Abstract: The purpose of study is to develop intelligent negotiation agents that can behave rationally so as to improve the final outcomes in a one-to-many negotiation. A Bayesian learning model of multi-attribute one-to-many negotiation, namely Bayes Improved-ITA is proposed. These agents employ Bayesian belief updating process to model their opponents utility structure. The performance of Bayes Improved-ITA is promising when it is compared with the results of one-to-many negotiations that use genetic-based machine learning model and heuristic search algorithm. Results from the experimental work show that having knowledge of opponents preferences and constraints, negotiation agents can achieve more optimal outcomes.
INTRODUCTION
The increasing popularity of Internet and World Wide Web (WWW) fuels the rise of electronic commerce (e-commerce). Both suppliers and consumers are free from location and time constraints to accomplish business transactions through computer and networks. Based on Nissens Commerce Model (Nissen, 1997), Felmans e-commerce value chain (Feldman, 1999) and Maes and Media Labs Consumer Buying Behavior (CBB) model for e-commerce (Moukas et al., 2000), a buying process is defined into six stages: need identification, product brokering, merchant brokering, negotiation, purchase and delivery as well as product service and evaluation. Among the six stages of the buying behavior, negotiation is the key component in e-commerce (Sandholm, 1999). Nevertheless, negotiation is a complicated process as it is to resolve conflicts of all the parties involved which may have contrast goals. The process becomes more complex as negotiating parties are reluctant to reveal their preferences to avoid exploitation (Hindriks and Tykhonov, 2008). Thus, both buyer and seller encounter the problem of converging to the common area of interest on pricing and other terms of transaction (e.g., delivery date and product warranty) during a negotiation. Human-based negotiation that requires the parties involved to gather in a particular place at a fixed time is time consuming and costly while the outcomes achieved are always sub-optimal (Bosse and Jonker, 2005).
In the literature, software agents (Jonker et al., 2007; Sim, 2007; Manisterski et al., 2008) have been proposed and implemented to automate the negotiation stage in online trading. These automated negotiation agents support one-to-one negotiation. Intelligence Trading Agency (ITA) (Rahwan et al., 2002) is a framework that practices bilateral one-to-many negotiation by means of conducting a number of coordinated simultaneous one-to-one multi-attribute negotiations. This model opens more alternatives to a party in a negotiation as one party can concurrently negotiate with several parties and finally deal with the one that can provide the best offer. For most of the negotiation agent systems, the decision of negotiation agents to select the best course of action does not take dynamics of negotiation into consideration. A buyer or seller may change his decision during a negotiation due to environmental factors or individual basis. The self-interested nature of the agents makes them spending more time to search for feasible solutions while the final outcomes obtained are normally sub-optimal. Moreover, the negotiation agents normally implement strategies that are programmed prior to the start of a negotiation. These negotiation strategies will be obsolete after a period of time and new strategies are required to replace them. Thus, a hypothesis can be made: if there is an adaptive agent to keep pace with the ever changing environment, the probability of obtaining successful negotiation will be higher than those agents without the learning ability. Lau (2005), Lin et al. (2006), Narayanan and Jennings (2006), Lopez-Carmona and Velasco (2006), Praca et al. (2008) and Hindriks and Tykhonov (2008) are significant researchers in automated negotiation systems that inform the design of intelligent negotiation agents. This study discusses the implementation of intelligent negotiation agents in one-to-many negotiation. A Bayesian learning model for multi-attribute negotiation is proposed to capture opponent agents preferences and constraints during the negotiation. The Bayesian learning model enables the negotiation agents to behave rationally and make better decisions during a one-to-many negotiation.
BAYES-IMPROVED ITA
Bayes Improved-ITA agents learn the negotiation by implementing Bayesian learning approach. The learning model of these proposed negotiation agents is adapted from Bazaar (Zeng and Sycara, 1998) and implemented in a multi-attribute one-to-many negotiation framework proposed by Rahwan et al. (2002). Bayesian network is used to update the knowledge and belief that each agent has about the environment and other agents.
System Operation
Figure 1 depicts the system operation of Bayes Improved-ITA.
A negotiation is initialized with a buyer agent and several seller agents. The
buyer agent is made up of a coordinating agent and several sub-buyers. Each
of the negotiation agents consists of a belief network about the opponent agents
information of acceptability of criteria and domain knowledge of negotiation.
The negotiation starts when the sub-buyers send their requirements in the form
of string message to the seller agents.
Fig. 1: | System operation of Bayes Improved-ITA |
The seller agents search the available packages from the database, respectively. Bayes Improved-ITA models negotiation as a sequential decision making process. When an agent receives an offer from its opponent agent, it first extracts the data from the offer received. This data are negotiation issues and the value of each of the negotiation issue. This information is used by the agent to update the Bayesian belief network in the agent. After that, the negotiating agent generates a counter offer to opponent agent based on the probability distribution of hypotheses in its belief network. The learning takes place at each cycle (an exchange of offer and counter offer) when the agents update their individual subjective beliefs before entering the next cycle of negotiation. This process is carried out vice versa until an agreement between one of the negotiation agent pairs is reached or one side of the negotiation agent terminates the negotiation.
System Architecture
Figure 2 depicts the architecture of an individual learning
agent in Bayes Improved-ITA. An individual learning agents main architecture
consists of a detector, the proposed Bayesian Learning Environment (BALE), a
negotiation engine and an effector. The negotiation engine is equipped with
the knowledge of a set of constraint variables with different levels of satisfactions.
The two main functions of the negotiation engine are evaluating offers and generating
new offers.
The detector detects the offer from opponent agent as incoming signal. The
agent then extracts values of each negotiation issue from the offer received.
This information yields the opponents threshold learning data for BALE.
BALE is an environment where the agent implements the learning mechanism. It
consists of a Bayesian belief network, Conditional Probability Table Learner
(CPTLearner) and Case Generator (CGen). Prior to a negotiation, each of the
negotiation agents is initialized with a belief network. The belief network
contains the learning agents belief about the opponent agents acceptability
criteria for each negotiation issue. For the negotiation that involves multiple
attributes, e.g., total price of hard disk and CPU and CPU speed in a computer
trading, these believes are the opponent agents reservation value for
total price of hard disk and CPU and the threshold value of CPU speed. This
information is the agents threshold of offer acceptability. When an offer
is received by the learning agent from its opponent, the opponents threshold
learning data from the detector is used to update the belief network.
Fig. 2: | Bayes Improved-ITA individual agents architecture |
CPTLearner is responsible to update the belief network by revising the probability distribution of each hypothesis. The probability distribution of the belief networks is revised whenever the detector detects an incoming signal from opponent agents. CGen then generates the predicted value of each negotiation issues based on the probability distribution of the hypotheses in the belief networks. These values coupled with the users utility information, the negotiation engine generates a counter offer and sends it to the opponent agent through the effector.
Offer Evaluation and Generation
Figure 3 depicts the activity diagram of evaluating offer
and generating new offer of Bayes Improved-ITA sub-buyer or seller agent. Evaluating
an offer is to reason the acceptability of the criteria of an offer sent by
the opponent agent. Bayes Improved-ITA agents use the utility theory and constraint
propagation techniques (Yokoo and Hirayama, 2000; Rahwan et al., 2002)
to evaluate an offer. Prior to a negotiation, both sides of agents, i.e., buyer
and seller, represent their preferences and constraints in a utility function.
When an agent receives an offer from its opponent, the agent checks its utility
function to ensure that the proposed values of each of the negotiation issues
satisfy all the constraints. However, the utility information of an agent is
private information protected from other agents. For every negotiation issue,
there is an acceptability value in which the agent will not accept an offer
when the value of the offer is below its acceptability value. When the value
of an offer received is above an agents threshold of acceptability, the
zone of agreement exists between the two parties, otherwise, the negotiation
fails.
Generating a new offer means searching for prospective solutions that lie within
the zone of agreement. Knowing opponent agents threshold of offer facilitates
the optimization of negotiation final outcomes since the agents can select values
based on a more accurate estimation of the opponents utility structure
when generating a new offer.
Fig. 3: | Bayes Improved-ITA evaluating offers and generating new offers activity diagram |
Bayes Improved-ITA agents learn the negotiation by updating the Bayesian belief network of opponent agents acceptability criteria of the negotiation issues as well as the agents domain knowledge about negotiation. If the zone of agreement exists between both agents after the evaluation state, the values of the variables from opponent agents offer are passed to BALE as opponents threshold learning data. The probability distribution of the hypotheses is revised in the Bayesian belief network. This belief network helps Bayes Improved-ITA agents to model the utility structure of opponents agents. This set of predicted opponents threshold value for negotiation issues is the outcomes of the Bayesian learning mechanism. Coupled with the agents own utility function, a new offer is generated and sent to the opponent agents.
BAYESIAN LEARNING IN NEGOTIATION
The Bayesian belief network resides in BALE. It is the abstraction of an agents belief about its opponents threshold value for each negotiation issue. Figure 4 depicts the Bayesian network for the computer trading negotiation problem. A computer trading that involves two negotiation issues, e.g., total price of hard disk and CPU and the CPU speed is taken as an illustration. The two negotiation issues denote the variables, which are represented as nodes in the Bayesian network. Since, Bayes Improved-ITA agents learn opponents threshold of offer acceptability based on their partial belief of each negotiation issues threshold value plus their domain knowledge about negotiation, the agents domain knowledge for negotiation is also represented as a node in the network.
There are states within each node, denoting the hypotheses, H. An agents
partial belief about its opponents threshold value for each of the negotiation
issues can be represented by a set of hypotheses, Hi, I = 1,...N.
For instance, there may be H1 = RM100; H2 = RM180 for
the belief about opponent agents reservation value for the total price
of hard disk and CPU. Each of the candidate hypotheses is assigned to the same
prior probability for the two variables, namely total price of hard disk and
CPU and the CPU speed. Based on Zeng and Sycara (1998), the domain knowledge
can be an observation such as: usually in our business, people will offer a
price which is above their reservation price by 17%. This statement is represented
by a set of conditional statements such that P(e1| H1)
= 0.3, where, e1 = RM117 while H1 = RM100 (Zeng and Sycara,
1998). In this example, e1 represents the value of the negotiation
issue proposed by opponent agent while H1 is the hypothesis of the
opponent agents reservation price. Figure 5 depicts
the algorithm of learning the negotiation by Bayes Improved-ITA. The Bayesian
updating occurs when the agents receives an offer as signals from the opponent
agents. Along with domain specific knowledge, these new signals enable the agents
to acquire new insights about their belief of opponent agents reservation
value for total price of hard disk and CPU as well as the threshold values for
CPU speed in the form of posterior subjective evaluation over Hi.
Fig. 4: | Bayesian network for computer trading negotiation problem |
Fig. 5: | Algorithm of learning negotiation by Bayes Improved-ITA |
Given the signal e in the form of offers made by the opponent agents and the domain knowledge encoded in the form of conditional statements, the agent can use the Bayesian updating rule to revise its belief about the opponents reservation value for total price of hard disk and CPU and the threshold values for CPU speed. The Bayesian updating rule (Zeng and Syacara, 1998) is:
|
(1) |
Thus, the agent can predict opponent agents reservation value for total price of hard disk and CPU as well as the threshold values for CPU speed from the probability distribution of the hypotheses for each of the negotiation issues. By having this information, the agent can generate a new offer based on a more accurate estimation of the opponent agents utility structure.
EXPERIMENTAL RESULTS
A computer trading scenario is used in the experiments. The trade starts when the buyer agent sends out a request for computer package to all participating seller agents simultaneously. The seller agents search for available packages in their database and in return, generate an offer to the buyer agent. The negotiation involves four issues: CPU name, CPU speed, hard disk manufacturer and total price of hard disk and CPU. The non-negotiable attributes are CPU name and hard disk manufacturer while the negotiable attributes are the total price of hard disk and CPU and CPU speed. The buyer and seller agents intention is to get the best deal possible from their opponent. A negotiation cycle consists of one exchange of offer and counter offer by each pair of negotiation agents. The composite buyer agent consists of three instances of sub-buyers and a coordinating agent, implemented as a multi-threaded system. Bayes Improved-ITA uses the Netica-J, a Bayesian network toolkit from Norsys Software Corp., to implement Bayesian learning.
To evaluate the performance of Bayes Improved-ITA, two major experiments are conducted. The first experiment evaluates the performance of Bayesian learning in negotiation problem. The second experiment examined the performance of the learning agents in Bayes Improved-ITA by observing the final negotiation outcomes and agents justification of negotiation decision. Results obtained are compared with negotiation outcomes of ITA in terms of the joint utility of negotiation agents. The performance of Bayes Improved-ITA agents justification of negotiation decision is compared with the results obtained by ITA and negotiation agents that utilizes genetic-based machine learning model (GA-Negotiation Agents).
Performance of Bayesian Learning in Negotiation
Figure 6 shows the performance graph of the proposed Bayesian
Learning Environment (BALE) in 300 randomly generated negotiation problems.
The negotiation problem contains two negotiation issues, namely the total price
of hard disk and CPU and speed of CPU. The accuracy curve (blue) shows the difference
of the actual and predicted total price of hard disk and CPU over 300 iterations
of learning and remains around 84.5%. The pink curve in the graph represents
the accuracy of the predicted value of CPU speed by BALE over 300 iterations
of learning and remains around 93.2%. It can be observed that the accuracy of
the correctly predicted output increases over time and remain consistent at
the end of the iteration.
Comparison of Joint Utility
Joint utility is the criterion that measures the social welfare of buyer
and seller agents (Cheng et al., 2005). In other words, the joint utility
indicates the quality of a particular negotiation process (Zeng and Sycara,
1998). Table 1 shows the joint utility of ITA and Bayes Improved-ITA.
Bayes Improved-ITA has higher joint utility value than ITA. It indicates that
Bayes Improved-ITA achieve more optimal outcomes in comparison with ITA. Thus,
Bayes Improved-ITA has presented better quality of negotiation than ITA.
Comparison of Agents Justification of Decision
An agents justification of decision depicts the agents ability
to make a rational decision during the negotiation. It is measured in terms
of the difference of the payoff value achieved between a pair of negotiation
agents over the negotiation cycles. It is ideal for the pair of negotiation
agents to obtain small value as possible for the difference of the payoff value
as it means that both of the agents has the capability to act rationally to
approach the agreement in a negotiation.
Table 1: | Joint utility of ITA and Bayes Improved-ITA |
Fig. 6: | Performance graph of BALE |
Fig. 7: | Agents justification of decision between (a) sub-buyer 1 and seller 0, (b) sub-buyer 2 seller 1 and (c) sub-byer 3 and seller 2 |
Figure 7 shows the agents justification of negotiation decision of a one-to-many negotiation between three pairs of sub-buyers and seller agents. Figure 7a shows the agents justification of negotiation decision between sub-buyer 1 and seller 0 for ITA, GA-Negotiation Agents and Bayes Improved-ITA during the negotiation. It can be observed that the Bayes Improved-ITA shows the best performance in the effort of justification of negotiation decision. They can make better decision in comparison to ITA and GA-Negotiation Agents as the difference between the negotiation payoff of buyer and seller agent is closer than others. However, ITA has poor performance in justifying the negotiation decision. Figure 7b shows the agents justification of negotiation decision between sub-buyer 2 and seller 1 of ITA, GA-Negotiation Agents and Bayes Improved-ITA. Bayes Improved-ITA shows the best performance in the effort of justifying the negotiation decision. This indicates that they can make better decision in comparison to the other two systems. Although this pair of negotiation agents cannot reach agreement at the end of the negotiation, they can perform well in justifying the decision throughout the negotiation. Figure 7c shows the agents justification of negotiation decision between sub-buyer 3 and seller 2 of ITA, GA-Negotiation Agents and Bayes Improved-ITA. Bayes Improved-ITA outperforms the other two systems in terms of justifying the negotiation decision. In conclusion, the ITA has the weaker ability in making the decision during the negotiation in comparison to GA-Negotiation Agents and Bayes Improved-ITA.
DISCUSSION
A learning negotiation model, namely BALE is tested in randomly generated negotiation problems. The proposed Bayesian learning model estimates the threshold value for each of the opponents negotiation issues. The accuracy value obtained in the first experiment shows that it is promising to apply machine learning approach for learning multi-attribute negotiation. Although Narayanan and Jennings (2006) has proposed Bayesian learning approach in negotiation problems, the framework focused on learning opponents negotiation strategy and the negotiation presented only involved single attribute. Other related study proposed by Zeng and Sycara (1998) uses Bayesian updating approach to model opponents utility structure in single-attribute negotiations. The experimental work has proven that machine learning approach is effective in learning negotiation that involves multiple issues. Achieving a win-win scenario has always been the main concern in negotiation. Even though human negotiators could perform well in negotiation, the outcomes are sub-optimal (Bosse and Jonker, 2005; Hindriks and Tykhonov, 2008). From the results of the joint utility value, which measures if the buyer and seller agents have reach optimal outcomes, the negotiation agents with the learning ability (Bayes Improved-ITA) can achieve higher joint utility value than the non-learning negotiation agents in ITA. Bosse and Jonker (2008) have also demonstrated that the learning agents achieve more optimal outcomes in a one-to-one negotiation. Whilst in this study, Bayes Improved-ITA has showed the relation of agents learning capability to the outcomes of negotiation in a one-to-many negotiation framework. The significance of the results in the second experiment implies that having knowledge of opponents preferences during the negotiation can optimize the negotiation outcomes. In terms of agents justification of decision, Bayes Improved-ITA outperforms GA-Negotiation Agents and ITA. The results indicate that Bayes Improved-ITA agents can make better decision during the negotiation. Nevertheless, in the early stage of negotiation, it can be observed that Bayes Improved-ITA presents a large difference value of negotiation payoff between buyer and seller agents in comparison with GA-Negotiation Agents and ITA. It is due to the lack of information that the agent collects from its opponent during the early stage of negotiation.
CONCLUSION
The main focus of this study is to optimize negotiation outcomes in one-to-many negotiation. The proposed intelligent negotiation agents are able to capture the preferences and constraints of their opponent agents during a negotiation. This information provides the important references for the agents while they generate a new offer. Bayes Improved-ITA agents learn the negotiation by modeling the utility structure of their opponent by Bayesian network. The proposed negotiation learning method in BALE is in the form of reinforcement learning in which the agents perceive inputs (offers) from the environment and take actions (counter offers) to the environment based on the cumulative reward returned by the environment for the previous actions. Bayes Improved-ITA agents successfully improve the negotiation outcomes by optimizing the benefits of all parties involved in negotiation. The significance of the results from the experimental work has proven that the adaptive nature of agents can increase the fitness of these autonomous agents in the dynamic electronic market. More optimal outcomes can be achieved when the opponents preference and constraints are taken into consideration during the negotiation rather than being self-interested for the negotiation agents. As future work, the negotiation outcomes can be further optimized by integrating Bayes Improved-ITA with grid computing approach which allows the negotiation agents to effectively allocate and acquire resource in distributed environment.
ACKNOWLEDGMENT
The authors wish to thank Dr. Iyad Rahwan for providing helpful information and data sets of ITA.