next up previous
Next: Preliminary Definitions Up: Extremal Behaviour in Multiagent Contract Negotiation Previous: Extremal Behaviour in Multiagent Contract Negotiation

Introduction

Mechanisms for negotiating allocation of resources within a group of agents form an important body of work within the study of multiagent systems. Typical abstract models derive from game-theoretic perspectives in economics and among the issues that have been addressed are strategies that agents use to obtain a particular subset of the resources available, e.g. [Kraus, 2001; Rosenschein & Zlotkin, 1994; Sandholm, 1999], and protocols by which the process of settling upon some allocation of resources among the agents involved is agreed, e.g. [Dignum & Greaves, 2000; Dunne, 2003; Dunne & McBurney, 2003; McBurney et al., 2002].

The setting we are concerned with is encapsulated in the following definition.

Definition 1   A resource allocation setting is defined by a triple $\langle{\mathcal A},{\mathcal R},{\mathcal U}\rangle $ where

\begin{displaymath}
{\mathcal A}=\{A_1,A_2,\ldots,A_n\}   ;   \
{\mathcal R}=\{r_1,r_2,\ldots,r_m\}
\end{displaymath}

are, respectively, a set of (at least two) agents and a collection of (non-shareable) resources. A utility function, $u$, is a mapping from subsets of ${\mathcal R}$ to rational values. Each agent $A_i\in{\mathcal A}$ has associated with it a particular utility function $u_i$, so that ${\mathcal U}$ is $\langle u_1,u_2,\ldots,u_n\rangle $. An allocation $P$ of ${\mathcal R}$ to ${\mathcal A}$ is a partition $\langle P_1,P_2,\ldots,P_n\rangle $ of ${\mathcal R}$. The value $u_i(P_i)$ is called the utility of the resources assigned to $A_i$. A utility function, $u$, is monotone if whenever $S\subseteq T$ it holds that $u(S)\leq u(T)$, i.e. the value assigned by $u$ to any set of resources, $T$, is never less than the value $u$ attaches to any subset, $S$ of $T$.

Two major applications in which the abstract view of Definition 1 has been exploited are e-commerce and distributed task realisation. In the first ${\mathcal R}$ represents some collection of commodities offered for sale and individual agents seek to acquire a subset of these, the ``value'' an agent attaches to a specific set being described by that agent's utility function. In task planning, the ``resource'' set describes a collection of sub-tasks to be performed in order to realise some complex task, e.g. the ``complex task'' may be to transport goods from a central warehouse to some set of cities. In this example ${\mathcal R}$ describes the locations to which goods must be dispatched and a given allocation defines those places to which an agent must arrange deliveries. The utility functions in such cases model the cost an agent associates with carrying out its alloted sub-tasks.

Within the very general context of Definition 1, a number of issues arise stemming from the observation that it is unlikely that some initial allocation will be seen as satisfactory either with respect to the views of all agents in the system or with respect to divers global considerations. Thus, by proposing changes to the initial assignment individual agents seek to obtain a ``better'' allocation. This scenario raises two immediate questions: how to evaluate a given partition and thus have a basis for forming improved or optimal allocations; and, the issue underlying the main results of this paper, what restrictions should be imposed on the form that proposed deals may take.

We shall subsequently review some of the more widely studied approaches to defining conditions under which some allocations are seen as ``better'' than others. For the purposes of this introduction we simply observe that such criteria may be either quantitative or qualitative in nature. As an example of the former we have the approach wherein the ``value'' of an allocation $P$ is simply the sum of the values given by the agents' utility functions to the subsets of ${\mathcal R}$ they have been apportioned within $P$, i.e. $\sum_{i=1}^{n} u_i(P_i)$: this is the so-called utilitarian social welfare, which to avoid repetition we will denote by $\sigma_u(P)$. A natural aim for agents within a commodity trading context is to seek an allocation under which $\sigma_u$ is maximised. One example of a qualitative criterion is ``envy freeness'': informally, an allocation, $P$, is envy-free if no agent assigns greater utility to the resource set ($P_j$) held by another agent than it does with respect to the resource set ($P_i$) it has actually been allocated, i.e. for each distinct pair $\langle i,j\rangle $, $u_i(P_i)\geq u_i(P_j)$.

In very general terms there are two approaches that have been considered in treating the question of how a finite collection of resources might be distributed among a set of agents in order to optimise some criterion of interest: ``contract-net'' based methods, e.g. [Dunne et al., 2003; Endriss et al., 2003; Endriss & Maudet, 2004b; Sandholm, 1998, 1999] deriving from the work of [Smith, 1980]; and ``combinatorial auctions'', e.g. [ Parkes & Ungar, 2000a; Parkes & Ungar, 2003b; Sandholm et al., 2001; Sandholm, 2002; Sandholm & Suri, 2003; Tennenholz, 2000; Yokoo et al., 2004 amongst others] The significant difference between these is in the extent to which a centralized controlling agent determines the eventual distribution of resources among agents.

One may view the strategy underlying combinatorial auctions as investing the computational effort into a ``pre-processing'' stage following which a given allocation is determined. Thus a controlling agent (the ``auctioneer'') is supplied with a set of bids - pairs $\langle S_j,p_j\rangle $ wherein $S_j$ is some subset of the available resources and $p_j$ the price agent $A_j$ is prepared to pay in order to acquire $S_j$. The problem faced by the auctioneer is to decide which bids to accept in order to maximise the overall profit subject to the constraint that each item can be obtained by at most one agent.

What we shall refer to as ``contract-net schemes'' typically eschew the precomputation stage and subordination to a controlling arbiter employed in auction mechanisms, seeking instead to realise a suitable allocation by an agreed sequence of deals. The contract-net (in its most general instantiation) for scenarios of $m$ resources distributed among $n$ agents, is the complete directed graph with $n^m$ vertices (each of which is associated with a distinct allocation). In this way a possible deal $\langle P,Q\rangle $ is represented as an edge directed from the vertex labelled with $P$ to that labelled $Q$. Viewed thus, identifying a sequence of deals can be interpreted as a search process which, in principle, individual agents may conduct in an autonomous fashion.

Centralized schemes can be effective in contexts where the participants cooperate (in the sense of accepting the auctioneer's arbritration). In environments within which agents are highly self-interested to the extent that their aims conflict with the auction process or in which there is a high degree of ``uncertainty'' about the outcome, in working towards a final allocation, the agents involved may only be prepared to proceed ``cautiously'': that is, an agent will only accept a proposed reallocation if satisfied that such would result in an immediate improvement from its own perspective. In such cases, the process of moving from the initial allocation, $P_{init}$, to the eventual reallocation $P_{fin}$ is by a sequence of local rational deals, e.g. an agent might refuse to accept deals which reduced $\sigma_u$ because of the possibility that it suffers an uncompensated loss in utility. A key issue here is the following: if the deal protocol allows only moves in which at each stage some agent $A_j$ offers a single resource to another agent $A_j$ then the rational reallocation $\langle P_{init},P_{fin}\rangle $ can always be implemented; if, however, every single move must be ``rational'' then $\langle P_{init},P_{fin}\rangle $ may not be realisable.

We may, informally, regard the view of such agents as ``myopic'', in the sense that they are unwilling to accept a ``short-term loss'' (a deal $\langle P,Q\rangle $ under they might incur a loss of utility) despite the prospect of a ``long-term gain'' (assuming $\sigma_u(P_{fin})>\sigma_u(P_{init})$ holds).

There are a number of reasons why an agent may adopt such views, e.g. consider the following simple protocol for agreeing a reallocation.

A reallocation of resources is agreed over a sequence of stages, each of which involves communication between two agents, $A_i$ and $A_j$. This communication consists of $A_i$ issuing a proposal to $A_j$ of the form $(buy,r,p)$, offering to purchase $r$ from $A_j$ for a payment of $p$; or $(sell,r,p)$, offering to transfer $r$ to $A_j$ in return for a payment $p$. The response from $A_j$ is simply $accept$ (following which the deal is implemented) or $reject$.
This, of course, is a very simple negotiation structure, however consider its operation within a two agent setting in which one agent, $A_1$ say, wishes to bring about an allocation $P_{fin}$ (and thus can devise a plan - sequence of deals - to realise this from an initial allocation $P_{init}$) while the other agent, $A_2$, does not know $P_{fin}$. In addition, assume that $A_1$ is the only agent that makes proposals and that a final allocation is fixed either when $A_1$ is ``satisfied'' or as soon as $A_2$ rejects any offer.

While $A_2$ could be better off if $P_{fin}$ is realised, it may be the case that the only proposals $A_2$ will accept are those under which it does not lose, e.g. some agents may be sceptical about the bona fides of others and will accept only deals from which they can perceive an immediate benefit. There are several reasons why an agent may embrace such attitudes within the schema outlined: once a deal has been implemented $A_2$ may lose utility but no further proposals are made by $A_1$ so that the loss is ``permanent''. We note that even if we enrich the basic protocol so that $A_1$ can describe $P_{fin}$, $A_2$ may still reject offers under which it suffers a loss, since it is unwilling to rely on the subsequent deals that would ameliorate its loss actually being proposed. Although the position taken by $A_2$ in the setting just described may appear unduly cautious, we would claim that it does reflect ``real'' behaviour in certain contexts. Outside the arena of automated allocation and negotiation in multiagent systems, there are many examples of actions by individuals where promised long-term gains are insufficient to engender the acceptance of short term loss. Consider ``chain letter'' schemes (or their more subtle manifestation as ``pyramid selling'' enterprises): such have a natural lifetime bounded by the size of the population in which they circulate, but may break down before this is reached. Faced with a request to ``send $10 to the five names at the head of the list and forward the letter to ten others after adding your name'' despite the possibility of significant gain after a temporary loss of $50, to ignore such blandishments is not seen as overly sceptical and cautious: there may be reluctance to accept that one will eventually receive sufficient recompense in return and suspicion that the name order has been manipulated.

In summary, we can identify two important influences that lead to contexts in which agents prefer to move towards a reallocation via a sequence of ``rational'' deals. Firstly, the agents are self-interested but operating in an unstable environment, e.g. in the ``chain letter'' setting, an agent cannot reliably predict the exact point at which the chain will fail. The second factor is that computational restrictions may limit the decisions an individual agent can make about whether or not to accept a proposed deal. For example in settings where all deals involve one resource at a time, $A_2$ may reject a proposal to accept some resource, $r$, since $r$ is only ``useful'' following a further sequence of deals: if this number of further deals is ``small'' then $A_2$ could decide to accept the proposed deal since it has sufficient computational power to determine that there is a context in which $r$ is of value; if this number is ``large'' however, then $A_2$ may lack sufficient power to scan the search space of future possibilities that would allow it to accept $r$. Notice that in the extreme case, $A_2$ makes its decision solely on whether $r$ is of immediate use, i.e. $A_2$ is myopic. A more powerful $A_2$ may be able to consider whether $r$ is useful should up to $k$ further deals take place: in this case, $A_2$ could still refuse to accept $r$ since, although of use, $A_2$ cannot determine this with a bounded look ahead.

In total for the scenario we have described, if $A_1$ wishes to bring about an allocation $P_{fin}$ then faced with the view adopted by $A_2$ and the limitations imposed by the deal protocol, the only ``effective plan'' that $A_1$ could adopt is to find a sequence of rational deals to propose to $A_2$.

Our aim in this article is to show that combining ``structural'' restrictions (e.g. only one resource at a time is involved in a local reallocation) with rationality restrictions can result in settings in which any sequence to realise a reallocation $\langle P,Q\rangle $ must involve exponentially many (in $\vert{\mathcal R}\vert$) separate stages. We refine these ideas in the next sub-section.


Subsections
next up previous
Next: Preliminary Definitions Up: Extremal Behaviour in Multiagent Contract Negotiation Previous: Extremal Behaviour in Multiagent Contract Negotiation
Paul Dunne 2004-11-26