next up previous
Next: Related Work Up: Results Previous: The Competition

Controlled Testing

In order to evaluate ATTac-2000's adaptive hotel bidding strategy in a controlled manner, we ran several game instances with ATTac-2000 playing against two variants of itself:

1.
  High-bidder always computed G* based on the current hotel prices (as opposed to using priors and averages of past closing prices).
2.
  Low-bidder always computed G* as in variant 1, but also only bid for hotel rooms at $50 over the current ask price (as opposed to the marginal utility, which tended to be more than $1000).

At the extremes, with ATTac-2000 and 7 high-bidders playing, at least one hotel price skyrockets in every game since all agents bid very high for the hotel rooms. On the other hand, with ATTac-2000 and 7 low-bidders playing, hotel prices never skyrocket since all agents but ATTac-2000 bid close to the ask price. Our goal was to measure whether ATTac-2000 could perform well in both extreme scenarios as well as various intermediate ones. Table 5 summarizes our results.


 
Table: The difference between ATTac-2000's score and the score of each of the other seven agents averaged over all games in a controlled experiment. All differences are statistically significant at the 0.001 level, except the one marked in italics. Each row corresponds to a different number of high-bidders (excluding ATTac-2000 itself). The first column presents the number of high-bidders as well as the number of experiments we ran for that scenario (in parentheses). The column labeled ``agent i'' shows how much better ATTac-2000 did on average than agent i. Scores above the stair-step line are for high-bidders (variant 1) and scores below the line are for low-bidders (variant 2). Results for identical agents are averaged to obtain a single average score difference for each type of agent in each row. In all cases, ATTac-2000 beats the other agents.
#high agent 2 agent 3 agent 4 agent 5 agent 6 agent 7 agent 8
7 (14) $\longleftarrow$ 9526 -------------------- $\hspace*{-.1cm} \longrightarrow$
6 (87) $\longleftarrow$ 10679 --------------- $\hspace*{-.1cm} \longrightarrow$ 1389
5 (84) $\longleftarrow$ 10310 ----------- $\hspace*{-.1cm} \longrightarrow$ $\longleftarrow$ 2650
4 (48) $\longleftarrow$ 10005 ------ $\hspace*{-.1cm} \longrightarrow$ $\longleftarrow$ ------ 4015
3 (21) $\longleftarrow$ 5067 $\longrightarrow$ $\longleftarrow$ ----------- 3639
2 (282) $\longleftarrow$ 209 $\longleftarrow$ --------------- 2710
 

Each row of Table 5 corresponds to a different number of high-bidders in the game; for example, the row labeled with 4high-bidders corresponds to ATTac-2000 playing with 4 copies of variant 1 and 3 copies of variant 2. Results for identical agents are averaged to obtain a single average score difference for each type of agent in each row. In the first column, we also show in parentheses the number of games played for the results in each row--each row reflects a different number of runs. In all cases, we ran enough game instances to achieve statistically significant results. However, in some cases we ran more instances than turned out to be required. The column labeled agent i shows the difference between ATTac-2000's score and the score of agent i averaged over all games. In all scenarios, these differences are positive, showing that ATTac-2000 outscored all other agents on average.5 Statistical significance was computed from paired T-tests; all results are significant at the 0.001 level except for the one marked in italics. As mentioned before, if the number of high-bidders is greater than or equal to 3, we expect the price for contentious hotels to rise, and in all such scenarios ATTac-2000 significantly outperforms all the other agents. The large score differences appearing in the top rows of Table 5 are mainly due to the fact that the other agents get large, negative scores since they end up buying many expensive hotel rooms.

In these experiments, ATTac-2000 always uses its adaptive hotel price expectations, even when there are only 2 high-bidders. In the last row, when the number of high-bidders is 2, very little bidding up of hotel prices is expected and in this case, we do not get statistical significance relative to the two high-bidders (agent 2 and agent 3), since their strategies are nearly identical to ATTac-2000's in this case. We do get high statistical significance relative to all the other agents (copies of variant 2), however. Thus, ATTac-2000's adaptivity to hotel prices seems to help a lot when hotel prices do skyrocket and does not seem to prevent ATTac-2000 from winning on average when they don't.

The results of Table 5 provide strong evidence for ATTac-2000's ability to adapt robustly to varying number of competing agents that bid up hotel prices near the end of the game. Note that ATTac-2000 is not designed to perform well against itself. If 8 copies of ATTac-2000 play against each other repeatedly, they will all favor the same hotel rooms and thus consistently all get large negative scores. It would be interesting to determine whether there exists a strategy that is both harmful to ATTac and beneficial to the adversary.


next up previous
Next: Related Work Up: Results Previous: The Competition
Peter Stone
2001-09-13