May 27, 3:00, WeH 4601 A Distributed RL Scheme for Packet Routing Justin Boyan (joint work with Michael Littman) We describe an adaptive algorithm for packet routing in which a reinforcement learning module is embedded into each node of a switching network. In unit time, a node examines the top packet in its queue and sends it to the neighbor estimated to give the shortest routing time to that packet's destination. Only local information and local communication are used at each node to keep up-to-date estimates of shortest routing times. In simple experiments involving a 36-node, irregularly-connected network, this learning approach proves superior to a nonadaptive algorithm based on precomputed shortest paths.