=================================================================
SPACE-TIME OPTIMIZER: SEARCH FOR AN EFFECTIVE ALLOCATION OF ROOMS
=================================================================

Radar Proposed SOW for Year 3

Date of last revision: 10/16/2005
Period of performance: 11/1/2005 - 10/31/2006

We describe the main tasks involved in building a new optimizer, and
then discuss the approach to building it, required human resources,
and delivery schedule. This description is an extension to the main
Space-Time SOW at www.cs.cmu.edu/~eugene/Radar/Notes/st-sow.doc.


----------
Main tasks
----------

The work on the optimizer will involve four main tasks: extending the
current representation of conference-planning scenarios, developing
data structures for efficient use of this representation, improving
the optimization algorithm, and adding control knowledge.


-= Domain representation =-

We currently represent a conference-planning scenario by an XML file
that includes a description of available rooms, conference events, and
scheduling constraints. We plan to add five extensions to the current
representation.

() We will extend the representation of uncertain knowledge, based on
the encoding of uncertain values by probability distributions. We plan
to approximate arbitrary probability distributions by collections of
uniform distributions, which will allow efficient optimization and
simplify the learning of default assumptions about the world. We also
plan to build a library of operations on probability distributions,
which will allow deriving statistically accurate conclusions from
uncertain knowledge.

() We will replace piecewise-linear preference functions with numeric
properties of events, such as expected attendance and preferred
duration, and we will use the new representation of uncertain data in
the encoding of event properties. This replacement will simplify the
presentation of data to the user and elicitation of additional data.
For example, displaying an expected attendance of a session is more
user-friendly than displaying the related piecewise-linear preference
function. As another example, asking the user to input an expected
attendance is more effective than asking her to modify a preference
function. Furthermore, the new representation will enable the system
to learn default properties of events. For example, the system may
learn that the usual attendance of regular sessions is between 20 and
40 people; then, it can use this knowledge to choose an appropriate
room size when the user does not specify the related preference.

() We will extend the representation to account for all elements of
the IET scoring function, including room costs, movable equipment,
services, and vendors.

() We will also include all elements related to Aaron Steinfeld's
definition of good schedules, including alignment of events by time and
rooms, reduction in the number of rooms used in the schedule, and
appropriate distribution of events among these rooms.

() We will ensure that the representation is sufficiently expressive
to allow extensions to the scoring function and definition of good
schedules without a significant re-implementation.


-= Data structures =-

We plan to revamp the data structures for representing the conference
planning scenario in the computer memory. The work on new data
structures has three main goals.

() We will build an object-oriented model; that is, we will create
data structures corresponding to the main elements of the scheduling
task, such as events, rooms, days, and pieces of equipment, and use a
semantic network to represent relations among these elements.

() We will ensure that these structures index the schedule data by
events, times, rooms, and other relevant features.

() We will include a mechanism for caching intermediate results of the
optimization, which will help to improve the search efficiency.


-= Optimization algorithms =-

We will extend the hill-climbing optimization algorithm, developed in
September 2005 for the Year 2 tests. Specifically, we will account for
the new domain features, improve the search space, and investigate the
use of randomization, look-ahead, and backtracking for improving its
performance. This work will address the following five goals.

() Developing an optimizer that accounts for all features of the new
representation, such as uncertainty, room costs, movable equipment,
vendors, and alignment of events by time and rooms.

() Improving the efficiency and solution quality, by reducing the
search space and selecting appropriate techniques for exploring it.

() Implementing a very fast version of the optimizer for evaluating
improvements in multiple alternative scenarios, which is required for
the information elicitation.

() Adding control knowledge for guiding the optimization algorithm,
which will potentially allow learning of optimization strategies.

() Investigating techniques for explaining the optimizer's decisions
to the user, which will include displaying the constraints that
determine the placement of specific events, and demonstrating negative
consequences of moving events to other time slots.


-= Control knowledge =-

We will develop mechanisms for encoding control knowledge, which will
allow the system to add missing parts of the world model and guide the
optimization. These mechanisms will support both hand-coding and
automated learning of control knowledge. They will also allow the
system to accept constraints and advice from the user, which is
required for supporting the collaboration with the user. We plan to
implement three control mechanisms.

() Default rules: We will develop a mechanism for making reasonable
assumptions about unspecified resources and constraints. For instance,
the system will assume that regular sessions need projectors, even if
the user does not explicitly ask for them, and that auditoriums are
larger than conference rooms. We will use production rules that derive
default properties of rooms and events from known properties. This
mechanism will provide more flexibility than the current mechanism
based on a type hierarchy. In particular, learning will not be
constrained by a specific hierarchy, and the system will be able to
learn dependencies among properties of rooms and events, such as "the
capacity of a room is usually proportional to its size."

() Elicitation rules: We will add production rules for guiding
elicitation, such as "if the size of an auditorium is unknown, then
ask about it." This mechanism will help to obtain critical information
early in the elicitation process, and reduce the number of questions.

() Control variables: We will add a number of parameters for tuning
the optimization, such as the search type, termination conditions, and
desirable trade-off between the IET score and schedule simplicity.


----------------------
Approach and resources
----------------------

We expect that the development of a new optimizer based on the current
optimization architecture should be more effective than adjusting some
optimization package to the Space-Time task, and that this work will
require 1.5 full-time programmers.


-= Approach =-

We plan to build a new optimizer by extending and partially
reimplementing the current optimizer, which will involve building a
number of new components "from scratch" in C# or C++. We feel that
this approach is more effective than using an off-the-shelf package or
modifying some existing optimizer for solving the conference-planning
problem. We plan to use ideas and algorithms from CMRadar, but we do
not plan to reuse its code. This approach will allow us to build a
system streamlined to the Space-Time representation, make the system
transparent for learning, and simplify its maintenance. The reasons
for avoiding off-the-shelf packages are as follows.

() The current Space-Time optimizer provides a starting point for the
Year 3 work, which may be more effective than using another system as
a starting point. Furthermore, adding the CMRadar techniques to the
current Space-Time optimizer may be more effective than starting with
CMRadar and modifying it for the Space-Time optimization.

() The expressiveness of constraints in off-the-shelf packages is
limited, and representing all domain features in the input language of
an off-the-shelf package may be difficult. Furthermore, general search
techniques may be inefficient for our problem, and the work involved
in adapting a general package to our problem may be comparable to
implementing our own search algorithm. We encountered these problems
when trying to adapt ILOG to solving the office allocation problem.

() To our knowledge, the existing packages do not support reasoning
with uncertainty, which is central to the Space-Time system. Also,
they do not provide sufficient support for adding control knowledge,
generating explanations of their decisions, and storing intermediate
search results, required for learning and collaboration with the user.

We plan to develop the new optimizer in either C# or C++; the final
selection between these two languages will depend on the preference of
the lead programmer, as well as on the integration requirements.


-= Human resources and deliverables =-

We expect that the implementation of all features of the new optimizer
will take a year, and require the work of 1.5 full-time programmers.
We also expect to use additional 0.5 of a programmer for the related
learning and integration work, which will involve helping the graduate
students working on learning, and extending the "plumbing" code that
connects the components of the Space-Time system. This estimate is
consistent with the resources requested in the earlier Statement of
Work for the Space-Time group, which included two programmers and two
graduate students.

We plan to built an initial version of the new optimizer in four
months, and incrementally extend its functionality over the next eight
months; a tentative delivery schedule is as follows.

February 2006: We will develop a new representation of uncertain
knowledge and preferences, make representation fully consistent with
the IET scoring function, and implement new data structures. We will
also implement an optimization algorithm for the new representation.

June 2006: We will add all elements of representation related to Aaron
Steinfeld's definition of good schedules. We will also develop and
evaluate techniques for improving the performance of the optimization
algorithm, and implement a fast version of the optimizer for the use
in the information elicitation.

October 2006: We will add control-knowledge mechanisms and investigate
techniques for explaining the optimizer's decision to the user. This
work may continue after October 2006, during Year 4.