================================================================= SPACE-TIME OPTIMIZER: SEARCH FOR AN EFFECTIVE ALLOCATION OF ROOMS ================================================================= Radar Proposed SOW for Year 3 Date of last revision: 10/16/2005 Period of performance: 11/1/2005 - 10/31/2006 We describe the main tasks involved in building a new optimizer, and then discuss the approach to building it, required human resources, and delivery schedule. This description is an extension to the main Space-Time SOW at www.cs.cmu.edu/~eugene/Radar/Notes/st-sow.doc. ---------- Main tasks ---------- The work on the optimizer will involve four main tasks: extending the current representation of conference-planning scenarios, developing data structures for efficient use of this representation, improving the optimization algorithm, and adding control knowledge. -= Domain representation =- We currently represent a conference-planning scenario by an XML file that includes a description of available rooms, conference events, and scheduling constraints. We plan to add five extensions to the current representation. () We will extend the representation of uncertain knowledge, based on the encoding of uncertain values by probability distributions. We plan to approximate arbitrary probability distributions by collections of uniform distributions, which will allow efficient optimization and simplify the learning of default assumptions about the world. We also plan to build a library of operations on probability distributions, which will allow deriving statistically accurate conclusions from uncertain knowledge. () We will replace piecewise-linear preference functions with numeric properties of events, such as expected attendance and preferred duration, and we will use the new representation of uncertain data in the encoding of event properties. This replacement will simplify the presentation of data to the user and elicitation of additional data. For example, displaying an expected attendance of a session is more user-friendly than displaying the related piecewise-linear preference function. As another example, asking the user to input an expected attendance is more effective than asking her to modify a preference function. Furthermore, the new representation will enable the system to learn default properties of events. For example, the system may learn that the usual attendance of regular sessions is between 20 and 40 people; then, it can use this knowledge to choose an appropriate room size when the user does not specify the related preference. () We will extend the representation to account for all elements of the IET scoring function, including room costs, movable equipment, services, and vendors. () We will also include all elements related to Aaron Steinfeld's definition of good schedules, including alignment of events by time and rooms, reduction in the number of rooms used in the schedule, and appropriate distribution of events among these rooms. () We will ensure that the representation is sufficiently expressive to allow extensions to the scoring function and definition of good schedules without a significant re-implementation. -= Data structures =- We plan to revamp the data structures for representing the conference planning scenario in the computer memory. The work on new data structures has three main goals. () We will build an object-oriented model; that is, we will create data structures corresponding to the main elements of the scheduling task, such as events, rooms, days, and pieces of equipment, and use a semantic network to represent relations among these elements. () We will ensure that these structures index the schedule data by events, times, rooms, and other relevant features. () We will include a mechanism for caching intermediate results of the optimization, which will help to improve the search efficiency. -= Optimization algorithms =- We will extend the hill-climbing optimization algorithm, developed in September 2005 for the Year 2 tests. Specifically, we will account for the new domain features, improve the search space, and investigate the use of randomization, look-ahead, and backtracking for improving its performance. This work will address the following five goals. () Developing an optimizer that accounts for all features of the new representation, such as uncertainty, room costs, movable equipment, vendors, and alignment of events by time and rooms. () Improving the efficiency and solution quality, by reducing the search space and selecting appropriate techniques for exploring it. () Implementing a very fast version of the optimizer for evaluating improvements in multiple alternative scenarios, which is required for the information elicitation. () Adding control knowledge for guiding the optimization algorithm, which will potentially allow learning of optimization strategies. () Investigating techniques for explaining the optimizer's decisions to the user, which will include displaying the constraints that determine the placement of specific events, and demonstrating negative consequences of moving events to other time slots. -= Control knowledge =- We will develop mechanisms for encoding control knowledge, which will allow the system to add missing parts of the world model and guide the optimization. These mechanisms will support both hand-coding and automated learning of control knowledge. They will also allow the system to accept constraints and advice from the user, which is required for supporting the collaboration with the user. We plan to implement three control mechanisms. () Default rules: We will develop a mechanism for making reasonable assumptions about unspecified resources and constraints. For instance, the system will assume that regular sessions need projectors, even if the user does not explicitly ask for them, and that auditoriums are larger than conference rooms. We will use production rules that derive default properties of rooms and events from known properties. This mechanism will provide more flexibility than the current mechanism based on a type hierarchy. In particular, learning will not be constrained by a specific hierarchy, and the system will be able to learn dependencies among properties of rooms and events, such as "the capacity of a room is usually proportional to its size." () Elicitation rules: We will add production rules for guiding elicitation, such as "if the size of an auditorium is unknown, then ask about it." This mechanism will help to obtain critical information early in the elicitation process, and reduce the number of questions. () Control variables: We will add a number of parameters for tuning the optimization, such as the search type, termination conditions, and desirable trade-off between the IET score and schedule simplicity. ---------------------- Approach and resources ---------------------- We expect that the development of a new optimizer based on the current optimization architecture should be more effective than adjusting some optimization package to the Space-Time task, and that this work will require 1.5 full-time programmers. -= Approach =- We plan to build a new optimizer by extending and partially reimplementing the current optimizer, which will involve building a number of new components "from scratch" in C# or C++. We feel that this approach is more effective than using an off-the-shelf package or modifying some existing optimizer for solving the conference-planning problem. We plan to use ideas and algorithms from CMRadar, but we do not plan to reuse its code. This approach will allow us to build a system streamlined to the Space-Time representation, make the system transparent for learning, and simplify its maintenance. The reasons for avoiding off-the-shelf packages are as follows. () The current Space-Time optimizer provides a starting point for the Year 3 work, which may be more effective than using another system as a starting point. Furthermore, adding the CMRadar techniques to the current Space-Time optimizer may be more effective than starting with CMRadar and modifying it for the Space-Time optimization. () The expressiveness of constraints in off-the-shelf packages is limited, and representing all domain features in the input language of an off-the-shelf package may be difficult. Furthermore, general search techniques may be inefficient for our problem, and the work involved in adapting a general package to our problem may be comparable to implementing our own search algorithm. We encountered these problems when trying to adapt ILOG to solving the office allocation problem. () To our knowledge, the existing packages do not support reasoning with uncertainty, which is central to the Space-Time system. Also, they do not provide sufficient support for adding control knowledge, generating explanations of their decisions, and storing intermediate search results, required for learning and collaboration with the user. We plan to develop the new optimizer in either C# or C++; the final selection between these two languages will depend on the preference of the lead programmer, as well as on the integration requirements. -= Human resources and deliverables =- We expect that the implementation of all features of the new optimizer will take a year, and require the work of 1.5 full-time programmers. We also expect to use additional 0.5 of a programmer for the related learning and integration work, which will involve helping the graduate students working on learning, and extending the "plumbing" code that connects the components of the Space-Time system. This estimate is consistent with the resources requested in the earlier Statement of Work for the Space-Time group, which included two programmers and two graduate students. We plan to built an initial version of the new optimizer in four months, and incrementally extend its functionality over the next eight months; a tentative delivery schedule is as follows. February 2006: We will develop a new representation of uncertain knowledge and preferences, make representation fully consistent with the IET scoring function, and implement new data structures. We will also implement an optimization algorithm for the new representation. June 2006: We will add all elements of representation related to Aaron Steinfeld's definition of good schedules. We will also develop and evaluate techniques for improving the performance of the optimization algorithm, and implement a fast version of the optimizer for the use in the information elicitation. October 2006: We will add control-knowledge mechanisms and investigate techniques for explaining the optimizer's decision to the user. This work may continue after October 2006, during Year 4.