One of the characteristics of a database system is that the persistent data is more valuable and longer-lived than the applications that access it. It is natural to expect that, over the lifetime of a database, needs will change, forcing a change to the database schema that preserves the actual data stored therein. This task is particularly troublesome in the new generation of database programming languages, where complex type specifications are supported. Even today, with the minimal support provided by available database systems, evolutions occur many times in the lifetime of a real database. The provision of improved tools for reasoning and managing schema evolutions will result in greater efficiency and would promote evolutions that programming complexity has discouraged in the past.
I have been concentrating on this problem in the context of Object-Oriented Database Management Systems (OODBMS), where changes to the database schema take the form of modifications to the class (or type) hierarchy, changes which may alter the representation of and/or the interface to objects. I have been developing a general and extensible framework for the management of type evolution and its effects. The framework provides two features not available in other systems: 1) compatibility for old applications, and 2) the ability to install arbitrary changes upon the schema and database. The framework is based upon the notions of schema versioning and conformance. Especially important is the novel option of employing explicit programmer authority in the adaptation process, which allows for the installation of arbitrary evolutions and more control over the nature and efficiency of the compatibility support.
Compatibility relaxes the normally strong dependency between database and application. Database managers could evolve a database without having to change all client applications synchronously. So long as the old interface is supported, the evolution can be freely performed, the older clients "upgrading" if and when their schedule permits. This decentralization of control is particularly useful in CAD and OIS applications and will become very important as highly distributed computation (eg. over the so-called Information Superhighway) becomes commonplace.
This work has led to some interesting spin-offs. The mechanisms introduced to support this framework are general enough to help support other advanced database features, such as schema integration, replication, and data migration. The revised data model for evolution has introduced some interesting ideas about the nature of subtyping and conformance.