StagedDB/CMP Project
Welcome
This large long-term project introduces a revolutionary staged design for high-performance,
evolvable DBMS that are easy to tune and maintain. We break the database system into modules
and encapsulate them into self-contained stages connected to each other through queues.
With the advent of highly-parallel chip multiprocessors database system designers are called to
revisit their designs. We study the performance of commercial database systems in evolving computer
architectures and we believe that conventional database designs are inherently restricted in
performing highly in such environments. On the other hand, the different approach taken by staged
database designs makes them more suitable for high performance in the new computing landscape.
This work is supported in part from the project named "III-COR: Staged Database Systems: Maximizing Locality through Service-based Data Management (National Science Foundation AWARD #0713409)".
Recently we released Shore-MT. Shore-MT is a scalable multi-threaded port of the Shore storage manager. It can be downloaded from here.
Current Focus
Our focus is in four directions:
-
Study the performance of database systems when running OLTP and DSS workload on emerging
hardware, such as the highly-parallel chip multiprocessors.
-
Build a staged relational query engine that can optimally manage available disk bandwidth,
RAM, and CPU cycles across multiple concurrent queries, and provide a significant performance
boost over conventional query engines.
-
Apply the Staged DB design coupled with smart scheduling to Online Transaction Processing (OLTP)
engines in order to optimize both instruction and data cache (processor cache) performance,
as well as, to improve the (intra-transaction) parallelism in those workloads.
-
Optimize chip multiprocessors for commercial workloads, especially database applications, such as
on-line transaction processing (OLTP) and decision support applications (DSS).
More in "Goals"...
Staged Database Systems
Our group proposed the use of staging for database systems.
According to the Staged Database System design, the previously monolithic database system is
decomposed to a set of stages. Each stage has its own queue and
thread support. New queries queue up in the first stage, they are encapsulated into a "packet",
and pass through the five stages shown on the top of the figure below. A packet carries
the query’s "backpack:" its state and private data. Inside the execution engine a query
can issue multiple packets to increase parallelism.
There are multiple research problems associated with this new database system architecture,
ranging from optimizing hardware resource usage to job queueing and scheduling with multiple
constraints and to multi-query processing and optimization.
|