With growing compute power of modern multi-core systems and with the increasing amount of data accessed by many applications, the memory channel will become a even more critical bottleneck for both system performance and energy-efficiency. Existing systems use memory as just a data store in which data can be stored and accessed from. However, this approach necessitates large amounts of data to be transferred over the memory channel even for some simple operations like bulk data copy or initialization. This results in high latency, bandwidth and energy consumption for such operations.
This thesis presents a series of mechanisms that will allow the processor to perform certain bulk data operations completely within DRAM, thereby eliminating the need to transfer large amounts of data over the memory channel. To keep the cost of DRAM low, our mechanisms aim to exploit the organization and operation of DRAM as much as possible. As a preliminary step, we have already developed and evaluated a mechanism (RowClone) for performing bulk data copy and initialization completely within DRAM.
This thesis proposal consists of two parts. The first part aims to explore other bulk data operations (e.g., gather-scatter, randomization) to DRAM. The second part aims to improve the hardware support for bulk data copy and initialization, further improving performance and energy efficiency compared to RowClone. If successful, we expect the mechanisms proposed by this thesis to significantly advance the state-of-the-art in many important applications.
Todd Mowry (Co-Chair)
Onur Mutlu (Co-Chair)
Rajeev Balasubramonian (University of Utah)
deb [atsymbol] cs.cmu.edu