Operation Chaining Asynchronous Pipelined Circuits

 

In ICCAD

Girish Venkataramani and Seth Copen Goldstein

November, 2007

Abstract

We define operation chaining (op-chaining) as an optimization problem to determine the optimal pipeline depth for balancing performance against energy demands in pipelined asynchronous designs. Since there are no clock period requirements, asynchronous pipeline stages can have non-uniform latencies. We exploit this fact to coalesce several stages together thereby saving power and area due to the elimination of control-path resources from the pipeline. The trade-off is potentially reduced pipeline parallelism.

In this paper, we formally define this optimization as a graph covering problem, which finds sub-graphs that will be synthesized as an opchained pipeline stage. We then define the solution space for provably correct solutions and present an algorithm to efficiently search this space. The search technique partitions the graph based on post-dominator relationships to find sub-graphs that are potential op-chain candidates. We use knowledge of the Global Critical Path (GCP) [13] to evaluate the performance impact of accepting a candidate sub-graph and formulate a heuristic cost function to model this trade-off. The algorithm has a quadratic-time complexity in the size of the dataflow graph. We have implemented this algorithm within an automated asynchronous synthesis toolchain [12]. Experimental evidence from applying the algorithm on several media processing kernels reveals that the average energy-delay and energy-delay-area products improve by about 1.4x and 1.8x respectively, with a maximum improvement of 5x and 18x.


download pdf


@inproceedings{venkataramani-iccad07,
  author = {Venkataramani, Girish and Goldstein, Seth Copen},
  title = {Operation Chaining Asynchronous Pipelined Circuits},
  booktitle = {ICCAD},
  abstract = {We define operation chaining (op-chaining) as an
     optimization problem to determine the optimal pipeline depth for
     balancing performance against energy demands in pipelined
     asynchronous designs. Since there are no clock period
     requirements, asynchronous pipeline stages can have non-uniform
     latencies. We exploit this fact to coalesce several stages
     together thereby saving power and area due to the elimination of
     control-path resources from the pipeline. The trade-off is
     potentially reduced pipeline parallelism. 

In this paper, we formally define this optimization as a graph covering problem, which finds sub-graphs that will be synthesized as an opchained pipeline stage. We then define the solution space for provably correct solutions and present an algorithm to efficiently search this space. The search technique partitions the graph based on post-dominator relationships to find sub-graphs that are potential op-chain candidates. We use knowledge of the Global Critical Path (GCP) [13] to evaluate the performance impact of accepting a candidate sub-graph and formulate a heuristic cost function to model this trade-off. The algorithm has a quadratic-time complexity in the size of the dataflow graph. We have implemented this algorithm within an automated asynchronous synthesis toolchain [12]. Experimental evidence from applying the algorithm on several media processing kernels reveals that the average energy-delay and energy-delay-area products improve by about 1.4x and 1.8x respectively, with a maximum improvement of 5x and 18x.}, month = {November}, year = {2007}, url = {http://www.cs.cmu.edu/~seth/papers/venkataramani-iccad07.pdf}, keywords = {Asychronous Circuits, CAD, Global Critical Path} }


Related Papers

Asychronous Circuits
Heterogeneous Latch-Based Asynchronous Pipelines
Girish Venkataramani, Tiberiu Chelcea, and Seth Copen Goldstein. Asynchronous Circuits and Systems, International Symposium on, pages 83–92, 2008.
Slack Analysis in the System Design Loop
Girish Venkataramani and Seth Copen Goldstein. In IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES-ISSS), pages 231–236, October, 2008.
Area Optimizations for Dual-Rail Circuits Using Relative-Timing Analysis
Tiberiu Chelcea, Girish Venkataramani, and Seth Copen Goldstein. In Proceedings of the 13th IEEE International Symposium on Asynchronous Circuits and Systems, pages 117–128, March, 2007.
Global Critical Path: A Tool for System-Level Timing Analysis
Girish Venkataramani, Mihai Budiu, Tiberiu Chelcea, and Seth Copen Goldstein. In Proceedings of the 44th ACM/IEEE Design Automation Conference, pages 783–786, June, 2007.
Operation Chaining Asynchronous Pipelined Circuits
Girish Venkataramani and Seth Copen Goldstein. In ICCAD, November, 2007.
Self-Resetting Latches for Asynchronous Micro-Pipelines
Tiberiu Chelcea, Girish Venkataramani, and Seth Copen Goldstein. In Proceedings of the 44th ACM/IEEE Design Automation Conference, pages 986–989, June, 2007.
Hardware Compilation of Application-Specific Memory Access Interconnect
Girish Venkataramani, Tobias Bjerregaard, Tiberiu Chelcea, and Seth Copen Goldstein. IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, 25(5):756–771,2006.
Leveraging Protocol Knowledge in Slack Matching
Girish Venkataramani and Seth Copen Goldstein. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November, 2006.
Modeling the Global Critical Path in Concurrent Systems
Girish Venkataramani, Tiberiu Chelcea, Mihai Budiu, and Seth Copen Goldstein. Carnegie Mellon University Technical Report No. CMU-CS-06-144, August, 2006.
Tartan: Evaluating Spatial Computation for Whole Program Execution
Mahim Mishra, Timothy J Callahan, Tiberiu Chelcea, Girish Venkataramani, Mihai Budiu, and Seth Copen Goldstein. In 12th ACM International Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS), pages 163–174, October, 2006.
Adding Faster with Application Specific Early Termination
David Ryan Koes, Tiberiu Chelcea, Charles Onyeama, and Seth Copen Goldstein. Carnegie Mellon University Technical Report No. CMU-CS-05-101, pages 20, May, 2005.
SOMA: A Tool for Synthesizing and Optimizing Memory Accesses in ASICs
Girish Venkataramani, Tobias Bjerregaard, Tiberiu Chelcea, and Seth Copen Goldstein. In IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES-ISSS), pages 231–236, September, 2005.
HLS Support for Unconstrained Memory Accesses
Girish Venkataramani, Tiberiu Chelcea, and Seth Copen Goldstein. In IEEE 14th International Workshop on Logic Synthesis (IWLS), June, 2005.
Spatial Computation
Mihai Budiu, Girish Venkataramani, Tiberiu Chelcea, and Seth Copen Goldstein. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 14–26, October, 2004.
Translating ANSI C to Asynchronous Circuits
Mihai Budiu, Girish Venkataramani, Tiberiu Chelcea, and Seth Copen Goldstein. In 10th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC '04), April, 2004.
C to Asynchronous Dataflow Circuits: An End-to-End Toolflow
Girish Venkataramani, Mihai Budiu, Tiberiu Chelcea, and Seth Copen Goldstein. In IEEE 13th International Workshop on Logic Synthesis (IWLS), June, 2004.
Molecules, Gates, Circuits, Computer
Seth Copen Goldstein and Mihai Budiu. In Molecular Nanoelectronics,, January, 2003.
Global Critical Path
Slack Analysis in the System Design Loop
Girish Venkataramani and Seth Copen Goldstein. In IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES-ISSS), pages 231–236, October, 2008.
Global Critical Path: A Tool for System-Level Timing Analysis
Girish Venkataramani, Mihai Budiu, Tiberiu Chelcea, and Seth Copen Goldstein. In Proceedings of the 44th ACM/IEEE Design Automation Conference, pages 783–786, June, 2007.
Operation Chaining Asynchronous Pipelined Circuits
Girish Venkataramani and Seth Copen Goldstein. In ICCAD, November, 2007.
Leveraging Protocol Knowledge in Slack Matching
Girish Venkataramani and Seth Copen Goldstein. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November, 2006.
Modeling the Global Critical Path in Concurrent Systems
Girish Venkataramani, Tiberiu Chelcea, Mihai Budiu, and Seth Copen Goldstein. Carnegie Mellon University Technical Report No. CMU-CS-06-144, August, 2006.
CAD
Slack Analysis in the System Design Loop
Girish Venkataramani and Seth Copen Goldstein. In IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES-ISSS), pages 231–236, October, 2008.
Area Optimizations for Dual-Rail Circuits Using Relative-Timing Analysis
Tiberiu Chelcea, Girish Venkataramani, and Seth Copen Goldstein. In Proceedings of the 13th IEEE International Symposium on Asynchronous Circuits and Systems, pages 117–128, March, 2007.
Global Critical Path: A Tool for System-Level Timing Analysis
Girish Venkataramani, Mihai Budiu, Tiberiu Chelcea, and Seth Copen Goldstein. In Proceedings of the 44th ACM/IEEE Design Automation Conference, pages 783–786, June, 2007.
Operation Chaining Asynchronous Pipelined Circuits
Girish Venkataramani and Seth Copen Goldstein. In ICCAD, November, 2007.
Leveraging Protocol Knowledge in Slack Matching
Girish Venkataramani and Seth Copen Goldstein. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November, 2006.
Modeling the Global Critical Path in Concurrent Systems
Girish Venkataramani, Tiberiu Chelcea, Mihai Budiu, and Seth Copen Goldstein. Carnegie Mellon University Technical Report No. CMU-CS-06-144, August, 2006.
SOMA: A Tool for Synthesizing and Optimizing Memory Accesses in ASICs
Girish Venkataramani, Tobias Bjerregaard, Tiberiu Chelcea, and Seth Copen Goldstein. In IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES-ISSS), pages 231–236, September, 2005.
Translating ANSI C to Asynchronous Circuits
Mihai Budiu, Girish Venkataramani, Tiberiu Chelcea, and Seth Copen Goldstein. In 10th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC '04), April, 2004.
C to Asynchronous Dataflow Circuits: An End-to-End Toolflow
Girish Venkataramani, Mihai Budiu, Tiberiu Chelcea, and Seth Copen Goldstein. In IEEE 13th International Workshop on Logic Synthesis (IWLS), June, 2004.
Molecules, Gates, Circuits, Computer
Seth Copen Goldstein and Mihai Budiu. In Molecular Nanoelectronics,, January, 2003.
MolSpice: Designing Molecular Logic Circuits
Seth Copen Goldstein, James Ellenbogen, David Almassiam, Matt Brown, Mark Cannarsa, Jesse Klein, Schuyler Schell, Geoff Washburn, and Matthew M Ziegler. In Ninth Foresight Conference on Molecular Nanotechnology, November, 2001.
Static Profile-driven Compilation for FPGAs
Srihari Cadambi and Seth Copen Goldstein. In Proceedings of the 11th International Conference on Field-Programmable Logic and Applications, August, 2001.
BitValue Inference: Detecting and Exploiting Narrow Bitwidth Computations
Mihai Budiu and Seth Copen Goldstein. Carnegie Mellon University Technical Report, June, 2000. See budiu-europar00.
Efficient Place and Route for Pipeline Reconfigurable Architectures
Srihari Cadambi and Seth Copen Goldstein. In ICCD '00, September, 2000.
BitValue Inference: Detecting and Exploiting Narrow Bitwidth Computations
Mihai Budiu, Majd Sakr, Kevin Walker, and Seth Copen Goldstein. In Proceedings of the 2000 Europar Conference, volume 1900, pages 969–979,August, 2000. Also appeared as CMU CS Technical Report, CMU-CS-00-141, October 2000..
CPR: A Configuration Profiling Tool
Srihari Cadambi and Seth Copen Goldstein. In 7th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '99), pages 104, April, 1999.


Back to publications list