Computer Science Speaking Skills Talk

  • Remote Access Enabled
  • Virtual Presentation
  • Ph.D. Student
  • Computer Science Department
  • Carnegie Mellon University
Speaking Skills

Progressive Compressed Records: Taking a Byte out of Deep Learning Data

Deep learning training accesses vast amounts of data at high velocity, posing bandwidth challenges for datasets retrieved over commodity networks and storage devices. A common approach to reduce bandwidth involves resizing or compressing data prior to training. We introduce a way to dynamically reduce the overhead of fetching and transporting data with a method we term Progressive Compressed Records (PCRs). PCRs deviate from previous storage formats by combining progressive compression with an efficient on-disk layout to view a single dataset at multiple fidelitiesall without adding to the total dataset size. We show that the amount of compression a dataset can tolerate depends on the training task at hand. We then show that PCRs can enable tasks to readily access appropriate levels of compression at runtimeresulting in a 2x speedup in training time on average.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Remote Participation Enabled. See announcement  for registration details.

For More Information, Please Contact: