Light-weight In-situ Analysis with Frugal Resource Usage

Friday, November 10th, 2017 from 12-1 pm in GHC 6501.

Qing Zheng, CSD

In this talk Qing presents Parallel Logging DB (PLDB), a new in-situ analysis technique for indexing data within DeltaFS. With its design as a scalable, server-less file system for HPC platforms, DeltaFS scales file system metadata performance with application scale. The new PLDB is a novel extension to the DeltaFS data plane, enabling in-situ indexing of massive amounts of data written to a single DeltaFS directory simultaneously, and in an arbitrarily large number of files. PLDB achieves this through a compaction-free indexing mechanism for reordering and indexing data, and a log-structured storage layout to pack small writes into large log objects, all while ensuring compute node resources are used frugally. We demonstrate the efficiency of our PLDB through VPIC, a widely-used simulation code developed at Los Alamos National Lab that scales to trillions of particles. With DeltaFS, we modify VPIC to create a file under a special directory for each particle to receive writes of that particle's output data. Dynamically indexing the directory's underlying storage using PLDB allows us to achieve a 5,000x speedup in VPIC particle trajectory queries, which require reading all data for a single particle. This speedup increases with simulation scale, while the overhead is fixed at 3% of available memory and 8% of final storage.

General Info

The Student Seminar Series is an informal research seminar by and for SCS graduate students from noon to 1 pm on Tuesdays and Fridays. Lunch is provided by the Computer Science Department (personal thanks to Sharon Burks and Debbie Cavlovich!). At each meeting, a different student speaker will give an informal, 40-minute talk about his/her research, followed by questions/suggestions/brainstorming. We try to attract people with a diverse set of interests, and encourage speakers to present at a very general, accessible level.

So why are we doing this and why take part? In the best case scenario, this will lead to some interesting cross-disciplinary work among people in different fields and people may get some new ideas about their research. In the worst case scenario, a few people will practice their public speaking and the rest get together for a free lunch.

Guideline & Speaking Requirement Need-to-Know

Note: Step #1 below are applicable to all SSS speakers. You can schedule AT MOST THREE talks per semester.

SSS is an ideal forum for SCS students to give presentations that count toward fulfilling their speaking requirements. The specifics, though, vary with each department. For instance, students in CSD will need to be familiar with the notes in Section 8 of the Ph.D. document and follow the instructions outlined on the Speakers Club homepage. Roughly speaking, these are the steps:

  1. Schedule a talk with SSS by sending your talk title, abstract, additional info (like "Joint work with..." or "In Partial Fulfillment of the Speaking Requirement"), and a picture of yourself (preferably jpeg) to sss@cs at least TWO WEEKS before your scheduled talk.
  2. After you are confirmed with your SSS slot, go to the Speakers Club Calendar and schedule your talk at least THREE WEEKS in advance of the talk date.
  3. On the day of your talk, make sure you print Speakers Club evaluation forms for your evaluators to use.
Students outside of CSD will need to check with their respective departments regarding the procedure. As another example, ISRI students fulfill their speaking requirements by attending a semesterly Software Research Seminar and giving X number of presentations per school year. If you have experience with your department that might help others in your department, please feel free to contribute your knowledge by emailing us. Thank you!

SSS Coordinators

Armaghan Naik, Computational Biology
Lin Xiao, CSD
Qing Zheng, CSD


