Light-weight In-situ Analysis with Frugal Resource Usage
Friday, November 10th, 2017 from 12-1 pm in GHC 6501.
In this talk Qing presents Parallel Logging DB (PLDB), a new in-situ analysis technique for indexing data within DeltaFS. With its design as a scalable, server-less file system for HPC platforms, DeltaFS scales file system metadata performance with application scale. The new PLDB is a novel extension to the DeltaFS data plane, enabling in-situ indexing of massive amounts of data written to a single DeltaFS directory simultaneously, and in an arbitrarily large number of files. PLDB achieves this through a compaction-free indexing mechanism for reordering and indexing data, and a log-structured storage layout to pack small writes into large log objects, all while ensuring compute node resources are used frugally. We demonstrate the efficiency of our PLDB through VPIC, a widely-used simulation code developed at Los Alamos National Lab that scales to trillions of particles. With DeltaFS, we modify VPIC to create a file under a special directory for each particle to receive writes of that particle's output data. Dynamically indexing the directory's underlying storage using PLDB allows us to achieve a 5,000x speedup in VPIC particle trajectory queries, which require reading all data for a single particle. This speedup increases with simulation scale, while the overhead is fixed at 3% of available memory and 8% of final storage.
The Student Seminar Series is an informal research seminar by and for SCS graduate students from noon to 1 pm on Tuesdays and Fridays. Lunch is provided by the Computer Science Department (personal thanks to Sharon Burks and Debbie Cavlovich!). At each meeting, a different student speaker will give an informal, 40-minute talk about his/her research, followed by questions/suggestions/brainstorming. We try to attract people with a diverse set of interests, and encourage speakers to present at a very general, accessible level.
So why are we doing this and why take part? In the best case scenario, this will lead to some interesting cross-disciplinary work among people in different fields and people may get some new ideas about their research. In the worst case scenario, a few people will practice their public speaking and the rest get together for a free lunch.
Note: Step #1 below are applicable to all SSS speakers. You can schedule AT MOST THREE talks per semester.
SSS is an ideal forum for SCS students to give presentations that count toward fulfilling their speaking requirements. The specifics, though, vary with each department. For instance, students in CSD will need to be familiar with the notes in Section 8 of the Ph.D. document and follow the instructions outlined on the Speakers Club homepage. Roughly speaking, these are the steps:
Armaghan Naik, Computational Biology