Summingbird, an open-source project recently released by Twitter, allows engineers to easily build data processing pipelines that work both in a streaming context provided by Twitter Storm, and in offline batch context through Apache Hadoop. This talk will cover the practical motivation for building such a thing, and explain the core Summingbird architecture and components.
Dmitriy Ryaboy (@squarecog) manages the Twitter Analytics Infrastructure team. He's previously worked at Cloudera, Ask.com, and Lawrence Berkeley National Laboratory. He holds a Master's degree in VLIS from CMU and a Bachelor's in EECS from UC Berkeley.
Faculty Host: Andy Pavlo
jennsbl [atsymbol] cs.cmu.edu