The “P“ in AMPLab stands for "People" and an important research thrust in the lab was on integrating human processing into analytics pipelines. Starting with the CrowdDB project on human-powered query answering and continuing into the more recent SampleClean and AMPCrowd/Clamshell projects, we have been investigating ways to maximize the benefit that can be obtained through involving people in data collection, data cleaning, and query answering. In this talk I will present an overview of these projects and discuss some future directions for hybrid cloud/crowd data-intensive applications and systems.
Michael J. Franklin is the Liew Family Chair of Computer Science and Sr. Advisor to the Provost for Computation and Data at the University of Chicago where his research focuses on database systems, data analytics, data management and distributed computing systems. Franklin previously was the Thomas M. Siebel Professor and chair of the Computer Science Division of the EECS Department at the University of California, Berkeley. He co-founded and directed Berkeley’s Algorithms, Machines and People Laboratory (AMPLab), which created industry-changing open source Big Data software such as Apache Spark and BDAS, the Berkeley Data Analytics Stack. At Berkeley he also served as an executive committee member for the Berkeley Institute for Data Science. He currently serves as a Board Member of the Computing Research Association and on the NSF CISE Advisory Committee.
Franklin is an ACM Fellow and a two-time recipient of the ACM SIGMOD “Test of Time” award. His other honors include the Outstanding Advisor award from Berkeley’s Computer Science Graduate Student Association, and the “Best Gong Show Talk” personally awarded by Andy Pavlo at this year’s CIDR conference.