Also appointed at: U. Pittsburgh
School of Medicine, Computational and Systems Biology (adjunct)
Gates-Hillman Complex 8103. Phone 412-268-7678
Assistant: Christina Melucci, GHC
Mailing address: Carnegie Mellon / Computer Science, 5000 Forbes
Ave., Pgh PA 15213
Short CV Long
am on sabbatical in Israel through end of June 2018, so am not able to meet or take
on any new engagements.
you have a question about 10601A/C for Fall 2018? Please read this.
Welcome! My interests
- Forecasting Epidemics: The long term vision of our Delphi research group is to
make epidemiological forecasting as universally accepted and useful as
weather forecasting is today. As
was the case with weather forecasting, this will likely take a long
time. In the shorter term, we
select high value epidemiological forecasting targets (currently Influenza
and Dengue); create baseline forecasting methods for them; establish
metrics for measuring and tracking forecasting accuracy; estimate the
limits of forecastability for each target; and
identify new sources of data that could be helpful to the forecasting
challenges: We have participated, and done very well, in all epidemiological
forecasting challenges organized by the U.S. government to date: Influenza
2013—2014 (CDC); Chikungunya 2015 (DARPA); Dengue 2009—2014 (White House OSTP);
Influenza 2014—2015 (CDC, winner); Influenza 2015—2016 (CDC, winner); Influenza 2016—2017 (CDC, winner).
Ř Try our operational,
geographically detailed, real time flu nowcasting
Ř Try our operational,
weekly updated flu forecasting.
We are part of the multi-university MIDAS research group.
2016: CDC has just named us “Most Accurate
Forecaster” for 2015-2016.
- Information and Communication Technologies for
and specifically Spoken Language Technologies for Development (SLT4D), which
is the term we coined for our own subfield of ICT4D: finding ways to use
spoken language technologies (like automatic speech recognition, speech
synthesis, and human-machine dialog systems) to aid socio-economic
development around the world.
Our current project, Polly, uses telephone-based viral
entertainment to reach low-literate people in Pakistan and India, familiarizing
them with speech interfaces and then introducing them to development-related
services. First deployed in Lahore in
May 2012, Polly reached over 165,000 users all over Pakistan and fielded over
2.5 million phone calls in 8 months. In
2013 we launched Polly in Bangalore, India, and it ended up spreading virally
to West Bengal, New Delhi and other areas of India. In March 2015 we deployed Polly in Guinea,
for person-to-person spreading of approved Public Health messages about Ebola
in many languages, in collaboration with the US embassy in Conakry. In 2016, in collaboration with Information
Technology University (Lahore) we launched two new services in Pakistan: Baang, a
voice-based Reddit, and Sawaal, a voice-based quiz game.
A previous project, HealthLine,
investigated the use of a telephone-based automated dialog system for access to
healthcare information by low-literate community health workers in Pakistan.
- Machine Learning for Social Good
(ML4SG). I continuously seek problems in non-profits
and government organizations, domestically and abroad, which can benefit
from machine learning solutions, and match them with suitable teams of
students and supervising faculty.
If your organization could use free machine learning or data science
expertize to help improve its societal impact, please contact me. Best cases are those where the potential
for societal impact is evident, the questions are well defined, and
significant relevant data is available.
Otherwise, I can work with you to get your problem ready for our
students. This initiative benefits
from a generous gift from Uptake (thanks guys!).
- Data Numeracy for
All. I believe that universal data
numeracy is as important in the 21st century as universal
literacy was in the 20th.
We need to increase the understanding of (and comfort with) data in
all segments of society. I am
interested in devising effective ways of doing that.
Students (department, topic): Logan Brooks (CSD, Epi-forecasting), Amanda Coston (MLD and Heinz, ML4SG), Aaron Rumack (MLD, Epi-forecasting), Lisheng
Gao (MLD, Epi-forecasting), Zirui (Edward) Wang (CS, Epi-forecasting).
David Farrow (CompBio,
viral evolution + Epi-forecasting), Ali
ICT4D), Chuang Wu (CompBio,
viral genotype-phenotype mapping), Jahanzeb Sherwani (CSD,
Yong Lu (CSD, CompBio), Dan Bohus (CSD,
dialog systems), Stefanie Tomko
(LTI, speech communication), Jerry
(Xiaojin) Zhu (LTI, MLD, semi-supervised learning),
Chase (RI, speech recognition).
Past Post-docs: Andy Walsh (computational
virology), Xiaojin Wang
(machine learning), Stan F. Chen (language modeling), Pierre DuPont (language modeling).
My favorite quotes.