Also appointed at: U. Pittsburgh
School of Medicine, Computational and Systems Biology (adjunct)
Gates-Hillman Complex 8103. Phone 412-268-7678
Assistant: Christina Melucci, GHC
Mailing address: Carnegie Mellon / Computer Science, 5000 Forbes
Ave., Pgh PA 15213
Short CV Long
you have a question about 10601A/C for Fall 2017? Please read this before emailing
Welcome! My interests
- Forecasting Epidemics: The long term vision of our Delphi research group is to
make epidemiological forecasting as universally accepted and useful as
weather forecasting is today. As
was the case with weather forecasting, this will likely take a long
time. In the shorter term, we
select high value epidemiological forecasting targets (currently Influenza
and Dengue); create baseline forecasting methods for them; establish
metrics for measuring and tracking forecasting accuracy; estimate the
limits of forecastability for each target; and identify new sources of
data that could be helpful to the forecasting goal.
Ř Epi-Forecasting challenges: We have
participated, and done very well, in all epidemiological forecasting challenges
organized by the U.S. government to date: Influenza 2013—2014 (CDC);
Chikungunya 2015 (DARPA); Dengue 2009—2014 (White House OSTP); Influenza
2014—2015 (CDC, winner); Influenza 2015—2016 (CDC, winner); Influenza 2016—2017 (CDC, winner).
Ř Try our operational, geographically
detailed, real time flu nowcasting service.
Ř Try our operational, weekly updated flu
are part of the multi-university MIDAS
Ř December 2016: CDC has just named us “Most Accurate
Forecaster” for 2015-2016.
- Information and Communication Technologies for
and specifically Spoken Language Technologies for Development (SLT4D), which
is the term we coined for our own subfield of ICT4D: finding ways to use
spoken language technologies (like automatic speech recognition, speech
synthesis, and human-machine dialog systems) to aid socio-economic
development around the world.
current project, Polly, uses
telephone-based viral entertainment to reach low-literate people in Pakistan
and India, familiarizing them with speech interfaces and then introducing them
to development-related services. First deployed
in Lahore in May 2012, Polly reached over 165,000 users all over Pakistan and
fielded over 2.5 million phone calls in 8 months. In 2013 we launched Polly in Bangalore,
India, and it ended up spreading virally to West Bengal, New Delhi and other areas
of India. In March 2015 we deployed
Polly in Guinea, for person-to-person spreading of approved Public Health
messages about Ebola in many languages, in collaboration with the US embassy in
Conakry. In 2016, in collaboration with
Information Technology University (Lahore) we launched two new services in
Pakistan: Baang, a voice-based
Reddit, and Sawaal, a voice-based
previous project, HealthLine,
investigated the use of a telephone-based automated dialog system for access to
healthcare information by low-literate community health workers in Pakistan.
- Machine Learning for Social Good
(ML4SG). I continuously seek problems in
non-profits and government organizations, domestically and abroad, which
can benefit from machine learning solutions, and match them with suitable
teams of students and supervising faculty.
If your organization could use free machine learning or data
science expertize to help improve its societal impact, please contact me. Best cases are those where the potential
for societal impact is evident, the questions are well defined, and
significant relevant data is available.
Otherwise, I can work with you to get your problem ready for our
students. This initiative benefits
from a generous gift from Uptake (thanks guys!).
- Data Numeracy for
All. I believe that universal data
numeracy is as important in the 21st century as universal
literacy was in the 20th.
We need to increase the understanding of (and comfort with) data in
all segments of society. I am
interested in devising effective ways of doing that.
Students (department, topic): Logan Brooks (CSD, Epi-forecasting), Amanda Coston (MLD and Heinz, ML4SG), Aaron Rumack (MLD, Epi-forecasting), Lisheng
Gao (MLD, Epi-forecasting), Zirui (Edward) Wang (CS, Epi-forecasting).
David Farrow (CompBio, viral evolution +
ICT4D), Chuang Wu (CompBio, viral genotype-phenotype
mapping), Jahanzeb Sherwani (CSD, ICT4D), Yong Lu (CSD, CompBio), Dan Bohus (CSD,
dialog systems), Stefanie Tomko (LTI, speech communication), Jerry
(Xiaojin) Zhu (LTI,
MLD, semi-supervised learning), Lin
Chase (RI, speech recognition).
Past Post-docs: Andy Walsh (computational
virology), Xiaojin Wang (machine learning), Stan F. Chen (language modeling), Pierre DuPont (language modeling).
My favorite quotes.