May 2010M.S. Machine LearningCarnegie Mellon University
Aug 2005Masters of Human-Computer InteractionCarnegie Mellon University
2004B.Eng. Information Engineering (Honors)The Chinese Univ. of Hong Kong Online Auction Fraud Detection
Advisor: Prof. Soung-Chang Liew
2000Diocesan Boys' School (high school)Hong Kong
Research Interests
My research bridges Data Mining and Human-Computer Interaction to synthesize systems and tools that help people understand and interact with Big Data. My thesis focuses on massive networks with billions of nodes and edges.
I blend techniques from machine learning (Belief Propagation), data mining (anomaly detection), visualization and user interaction. Notable projects:
mixed-initiative system (ML+vis) for making sense of large network data (watch a demo);
Apolo helps users incrementally find relevant subgraphs to explore, avoids overwhelming the user
malware detection using machine learning, over massive graph with 37 billion machine-file relationships;
protects 120 million Symantec machines worldwide; patented.
auction fraud detection (eBay) that fingers bad guys by identifying their suspicious transactions;
appeared in Wall Street Journal, USA Today, and more
award-winning project that creates graph algorithms for massive graph data
Academic Honors & Awards
2012Carnegie Mellon School of Computer Science Dissertation Award, Honorable Mention
2010Open Source Software World Challenge, Silver Award (3rd place), for the PEGASUS project which mines billion-node graphs. U Kang, Duen Horng Chau, Christos Faloutsos
2009-2010Symantec Research Labs Graduate Fellowship, covers full tuition and stipend (in addition to the previous fellowship). Re-selected, as one of the only three graduate students worldwide to receive the award.
2009Yahoo! Key Scientific Challenges Award, in Information Retrieval, Algorithms, and Data Mining. Awarded as one of the 20 graduate student nationwide, with a gift of $5,000 unrestricted research fund.
2008-2009Symantec Research Labs Graduate Fellowship, covers full tuition and stipend.
Selected as one of the only two graduate students to receive the award. Featured on: The Wall Street Journal, MSN Money, CNNMoney.com, The Pittsburgh Tribune-Review, Fox Business, AOL Money & Finance, MSNBC Business.
2008Winner, Symantec research competition (internal).
2006Best Presentation Award, 3rd Place, PKDD'06.
2001, 2002, 2003Dean's List
2002, 2003Nominated for university scholarships
Employment
Summer-Fall 2011Google, Ads Backend TeamMountain View, CA
Ph.D. Software Engineer InternMentor: Dr. Arun Swami
Fall 2009Symantec Research LabsLos Angeles, CA
Ph.D. Research InternMentor: Mr. Carey Nachenberg
Summer 2008Symantec Research LabsLos Angeles, CA
Ph.D. Research InternMentor: Mr. Darren Shou
2005-2007Carnegie Mellon University, Human-Computer Interaction InstitutePittsburgh, PA
Research AssociateSupervisor: Prof. Brad Myers
Summer 2003The Chinese University of Hong Kong, Lightwave Communications LabHong Kong
Undergraduate Research Assistant
2002-2007Elfware Company Hong Kong
System Consultant
Web-scale malware detection & fraud detection, and Interactive graph exploration & mining for Massive Graph Analytics (CSE 8803 MGA) Instructor: Prof. David Bader. Fall 2012. Georgia Tech.
Data Warehousing and Data Mining for Database Applications (15-415).
Instructor: Prof. Christos Faloutsos. Spring 2012. Carnegie Mellon University.
Web Search for Science of the Web (15-396). Instructor: Prof. Luis von Ahn, Brendan Meeder. Fall 2011. Carnegie Mellon University.
Making Sense of Large Networks for Sensemaking: Cognitive, Social, and Technical Perspectives (05-899).
Instructor: Prof. Niki Kittur. Spring 2011. Carnegie Mellon University.
Teaching Assistant
Science of the Web (15-396).
Instructor: Prof. Luis von Ahn, Brendan Meeder. Fall 2011. Carnegie Mellon University.
Multimedia Databases and Data Mining (15-826).
Instructor: Prof. Christos Faloutsos. Spring 2010. Carnegie Mellon University.
Conference PapersMining Connection Pathways for Marked Nodes in Large Graphs.
Leman Akoglu, Jilles Vreeken, Hanghang Tong, Duen Horng Chau, Nikolaj Tatti, and Christos Faloutsos. Proccedings of SIAM International Conference on Data Mining (SDM) 2013. May 2-4, 2013. Austin, Texas. [25% acceptance rate]
PEGASUS: Mining Billion-Scale Graphs in the Cloud.
U Kang, Duen Horng Chau, and Christos Faloutsos. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012. Mar 25-30, 2012. Kyoto, Japan
Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms.
Danai Koutra, Tai-You Ke, U Kang, Duen Horng (Polo) Chau, Hsing-Kuo Kenneth Pao, and Christos Faloutsos. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (PKDD) 2011. Sept 5-9, 2011. Athens, Greece. [20% acceptance rate]
Mining Large Graphs: Algorithms, Inference, and Discoveries.
U Kang, Duen Horng (Polo) Chau, Christos Faloutsos. IEEE International Conference of Data Engineering (ICDE) 2011. April 11-16. Hannover, Germany. [20% acceptance rate]
Apolo: Making Sense of Large Network Data by Combining Rich User Interaction and Machine Learning.
Duen Horng (Polo) Chau, Aniket Kittur, Jason I. Hong, Christos Faloutsos. ACM Conference on Human Factors in Computing Systems (CHI) 2011.
May 7-12, 2011. Vancouver, BC, Canada. [26% acceptance rate]
Polonium: Tera-Scale Graph Mining and Inference for Malware Detection.
Duen Horng (Polo) Chau, Carey Nachenberg, Jeffrey Wilhelm, Adam Wright, Christos Faloutsos. Proccedings of SIAM International Conference on Data Mining (SDM) 2011.
April 28-30, 2011. Mesa, Arizona. [25% acceptance rate]
On the Vulnerability of Large Graphs.
Hanghang Tong, B. Aditya Prakash, Charalampos Tsourakakis, Tina Eliassi-Rad, Christos Faloutsos, Duen Horng (Polo) Chau. IEEE International Conference on Data Mining (ICDM) 2010. Dec 14-17, 2010. Sydney, Australia. [19% acceptance rate]
What to Do When Search Fails: Finding Information by Association.
Duen Horng (Polo) Chau, Brad Myers, and Andrew Faulring. ACM Conference on Human Factors in Computing Systems (CHI) 2008. April 5-10, 2008. Florence, Italy. New York: ACM Press, Pages 999-1008. [22% acceptance rate]
NetProbe: A Fast and Scalable System for Fraud Detection in Online Auction Networks.
Shashank Pandit, Duen Horng (Polo) Chau, Samuel Wang, Christos Faloutsos. International Conference on World Wide Web (WWW) 2007. May 8-12, 2007. Banff, Alberta, Canada. Pages 201-210. [15% acceptance rate]
Eyes on the Road, Hands on the Wheel: Thumb-based Interaction Techniques for Input on Steering Wheels.
Ivan E. Gonzalez, Jacob O. Wobbrock, Duen Horng (Polo) Chau, Andrew Faulring, Brad A. Myers. Graphics Interface (GI) 2007. May 28-30, 2007. Montreal, Quebec, Canada. Pages 95-102. [48% acceptance rate]
Demonstrating the Viability of Automatically Generated User Interfaces.
Jeffrey Nichols, Duen Horng (Polo) Chau, Brad A. Myers. ACM Conference on Human Factors in Computing Systems (CHI) 2007. April 28-May 3, 2007. San Jose, CA. Pages 1283-1292. [25% acceptance rate]
An Alternative to Push, Press, and Tap-tap-tap: Gesturing on an Isometric Joystick for Mobile Phone Text Entry.
Jacob O. Wobbrock, Duen Horng (Polo) Chau and Brad A. Myers. ACM Conference on Human Factors in Computing Systems (CHI) 2007. April 28-May 3, 2007. San Jose, CA. Pages 667-676. [25% acceptance rate]
Huddle: Automatically Generating Interfaces for Systems of Multiple Connected Appliances.
Jeffrey Nichols, Brandon Rothrock, Duen Horng (Polo) Chau, Brad A. Myers. ACM Symposium on User Interface Software and Technology (UIST) 2006. October 15-18, 2006. Montreux, Switzerland. Pages 279-288. [23% acceptance rate]
In-stroke Word Completion.
Jacob O. Wobbrock, Brad A. Myers, and Duen Horng (Polo) Chau. ACM Symposium on User Interface Software and Technology (UIST) 2006. October 15-18, 2006. Montreux, Switzerland. Pages 333-336. [23% acceptance rate]
Detecting Fraudulent Personalities in Networks of Online Auctioneers.
Duen Horng (Polo) Chau, Shashank Pandit, and Christos Faloutsos. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (PKDD) 2006. Sept 18-22, 2006. Berlin, Germany. Pages 103-114. [15% acceptance rate] Best Presentation Award, 3rd place
A Linguistic Analysis of How People Describe Software Problems.
Andrew J. Ko, Brad A. Myers, and Duen Horng (Polo) Chau.
Proceedings of VL/HCC 2006. September 4-8, 2006, Brighton, UK. Pages 127-134. [28% acceptance rate]
Answering Why and Why Not Questions in User Interfaces.
Brad Myers, David A. Weitzman, Andrew J. Ko, and Duen Horng (Polo) Chau. ACM Conference on Human Factors in Computing Systems (CHI) 2006. April 22-27, 2006. Montreal, Canada. Pages 397-406. [23% acceptance rate]
Invited ArticlesCatching Bad Guys with Graph Mining. Polo Chau. Crossroads: The ACM Magazine for Students - The Fate of Money. Volume 17 Issue 3, Spring 2011. Pages 16-18.
Case study on fraud detection using social-network analysis.
Duen Horng (Polo) Chau and Christos Faloutsos. Encyclopedia of Social Networks and Mining. In preparation.
Polonium: Tera-Scale Graph Mining for Malware Detection.
Duen Horng (Polo) Chau, Carey Nachenberg, Jeffrey Wilhelm, Adam Wright, Christos Faloutsos. The 2nd Workshop on Large-scale Data Mining: Theory and Applications (LDMTA 2010). July 25, 2010. Washington, DC. IBM Student Travel Fellowship
Inference of Beliefs on Billion-Scale Graphs.
U Kang, Duen Horng (Polo) Chau, Christos Faloutsos. The 2nd Workshop on Large-scale Data Mining: Theory and Applications (LDMTA 2010). July 25-28, 2010. Washington, DC.
Supporting Ad Hoc Sensemaking: Integrating Cognitive, HCI, and Data Mining Approaches.
Aniket Kittur, Duen Horng (Polo) Chau, Christos Faloutsos, Jason I. Hong. Sensemaking Workshop at CHI 2009. April 4-5, 2009. Boston, MA.
Feldspar: A System for Finding Information by Association. Duen Horng (Polo) Chau, Brad Myers, and Andrew Faulring. PIM 2008: CHI 2008 Workshop on Personal Information Management. April 5-6, 2008. Florence, Italy.
Fraud Detection in Electronic Auction.
Duen Horng (Polo) Chau and Christos Faloutsos.
Proceedings of EWMF'05: European Web Mining Forum, at ECML/PKDD'05. October 3-7, 2005. Porto, Portugal.
Poster PapersFast Interactive Visualization for Multivariate Data Exploration.
Changhyun Lee, Wei Zhuo, Jaegul Choo, Duen Horng (Polo) Chau. Extended Abstracts, CHI 2013 Apr 27- May 2, 2013. Paris, France.
TopicViz: Semantic Navigation of Document Collections.
Jacob Eisenstein, Duen Horng (Polo) Chau, Aniket Kittur, Eric P. Xing. Extended Abstracts, CHI 2012. May 5-10, 2012. Austin, TX, USA.
Parallel Crawling for Online Social Networks.
Duen Horng (Polo) Chau, Shashank Pandit, Samuel Wang, and Christos Faloutsos. International Conference on World Wide Web (WWW) 2007. May 8-12, 2007. Banff, Alberta, Canada. Pages 201-210.
Integrating Isometric Joysticks into Mobile Phones for Text Entry. Duen Horng (Polo) Chau, Jacob O. Wobbrock, Brad A. Myers, Brandon Rothrock. Extended Abstracts, CHI 2006. April 22-27, 2006. Montreal, Canada. Pages 640-645.
DemosInteractively and Visually Exploring Tours of Marked Nodes in Large Graphs.
Duen Horng (Polo) Chau, Leman Akoglu, Jilles Vreeken, Hanghang Tong, Christos Faloutsos. ASONAM 2012. Aug, 2012. Istanbul, Turkey
TourViz: Interactive Visualization of Connection Pathways in Large Graphs.
Duen Horng (Polo) Chau, Leman Akoglu, Jilles Vreeken, Hanghang Tong, Christos Faloutsos. KDD 2012. Aug, 2012. Beijing, China
Large Graph Mining System for Patterns, Anomalies & Visualization.
Leman Akoglu, Duen Horng (Polo) Chau, U Kang*, Danai Koutra, Christos Faloutsos. Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2012. May 29-Jun 1, 2012. Kuala Lumpur, Malaysia
OPAvion: Mining and visualization in large graphs.
Leman Akoglu, Duen Horng (Polo) Chau, U Kang, Danai Koutra, Christos Faloutsos. ACM SIGMOD Conference 2012. May 20-24, 2012. Scottsdale, Arizona, USA
Apolo: Interactive Large Graph Sensemaking by Combining Machine Learning and Visualization.
Duen Horng (Polo) Chau, Aniket Kittur, Jason I. Hong, Christos Faloutsos. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2011. Aug 21-21, 2011. San Diego, California, USA.
SHIFTR: A Fast and Scalable System for Ad Hoc Sensemaking of Large Graphs.
Duen Horng (Polo) Chau, Aniket Kittur, Christos Faloutsos, Hanghang Tong, Jason I. Hong. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2009. June 28-July 1, 2009. Paris, France.
GRAPHITE: A Visual Query System for Large Graphs.
Duen Horng (Polo) Chau, Christos Faloutsos, Hanghang Tong, Jason I. Hong, Brian Gallagher, Tina Eliassi-Rad. International Conference on Data Mining (ICDM) 2008. Dec 15-19, 2008. Pisa, Italy.
VideoSHIFTR: a user-directed, link-based system for ad hoc sensemaking
of large heterogeneous data collections.
Duen Horng (Polo) Chau, Aniket Kittur, Christos Faloutsos, Jason I. Hong. Extended Abstracts of CHI 2009. Apr 4-9, 2009. Boston, MA.
Design Awards & Leadership
2006Winner, Carnegie Mellon ID card design contest. My design has been in use since then.
Winner, Graduate Student Assembly logo design contest. Carnegie Mellon University.
Winner, Language Technologies Institute T-shirt design contest. Carnegie Mellon University.
Carnegie Mellon University, Tepper School of Business. Pittsburgh, Pennsylvania, USA. Invited.
Mining Massive Graphs: Visualization and Anomaly Detection
May 13, 2011
Google. Mountain View, CA, USA
May 10, 2011
Simon Fraser University. Vancouver, Canada. Host: Prof. Jian Pei. Invited.
Detecting Fraudulent Personalities in Networks of Online Auctioneers
Feb 16, 2007
eBay. Palo Alto, CA, USA. Invited.
Panels
Chair
Apr 15, 2013
Center for Data Analytics workshop
Invited Panelist
Dec 11, 2012
SAMSI-FODAVA Workshop on Interactive Visualization and Analysis of Massive Data
Nov 13, 2012
Big Data Meets Social Media at Georgia Tech's Institute for People and Technology (iPaT)'s annual People & Technology Forum
Patents
Inferring File and Website Reputations by Belief Propagation Leveraging Machine Reputation. Adam Wright (Symantec), Duen Horng Chau.
System and Method for Identifying Related and Synonymous Phrases in Ad Creatives. Arun Swami (Google), Duen Horng Chau. Pending.
System and Method for Identifying Contextually Relevant Phrases in Ad Creatives. Arun Swami (Google), Duen Horng Chau. Pending.
Press
"Fraudsters on Internet Auction Sites Often Leave Trail Leading to Crimes", Lee Gomes, The Wall Street Journal, December 6, 2006.
"Researchers: Software Can Nip eBay Fraud in The Bud", Bob Sullivan, MSNBC, December 8, 2006.
KQV Radio live interview, Host: Frank Gottlieb, December 10, 2006.
"Researchers Developing Anti-Fraud Tool", Joe Mandak, The Associated Press, December 11, 2006. Also appeared in over 100 newspapers around the world, including USA Today, Los Angeles Times, The Washington Post.
"CMU Develops Program To Online Auction Fraud", Keith Jones, KDKA-TV News, originally aired December 12, 2006.
BizRadio live interview, Host: Brent Clenton, December 13, 2006
"Online Auction Frauds Not As Clever As They Think, Researchers Say", Alpha Doggs, Network World, December 5, 2006.
Grants
Co-PI
Proactive Detection of Insider Threats with Graph Analysis at Multiple Scales
PIs: T. Senator (SAIC) and D.A. Bader (GTRI)
Defense Advanced Research Projects Agency
Anomaly Detection at Multiple Scales (ADAMS) Program
Funded: $2,927,976 (GT portion), 5/1/2011 - 4/30/2013
Fellowship
Research proposals led to two full years of research funding from Symantec. Awarded ~$140,000.
Co-author
$35 million DARPA proposal on Anomaly Detection at Multiple Scales (ADAMS). CMU funded ~$500,000. Co-authored the winning proposal, represented CMU and led research efforts.
Helped in creating an NSF proposal HCI meets Data Mining — Attention Routing, Visualization and Sense-making of Large Graphs. $500,000. Funded.
Miscellaneous
Amazon Web Services (AWS) Teaching Grant
$4000 credits to students of my class Data and Visual Analytics, CSE 6242 A / CS 4803 DVA, Spring 2013
Unfunded
Helped in creating an NEC proposal for University Awards from NEC Labs Data Management. $75,000
Ph.D. Students
Advisor
Acar Tamersoy. CS. Spring, 2013 - present.
Robert Pienta. CSE. Spring, 2013 - present.
Collaborators
Jaegul Choo. CSE. Adv: Haesun Park. Spring 2013 - present.
Kunal Malhotra. CS. Adv: Sham Navathe. Spring 2013 - present.