October 20th,
21st and 22nd, 2017



Presented by
Faculty & Students in Carnegie Mellon's

School of Computer Science

THANKS to Our Sponsors:



Disney Research



Thanks also to these SCS Departments:
Computer Science

Istitute for Software Research

Machine Learning

Language Technologies Institute

Robotics Institute





Research Team Leaders and Projects:

Team Leaders and Project Descriptions

alanAlan Black
Language Technologies Institute
Carnegie Mellon

shrimaiShrimai Prabhumoye
Graduate Student
Language Technologies Institute
Carnegie Mellon

Chatting with Computers
There are now a number of personal digital assistants that allow you
to talk to machines, ask them questions, give them commands, and
generally interact with them.  Apple Siri, Microsoft Cortana, Google
Home and Amazon Alexa all provide systems that non-computer experts
can use to get information.  This project will look at how these
systems work, using speech recognition and synthesis, natural
language understanding, dialog processing and natural language
generation.  We will discuss task oriented dialogs (with a specific
goal) and non-task oriented dialogs (just "chatting").

As part of this project we will develop our own working conversational
chatbot on the Amazon Alexa platform.  This project will make use of
knowledge of Python programming, Natural Language Processing, and also
creative ideas about how to use language to give the right answers, and
engage the user.

Lorrie Cranor
Institute for Software Research
Carnegie Mellon
abbyAbby Marsh
Graduate Student
Institute for Software Research
Carnegie Mellon

The Privacy and Security of Private Browsing
Modern browsers all include private browsing modes (also known as "incognito mode"), which promise users the chance to browse the web more anonymously and to erase their digital footprint as soon as they are done browsing. Private browsing is being used by average users more frequently now than ever—news reports detail how private browsing can save hundreds of dollars on airfare, and it is common practice for those browsing online porn. But private browsing alone is not a sufficient digital privacy or security tool; if users think they are fully protected from online threats by private browsing, they may put themselves at risk of discipline by employers, identity theft, malware, or other harms. We need to understand: do people's expectations of private browsing match the reality of what it provides? Who is using private browsing, and does use of private browsing correlate with knowledge of the Snowden revelations, technical knowledge, age, or any other factor? For what types of websites are people using private browsing? For this project participants will develop a survey to address some of these questions and explore the use cases and users of private browsing modes. We will gather and analyze actual user data using Amazon’s Mechanical Turk marketplace, allowing participants to draw conclusions around these important issues. 


Jenny Tsai-Smith
Vice President,
User Assistance Development


Lauren Cohn
Multimedia Architect

Articulating the Dimensionality of Films Using Machine Learning
Images and audio – what we see and hear become recognized patterns of human information. How we use our senses to collect, store, and process data is taught from birth. With each passing year, we gain knowledge and preferences, to quickly identify and predict how we will retain, transfer and reuse the data we collect. With time, we gain experience. With experience, we also gain bias, which feeds into how we process, interpret, and act on the data.
This project seeks to translate this human learning process to machine learning.  When we sit in a movie theater, within the first few minutes, we understand what kind of story we are about to see. Our personal algorithms for understanding and categorizing content are updated as we watch the story unfold on screen. Some movies are boring (predictable), while others are not.  How do we teach machines to “watch” a movie and make the same analytical assessments a person would?

In this project, we will seek to design an application that can categorize a short video or movie without human intervention.  We will:

  • break apart each frame of a video/movie and tease out all the composite components
  • identify the patterns and apply those patterns to other patterns and algorithms
  • demonstrate a predictive and categorized understanding of the film with each passing frame

To tackle this challenge, the team will be organized in a way that reflects how teams work in industry research contexts. Working collaboratively, the team will research and determine the best methodologies for understanding how humans experience films, breaking down and analyzing the vast amount of visual and audio information in each frame of a film, and using that data to achieve and share the genre categorization and understanding of the film content. We will use a pre-built project to allow the team to gain hands-on experience working with Oracle technologies in delivering a similar machine learning application. 

lori Lori Flynn
Software Security Researcher
CERT, Software Engineering Institute
Carnegie Mellon

nancyNancy Mead
Fellow and Principal Researcher
CERT, Software Engineering Institute
Carnegie Mellon

woodyCarol Woody
Technical Manager, Cybersecurity Engineering
CERT, Software Engineering Institute
Carnegie Mellon

gregGreg Shannon
Chief Scientist
CERT, Software Engineering Institute
Carnegie Mellon

How can we develop secure smartphone software?
Smartphones are ubiquitous and routinely blend into our daily lives. While we are using them, however, the security of these devices can be compromised along with our own security and privacy!  The simple act of downloading an app can lead to massive exposure of personal information. Smartphones have been used for such nefarious purposes as stalking those who have been victims of abuse and stealing money. The software may be compromised with malware that allows attackers to access personal information such as current location, banking information, and other account information such as Facebook and PayPal passwords.

In the workshop, we will focus on Android phones and study the ways in which vulnerabilities are inadvertently introduced during software development, and how those vulnerabilities are used in cyber attacks. The team will then examine code examples to identify code that has inadvertently introduced vulnerabilities, as a result of poor coding practices. We will make hands-on use of code editors and an Android emulator, and we will also review related literature–a fundamental practice in all research fields. Finally, the team will define coding rules that can be used to avoid these vulnerabilities and thus avoid exposure of personal and private information. The resultant coding rules will be published on the CERT Android coding standard website, with the names of the students that developed them, thus providing a publication opportunity for the student team’s research in the workshop. In this project, team members will participate in research activities common to software security research: examining related literature, looking at code, understanding software vulnerabilities, developing conclusions about secure practices, and publishing their work.

bonnieBonnie Holub
Master Data Scientist, Director, Cognizant
Founder and former CEO of Adventium Labs.

Data Science answers to questions like: where should I live?
This research project will be a fast-paced investigation into data science, analytics and visualization to determine the livability and economic indicators of leading US urban areas. We’ll explore differences among and between cities. We’ll compare their economies based the companies that are thriving in each, we’ll look at industrial diversity and livability issues that job seekers and technology professionals desire. The results of this weekend will be a report of our findings and provocative questions for future work. Participants need no previous experience in data mining, analytics or data visualization - but will leave with skills in all these areas. Learn why Data Scientist has been called the “Sexiest Job of the 21st Century” first hand, and learn how your graduate school experience can combine this variety of fascinating tool sets.

geogeGeorge Kantor
Senior Systems Scientist
Robotics Institue
Carnegie Mellon


Robots and Cameras for Automated Plant Phenotyping 
This project will investigate the problem of using robotics, cameras, and 3D modeling technology to better understand plant growth.  This is part of an emerging research area called “high throughput phenotyping”, where automated technologies are used to gather plant information that can be used both for high efficiency plant production and for breeding plants with desirable characteristics.  After reviewing the state of the art in this area, we will brainstorm methods for the automated collection and processing of visual data.  The goal is to build a prototype robotic system that can collect plant images, process the images into high-resolution 3D models, and automatically extract useful phenotype (e.g., stem width or leaf shape) information from the models.

Geoff Kaufman
Assistant Professor
Human Computer Interaction Institute
Carnegie Mellon
Joseph Seering
Graduate Student
Human Computer Interaction Institute
Carnegie Mellon

Designing Proactive Moderation Tools and Techniques to Reduce Online Harassment and Hate Speech

Harassment and hate speech affect countless users online, and users from minority groups or groups with less social power are particularly likely to be targeted. What is particularly insidious about these targeted acts is that they often occur on the most common, most mainstream social networks. While a great deal of work has focused on identifying factors, such as anonymity, that facilitate and perpetuate these behaviors, current responses to hateful behavior online have proven to be limited in their effectiveness. The majority of these approaches have focused on reactive moderation techniques, such as user bans or removal of offensive content, which have low utility and feasibility, especially as long-term solutions to reducing hate speech and harassment.  This project explores the potential for an alternative approach: the design of proactive moderation tools and techniques, based on social psychological theories and strategies, that aim to reduce the occurrence of hate speech and harassment in anonymous online forums and promote positive self-governance in online communities. We will address this basic question: can we design psychologically grounded, easily implementable techniques and best practices for designing online forums in a way that counteracts negative behaviors and promotes greater inclusiveness and a sense of positive community?

Over the course of the weekend, we will review existing methods of moderation across a variety of online platforms and explore a plethora of psychological theories and design techniques that could be fruitfully applied to proactive moderation.  Using this foundational knowledge, we will then brainstorm and prototype a set of new proactive techniques involving the design of specific platform elements (including password generation, terms of service approvals, and interface design) that cognitively prime or reinforce particular goals, mindsets, or orientations (such as heightened self-awareness, activation of community norms or social identities related to those norms, or inductions of perspective-taking and empathy) aimed at reducing the occurrence of negative communication behaviors among users of a target platform.
gabiGabriella Marcu
Assistant Professor
The College of Computing and Informatics
Drexel University
gabrielleGabrielle Salib
Graduate Student
The College of Computing and Informatics
Drexel University

How can technology give at-risk youth enough social support to stay in school?

When students, teachers, and parents work together effectively, lives of children and youth are significantly impacted. However, it is not always easy for this triad to achieve communication and exchange of information. In this research, we will study the types of breakdowns that lead to loss of trust and collaboration. For example, parents of students with special needs sometimes use lawsuits to advocate for their child’s needs. High-risk youth in underserved and impoverished communities sometimes give up on school entirely due to lack of support. In this project, we will investigate existing barriers to collaboration and explore technological solutions for improving communication within the student, teacher, and parent triad. We will identify opportunities for innovative uses of technology to address barriers, foster trust, and facilitate collaboration. Components of this project will include user experience research, human-centered design, empathic design, qualitative data analysis, and rapid prototyping.
Robert Kraut
Human Computer Interaction Institute
Carnegie Mellon
Carolyn Rosé
Associate Professor Language Technologies Institute
Carnegie Mellon
diyiDiyi Yang
Graduate Student
Human Computer Interaction Institute
Carnegie Mellon

Modeling successful persuasion
People use persuasive communication to shape others attitudes and decisions on a wide range of topics. In your research project, you will investigate the factors that make messages persuasive, including communicators’ personal characteristics (e.g., authority), their links to the targets of persuasion (e.g., liking) and the language they use in their messages. The context will be loan requests that members of Kiva use to persuade other members of their teams to lend money to borrowers. Kiva is a micro-finance site designed to provide money to borrowers in the developing world. Starting from social science theories of persuasion, you will extract high-level social and language features to operationalize different persuasive strategies that requesters use. You will build and test machine learning models to predict whether messages containing those features lead lenders to give money to a borrower.  These models will allow you to predict the success of unseen requests, identify strategies associated with successful loan requests, and provide guidance to future requesters.

James Morris
Professor and Former Dean of SCS
Human Computer Interaction Institute
Carnegie Mellon

Invent a Moblie Service
Cell phones are today's computing platform, especially in the developing world. Our team will develop an idea for a new mobile service. Some examples:

  • Ridesharing: Connect and motivate drivers and riders with real-time support
  • Favor Net (invented by the previous OurCS team): Facilitate requesting and performing favors on a college campus.
  • Family Memory Book: Billions of photos are being taken of children every day. How will their parents, the children grown, and their descendants enjoy them?
  • Product Finder: Snap a picture of an item you like, from TV or real life, and  be told where it came from.

The team will decide on a service and develop usage scenarios.

kleinDaniel Klein
Software and Site Reliability Engineer
Google, Pittsburgh
timothyTimothy James
Partner Technology Manager
Google, Pittsburgh

Exploring the Intelligent Fabric of our World
Computers have become the modern backbone of communication, which creates a powerful fabric for interacting with the environment and objects around us. We'll explore the space of microcontrollers, sensors, and peripherals that enable us to become creators of a future intelligent world. Students will use this fabric to make a prototype that allows them to communicate with and control an object in their world.

Examples include wearable computing, telepresence systems, and remote mointoring.

danielDaniel Mosse
Professor and Former Chair of Computer Science Department
University of Pittsburgh


Discovering Network Topology
The Internet has become a ubiquitous and indispensable tool in our lives. To maintain seamless network services, it is important that the administrator maintains an accurate knowledge of the current network topology or structure. However, sending extra packets to learn the network structure places additional burden on network routers. In this project, we will use machine learning techniques to learn a network topology using only passive information collected from packets received at computers at the periphery of a network.  

raviRavi Starzl
Systems Scientist
Language Technology Institute
Carnegie Mellon

The Future of Software Systems in Healthcare and Medicine
Participants will discuss and analyze some recent software trends in healthcare and medicine. Participants will brainstorm and iterate through a research challenge related to these fields.

The workshop will provide opportunities for all participants to work in teams on exploratory research problems. Each team will be led by a researcher from industry or academia who will introduce the research problem and guide the team through the process.

There are several sessions devoted to the research workshops.
The final research session will include presentations/solutions from each team.

For questions about the workshop, see contact page