Who’s peeking at your personal data?

“Privacy engineering” doesn’t grab the same headlines as computer security, but it’s a fast-growing field for Carnegie Mellon researchers, educators and students. It should concern you, too.

Emily Traynor illustration

If you have a burning secret, and you want to share it with someone, you might turn to Whisper, an app that co-founder Michael Hayward once called the “safest place on the Internet.” Whisper allows people to anonymously reveal secrets and pair them with stock images. Users have admitted to all kinds of indiscretions, unburdening their consciences while knowing that the company doesn’t collect any private information about them, so their confidences remain anonymous.

Or do they? In October 2014, a reporter for The Guardian newspaper alleged that Whisper’s editorial team—which tries to monitor the authenticity of the content it serves—was tracking certain users it deemed “newsworthy,” including a former Capitol Hill lobbyist who admitted to a salacious love life. Even turning off Whisper’s geo-location function was no guarantee of anonymity for users, The Guardian said; the app can follow them through their IP addresses. Matching details revealed in secrets with users’ locations would make the app a little less than truly anonymous.

Whisper struck back, saying that the supposed lobbyist doesn’t really exist, and that the company retains IP addresses only for seven days. Furthermore, Whisper said, it only uses the IP addresses if a user appears to be in danger, or is a threat to someone else—for instance, if a user admits to being suicidal or homicidal. The story, while complicated, highlights the growing awareness by users that very little that they do while online, or on their smartphones, is truly anonymous. It may not even be private, and they may not even realize who else has access to their information.

Everyone understands why security is important—bad guys break into a database and steal personal photos or credit card information—but understanding privacy is a bit trickier. “Security is about protecting information and physical devices and resources, while privacy is about policy—about what we want to protect, and what we want to protect it from,” says Lorrie Faith Cranor, professor of computer science and engineering and public policy, and director of CMU’s CyLab Usable Privacy and Security Laboratory.

“Security tends to get the headlines—it’s concrete and easier to understand,” she says, “but privacy can be more subtle. If a company takes information about what websites I visit and sells it, I might feel like my privacy is violated, but it’s not the same headline-grabbing story.”

We rarely read the fine print

Almost every time people interact with an app or a website—whether it’s to pay a bill, order a pizza, buy music, or join a dating site—they must accept the terms of use or a privacy agreement. Yet people rarely, if ever, read those agreements, some of which contain dozens of pages of small print, written in legalese. Even if they wanted to read every agreement, they likely wouldn’t have the time. Cranor estimates that if the average Internet user took the time to read every privacy agreement for every website she accesses in a single year, it would take her about 244 hours.

Rather than plowing through privacy agreements, users are anxious to actually download their music, find a date, or play Candy Crush. Simply clicking “I agree” when the privacy screen appears makes all that happen much more quickly.

As a result, most users don’t realize to what they might be consenting when they approve those terms of use. Often, in their privacy agreements, companies reveal that they sell user data to third parties, for example, and it’s perfectly legal because, technically, users have signed an agreement allowing it.

How can software developers balance privacy with usability? How can business and government leaders develop better privacy policies? And how can consumers make more informed decisions about what personal information they share online? Through research and through education—including the first-of-its-kind master’s degree in privacy engineering—Carnegie Mellon is positioning itself to be at the forefront of these complicated conversations.

Alessandro Acquisti is one of the researchers who are studying privacy agreements and their implications. A professor of information technology and public policy at CMU’s Heinz College, he says there are a number of factors that explain why people agree to “nebulous or even deceiving” privacy agreements.

“One reason is called hyperbolic discounting,” Acquisti says. “We focus on the immediate benefit we expect to receive from the site, and discount the possible future costs associated with the site using or abusing our personal information.”

Acquisti uses behavioral economics to understand how people feel about privacy and interact with privacy policies. He knows that people feel conflicted; they desire privacy, but also feel the need to trade some of their information in exchange for a benefit. In interpersonal settings, most people are good at controlling the flow of information, but when it comes to the Internet, people struggle to understand how and where to share private information.

We now have the power to share too much

“Modern information technologies have vastly advanced our power and ability to broadcast information—often, sensitive information—to others without even realizing we are doing so, and often without realizing the consequences of doing so,” Acquisti says. “Sometimes, this power creates problems—such as over-sharing or making disclosures that we later regret having done.”

But the problem goes beyond our inability to keep embarrassing photos off of Facebook or our ill-informed opinions about politics off of Twitter. Privacy also includes how businesses use the data they collect about their customers. When an app asks to access your location, for instance, it can automatically tag where the photo was taken. That’s super-convenient, right? But having a few days of a user’s locations means that an app also has a pretty good idea of who the user is, and what his or her habits are. “If you have one or two days of location data, you can identify where the person lives and works,” says Jason Hong, an associate professor in the Human-Computer Interaction Institute who researches usable privacy and security and mobile computing. In many cases, with enough specific data about times, locations and habits, supposedly anonymous users can be easily identified.

Some people have called for a national policy on information privacy, including Acquisti. In January 2015, he and his co-authors—Laura Brandimarte (HNZ’10,’12) of CMU’s Heinz College and George Loewenstein of the Department of Social and Decision Sciences in CMU’s Dietrich College—published an article in the journal Science titled “Privacy and Human Behavior in the Age of Information.” They argued that American consumers need policies that set minimal requirements for users to make informed and rational decisions, and called for “policies that include a baseline framework of protection, such as the principles embedded in the so-called fair information practices.”

“People need assistance and even protection to aid in navigating what is otherwise a very uneven playing field,” the authors wrote. “To be effective, privacy policy should protect real people—who are naive, uncertain and vulnerable—and should be sufficiently flexible to evolve with the emerging unpredictable complexities of the information age.”

Yet creating such a policy poses problems. It’s possible that regulations that are too specific could stifle innovation because they’re unable to anticipate new technologies. So while some experts are grappling with the idea of legislating privacy, others are focused on public education, including the U.S. Federal Trade Commission’s Julie Brill. In January, Brill, an FTC commissioner, delivered the keynote address at CMU’s annual Privacy Day forum. Since being sworn in at the agency in 2010, Brill has made consumer privacy one of her priorities. In her speech, Brill said the emerging “Internet of Things”—smart, networked devices with embedded computer technology, from toys to thermostats to vehicles—is making our privacy even more complicated.

From toys to transportation, we create a river of data

“Every time we swipe our smartphone screens to check Twitter or tap our phones to pay for coffee, we add to the swelling rivers of data that capture the details of what we do, what we buy, what we read, and where we go,” Brill said. “Soon these streams of data will reveal whether we’re at home and what we’re doing there. They will record how much we’ve exercised, when we’ve gained a few pounds, and how well we sleep. They’ll log our vital signs, and help us manage our diabetes, heart and other health conditions.”

Finding solutions that honor privacy principles is important not just to the FTC and consumers, Brill said; it’s also smart business. Companies must realize “that they need to maintain consumers’ trust if they are going to win their loyalty and business,” she said.

Although computer security isn’t the same as computer privacy, Brill did point out that good security does keep private data … well, private. While many computer security courses stress defensive measures, David Brumley, technical director of CMU’s CyLab and an associate professor of electrical and computer engineering, says he also teaches an emerging field called “cyber offense.”

“There are not a lot of differences between offense and defense—both are concerned with preventing vulnerabilities,” he says. While defensive measures involve patching security holes to keep people out, offensive measures require programmers to think strategically and predict how attackers might target a system, says Brumley, who also advises CMU’s undergraduate hacking club, the Plaid Parliament of Pwning, which is internationally ranked. “We’re teaching (students) to think about how their actions affect others,” he says.

Meeting the need for ‘privacy engineers’

Giving students the tools to develop strong, built-in security in the systems they create or maintain is an essential element to ensuring privacy. But computer privacy is developing into a specialized field of its own. Cranor says that in the past, when she’s talked to recruiters for technology companies, they’ve often said that they had no problem hiring “security engineers,” but when they posted positions for “privacy engineers,” the applications slowed to a trickle. Few applicants seemed to have much experience, or understanding, of user or data privacy.

“We saw that as a real need for a degree program,” Cranor says.

CMU’s new master’s of science degree in information technology—privacy engineering (MSIT-PE) is intended to fill that need, says Norman Sadeh (CS’91), a professor in CMU’s Institute for Software Research who co-founded and co-directs the MSIT-Privacy Engineering Program with Cranor. “We prepare students for both industry and government jobs where they will be asked to inform the design and refinement of products, services and business processes, taking into account privacy considerations.”

Travis Breaux, an assistant professor in CMU’s Institute of Software Research, says most of the students entering the program have a background in computer science, and many have worked in computer security. But while they have technical know-how, they don’t necessarily understand the nuances of privacy, he says. The MSIT-PE is designed to allow students to tackle real privacy issues. “That first-hand technical experience helps people create solutions, so they’re able to synthesize knowledge about privacy problems and risk, and (then) they can create and innovate,” Breaux says.

Adam Durity (CS’14) is one of the first eight graduates of the master’s program. He had originally applied to another graduate program in SCS; Cranor saw his application and called him to see if he’d consider the privacy-engineering program instead. At first, Durity was skeptical—he hadn’t been thinking of specializing in information privacy. As he began wondering he could do with an advanced degree in privacy, news broke about Edward Snowden, the computer engineer who leaked thousands of classified documents he had gathered while working as a contractor for the National Security Agency. Soon, Durity heard Cranor being interviewed on national public radio, discussing computer security and data privacy.

It made the choice to enroll in the MSIT-PE seem obvious. “I did some research and wow, suddenly this stuff seemed really relevant,” Durity says.

In the MSIT-PE, students take classes in privacy law, ethics and security, and have plenty of opportunities to participate in research projects with faculty members. There’s also room for them to take electives in areas such as machine learning. “The experience you have is defined by you,” Durity says. “There are plenty of opportunities to join in research.”

The final requirement of the degree is completion of a capstone project, which gives students the opportunity to complete a real-life assignment with an industry partner. Durity’s capstone examined ways to help Facebook users better understand what they were really agreeing to when they consented to the site’s privacy policy. Some users, he found, didn’t really understand any of the consequences of sharing their personal information. The capstone forced Durity to think carefully about how to explain privacy policies in meaningful ways that make users take an interest in them.

Today, as a data privacy analyst at Google, Durity tackles the same issues, and the experience he gained in the program helps him every day. “Primarily, I’m looking at how Google is using data, and where, and why,” he says. The MSIT-PE program “helps me place that information into a larger picture.”

Using research to make privacy intuitive

The MSIT-PE program is informed by more than a decade of research at Carnegie Mellon into ways of making privacy easier for end-users and developers alike to understand. In the early 2000s, Cranor led development of the Privacy Bird, a tool to help users better understand privacy policies. Created on behalf of AT&T, Privacy Bird was a plug-in for Microsoft’s Internet Explorer browser that could read privacy policies that were formatted according to specifications laid out by the World Wide Web Consortium. Users could tell Privacy Bird what sorts of concerns were important to them—having their name and address sold without their permission, for instance. When they visited a website, Privacy Bird would check for a privacy policy and then scan the agreement to see if it matched the user’s reported concerns. If there was a mismatch, the “Privacy Bird” squawked to alert them.

Since then, Cranor has been involved in a variety of projects that help make privacy policies easier to understand and she continues to develop new privacy tools. In 2013, Cranor and two of her Ph.D. students, Pedro Giovanni Leon (E’10,’14) and Blase Ur (CS’14), created the Bank Privacy website (cups.cs.cmu.edu/bankprivacy), a database of more than 6,000 U.S. banks which compares their privacy policies and allows users to search for those financial institutions that have the most protective policies.

But while websites continue to pose a threat to consumer privacy, our increasing reliance on smartphones has introduced a wide range of new privacy concerns. Sadeh, who heads CMU’s Mobile Commerce Lab, is examining how people interact with privacy policies and settings on their smartphones in daily life, and how effective existing privacy mechanisms really are. In one experiment, conducted with ISR Ph.D. student Hazim Almuhimedi (CS’08,’13), Sadeh studied 23 users of Android smartphones. The study used Android phones because Android allowed the researchers to instrument a “permission manager” tool called AppOps that enabled the users to manipulate their mobile app privacy permissions—which effectively gave users the power to control how much of their sensitive information could be accessed by each app, such as location or texting functionality.

During the first week, the experimenters tracked how the users interacted with apps on their smartphones. In the second week, they gave them access to a variation of AppOps that allowed them to toggle permissions off and on for each app, allowing them to customize their privacy settings for every application on their phone. If a user did not want Angry Birds, for example, to access her location, she could simply block it from doing so. Overall, users looked at their settings 51 times that week, restricting 272 permissions on 76 different apps, but their interest in privacy varied widely; one user, for instance, didn’t review any of their settings. And after a few days with AppOps, most people left their privacy settings alone.

Then, during the third week, the experimenters began sending the users “nudges”—messages about their privacy information that read something like this: “Your location has been shared 5,398 times with Facebook, Groupon, Go Launcher EX and seven other apps in the last 14 days. Would you like to review your settings?” That got users’ attention.

“The vast majority of users were motivated by these pop-up messages,” Sadeh says. “They also realized that there was information being shared they didn’t know about and weren’t aware of, and they revisited their settings.” After receiving their “nudges,” subjects reviewed 69 settings and blocked 122 permissions on 47 different applications. “This was after having been given access to the AppOps permission manager for a week,” Sadeh says, adding “while the permissions managers were useful, the nudges created a significantly higher level of awareness and motivated users to revisit settings with which they previously thought they were comfortable.” Sadeh and Almuhimedi’s findings, presented in April at the CHI 2015 conference in Seoul, South Korea, were widely discussed by tech bloggers and reporters covering computer security.

The research revealed that people often struggle to understand why an app might need certain information. It makes sense that a map application or a dating app, such as Tinder—which is based on a user’s proximity to other users—would need to know where a user is, but users didn’t understand why other applications would need such data. Many people, on the other hand, would question why an app that turns your mobile phone into a temporary flashlight would need access to their location.

As a way of helping users think about the permissions requested by various apps, and make informed decisions, Jason Hong, an associate professor in the HCII, recently launched the PrivacyGrade.org website. The content of the site is based on research conducted by Hong, Sadeh and others in collaboration with researchers at Rutgers. Users of Amazon’s Mechanical Turk were asked to read descriptions of Android apps and then answer questions based on their understanding of those descriptions. Subjects were asked whether they expected an app to use their precise location information, for instance, and then if they understood why the app needed that information. They also were asked if they were comfortable with an app if they didn’t understand why the app needed certain personal information.

With the help of these crowdsourced responses, the research team developed PrivacyGrade, a tool to rank apps by balancing the users’ expectations against what the apps actually do. PrivacyGrade “is about setting boundaries and expectations,” says Hong, who also directs CMU’s Computer Human Interaction: Mobility Privacy Security, or “CHIMPS,” lab. Sometimes, he says, “it’s not clear what these companies are doing, and it’s not clear they’re doing it in our best interest.”

Most people expect that Google Maps needs to access their location to help give them accurate directions. Because Google Maps met users’ expectations, it earned an “A” rating from PrivacyGrade. But few people expect a game such as “Cut the Rope” to access and use their location, and the app appears to use the information to send users targeted advertisements. Because of that, “Cut the Rope” earned a “D.” The website is already having a positive effect; Hong notes that one game developer has already changed its policies based on the low score it received from PrivacyGrade.

Balancing usability with privacy is a constant struggle

For developers, creating an app that works well, but which also protects user privacy, is a constant balancing act. Travis Breaux, who is examining privacy from the point of view of software developers, says that sometimes, in their pursuit of the next really cool application or feature, programmers forget about privacy concerns. In 2013, Breaux and other researchers studied the interaction between Facebook, Zynga, and AOL. Facebook, they noted, controls a lot of personal data, including user names, friend lists and ages; some of which it shared with Zynga, maker of popular games that can be played inside Facebook, such as Farmville.

Facebook, the researchers found, prohibited Zynga from sharing this information with advertisers, one of which was AOL. But although Facebook and Zynga had a direct relationship with users, AOL didn’t, and its obligation to Facebook and Zynga’s users was less clearly defined. The three different corporate policies had created a backdoor route through which, potentially, sensitive information could be shared without users knowing it. “Right now, information is flowing in these rivers, where it just converges and swirls,” Breaux says. In complex data-supply chains, determining who’s responsible for privacy and security at any given moment may not always be clear.

That’s where Breaux’s research comes into play. He directs CMU’s Requirement Engineering Laboratory, which develops tools, techniques and computational methods to improve trust and assurance in software systems. One of its current research thrusts, funded by the National Science Foundation and the Office of Naval Research, is adapting formal reasoning methods to find and resolve conflicts in privacy and security policies. The goal is the creation of automated tools that developers can use to make sure they’re following best practices in the areas of privacy and security, while still creating apps that can communicate efficiently with one another. Ultimately, developers would be able to express their information sharing needs, “then check to see if they align with any applicable laws, and with ethical rules,” Breaux says. 

Automated tools, crowdsourcing and other applications of technology will help ensure user privacy. But Breaux says that for the foreseeable future, education of end-users and attentiveness by developers and programmers form our first lines of protection. For now, he says, at each step of the way, before any sensitive data is used or shared, “we still need a human in the loop who asks, ‘is this the right thing to do?’”

—Meghan Holohan is a frequent contributor to The Link who also writes for the Today Show’s website and MentalFloss.com.

For More Information: 

Jason Togyer | 412-268-8721 | jt3y@cs.cmu.edu