Contact Information

Email:   shomir.wilson -at-  
Office:835 Rhodes Hall
Mailing Address:  Shomir Wilson
Electrical Engineering and Computing Systems
University of Cincinnati
PO Box 210030
Cincinnati, OH 45221-0030

Who I Am

Update 2016-08-22: I have started as an Assistant Professor in the EECS Department at the University of Cincinnati. When time permits I will update this page and move it to a server at UC. In the meantime some content on this site may be out of date.

Previously I was a Project Scientist at Carnegie Mellon University's School of Computer Science. Before that I was a Postdoctoral Fellow at the same university and an NSF International Research Fellow at the University of Edinburgh's School of Informatics. I received my PhD in Computer Science from the University of Maryland in 2011.

My research spans natural language processing, privacy, and artificial intelligence. Recently a major focus of my work has been the Usable Privacy Policy Project.

You can read more about my research interests and my teaching experience.

Curriculum Vitae

Have a look here.


2016-09-09: On November 1 I will give an invited talk as part of HCOMP's Encore Track in Austin, Texas.

2016-07-25: We have released the OPP-115 Corpus, a unique resource of 115 website privacy policies annotated with 23K data practices. The corpus is described in detail in our recent ACL paper.

2016-07-06: Starting this fall I will be an Assistant Professor of Computer Science at the University of Cincinnati.

2016-06-03: Along with my colleagues Alessandro Oltramari and Fei Liu, I am organizing a AAAI Fall Symposium on privacy and language technologies. Visit the symposium website here and help us advertise with this flyer.

2016-05-30: Our paper "The Creation and Analysis of a Website Privacy Policy Corpus" has been accepted for poster presentation at ACL this August, and our poster abstract "Visualization and Interactive Exploration of Data Practices in Privacy Policies" has been accepted by SOUPS.

2016-04-13: Our paper "Crowdsourcing Annotations of Websites’ Privacy Policies: Can It Really Work?" has been selected as one of five Best Paper Finalists at the 25th World Wide Web Conference.

2016-03-31: Our paper "Demystifying Privacy Policies with Language Technologies: Progress and Challenges" has been accepted by the Text Analysis for Cybersecurity and Online Safety workshop at LREC. I will present it.

2016-03-20: Lifehacker posted an article about our data exploration website.

2016-03-15: The Consumerist wrote an article about our data exploration website.

2016-03-10: Here's a data exploration website that my group has created to showcase the privacy policy annotations that we've collected. Also, here's a press release about it.

2016-01-05: I gave a talk as part of the CHIME seminar series at the National University of Singapore. Thanks to Min-Yen Kan for hosting me.

2015-12-16: Our paper "Crowdsourcing Annotations of Websites’ Privacy Policies: Can It Really Work?" has been accepted for presentation at the 25th World Wide Web Conference. The camera-ready version is here.

2015-10-18: Our paper "This Table is Different: A WordNet-Based Approach to Identifying References to Document Entities" has been accepted by the Global Wordnet Conference for a half-hour presentation. Here is the camera-ready version.

2015-06-05: I am pleased to announce that I have accepted an offer for a project scientist position in the School of Computer Science here at Carnegie Mellon University, starting in August. My primary responsibility will be the Usable Privacy Policy Project, a group that I have been involved with since its inception.

2015-04-22: I recently gave talks at John Hopkins University's Human Language Technology Center of Excellence (announcement and slides) and the University of Pennsylvania's Positive Psychology Center (slides).

2015-02-20: I had a photograph in the art gallery of Carnegie Mellon University's SCS Day.

2014-10-07: I've been selected to be a Grand Awards Judge in Computer Science at the Intel International Science and Engineering Fair in May 2015.

2014-09-01: Our submission to HCOMP, titled "Identifying relevant text fragments to help crowdsource privacy policy annotations", was accepted for publication.

2014-08-20: I have moved from the University of Edinburgh to the Language Technologies Institute at Carnegie Mellon University for the second part of my NSF IRFP fellowship. It's been great meeting new colleagues and reconnecting with old colleagues from my previous stay here.

2014-05-09: I gave a talk today for the NLIP Seminar Series at the University of Cambridge. Here are my slides. My recent work at the University of Edinburgh is described in the second half of the presentation.

2014-05-01: My short paper submission to ACL in Baltimore was accepted. The dataset it describes is here.

2013-11-04: Here's a belated link to a press release on the usable privacy policy project that I'm involved with. Also, here's the project website.

2013-10-28: The overview of my research is now up to date.

2013-08-05: I recently arrived at the University of Edinburgh to begin the first part of my NSF International Research Fellowship. It's been great meeting several new colleagues, and I look forward to working here for the next twelve months. Also, in coming months I will be attending Ubicomp in Zurich and IJCNLP in Nagoya to present papers at both.

2013-04-30: I recently gave a talk for the CL+NLP Lunch at Carnegie Mellon. You can take a look at my slides.

2012-11-18: I've updated the overview of my research.

2012-07-02: I've created a stub page to host the metalanguage corpus described in my recent ACL paper.

2012-03-16: My paper "The Creation of a Corpus of English Metalanguage" has been accepted for oral presentation at ACL this July in Jeju, South Korea. I will attend to present it.

2012-02-24: A profile of some of my Ph.D. research is up on

2012-02-12: I've been adding content to the EAPSI page over the past few months, and it is now complete.

2011-10-23: Prompted by my move to Carnegie Mellon last month, I've assembled this long-overdue renovation of my website. I'll add more content in the coming months.