Analysis of Social Media course

From ScribbleWiki: Analysis of Social Media

Jump to: navigation, search


Overview & Description

The class means Tuesday 4:30-6:30 in Wean Hall 4623. The instructors are William Cohen and Natalie Glance (Google Pittsburgh). The course is MLD 10-802 and also LTI 11-772.

The most actively growing part of the web is "social media"—e.g.. wikis, blogs, bboards, and collaboratively-developed community sites like Flikr and YouTube. This seminar course will review selected papers from the recent research literature that address the problem of analyzing and understanding social media. This will be a 6-credit course, with the primary workload being attending class and presenting material.

Topics that will be covered include:

  • Text analysis techniques for sentiment analysis, analysis of figurative language, authorship attribution, and inference of demographic information about authors (e.g., age or sex).
  • Community analysis techniques for detecting communities, predicting authority, assessing influence (e.g. in viral marketing), or detecting spam.
  • Visualization techniques for understanding the interactions within and between communities.
  • Learning techniques for modeling and predicting trends in social media, or predicting other properties of media (e.g., user-provided content tags.)

Students should have a machine learning course (e.g., 15-781 or 15-681) or consent of the instructor. The content of the course will be complimentary to another new course, “The Social Web: Content, Communities, and Context” (05-320/05-820) which is also being offered in fall 2007.

Course Projects

For those students that have elected to upgrade the course to a full 12-credit course and submit a course project:

  • By 10/1, everyone should send the instructors a three-page writeup of your proposed project describing: the problem you are studying, the inputs and outputs of the method you plan to develop; the dataset you plan to use; and a short discussion of what techniques you plan to use.
  • The final project will be due midnight EST on 12/13, and will be a paper, in the format used by ICWSM i.e., 8 pp 2-col conference paper format. (Unfortunately the ICWSM deadline is earlier than this, Dec 3, but so it goes).


August and September

  • Aug 28: Organizational meeting (William). Slides.

Papers discussed: Turney, ACL 2001, Pang et al, EMNLP 2002, Wiebe et al, Computational Linguistics 2005

Papers discussed: Page et al, 1999

  • Sep 18: Lecture on Slides.

Papers discussed: Cohn and Hoffman, NIPS 2001, Erosheva et al, PNAS 2004 , Rosen-Zvi et al, UAI, 2004, McCallum et al, IJCAI 2005, Dietz et al, ICML 2007 . Ramesh also suggested some background reading papers on PLSA, LDA, and topic models.

Slides part 2b; Slides part 2c.

Papers discussed: Sun et al, KDD 2006, Wang et al, SRDS 2003, Chakrabarti et al, KDD 2004.


  • Oct 16. Interested in meeting Matt while he's here?


  • Nov 13. Community dynamics.
    • Student 1: Shilpa
    • Student 2: Hanghang Tong - Paper:[A] Deepayan Chakrabarti, Spiros Papadimitriou, Dharmendra S. Modha, Christos Faloutsos: Fully automatic cross-associations. KDD 2004: 79-88. [B] Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, Philip S. Yu:GraphScope: parameter-free mining of large time-evolving graphs. KDD 2007:687-696
    • Student 3: Sachin
  • Nov 27. Design of online communities. (Guest lecture from Bob Kraut, HCII).


  • Dec 4. Anonymity and privacy issues.
    • Student 1: Yimeng
    • Student 2: Justin
    • Student 3: Jana
  • Dec 11. Tagging and folksonomys.
    • Student 1: open
    • Student 2: open
    • Student 3: open
Personal tools
  • Log in / create account