10-709 Fall 2017: Fundamentals of Learning from the Crowd

Time: Tuesday and Thursday 1.30pm to 2.50pm
Location: GHC 4303
Units: 12

Instructor: Nihar Shah
Nihar's office hours: 3-4pm every Tuesday in GHC 8211
nihars at cs dot cmu dot edu

TA: Ritesh Noothigattu
Ritesh's office hours: 3-4pm every Thursday outside GHC 8013
riteshn at cmu dot edu

Description: Crowdsourcing is a burgeoning area that is popular in academic research, industrial applications, and also in societal causes. In this course, we will cover the foundational theoretical principles behind crowdsourcing and learning from the crowd. We will study this field via the lens of game theory (how to incentivize people to provide better data) and that of learning theory (how to make sense of this data). We will also touch upon literature in psychology and economics that studies the behavior of people. Along the way, we will discuss several fascinating paradoxes and conduct some live experiments in the class. Almost all lectures will be taught on the board. Required background material such as scoring rules, Nash equilibrium, concentration inequalities, random matrix theory will be taught in class.

Evaluation: Homeworks, final project, class participation.
Prerequisites: Basic probability (e.g., the student should be comfortable with conditional expectations, the Gaussian distribution, union bound), basic linear algebra (e.g., singular value decomposition) and basic programming.

Tentative schedule (subject to change):
Sept 5What is this course about? What is crowdsourcing?
Sept 7How to win $40,000?
Sept 12How to properly make strict rules?
Sept 14How do these rules help in machine learning?
Sept 19How do casinos help in crowdsourcing?
Sept 21What was "A beautiful mind" all about?
Sept 26How to administer a virtual truth serum?
Sept 28How to predict events using people's opinions?
Oct 3We are not who we think we are. Then who are we?
Oct 5We are all so weird! I'll try to demonstrate.
Oct 10What are concentration inequalities? (And no, its not related to Yoga)
Oct 12What are good models for ranking?
Oct 17How to rank in a simple, robust and optimal manner?
Oct 19What is rank centrality?
Oct 24Guest lecture
Oct 26Who started the gossip?
Oct 31I started the gossip. And you can't catch me. (Guest lecture)
Nov 2How to grade your peers?
Nov 7What is paramteric labeling?
Nov 9What is permutation labeling?
Nov 14How can I be fair to everyone?
Nov 16You cannot be fair to everyone.
Nov 21Well, ok. But still, can I try?
Nov 28Lets try in the real world.
Nov 30Finishing up. Do you have any questions?
Dec 5Project presentations I
Dec 7Project presentations II

Timeline for homeworks and project:
Sept 20, 1pmGroup names
Oct 17, 1pmProposal
Dec 5, 1pmFinal report
Last two lecturesPresentation
Major homeworks:
Sept 20, 1pmMajor homework 1 release
Oct 8, 1pmMajor homework 1 submission
Nov 3, 1pmMajor homework 2 release
Nov 21, 1pmMajor homework 2 submission