Back to Home Page
Peer Grading in Massive Open Online Courses (MOOCs)
Massive Open Online Courses (MOOCs) have been both hyped as steps towards cheaper, more democratic education and criticized as low-quality substitutes for traditional education.
In this project, we propose to address one of the major challenges in MOOCs: grading and feedback.
Improving the quality of grading and feedback to students would improve learning outcomes and add value to MOOC course credits, making MOOCs a more useful and sustainable educational resource.
Limited resources in large courses prevent personalized feedback from instructors.
We therefore turn to peer grading and feedback, which has the potential to support inexpensive and scalable MOOCs.
Peer grading has been tested with limited success, yet we believe that further research could make it practical and reliable.
This project aims to develop a deeper understanding of peer grading via a combination of theoretical analysis and empirical testing.
We aim to establish bounds on reliability and scalability of peer grading systems.
Analysis of these fundamental properties will lay groundwork for long-term MOOC research and development.
We will use our analysis to develop practical grading and feedback systems which combine student and instructor input.
Primary research questions:
- Limited grading resources: We cannot expect students to grade more than a few peers. Likewise, we cannot expect instructors to grade more than a tiny fraction of students.
- Noisy grades: Non-expert grades will be noisy. Students may put varying amounts of effort into grading.
- Ground truth: Collecting ground truth (e.g., instructor grades) is expensive. The topic itself may be subjective.
- Reliability: What a priori quality guarantees can we give for peer grading systems? During a course, how can reliability be assessed and improved by the system or instructor?
- Scalability: What are fundamental scaling limits of peer grading systems, and what assumptions or system modifications can change those limits?
- Cardinal vs. ordinal assessment: How does the basic assessment method affect reliability of peer grades? Are cardinal grades or pairwise comparisons better, and in which types of tasks?
- Incentives: How can we build interpretable incentives into the system to encourage students to give higher quality peer grades and feedback?
- Interventions: What other interventions can improve the reliability or scalability of peer grading? Interventions might include targeted use of instructor grading, systems for student complaints, and adaptive grader-gradee assignment.
- Cardinal vs. ordinal assessment: Psychological studies have shown that ordinal measurements (such as pairwise comparisons) can be more reliable than cardinal (numerical) measurements. By analyzing noise models and measuring noise in data, we are working to understand how noise levels differ across assessment tasks and settings.
- Sample complexity of peer assessment: We are studying the sample complexity of models for aggregating peer grades. We are also considering the effects of post-processing such as binning (pass/fail, as opposed to numerical grades) and complaint systems to catch errors.
Nihar B. Shah, Joseph K. Bradley, Abhay Parekh, Martin Wainwright, and Kannan Ramchandran.
A Case for Ordinal Peer-evaluation in MOOCs.
NIPS Workshop on Data Driven Education, 2013.