|
Internet
Search Technologies, 15-505, Fall 2007
|
|
|
Syllabus and Course Schedule |
This seminar presents a practicum of how research from across computer science is used to provide internet search and related services. We look at selected works from the areas of systems, machine learning, language technologies and human computer interaction. The seminar class will meet weekly for 90 minutes, and each class consists of a lecture and interactive discussion.
This schedule will change and get more fleshed out as the semester progresses.
|
|
||
Module |
Lectures, readings, online materials |
Homework |
|
|
Lecture 1: 8/28/07
(Larsen) |
|
|
Lecture 2: 9/4/07
(Monson) Parallel computation
through MapReduce ·
MapReduce:
Simplified Data Processing on Large Clusters, Jeffrey Dean, Sanjay Ghemawat, OSDI'04: Sixth Symposium on Operating System
Design and Implementation, 2004 |
Reading response due at
start of class. |
|
|
Lecture 3: 9/11/06
(Monson) A file system optimized for
streaming reads and appending writes. ·
The Google File System,
Sanjay Ghemawat, Howard Gobioff,
Shun-Tak Leung, Proceedings of the 19th ACM
Symposium on Operating Systems Principles, 2003. |
Reading response due at
start of class. HW 1 due. |
|
|
A distributed storage
system for structured non-relational data ·
Bigtable:
A Distributed Storage System for Structured Data, Fay Chang, Jeffrey
Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber, 7th USENIX Symposium on
Operating Systems Design and Implementation (OSDI), 2006 |
Reading response due at
start of class. |
|
|
Information retrieval |
Lecture 5: 9/25/06
(Nigam) Introduction
to information retrieval |
HW 2 due. |
|
Lecture 6: 10/2/06
(Larsen) Improved
ranking using link structure: PageRank and Hubs
& Authorities ·
Sergey Brin, ·
(Optional) Jon Kleinberg. Authoritative sources
in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete
Algorithms, 1998. Extended version in Journal of the ACM 46(1999). Also
appears as IBM Research Report RJ 10076, May 1997. |
Reading response due at
start of class. |
|
|
|
Lecture 7: 10/9/07
Supervised Classification and Logistic
Regression |
|
|
Lecture 8:
10/16/07 (Nigam) Hierarchical
agglomerative clustering, k-means clustering, canopies ·
(Optional) Andrew McCallum,
Kamal Nigam and Lyle Ungar. Efficient Clustering
of High-Dimensional Data Sets with Application to Reference Matching. In Sixth
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
2000. |
HW 3 due. |
|
|
Lecture 9:
10/23/07 (Larsen) More supervised classification |
N/A |
|
|
|
Lecture 10: 10/30/07 (Fyshe) |
N/A |
|
Lecture 11: 11/6/07 (Nigam) |
|
|
|
|
Lecture 12:
11/13/07 Introduction
to good user interface design practice |
TBD |
|
|
||
|
Lecture 13: 11/20/07 Extended
case study of user interface design |
|
|
|
TBD |
Lecture 14: 11/27/07 |
TBD |
|
|
Lecture
15: 12/4/07 (Nigam) Organizing companies around the practice of
computer science |
TBD |