Carnegie Mellon University
15-826 Multimedia Databases and Data Mining
Fall 2011 - C. Faloutsos
Final exam study guide
Preliminaries:
- All aids allowed, except communication devices (laptops, cell-phones etc)
- Duration: 3 hours
- The exam will be comprehensive, with more emphasis on
the material after the midterm
Reminders:
- Time and place: Tue. December 13 5:30p.m.‐8:30p.m. DH 2302, as announced at the 'hub': http://www.cmu.edu/hub/docs/final-exams.pdf
- Office hours by
instructor:
- the usual Wed Dec. 7, 3-4pm
- extra: Mon Dec. 12, 2-3pm
- Several of the links are internal to CMU.
Material to be examined
As in the original reading list, with a few papers deleted.
Specifically, the material consists of all the lecture foils (including the guest lectures), and all the papers/chapters
below:
Foils:
Everything we covered in the foils (see their pdf, from the
course schedule page)
A. Multimedia Indexing
- Primary key access methods
- Secondary key and spatial access methods
- Jon
Louis Bentley,
Multidimensional binary search trees used for associative
searching, Comm. of the ACM (CACM), Volume 18 ,
Issue 9, pp. 509-517, (September 1975)
- A. Guttman
R-Trees: a Dynamic Index Structure for Spatial
Searching, Proc. ACM SIGMOD, June 1984, pp. 47-57, Boston,
Mass.
- J. Orenstein,
Spatial Query Processing in an Object-Oriented Database
System, Proc. ACM SIGMOD, May, 1986, pp. 326-336,
Washington D.C.
- Ibrahim Kamel and Christos Faloutsos,
Hilbert R-tree: An improved R-tree using fractals Proc.
of VLDB Conference, Santiago, Chile, Sept. 12-15, 1994, pp.
500-509.
- Roberto F. Santos Filho, Agma Traina, Caetano Traina Jr., and
Christos Faloutsos:
Similarity search without tears: the OMNI family of all-purpose
access methods ICDE, Heidelberg, Germany, April 2-6
2001.
- Textbook, chapters 4 and 5.
- Fractals
- Christos Faloutsos and Ibrahim Kamel,
Beyond Uniformity and Independence: Analysis of R-trees Using
the Concept of Fractal Dimension, Proc. ACM
SIGACT-SIGMOD-SIGART PODS, May 1994, pp. 4-13, Minneapolis,
MN.
- Alberto Belussi and Christos Faloutsos, Estimating
the Selectivity of Spatial Queries Using the `Correlation' Fractal
Dimension Proc. of VLDB, p. 299-310, 1995 (and
gzipped postscript )
- Power laws, lognormals etc: M. E. J. Newman, Power laws, Pareto
distributions and Zipf's law Contemporary Physics 46, 323-351
(2005) (local
pdf copy)
- Text and LSI
- Textbook, chapter 6
- Peter W. Foltz and Susan T. Dumais,
Personalized Information Delivery: an Analysis of Information
Filtering Methods, Comm. of ACM (CACM), 35, 12, Dec. 1992,
pp. 51-60.
- SVD: In PTVF ch. 2.6; Textbook Appendix D
- PageRank: Sergey Brin, Lawrence Page The Anatomy of a
Large-Scale Hypertextual Web Search Engine (1998) (local
pdf)
- HITS: Jon M. Kleinberg Authoritative Sources in a
Hyperlinked Environment JACM, 46,5 (1999) (local
pdf)
- Tensors: Tamara G. Kolda and Brett W. Bader.
Tensor decompositions and applications. Technical Report
SAND2007-6702, Sandia National Laboratories, Albuquerque, NM and
Livermore, CA, November 2007 (local
pdf copy )
- Time sequences
- DSP and image databases
- Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jon Ashley,
Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee,
Dragutin Petkovic, David Steele and Peter Yanker
Query by Image and Video Content: the QBIC System IEEE
Computer 28, 9, Sep. 1995, pp. 23-32.
- Journal
of Intelligent Inf. Systems, 3, 3/4, pp. 231-262, 1994 An
earlier, more technical version of the IEEE Computer '95
paper.
- FastMap: Textbook chapter 11; Also in: C.
Faloutsos and K.I. Lin
FastMap: A Fast Algorithm for Indexing, Data-Mining and
Visualization of Traditional and Multimedia Datasets ACM
SIGMOD 95, pp. 163-174
.
- DFT/DCT: In PTVF ch. 12.1, 12.3, 12.4; in
Textbook Appendix B.
- Wavelets: In PTVF ch. 13.10; in Textbook Appendix C
- Karhunen-Loeve: in Textbook Appendix D.
- JPEG: Gregory K. Wallace,
The JPEG Still Picture Compression Standard, CACM, 34,
4, April 1991, pp. 31-44
- MPEG: D. Le Gall,
MPEG: a Video Compression Standard for Multimedia
Applications CACM, 34, 4, April 1991, pp. 46-58
- Fractal compression: M.F. Barnsley and A.D. Sloan,
A Better Way to Compress Images, BYTE, Jan. 1988, pp.
215-223.
- Textbook, chapter 9
B. Data mining
- Graph mining and social networks:
- Michalis Faloutsos, Petros Faloutsos and Christos Faloutsos,
On Power-Law Relationships of the Internet Topology,
SIGCOMM 1999.
- R. Albert, H. Jeong, and A.-L. Barabási,
Diameter of the World Wide Web Nature,
401, 130-131 (1999).
- Réka Albert and Albert-László
Barabási
Statistical mechanics of complex networks, Reviews of
Modern Physics, 74, 47 (2002).
- Haveliwala, Taher H. (2003) Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. Technical Report. Stanford InfoLab. (Extended version of the WWW2002 paper on Topic-Sensitive PageRank.)
- D. Chakrabarti, S. Papadimitriou, D. Modha and C. Faloutsos, Fully Automatic Cross-Associations, in KDD 2004 (pages 79-88), Washington, USA
- Jure Leskovec, Jon Kleinberg, Christos Faloutsos Graphs
over Time: Densification Laws, Shrinking Diameters and Possible
Explanations, KDD 2005, Chicago, IL, USA, 2005.
- D. Chakrabarti and C. Faloutsos, Graph Mining: Laws, Generators and
Algorithms, in ACM Computing Surveys, 38(1), 2006
(
pdf draft, internal to CMU)
- Time series forecasting
- Data Mining in Databases:
- Data warehouses, OLAP and DataCubes: [HK], ch. 2.1-4 (1st ed), or 3.1-4 (2nd ed)
- Association Rules:
- Miscellaneous: ICA
Reminders: Required text
Recommended text
- [HK] Jiawei Han and
Micheline Kamber, Data Mining: Concepts
and Techniques, Morgan Kaufmann, 2000 (1st edition); 2006 (2nd
edition).
- [PTVF] William H. Press Saul A.
Teukolsky William T. Vetterling Brian P. Flannery Numerical
Recipes in C Cambridge University Press, 1992, 2nd Edition.
On-line evaluation copy
- Undergraduate DB textbook, for
those who took a db class too long ago:
Last modified: Dec. 6, 2011, by Christos Faloutsos