Carnegie Mellon University
15826 Multimedia Databases and Data Mining
Fall 2012  C. Faloutsos
Final exam study guide
Preliminaries:
 No aids allowed, except
 two, standard 8'' x 11.5'' pages, with your notes (you may use
both sides) and
 pocket calculators: strongly
recommended (logarithms)
 Duration: 3 hours
 The exam will be comprehensive,
with more emphasis on
the material after the midterm
Reminders:
 Time and place: Thu. December
13 5:30p.m.‐8:30p.m. DH A302, as announced at the 'hub':
http://www.cmu.edu/hub/docs/finalexams.pdf
 Office hours by
instructor:
 the usual Wed
Dec. 5, 23pm
 extra: Mon Dec. 10, 1112noon
 extra: Tue Dec. 11, 122pm
 Several of the links are internal
to CMU.
Material to be examined
As in the original reading list, with a few papers deleted.
Specifically, the material consists of all
the lecture foils (including the guest lectures), and all the papers/chapters
below:
1. Foils:
Everything we covered in the foils (see their pdf, from the
course schedule page)
2. Multimedia Indexing
 Primary key access methods
 Secondary key and spatial access methods
 Jon
Louis Bentley,
Multidimensional binary search trees used for associative
searching, Comm. of the ACM (CACM), Volume 18 ,
Issue 9, pp. 509517, (September 1975)
 A. Guttman
RTrees: a Dynamic Index Structure for Spatial
Searching, Proc. ACM SIGMOD, June 1984, pp. 4757, Boston,
Mass.
 J. Orenstein,
Spatial Query Processing in an ObjectOriented Database
System, Proc. ACM SIGMOD, May, 1986, pp. 326336,
Washington D.C.
 Ibrahim Kamel and Christos Faloutsos,
Hilbert Rtree: An improved Rtree using fractals
Proc.
of VLDB Conference, Santiago, Chile, Sept. 1215, 1994, pp.
500509.
 Roberto F. Santos Filho, Agma Traina, Caetano Traina Jr., and
Christos Faloutsos:
Similarity search without tears: the OMNI family of
allpurpose
access methods ICDE, Heidelberg, Germany, April 26
2001.
 MM Textbook, chapters 4 and 5.
 Fractals
 Christos Faloutsos and Ibrahim Kamel,
Beyond Uniformity and Independence: Analysis of Rtrees
Using
the Concept of Fractal Dimension, Proc. ACM
SIGACTSIGMODSIGART PODS, May 1994, pp. 413, Minneapolis,
MN.
 Alberto Belussi and Christos Faloutsos, Estimating
the Selectivity of Spatial Queries Using the `Correlation' Fractal
Dimension Proc. of VLDB, p. 299310, 1995.
 Power laws, lognormals etc: M. E. J. Newman, Power laws, Pareto
distributions and Zipf's law Contemporary Physics 46, 323351
(2005) (local
pdf copy)
 Text and LSI
 MM Textbook, chapter 6
 Peter W. Foltz and Susan T. Dumais,
Personalized Information Delivery: an Analysis of
Information
Filtering Methods, Comm. of ACM (CACM), 35, 12, Dec. 1992,
pp. 5160.
 SVD: In PTVF ch. 2.6; MM
Textbook Appendix D
 PageRank: Sergey Brin, Lawrence Page The Anatomy of a
LargeScale Hypertextual Web Search Engine (1998) (local
pdf)
Also, [GM textbook section 15.4]
 HITS: Jon M. Kleinberg Authoritative Sources in a
Hyperlinked Environment JACM, 46,5 (1999) (local
pdf) (also [GM textbook, section 15.3]
 Tensors: Tamara G. Kolda and Brett W. Bader.
Tensor decompositions and applications. Technical Report
SAND20076702, Sandia National Laboratories, Albuquerque, NM and
Livermore, CA, November 2007 (local
pdf copy ) (Also, [GM textbook, ch 16]).
 ICA: JiaYu Pan, Christos Faloutsos, Masafumi Hamamoto and
Hiroyuki Kitagawa: AutoSplit: Fast and Scalable Discovery of
Hidden Variables in Stream and Multimedia Databases, PAKDD,
Sydney, Australia, May 2004.
 Time sequences
 DSP and image databases
 Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jon Ashley,
Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee,
Dragutin Petkovic, David Steele and Peter Yanker
Query by Image and Video Content: the QBIC System
IEEE
Computer 28, 9, Sep. 1995, pp. 2332.
 Journal
of Intelligent Inf. Systems, 3, 3/4, pp. 231262, 1994 An
earlier, more technical version of the IEEE Computer '95
paper.
 FastMap: MM Textbook chapter 11; Also in:
C.
Faloutsos and K.I. Lin
FastMap: A Fast Algorithm for Indexing, DataMining and
Visualization of Traditional and Multimedia Datasets ACM
SIGMOD 95, pp. 163174
.
 DFT/DCT: In PTVF ch. 12.1, 12.3, 12.4; in
MM Textbook Appendix B.
 Wavelets: In PTVF ch. 13.10; in MM Textbook Appendix C
 KarhunenLoeve: in MM Textbook Appendix D.
 JPEG: Gregory K. Wallace,
The JPEG Still Picture Compression Standard, CACM,
34,
4, April 1991, pp. 3144
 MPEG: D. Le Gall,
MPEG: a Video Compression Standard for Multimedia
Applications CACM, 34, 4, April 1991, pp. 4658
 Fractal compression: M.F. Barnsley and A.D. Sloan,
A Better Way to Compress Images, BYTE, Jan. 1988,
pp.
215223.
 MM Textbook, chapter 9
3. Data mining
 Graph mining and social networks:
 Patterns in graphs [GM textbook, ch 38]
 R. Albert, H. Jeong, and A.L. Barabási,
Diameter of the World Wide Web Nature,
401, 130131 (1999).
 Réka Albert and AlbertLászló
Barabási
Statistical mechanics of complex networks, Reviews of
Modern Physics, 74, 47 (2002).
 Jure Leskovec, Jon Kleinberg, Christos Faloutsos Graphs
over Time: Densification Laws, Shrinking Diameters and Possible
Explanations, KDD 2005, Chicago, IL, USA, 2005.
 Graph generators: RMAT and Kronecker
[GM textbook, ch 1214]
 Community detection [GM textbook, ch 17]
 Time series forecasting
 Data Mining in Databases:
Reminders: Required texts
Recommended text
 [HK] Jiawei Han and
Micheline Kamber, Data Mining: Concepts
and Techniques, Morgan Kaufmann, 2000 (1st edition); 2006 (2nd
edition).
 [PTVF] William H. Press Saul A.
Teukolsky William T. Vetterling Brian P. Flannery Numerical
Recipes in C Cambridge University Press, 1992, 2nd Edition.
Online evaluation copy

Undergraduate DB textbook, for
those who took a db class too long ago:
Last modified: Dec. 3, 2012, by Christos Faloutsos