Carnegie
Mellon University
15-826 Multimedia Databases and Data
Mining
Spring 2006 - C. Faloutsos
Reading List
NOTICE:
Several of the links are internal
to CMU.
Required text
Recommended text
- [HK] Jiawei
Han and Micheline Kamber, Data Mining: Concepts
and Techniques, Morgan Kaufmann, 2000.
- [PTVF] William H. Press Saul A. Teukolsky
William T. Vetterling Brian P. Flannery Numerical Recipes in C
Cambridge University Press, 1992, 2nd Edition. On-line evaluation copy
- Undergraduate DB
textbook, for those who took a db class too long ago:
- Raghu Ramakrishnan, Johannes Gehrke, "Database Management
Systems," McGraw-Hill 2002 (3rd ed).
Foils:
In pdf
A. Multimedia Indexing
- Primary
key access methods
- Secondary key and spatial access methods
- A. Guttman R-Trees:
a Dynamic Index Structure for Spatial Searching, Proc. ACM
SIGMOD, June 1984, pp. 47-57, Boston, Mass.
- J. Orenstein, Spatial
Query Processing in an Object-Oriented Database System, Proc.
ACM SIGMOD, May, 1986, pp. 326-336, Washington D.C..
- Textbook, chapters 4 and 5.
- Fractals
- Ibrahim Kamel and Christos Faloutsos, Hilbert
R-tree: An improved R-tree using fractals Proc. of VLDB
Conference, Santiago, Chile, Sept. 12-15, 1994, pp. 500-509.
- Christos Faloutsos and Ibrahim Kamel, Beyond
Uniformity and Independence: Analysis of R-trees Using the Concept of
Fractal Dimension, Proc. ACM SIGACT-SIGMOD-SIGART PODS, May
1994, pp. 4-13, Minneapolis, MN.
- Text and LSI
- Time sequences
- DSP and image databases
- Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jon Ashley,
Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin
Petkovic, David Steele and Peter Yanker Query
by Image and Video Content: the QBIC System IEEE Computer 28,
9, Sep. 1995, pp. 23-32. (hard copy - on reserve)
- Journal
of Intelligent Inf. Systems, 3, 3/4, pp. 231-262, 1994 An earlier,
more technical version of the IEEE Computer '95 paper.
- FastMap: Textbook chapter 11; Also in: C.
Faloutsos and K.I. Lin FastMap: A Fast Algorithm for Indexing,
Data-Mining and Visualization of Traditional and Multimedia Datasets
ACM SIGMOD 95, pp. 163-174.
- DFT/DCT: In PTVF ch. 12.1, 12.3, 12.4; in Textbook Appendix B.
- Wavelets: In PTVF ch. 13.10; in Textbook Appendix C
- Karhunen-Loeve: in Textbook Appendix D.
- JPEG: Gregory K. Wallace, The
JPEG Still Picture Compression Standard, CACM, 34, 4, April
1991, pp. 31-44
- MPEG: D. Le Gall, MPEG:
a Video Compression Standard for Multimedia Applications CACM,
34, 4, April 1991, pp. 46-58
- Fractal compression: M.F. Barnsley and A.D. Sloan, A
Better Way to Compress Images, BYTE, Jan. 1988, pp. 215-223. (hard copy: on reserve)
- Textbook, chapter 9
B. Data mining
- Graph mining and social networks:
- Michalis Faloutsos, Petros Faloutsos and Christos Faloutsos, On
Power-Law Relationships of the Internet Topology, SIGCOMM 1999.
- R. Albert, H. Jeong, and A.-L. Barabási, Diameter of
the World Wide Web, Nature, 401,
130-131 (1999).
- Réka Albert and Albert-László
Barabási Statistical
mechanics of complex networks, Reviews of Modern Physics, 74,
47 (2002).
- Time series forecasting
- Statistics background: In PTVF pp. 620-621 and
ch. 14.4-14.5;
- AI background / Classification
- [HK] chapter 7.3
- Rakesh Agrawal, Sakti Ghosh, Tomasz Imielinski, Bala Iyer and
Arun Swami An
Interval Classifier for Database Mining Applications VLDB Conf.
Proc. Vancouver, BC, Canada, Aug. 1992, pp. 560-573.
- M. Mehta, R. Agrawal and J. Rissanen, `SLIQ:
A Fast Scalable Classifier for Data Mining', Proc. of the Fifth
Int'l Conference on Extending Database Technology, Avignon, France,
March 1996.
- Data Mining in Databases:
- Data warehouses, OLAP and DataCubes: [HK],
ch. 2.
- Data reduction: [HK] chapter 3.4
- Association Rules:
- Cluster analysis: [HK] chapter 8.
- Miscellaneous (ICA, approximate counting)
- Jia-Yu Pan, Christos Faloutsos, Masafumi Hamamoto and
Hiroyuki Kitagawa: AutoSplit:
Fast and Scalable Discovery of Hidden Variables in Stream and
Multimedia Databases, PAKDD, Sydney, Australia, May 2004.
- Christopher Palmer, Phillip Gibbons and Christos Faloutsos, ANF:
A Fast and Scalable Tool for Data Mining in Massive Graphs, KDD
2002, Edmonton, Alberta, Canada, July 2002
- Efficient
and Tunable Similar Set Retrieval, by Aristides Gionis,
Dimitrios Gunopulos and Nikos Koudas, ACM SIGMOD, Santa Barbara,
California, May 21-24, 2001.
- New
sampling-based summary statistics for improving approximate query
answers, by Phillip B. Gibbons and Yossi Matias, ACM SIGMOD,
pp 331 - 342, Seattle, Washington, 1998.
RECOMMENDED OPTIONAL READING
Additional, optional citations, that may be useful for your
project:
Multimedia indexing
- Spatial access methods:
- N. Beckmann, H.-P. Kriegel, R. Schneider B. Seeger The
R*-Tree: an Efficient and Robust Access Method for Points and Rectangles
ACM SIGMOD, May 1990, pp. 322-331 Atlantic City, NJ. (Deferred
splitting in R-trees)
- Fractals
- B. Mandelbrot Fractal Geometry of Nature W.H.
Freeman, 1977. (The classic book on fractals).
- Manfred Schroeder, Fractals, Chaos, Power Laws: Minutes
From an Infinite Paradise W.H. Freeman and Company, 1991. (An
excellent introduction to fractals)
Data mining
- Time sequences
- George E.P. Box, Gwilym M. Jenkins and Gregory C. Reinsel, Time
Series Analysis: Forecasting and Control Prentice Hall, 1994 (3rd
Edition). (Time series forecasting - the classic approach. It also has
the algorithms for linear predictive coding.)
- Andreas S. Weigend and Neil A. Gerschenfeld, Time Series
Prediction: Forecasting the Future and Understanding the Past
Addison Wesley, 1994. (Time series forecasting: non-linear/chaotic
approaches)
- NEW:
Spiros Papadimitriou, Jimeng Sun and Christos Faloutsos
Streaming Pattern Discovery in Multiple Time-Series VLDB
2005, Trondheim, Norway.
- Graph mining:
- Tom Mitchell,
Machine Learning, McGraw Hill, 1997.
- John Ross Quinlan C4.5: Programs for Machine Learning
Morgan Kaufmann Publishers Inc., 1993. (Introduction to data mining,
with source code)
Last modified 04/03/06
by Christos
Faloutsos