Fractals for Spatial and Temporal Databases

Christos Faloutsos Phone: (412)-268.1457
Department of Computer Science Fax : (412)-268.5576
Carnegie Mellon Univ. Email: christos@cs.cmu.edu
Pittsburgh, PA 15213 WWW page: http://www.cs.cmu.edu/~christos

Keywords

Fractals, data mining, power laws, temporal and spatial data

Project Award Information

Project Summary

The goal of this project is to use concepts from fractals to solve long-standing database problems in query optimization, spatial databases, and temporal databases. Moreover, it introduces the powerful tools of fractals, chaos and non-linear systems for additional applications, like forecasting, time-sequence data mining, buffering and prefetching, for large database applications.

Goals, Objectives, and Targeted Activities

We are focusing on two  settings: (a) spatial data, where the goal is to find power laws that describe real datasets
(b) temporal data, where we try to model bursty time sequences, that are not modeled well by the traditional Poisson arrivals.

Indication of Success

We have already discovered power laws in multiple settings: Moreover, we have used fractals and power laws to develop a novel dimensionality reduction method, which is fast, scalable and able to detect non-linear correlations. [Traina00]

Long range results: This is an exploratory project, whose aim is to show that power laws and fractals are the correct tools to use for data mining and pattern discovery in large spatial and temporal datasets. This is in contrast to the textbook approaches, which use the uniformity and independence assumptions, and the Gaussian and Poisson distributions; although easy to study, these assumptions are clearly unrealistic for an overwhelming majority of real datasets.

Project Impact

Project References

The following refereed publications mention the  NSF support, since March 2000:
  1. [Faloutsos00] Christos Faloutsos, Bernhard Seeger, Agma J. M. Traina and Caetano Traina Jr., Spatial Join Selectivity Using  Power Laws, SIGMOD 2000, May 2000
  2. [Korn01] Flip Korn, Bernd-Uwe Pagel and Christos Faloutsos, On the `Dimensionality Curse' and the `Self-Similarity  Blessing' IEEE-TKDE,  vol. 13, no. 1, pp. 96-111, Jan./Feb. 2001.
  3. [Moon01] Bongki Moon, H.V. Jagadish, Christos Faloutsos and  Joel Salz Analysis of the Clustering Properties of the  Hilbert  Space-Filling  Curve IEEE Trans. on Knowledge and Data Engineering (IEEE-TKDE), vol. 13, no. 1,  pp.  124-141, Jan./Feb.  2001.
  4. [Proietti00] Guido  Proietti and Christos Faloutsos Analysis of  Range Queries and Self-Spatial Join  Queries  on Real  Region Datasets Stored Using and R-Tree IEEE TKDE, 12,  5, Sept./Oct. 2000, pp 751-762.
  5. [Traina00] Caetano Traina Jr.,  Agma  Traina,  Leejay  Wu  and  Christos Faloutsos Fast feature selection using fractal dimension  XV  Brazilian  Symposium  on  Databases  (SBBD), Paraiba, Brazil, October 2000.

Area Background

The project requires familiarity with spatial, temporal and multimedia databases, as well as with fractals.
 

Area References:

GPRA performance criteria

Discoveries at and across the frontiers of science and engineering: Fractals span multiple disciplines, like storage systems design, performance evaluation, spatial/geographical databases, financial time sequences, and many more.

Connections between discoveries and their use in the service of society: Power laws can detect patterns among spatial distributions, like hospitals, residences and disease outbreaks, to help with urban planning - and this is just one of the myriad applications.

A diverse globally-oriented workforce: Our efforts are inherently cross-disciplinary, straddling databases, statistics, machine learning, storage systems, networks and statistics.