Carnegie Mellon University
15-826 Multimedia Databases and Data Mining
Spring 2017 - C. Faloutsos

List of suggested non-default projects, for PhD students

PRELIMINARIES

The projects are grouped according to their general theme. We also list the data and software available, to leverage your effort. More links and resources may be added in the future. Reminders:

SUGGESTED TOPICS

You may negotiate with the instructor, and propose a project outside of this list.

1.  GRAPH / TENSOR MINING

1.1. Spam Detection for Review Data


1.2. Weighted graphs over time





1.3. Tensor decomposition using RDBMS


1.4. Confidence-based ranking for graph classification


2.  MODELING

2.1 'Brain in a box'


3. TIME SERIES

3.1 Guess the next flu spike: Co-evolving time series mining



4. RICH / HETEROGENEOUS GRAPHS

4.1. Structural Correlation of Attributes


4.2. Attribute and/or Link Prediction in Heterogeneous Networks


4.3. Clustering in Attributed Networks


DATASETS

Unless explicitly mentioned, the datasets are either  'public' or 'owned' by the instructor; for the rest, we need to discuss about 'Non-disclosure agreements' (NDAs).

Time sequences

Spatial data

Graph data - need NDA

Graph Data - public

Miscellaneous:


SOFTWARE

Notes for the software: Before you modify any code, please contact the instructor - ideally, we would like to use these packages as black boxes.


BIBLIOGRAPHICAL RESOURCES:


Last modified Jan. 24, 2017, by Christos Faloutsos.