Thesis Proposal

Mining Large Dynamic Graphs and Tensors


Kijung Shin

March 5, 2018, 11:30 am EST
Newell Simon Hall 1507



Speaker

Kijung Shin
Ph.D. Student, Computer Science Department
Carnegie Mellon University


Thesis Committee



Document

The write-up [.pdf] can be found here.


Slides

The slides [.pdf] can be found here.


Abstract

Graphs are everywhere from online social networks to bipartite review graphs. Many of them are large and dynamic. Moreover, they are with extra information and thus naturally modeled as tensors. Given large dynamic graphs and tensors, how can we analyze their structure? How can we detect interesting anomalies? Lastly, how can we model the behavior of the individuals in the data?

My thesis focuses on these closely related questions, all of which are fundamental to understand large growing data on user behavior. For structure analysis, we develop streaming algorithms that incrementally update several connectivity measures in dynamic graphs. We also improve the scalability of Tucker Decomposition, which summarizes the structure of tensors. For anomaly detection, we develop several approximation algorithms that detect anomalously dense subgraphs and subtensors. The algorithms are designed for various settings, including distributed and streaming settings. Lastly, regarding behavior modeling, we build a stage model for progression of users on social media, and a game-theoretic model for purchasing behavior of individuals in social networks.

As future work, we will further explore the relation between the structure of social networks and the behavior of individuals in them by building a model for polarization in social networks. We will also advance our algorithms for larger and more dynamic data (e.g., fully dynamic graph streams where edges are not only added but also deleted).



References (Completed Work)

My thesis proposal is based on the following publications:


My thesis proposal is also based on the following preprints:



Open Source Software



Datasets (Graphs)


NameDescriptionSourceDownload
KarateClub Friendship Network KONECT and UCINET Link
Hamster Friendship Network KONECT Link
CatsterFriendship Network KONECT Link
YouTubeFriendship Network SNAP and MPI-SWS Link
FlickrFriendship Network KONECT and MPI-SWS Link
Flickr(L)Friendship Network KONECT and MPI-SWS Link
OrkutFriendship Network SNAP and MPI-SWS Link
Youtube(L) Friendship Network Network Repository and MPI SWS Link
LiveJournalFriendship Network SNAP and MPI-SWS Link
FriendsterFriendship Network KONECT Link
AdvogatoTrust Network KONECT Link
EpinionTrust Network Network Repository and SNAP Link
TwitterSubscription Network KAIST Link
EmailEmail Network SNAP Link
Email(L)Email Network KONECT and CMU Link
StanfordWeb Graph SNAP Link
NotreDameWeb Graph SNAP Link
BerkStanWeb Graph SNAP Link
GoogleWeb Graph SNAP Link
CaidaInternet Topology SNAP and CAIDA Link
SkitterInternet Topology SNAP and CAIDA Link
HepThCitation Network SNAP and Cornell Link
HepPhCitation Network SNAP and KDD Cup Link
PatentCitation Network SNAP and NBER Link


Datasets (Tensors)


Name Modes Entries Source Download
YoutubeUser X User X Date1 (Friend) or 0 KONECT Link
SMSUser X User X Timestamp # Messages NDA NDA
StackOver.User X Post X Timestamp1 (Like) or 0 KONECT Link
Wiki (Kor)User X Page X Timestamp# Revisions Wikimedia Link
Wiki (Eng)User X Page X Timestamp# Revisions Wikimedia Link
Yelp User X Business X Date X Score # Reviews Yelp Link
AppMarket User X App X Date X Score # Reviews ODDS
Android User X App X Date X Score # Reviews UCSD Link
NetflixUser X Movie X Date X Score# Reviews Netflix Link
YahooMusicUser X Item X Date X Score# Reviews Yahoo Labs Link
AcademicAuthor X Venue X Year X Keyword # Papers KDD Cup Link
DARPASrc IP X Dst IP X Timestamp # Connections Lincoln Lab Link
AirForceProtocol X Service X Flag X Src Bytes
X Dst Bytes X Counts X Srv Counts
# Connections UCI KDD Archive Link
LinkedInUser X Features X Timestamp 1 (Use) or 0 NDA NDA