Thesis Proposal
		Mining Large Dynamic Graphs and Tensors
		Kijung Shin
		March 5, 2018, 11:30 am EST  Newell Simon Hall 1507 
		
		
	
 
		
		Speaker
		Kijung Shin 
		Ph.D. Student, Computer Science Department 
		Carnegie Mellon University 
		
		
		Thesis Committee
		
				
		
		
		Document 
		The write-up [.pdf] can be found 
here. 
		
		
		
		Slides 
		The slides [.pdf] can be found 
here. 
		
		
		Abstract 
		Graphs are everywhere from online social networks to bipartite review graphs. Many of them are large and dynamic. Moreover, they are with extra information and thus naturally modeled as tensors. Given large dynamic graphs and tensors, how can we analyze their structure? How can we detect interesting anomalies? Lastly, how can we model the behavior of the individuals in the data? 
		My thesis focuses on these closely related questions, all of which are fundamental to understand large growing data on user behavior. For structure analysis, we develop streaming algorithms that incrementally update several connectivity measures in dynamic graphs. We also improve the scalability of Tucker Decomposition, which summarizes the structure of tensors. For anomaly detection, we develop several approximation algorithms that detect anomalously dense subgraphs and subtensors. The algorithms are designed for various settings, including distributed and streaming settings. Lastly, regarding behavior modeling, we build a stage model for progression of users on social media, and a game-theoretic model for purchasing behavior of individuals in social networks.  
		As future work, we will further explore the relation between the structure of social networks and the behavior of individuals in them by building a model for polarization in social networks. We will also advance our algorithms for larger and more dynamic data (e.g., fully dynamic graph streams where edges are not only added but also deleted).
		
		
		
		
		
		
		References (Completed Work)
		My thesis proposal is based on the following publications: 
		
		
		
		
		
		
		
		
		
		
		
		
		My thesis proposal is also based on the following preprints: 
		
		
		
		
		Open Source Software
		
		
		
		Datasets (Graphs)
		
		
		
		
		Datasets (Tensors)