Multimedia

Graphs are ubiquitous in statistical modeling and a broad range of machine learning applications. Examples are social networks, natural language dependency structures, latent interrelationships among tasks, and neural network topologies. Despite their versatility in representing structured data, how to fuse the information from heterogeneous and/or dynamically evolving graphs poses a grand challenge to existing machine learning theory and optimization algorithms. Furthermore, efficient graph topology optimization is another important but unsolved problem which entails searching over a combinatorially large discrete space. In this thesis, we address these open challenges in several complementary aspects:

In §1 we focus on a novel framework for fusing multiple heterogeneous graphs into a single homogeneous graph, on which learning tasks can be conveniently carried out in a principled manner. We also propose a new approach to impose analogical structures among heterogeneous nodes, which offers a theoretical unification of several representative models along with improved generalization.

In §2 we focus on graph induction problems in the context of graph-based semisupervised learning. We start with a nonparametric method that is able to recover the optimal latent label diffusion pattern over the graph, and then generalize label diffusion processes as graph convolution operations whose filter weights are induced from data residing on the non-Euclidean manifold.

In §3 we extend the scope of our modeling from static graphs to dynamic graphs. Specifically, we develop an online algorithm for multi-task learning with provable sublinear regret bound, where a latent graph of task interdependencies is dynamically inferred on-the-fly. We also look at time-series forecasting tasks, showing that the explicitly modeling of the graph dependencies among temporally evolving variables can improve the prediction accuracy.

In §4 we formulate neural architecture search as a graph topology optimization problem. We present a simple yet efficient evolutionary algorithm that automatically identifies high-performing architectures based on a novel hierarchical representation
scheme, where smaller operations are automatically discovered and reused as building blocks to form larger ones. The learned architecture achieves highly-competitive performance on ImageNet against the state-of-the-art, outperforming a large number of modern convolutional neural networks that were designed by hand.

Thesis Committee:
Yiming Yang (Chair)
Jaime Carbonell
Zico Kolter
Karen Simonyan (DeepMind)

Copy of Proposal Document

In the ocean of Web data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized systems based on replicated clusters.   Web data, however, is always evolving. The number of active Web sites continues to grow (180 millions at the beginning of 2014) and there are currently more than hundreds of billion indexed pages. On the other hand, Internet users are above two billion and hundreds of million of queries are issued each day. In the near future, centralized systems are likely to become less effective against such a data-query load, thus suggesting the need of fully distributed search engines.  Such engines need to maintain high quality answers, fast response time, high query throughput, high availability and scalability; in spite of network latency and scattered data. In this talk we present the main challenges behind the design of a distributed Web retrieval system and our research in all the components of such web search engine: crawling, indexing, and query processing.

***

Ricardo Baeza-Yates is VP of Yahoo! Labs for Europe and Latin America, leading the labs at Barcelona, Spain and Santiago, Chile, since 2006. Between 2008 and 2012 he also oversaw the Haifa lab.  He is also part time Professor at the Dept. of Information and Communication Technologies of the Universitat Pompeu Fabra in Barcelona, Spain. During 2005 he was an ICREA research professor at the same university. Until 2004 he was Professor and Director of the Center for Web Research at the Dept. of Computing Science of the University of Chile (in leave of absence until today).

He obtained a Ph.D. from the University of Waterloo, Canada, in 1989. Before, he obtained two masters (M.Sc. CS & M. Eng. EE) and the electrical engineering degree from the University of Chile in Santiago. He is co-author of the best-seller Modern Information Retrieval textbook, published in 1999 by Addison-Wesley with a second enlarged edition in 2011, that won the ASIST 2012 Book of the Year award. He is also co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 500 other publications. From 2002 to 2004 he was elected to the board of governors of the IEEE Computer Society and in 2012 he was elected for the ACM Council. He has received the Organization of American States award for young researchers in exact sciences (1993), the Graham Medal for innovation in computing given by the University of Waterloo to distinguished ex-alumni (2007), the CLEI Latin American distinction for contributions to CS in the region (2009), and the National Award of the Chilean Association of Engineers (2010), among other distinctions.  In 2003 he was the first computer scientist to be elected to the Chilean Academy of Sciences and since 2010 is a founding member of the Chilean Academy of Engineering. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow.

Faculty Host: Jamie Callan

Faculty, staff, students and alumni gathered May 31 to pay tribute to Mark Stehlik, outgoing assistant dean for undergraduate education for the School of Computer Science.

A party in Stehlik's honor was held in the Collaborative Commons of the Gates and Hillman Centers. About 200 people attended.

During impromptu remarks May 31, Stehlik thanked his colleagues and his students:
Stehlik, who has taught computer science at the Pittsburgh campus since 1982 and served as assistant dean since 1988, is stepping down this summer to begin a five-year appointment as associate dean for education at Carnegie Mellon Qatar.

Tom Cortina, associate teaching professor, will assume the duties of SCS assistant dean and is working with Stehlik on the transition.

Stehlik was this year's recipient of the Doherty Award for Sustained Contributions to Excellence in Education, and is a past winner of the Herbert A. Simon Award for Teaching Excellence in Computer Science.

On May 31, David Kosbie, assistant teaching professor of computer science and this year's recipient of the Simon Award, talked about how his teaching methods were patterned after those of Stehlik:

RI graduate student, Marek Michalowski, is featured in Post-Gazette for his work with robots and autistic children. Michael's favorite robot, Keepon, is "being used to study how children interact socially, and whether the robot might particularly be able to help children with autism". Read Post-Gazette article and watch Keepon on YouTube

Subscribe to Multimedia