Overview PEGASUS is a Peta-scale graph mining system, fully written in Java. It runs in parallel, distributed manner on top of Hadoop. Hadoop is a cloud computing platfrom, as well as an open source implementation of MapReduce framework which was originally designed for web-scale data processing by Google. PEGASUS provide large scale algorithms for important graph mining tasks: Degree PageRank Random Walk with Restart (RWR) Radius Connected Components The details of PEGASUS can be found in the following paper: U Kang, Charalampos E. Tsourakakis, and Christos Faloutsos. PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations. IEEE International Conference On Data Mining 2009, Miami, Florida, USA. Graph Mining with PEGASUS Graph Mining is an area of data mining to find patterns, rules, and anomalies of graphs. Why Should We Care? Graphs or networks are everywhere, ranging from the Internet Web graph, social networks(FaceBook, Twitter), biological networks, and many more. Finding patterns, rules, and anomalies have numerous applications including, but not limited to, the followings: Ranking web pages by search engine 'viral' or 'word-of-mouth' marketing Patterns of disease with potential impact for drug discovery Computer network security: email/IP traffic and anomaly detection Why PEGASUS? Existing works on graph mining has limited scalability: usually, the maximum graph size is order of millions. PEGASUS breaks the limit by scaling up the algorithms to billion-scale graphs. The breakthrough was possible by the careful algorithm design and implementation for Hadoop, a massive cloud computing platform. To summarize, PEGASUS has three major advantages. 1. Large Graph Mining Package Graphs with billions of nodes and edges 2. Parallel Algorithms on Hadoop Massive cloud computing platform 3. Open Source Apache Public License 2.0 Thanks to PEGASUS, we could analyze one the largest publicly available Web Graphs, from Yahoo!, with 6,7 billion edges. Publicity PEGASUS is gaining popularity among academia, as well as from industries. The PEGASUS paper received the best paper runner-up award at International Conference on Data Mining (ICDM) 2009 The PEGASUS web site has been visited by people from 64 countries. What is Pegasus? DOWNLOAD USING PEGASUS PUBLICATIONS ABOUT SCHOOL OF COMPUTER SCIENCE