Radius Plot The radius distribution is plotted by the plot radius [graph_name] command. The output file yweb_radius.eps  is generated in the current directory. Here is the Radius distribution plotted. PageRank Plot The PageRank distribution is plotted by the plot pagerank [graph_name] command. The output file yweb_pagerank.eps is generated in the current directory. Here is the PageRank distribution plotted. Plotting Results Once you have run the algorithms, you can plot the results to find interesting patterns and anomalies. We will show how to plot the distributions of degree, pagerank, radius, and the correlations among them. Degree Plot The degree distribution is plotted by the plot deg [graph_name] command. The output file www_deg_inout.eps is generated in the current directory. Here is the degree distribution plotted. Overview This demo shows how to use PEGASUS for mining large graphs. We will analyze a web graph by computing the degree, PageRank, radius distributions, and the correlations among them. This demo is composed of the following four parts: 1. Interactive Shell 2. Managing graphs 3. Running algorithms 4. Plotting results Interactive Shell PEGASUS supports an interactive shell so that users can manage graphs, run algorithms, and generate plots. To access the shell, type pegasus.sh in the PEGASUS installation directory. Then, the PEGASUS shell will appear. For available commands in the shell, type help. Managing graphs To use PEGASUS, the graphs to be analyzed should be uploaded to the HaDoop File System (HDFS). In the shell, the add command is used for uploading a graph to HDFS. To add a local edge file 'www_edges.tab' to HDFS and name it to 'www', issue the following command: add www_edges.tab www You can see the list of the current graphs by the list command. As you see, the graph 'www' is added to HDFS. Now we are ready to run algorithms. Running Algorithms We will compute the degree, PageRank, and the radius of the www graph. For the purpose, we use the compute  command. Degree To compute the degree, use the compute deg [graph_name] command. On entering the command, it will ask additional parameters: the type of the degree, and the number of reducers. In this demo, we use inout for the degree type, and 10 for the number of reducers. After entering the parameters, the degree is computed on Hadoop. When the computation is finished, you will see the following messages. PageRank To compute the PageRank, use the compute pagerank [graph_name] command. On entering the command, it will ask additional parameters: the number of nodes in the graph, the number of reducers, and whether to symmetrize the graph. In this demo, we use 325729 for the number of nodes, and 10 for the number of reducers, and 'nosym' which means not to symmetrize the graph. After entering the parameters, the PageRank is computed on Hadoop. When the computation is finished, you will see the following messages. Radius To compute the Radius, use the compute radius [graph_name] command. On entering the command, it will ask additional parameters: the number of nodes in the graph, the number of reducers, and whether to symmetrize the graph. In this demo, we use 325729 for the number of nodes, and 10 for the number of reducers, and makesym  which means to symmetrize the graph so that we get the undirected radius. After entering the parameters, the radius is computed on Hadoop. When the computation is finished, you will see the following messages. Correlation Plots In addition to the distribution of individual properties of graph, you can plot the correlation plots of the two properites. PEGASUS generates three correlation plots: degree vs. PageRank, radius vs. PageRank, and radius vs. degree. To generate the correlation plots, type plot corr [graph_name] command. Then, the following three output files are created: [graph_name]_deg_radius.png, [graph_name]_pagerank_deg.png, and [graph_name]_pagerank_radius.png. Here are sample outputs. Demo DOWNLOAD USING PEGASUS PUBLICATIONS ABOUT SCHOOL OF COMPUTER SCIENCE