Project Title

Peer-to-Peer Networks for Self-Organizing Virtual Communities

Project Award Number

IIS-0118767


Principal Investigator

Jamie Callan
School of Computer Science
Carnegie Mellon University
Pittsburgh
PA 15213-8213
412-268-4525
412-268-6298
callan@cs.cmu.edu
http://www.cs.cmu.edu/~callan/


Co-PI

Ramayya Krishnan
Heinz School of Public Policy and Management
Carnegie Mellon University
Pittsburgh PA 15213-8213
412-268-2174
rk2x@andrew.cmu.edu
http://www.heinz.cmu.edu/researchers/faculty/rk2x.html


Co-PI

Alan Montgomery
Graduate School of Industrial Administration
Carnegie Mellon University
Pittsburgh PA 15213-8213
412-268-4562
alm3@andrew.cmu.edu
http://www.andrew.cmu.edu/~alm3/


Collaborator

Michael Smith
Heinz School of Public Policy and Management
Carnegie Mellon University
Pittsburgh PA 15213-8213
412-268-5978
mds@cmu.edu
http://www.heinz.cmu.edu/researchers/faculty/mds.html


Collaborator

Rahul Telang
Heinz School of Public Policy and Management
Carnegie Mellon University
Pittsburgh PA 15213-8213
412-268-1155
rtelang@andrew.cmu.edu
http://www.heinz.cmu.edu/researchers/faculty/rtelang.html

Keywords

peer-to-peer networks
virtual communities
content-based resource selection
pricing and resource allocation

Project Summary

This project extends peer-to-peer communication networks to better support formation of virtual communities in wide area computer networks. Virtual communities bring together individuals with similar interests, but the difficulty of forming them and sustaining critical mass discourages communities that serve small populations or compete with existing communities. Large-scale peer-to-peer networks offer the possibility of self-organizing communities, in which nodes recognize and create relatively stable connections to other nodes with similar interests. The solution includes nodes that learn about their network neighborhoods, nodes that offer partial (and competing) directory services, new methods of routing messages efficiently in peer-to-peer networks, more accurate methods of making resource selection decisions in environments containing many resources, and a utility-theoretic model for decision-making by individual nodes that incorporate multiple task requirements (e.g., cost, accuracy, and reliability).

Publications and Products

A. Asvanund, S. Bagala, M. Kapadia, R. Krishnan, M. Smith, and R. Telang, "Intelligent Club Management in P2P Networks", Workshop on P2P systems. 2003.

A. Asvanund, K. Clay, R. Krishnan, and M. Smith, "An Empirical Analysis of Network Externalities in Peer-To-Peer Music Sharing Networks", International Conference on Information Systems (ICIS). 2002.

A. Asvanund, R. Krishnan, M. Smith, and R. Telang, "Building Economic Incentives for Club Management in Peer-to-Peer Networks for Self-Organizing Virtual Communities", International Conference on Information Systems. Submitted.

K. Hosanagar, R. Krishnan, V. Choudhary, and J. Chuang, "Pricing and Resource Allocation in Caching Services with Multiple Levels of QoS", Management Science. Submitted.

K. Hosanagar, R. Krishnan, J. Chuang, and V. Chowdhary, "Pricing Vertically Differentiated Web Caching Services", Proceedings of the International Conference on Information Systems (ICIS). 2002.

K. Hosanagar, R. Krishnan, I. Karaesman, and A. Montgomery, "Simulation/Optimization Based Design of Comparison Shopping Engines", Proceedings of the 11th Workshop on Information Technology and Systems (WITS). 2002.

R. Jin, L. Si, A.G. Hauptmann, and J. Callan, "A language model for IR using collection information (poster description)", Proceedings of the Twenty-Fifth Annual International SIGIR Conference on Research and Development in Information Retrieval. 2002.

R. Krishnan, M. D. Smith, Z. Tang, and R. Telang, "The Virtual Commons: Why Free-Riding Can Be Tolerated in File Sharing Networks", International Conference on Information Systems (ICIS). 2002.

J. Lu and J. Callan, "Content-Based Retrieval in Hybrid Peer-to-Peer Networks", Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM 2003). To appear.

A. L. Montgomery and B. R. Gordon, "Categorizing Web Pages Using Statistical Models of Web Navigation and Text Classification", Journal of the American Statistical Association. Submitted.

A. L. Montgomery and S. Ouyang, "The Promotional Value of Peer-to-Peer Networks", Management Science. Submitted.

L. Si and J. Callan, "Using sampled data and regression to merge search engine results", Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, p. 19, vol. 25. 2002.

L. Si and J. Callan, "Relevant document distribution estimation method for resource selection", Proceedings of the Twenty Sixth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2003.

L. Si and J. Callan, "The effect of database size distribution on resource selection algorithms." In Proceedings of the SIGIR 2003 Workshop on Distributed Information Retrieval. Toronto. 2003.

L. Si and J. Callan, "A semi-supervised learning approach to merging search engine results", ACM Transactions on Information Systems. In press.

L. Si, R. Jin, J. Callan, and P. Ogilvie, "A language modeling framework for resource selection and results merging", Proceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM). 2002.

X. Wang and A. L. Montgomery, "The Effects of Advertising on Customer Retention and the Profitability of Auctions", Management Science. Submitted.

X. Wang, A. L. Montgomery, and K. Srinivasan, "The Use of Buy-it-Now in Auctions", Marketing Science. Submitted.

Project Impact

The scientific results are more robust and efficient peer-to-peer networks, new techniques for forming virtual communities, and a better understanding of how complex peer-to-peer networks work. A software simulator enables CS, MIS, and Business students to study virtual communities, for example testing hypotheses about why marketplaces fail or policies that encourage community formation. The basic science can be used to build search tools that explicitly consider tens of thousands of databases, software that supports dynamic creation of virtual communities within organizational intranets in response to unforseen developments (e.g., the DoD), and wireless networks in which devices work whenever they are in range of another device.

Goals, Objectives and Targeted Activities

The research focus is centralized and local approaches to acquiring information about the characteristics of nodes in a peer-to-peer network; a particularly challenging type of information is information about the content at each node, but we also develop and evaluate methods for acquiring information about other node characteristics, such as reliability and responsiveness. The research investigates decision-theoretic utility functions that enable nodes to optimize their decision-making with respect to their specific priorities. Our research goals are a better understanding of i) information gathering and decision-making in large-scale peer-to-peer networks; ii) network conditions in which optimizing individual utility is consistent with globally desirable conditions such as reduced network congestion; and iii) network characteristics that foster spontaneous formation of virtual communities through the individual decision-making of nodes about where to connect and how to relay messages in the network.

Area Background

There is not enough space to provide this information. A relatively extensive description of peer-to-peer systems may be found at http://www.oreillynet.com/pub/q/p2p_category.

Area References

There is not enough space to provide this information.

Potential Related Projects

Project Websites

http://www.cs.cmu.edu/~callan/Projects/P2P/index.html
This is the main website for our project.

Illustrations

Online Software

The project has produced a peer-to-peer simulator, written in Java. The simulator has been used to simulate peer-to-peer networks of up to 2500 simple, content-based digital libraries. The simulator is not packaged for widespread distribution, but it is available to other research projects upon request.

Online Data

http://hartford.lti.cs.cmu.edu/callan/Data/
Data definitions for a peer-to-peer network of 2,500 small digital libraries, and a set of 50,000 known-item queries that can be used with them. See Lu and Callan, CIKM 2003 (cited above) for details.

Other Resources