Peer-to-Peer Content Distribution
|
Content distribution on the Internet uses many different
service architectures, ranging from centralized
client-server to fully distributed. The recent
wide-spread use of peer-to-peer applications such as
SETI, Napster, and Gnutella indicate that there are many
potential benefits to fully distributed peer-to-peer
systems. Peer-to-peer content distribution provides
more resilience and higher availability through
wide-scale replication of content at large numbers of
peers.
We are involved in several ongoing projects that study
different flavors of peer-to-peer content distribution.
We study the use of a simple, yet powerful observation
called interest-based locality
to provide scalable and high-performance content lookups
and retrievals in peer-to-peer systems. We are also
studying how to scale Gnutella,
a popular file-sharing application. And, we are
exploring how selective use of peer-to-peer
communications can enhance existing client-server
systems in the CoopNet project.
|
Current work on peer-to-peer content location has focused on
designing scalable algorithms. However, in a heterogeneous
environment such as the Internet, performance is an equally
important consideration. We study techniques to enhance the
performance of peer-to-peer systems. In particular, we exploit
a simple, yet powerful property interest-based locality
in the context of peer-to-peer content location, which says that
if a peer has a particular piece of content that we are
interested in, it is very likely that it will have other pieces
of content that we are interested in as well. Therefore, peers
that share similar interests can benefit from direct
cooperation. We propose a technique called interest-based
shortcuts to link peers that share similar interests closer
together. Peers run a fully distributed algorithm to
incrementally construct their own set of shortcuts without the
use of any global state or global communication. In addition,
shortcuts are modular and can be implemented as a performance
enhancement layer on top of any existing peer-to-peer content
location system. As a result, shortcuts yield higher lookup
performance without sacrificing scalability.
In addition to improving content location performance,
interest-based shortcuts can be used as a primitive for a rich
class of higher-level services. For instance, keyword or string
matching searches for content and performance-based content
retrieval are two examples of such services. Our SIGCOMM 2001
poster studies how performance-based content retrieval can
implemented using interest-based shortcuts. The goal of such a
service is to retrieve content from the peer with the best
performance. Most peer-to-peer systems assume short-lived
interaction on the order of single requests. However, shortcuts
provide an opportunity for a longer-term relationship between
peers. Given this relationship, peers can afford to carefully
test out shortcuts and select to use the best ones. In
addition, the amount of state peers need to allocate for
interest-based shortcuts is small and bounded. Therefore, peers
can store performance history for all of their shortcuts. Peers
can even perform active probing of shortcuts when needed.
Publications
- Efficient Content Location Using Interest-Based Locality
in Peer-to-Peer Systems, Kunwadee Sripanidkulchai, Bruce
Maggs, and Hui Zhang. Infocom 2003. Paper (pdf
| ps.gz)
and presentation (PowerPointShow
| pdf
| ps.gz)
- Enabling Efficient Content Location and Retrieval in
Peer-to-Peer Systems by Exploiting Locality in Interests,
Kunwadee Sripanidkulchai, Bruce Maggs, and Hui Zhang. SIGCOMM
2001 Poster. Poster (pdf
| ps.gz)
and abstract (pdf
|
ps.gz) which appears in ACM SIGCOMM Computer Communication
Review, January 2002.
The surging increase in the popularity of peer-to-peer
applications had led to a dramatic need for a scalable and high
performance content location protocol. Gnutella, a peer-to-peer
file-sharing protocol, broadcasts queries to locate content and,
thus, suffers from an overwhelming amount of query and reply
traffic. We study the characteristics of queries on Gnutella and
its implications on scaling. We find that the popularity of
search strings follows a Zipf-like distribution. Taking
advantage of such a popularity distribution by caching a small
number of query results significantly decreases the amount of
traffic seen on the network. We evaluate the effectiveness of
caching and find that caching at one Gnutella node can result in
up to a 3.7-time reduction in traffic while using only a few
megabytes of memory. As more nodes implement caching, more
traffic is reduced. Caching is a short-term solution to
increasing the scalability of Gnutella.
Publications

|
In CoopNet, we seek to improve the performance of
client-server systems through selective use of
peer-to-peer communications. We focus on the Web flash
crowd problem and show that client cooperation offers an
effective solution. We evaluate CoopNet using traces
gathered at the MSNBC website during the flash crowds
that occurred on September 11, 2001. This is joint work
with the Systems and Networking Group at Microsoft
Research. For more information, please visit the project website.
Publications
- Distributing Streaming Media Content Using
Cooperative Networking, Venkata N. Padmanabhan,
Helen J. Wang, Philip A. Chou, and Kunwadee
Sripanidkulchai. NOSSDAV '02. Paper
(pdf).
- The Case for Cooperative Networking, Venkata
N. Padmanabhan and Kunwadee Sripanidkulchai. IPTPS
'02. Paper (pdf)
and presentation (PowerPointShow
| pdf
| ps.gz).
|
Content Location Protocols Based on Distributed
Hash Tables
- Tapestry: An Infrastructure for Wide-area
Fault-tolerant Location and Routing, Ben Zhao, John Kubiatowicz,
Anthony Joseph. This is part of the Oceanstore project.
- A
Scalable Content-Addressable Network, Sylvia Ratnasamy, Paul
Francis, Mark Handley, Richard Karp, Scott Shenker. ACM SIGCOMM
Conf., San Diego, CA, September 2001.
- Pastry: Scalable, distributed object location
and routing for large-scale peer-to-peer systems, Anthony
Rowstron (MSR) and Peter Druschel (Rice). This is part of the
PAST project.
- Chord: A Peer-to-Peer Lookup Service for
Internet Applications, Ion Stoica, Robert Morris, David Karger,
Frans Kaashoek, Hari Balakrishnan. ACM SIGCOMM Conf., San
Diego, CA, September 2001.
Peer-to-Peer Applications
Misc.
Kay Sripanidkulchai
