My broad research interests span the areas of distributed systems and networking. Specifically, I have looked at interesting research issues in improving data transfer performance in
I worked on my thesis under the guidance of Y. Charlie Hu. I was part of the Distributed Systems and Networking Lab. I also actively collaborate with Dave Andersen from Carnegie Mellon University, Konstantina Papagiannaki and Michael Kaminsky from Intel Research Pittsburgh and Z. Morley Mao from University of Michigan Ann Arbor.
Despite the increasing importance of bulk data transfers, such transfers
frequently remain slow and inefficient. Many modern transfer systems achieve
download throughputs of only tens of Kbit/s despite running on fast, asymmetric
broadband connections. As bulk data transfers take center-stage to deliver
content ranging from news and entertainment to critical services such as
telemedicine and telepresence, there is a pressing need to improve their
performance. This demand for efficient bulk data transfers motivates our
work--we are investigating the performance limitations of current bulk data
transfers and systematically improving their performance.
The foremost challenge is that TCP, the dominant transport protocol for the last twenty years, is inefficient in utilizing available bandwidth between a single sender-receiver pair in the presence of losses. We built Slot, an incrementally deployable system, that addresses this challenge. Slot uses in-network resources from an overlay network to effectively shorten the end-to-end TCP feedback loop; it breaks the end-to-end loop into multiple pipelined sub-loops, each of which is more efficient than the end-to-end loop, via intermediary nodes carefully chosen from an overlay network.
Given that a single sender is typically unable to saturate the receiver's bandwidth, a popular approach to improve transfer performance is to download from multiple sources simultaneously (e.g., BitTorrent). The challenge, however, is that the current multi-source systems have a fundamental limitation--these systems may still not have enough sources to saturate the receiver's bandwidth. We addressed this limitation by developing a novel technique to discover more sources for a receiver to draw from by exploiting similarity among data objects. The proposed system, SET, efficiently locates and exploits such similar and identical sources using handprinting, a technique based on constant number of lookups and mappings in a global lookup table. SET is the core transfer mechanism within DOT.
Current approaches have also attempted to leverage local resources to improve transfer performance (e.g., rsync). These approaches suffer from focusing on one particular resource strategy to the exclusion of others and are not adaptive across a wide-range of operating scenarios. To address this challenge, we proposed dsync, a system that harnesses both local and network resources intelligently.
The design of Slot's infrastructure required an understanding of the dynamics of network properties and their time scales---when and why do network properties change. The answers to these questions are also important in the design of other measurement-based systems such as VoIP and network positioning systems. Our work used network delay as an example to understand its dynamics. It is well known that congestion on a routing path causes variation in the end-to-end delay between two hosts. Our work is the first, however, to systematically analyze changes in network delays and jitter caused by routing path changes. Our key contribution was to study the predictability of the delay variations caused by these routing events by measuring properties such as the durations for which these paths last, the frequency at which path changes happen, the delay variations that can be caused due to these changes, and so on. Another recent work also investigated the predictability of one-way delay, important for applications like streaming and tree-based multicast.
Community wireless networks (mesh networks) are an appealing way to provide cheap, deployable, last mile Internet access to millions of homes, specially in rural and economically disadvantaged zones. We are also studying various issues in designing and deploying wireless mesh networks using MAP (Mesh@Purdue), a 32 node multi-radio wireless mesh testbed. We are studying problems at the medium access, network and transport layers.
We are also focused on developing and deploying real applications over protocols implemented in our wireless mesh testbed. As a first step, we have studied the issues that arise when client nodes access infrastructure points such as wireless mesh routers for Internet connectivity. Our work identifies the problem of providing fairness in multi-hop wireless infrastructure access. Finally, we develop and study a distributed routing algorithm that significantly improves the symmetrical fairness.
Obtaining satisfactory throughput across several clients is a significant barrier in these wireless networks. We proposed two systems, DMesh and MeshCache, to improve the throughput performance in mesh networks. DMesh exploits cheap and simple directional antennas placed on multiple radios via intelligent directional channel assignment and routing algorithms. This provides spatial and frequency separation between competing transmitters, thereby reducing interference and improving throughput. MeshCache is an application layer TCP relay and caching system for mesh networks. MeshCache enables data caching at each wireless mesh router by breaking long haul TCP connections (from clients to gateways) at every router. MeshCache then exploits locality in user traffic to fetch the data from peer routers instead of the Internet gateway. This reduces the hop count of the data transfers and mitigates congestion at the gateway, thereby improving throughput performance.
Wireless networks formed in an ad-hoc fashion are useful in settings such as stadiums and conferences. Although such networks are typically small in scale (about 100 nodes), the possible mobility of nodes and their churn creates performance hurdles---these networks suffer from limited scalability of routing protocols and lack of service discovery mechanisms.
We developed Dynamic P2P Source Routing (DPSR), a new routing protocol for mobile wireless ad hoc networks (MANETs) that exploits the synergy between topology-aware structured P2P overlays and MANETs for enhanced scalability. In a network of size N, when each of the N nodes communicates with the remaining N-1 nodes directly, DPSR introduces a virtual address space for routing through overlay peers, effectively reducing the number of links (and hence routes) required for all pairwise communication from O(N^2) to O(N logN).
Our second focus was to provide support for p2p applications in MANETs. Distributed Hash Tables (DHTs) have proven to be a novel and efficient platform for building distributed applications in the Internet. Using DHTs in MANETs is difficult due to the unique challenges of the wireless environment. We proposed, evaluated and implemented Ekta, an efficient integrated approach towards provisioning DHTs in MANETs. We also proposed improvements to routing protocols for MANETs to support p2p applications that provide a challenging workload.
This work identifies novel frontiers where the p2p principles of self organization and cooperation are inherent, and furthers the state-of-the-art in these new application domains.
Large-scale mobile systems have important applications in military command and control, emergency services, and sensing networks. Scalable routing, however, in such large networks in the presence of mobility is a challenging problem due to the frequent topology changes and the need for global route discovery. Geographic routing (e.g. GPSR, GFG, GOAFR) allows scalable routing in these networks using only local state. A central challenge in enabling geographic routing is designing a scalable location service to obtain the location of a destination node. We showed that in practical sized networks, asymptotically scalable location services are not necessarily the best choice. In contrast, our protocol GHLS, provides a simple, robust and scalable location service by avoiding unnecessary complex operations. In subsequent work, we applied the distributed mobile geographic hashing primitive to provide scalable multicast in mobile ad hoc networks.
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.