An independent research project is required as part of this class. The project will be done in groups of two students. Additional details will be posted soon.
Project Ideas
A variety of recent studies have looked at the characteristics of Internet topologies. These initial efforts have opened up a number of interesting questions
Types of networks. The early efforts have primarily concentrated on AS-AS connectivity. One possibility is that there are a multiple "types" of networks. E.g. physically constrained networks - networks where physical issues such as distance or physical fanout constrain the possible connections; logically constrained networks - connections can be made to arbitrary peers however, connections are often limited by "friendships"/knowledge of peers. An interesting project would search for different types of networks and try to identify key characteristics.
Characteristics other than connectivity. In order to use topologies to drive protocol design, other characteristics (such as link bandwidth and latency) must be characterized as well. It seems likely that these characteristics may have interesting properties as well (e.g. nodes with high fanout may have high bandwidth as well). An interesting project would be to measure these characteristics and identifying any interesting properties.
Scaling properties. The topology characteristics on a network have significant implications on the type of traffic patterns that are scalable. For example, if most traffic travels little distance and wide area links are very high bandwidth, the network will easily scale with the number of end nodes. An interesting project would be to try to characterize scaling properties of AS-AS topologies.
Traffic matrix model. Much of the work in topology characterization will remain very abstract until someone develops a method of mapping network traffic onto the graph.
Abstract path model. In order to develop a robust distributed application, developers need effective testing and simulation tools. The current approach to improving simulation tools has been to more accurately reproduce the network. Current work has concentrated on providing realistic network topologies and scaling simulation to large numbers of nodes. Future work is likely to examine other aspects of network topology (BW and latency) as well as network traffic loads. An alternate approach to improving realism may be to create the equivalent of "Impulse Response" for a network path. Based on some basic measurements of the path it may be possible to reproduce the response of the path to arbitrary cross traffic.
Benchmarks. Network benchmarks and planning tools are poor to non-existent. How does one know how adding a new link or speeding up an existing one will improve the performance of the network as a whole? Perhaps money would be better spent adding links elsewhere? What is necessary is tools that help measure the current performance of the network (benchmarks) and tools that help predict the impact of changes.
Distributed application monitoring. One of the challenges of deploying a large scale distributed application is problem of monitoring the operation of the application. How can the system identify whether the operating correctly or in an efficient manor. Does this monitoring infrastructure have to be reinvented for each application or can certain parts be provided as a general infrastructure.
design tool to measure critical path in existing applications
BGP measurement -- what is missing from current graphs
BGP measurement -- look at how much info path vector collects
NTP overlay network behavior. The Network Time Protocol (NTP) works by implementing a large distributed network of time server machines, interconnected to form an overlay network. Little is known about the behavior and performance of the NTP overlay network. The most recent study was done by Nelson Minar (as a course project at MIT), where he collected an impressive amount of data and wrote an NTP spider. Unfortunately, despite this impressive groundwork, he ran out of time to do a complete and systematic study. His source code and data (and term paper) are now available. Your task will be to build on this work, perhaps collecting additional data, to better understand the behavior of the NTP network. The NTP service is vital to the performance of many applications and systems, and more should be known about it than it currently is.
Analysis of wide-area routing data. Kim Claffy and colleagues at CAIDA have collected a lot of packet-level data from a gateway at an Internet backbone. This data is now available at http://moat.nlanr.net/Traces/ and they have a paper analyzing some of it at http://www.caida.org/outreach/papers/AIX0005/ Your task will be to analyze this data for a variety of characteristics to learn and educate people about properties of the Internet. Part of the research here will involve figuring out exactly what questions to answer, once you read their paper. This data is a goldmine of information, and a number of possibilities exist.
Anonymization techniques for traces
SCTP over CM. The Stream Control Transport Protocol, SCTP, is an emerging standard for both lightweight reliable transports and selectively reliable transports. It has some nice properties including message boundary support, support for multi-homed hosts, and denial-of-service protection. Your task will be to implement an SCTP that uses the CM for congestion management. A good way to do this would be to take one of the available implementations of SCTP, rip out its congestion control, and use the CM API to perform congestion control. In addition to adding correct congestion control functionality to SCTP, a protocol that is emerging as a popular standard for signaling applications on the Internet, you should be able to answer questions about its performance across the Internet.
Congestion control with mobility. What is the correct update to congestion control when a mobile host moves? Should it restart from scratch? Keep the same old congestion window? Perhaps a hint on how to update the congestion state can be obtained by identifying how much of the connection path has changed or try to identify where the bottleneck for a connection is.
Congestion manager with web browsing. The congestion manager provides a mechanism to schedule a group connections that share congestion control state. What if Web browsers and servers could use this to prioritize the transfer of different parts of a Web page? How would clients and server express their desires? Perhaps as part of HTML or HTTP? What would be some good policies?
Where is congestion really occurring in the Internet... can you trace it down? Perhaps look at nettimer - like techniques to identify where bottlenecks occur? Can you avoid these bottleneck - are their alternate paths available?
Most protocols are designed to operate best
within a certain range of operating conditions. Examples include how
routing/packet delivery is based on population and how TCP places the bulk
of the burden on senders. Can we redesign these protocols to take on very
different operation modes in different situations (e.g. have TCP adapt to
the endpoint capabilities and have routing protocols adapt to the
population size)? However, many of these operating mode decisions must be
made simultaneously by all parties involved. For unicast transport
protocols, this is relatively simple. However, for distributed routing
protocols, this may be much more difficult. This idea is especially
important in mobile computing where operating conditions are not static.
Routing based on population size. Routing in small network tend to use
broadcast to help routing/delivery (e.g. existing ad-hoc routing,
Ethernet ARP). Routing in larger network tends to use hierarchy to
scale. Shouldn't routing protocols for ad-hoc network take on
different behavior based on population of the network? How can a group
of nodes make such a decision and switch the mode of routing operation
as a whole? How can you estimate population?
MAC based on node-density. Can you improve backoff/other algorithms
with an estimate of how many people are active? How do you get this
estimate... how long is it valid?
adhoc with rate of mobility as factor
Operating condition measurement for multi-modal. See the description
of multi-modal protocols. These protocols need estimation of
parameters such as population, power rich nodes, computation rich
nodes, etc. The objective of this project is to develop techniques to
monitor and track such parameters.
see BGP related topics..
The current group communication primitive
in the Internet, IP Multicast, has been unsuccessful for a variety of
reasons. Several groups have begun to explore the alternate approach of
using an overlay network of multicast routers. Many open issues remain in
providing this service.
What support for congestion control should the shared infrastructure provide. Congestion control is tightly linked to monitoring losses on the Internet. Monitoring losses in multicast groups results in the problem of ACK implosion. Therefore, having the infrastructure deal with some of the congestion control may provide a more scalable solution.
Performance of Content Distribution Networks (CDNs). Content distribution networks (CDNs)such as Akamai are becoming increasingly popular and it seems likely that most popular sites will end up being transmitted over CDNs. Your task will be to come up with benchmarks and evaluate the performance of Akamai's network. In addition to understanding the details of how their Web CDN works, you should be able to analyze its performance under different workloads. The best way to start would be to get several accounts at different ISPs (ASes) and go from there. There's one study from Sightpath by Johnson et al. on this, which appeared in the WWW caching workshop. You should become familiar with this work.
The true benefits of proxy Web caching.
A lot of people attribute the good performance of deploying Web
proxies to the fact that they cache pages and so reduce user-perceived
latencies for Web downloads. However, given that several studies show
Web cache hit rates are between 30% and 50%, it doesn't seem like the
benefits are that significant to justify their deployment. However,
there is another important benefit of proxies, which is that they
typically work by splitting TCP connections at the proxy. Most proxies
are at the well-connected end of a (sometimes shared) bottleneck.
Proxy-to-client connectivity, while often low and high latency, is
often not loss-prone (e.g., modems often have huge queues). Most
losses occur in the wide-area Internet, where timeouts dominate for
small window TCPs (connections are short). So, your task is to figure
out how much of the benefit of proxy caches is because of TCP
splitting (and the corresponding reduction in RTT) vs. actual caching?
In comparison with an end-to-end TCP, split-TCP (with all its
architectural drawbacks) is likely to perform substantially better in
this environment. If so, the benefits of proxy Web caching in terms of
performance are more because of TCP-splitting than caching.
Your task is to prove or disprove this hypothesis and tell the world
what's really going on!
Forward Error Correction. Some people advocate the use of FEC to recover from packet loss. Obviously, use of FEC increases the data rate and (maybe) the packet loss. I could imagine having a project looking at when it might make sense to use FEC and when it hurts.
Look at loss recovery in different environments - overlay networks, adhoc networks, etc. Should techniques be modified in these environments - if so, how?
Service discovery with dynamic attributes. One of the desirable features of service discovery is to be able to do things like print to the printer with the shortest queue. Service attributes for the printer such as location and memory do not change over time. However, attributes such as queue length are dynamic. Service descriptions tend to very simple in most systems. What if we made them into code segments that would be evaluated by either the service discovery server or client. This would enable service discovery servers to pull information about service attributes on demand. Pulling information would make more sense for something like a compute server where the load attribute changes frequently but few queries either match the server's other attributes. It may also be useful to support attributes such as price which may vary based on a variety of parameters including client characteristics. Issues to resolve include: how is the pull code defined (Java maybe)?, what are the security restrictions on this code, are these pulled values cached?, is the caching defined by the code? if so, what type of persistent state is allowed across invocations?
Incentives in non-administrative hierarchies. Current service discovery hierarchies primarily only supports administratively scoped queries. It needs to support geographically scoped queries for discovering resources in mobile environments and support network performance based queries for discovering rentable network resources. However, nodes in a service discovery system must have also have the incentive to provide accurate responses to queries. In an administratively organized system, this incentive structure is simple. However, in alternate organizations, this incentive is unclear. For example, suppose the Mellon Bank ran the server for the CMU area, a search for ATM's (a service) probably will not return any non-Mellon ATM's. How do we ensure that service location servers return truthful results?
Service composition and service browsing. Path creation is the composition of multiple services to make a new service. Another useful feature is the ability to browse available services. There seem to be reasonable solutions to both these problems in centralized service discovery systems. Are there reasonable ways to enable these in a distributed service discovery system?
Protocol evolution. End-to-end protocols are deployed rapidly today (e.g. SACK was effectively deployed in 3 years) because end-hosts are frequently replaced or upgraded. In a future environment with many more embedded devices, it is less likely that devices will be updated. The architecture of end-node protocols stacks have to modified to enable automated upgrading and evolution. Unfortunately, because end nodes may be extremely simple the solution of active networks may not be as applicable (not every node will be able to support the active programming environment). Perhaps some negotiation techniques for features may provide some middle ground?
Analysis of performance/algorithms/workload
Better techniques to search P2P systems
using them for something other than standard file sharing... e.g. chat groups, mail/netnews service, anonymized routing...
Ad-hoc routing with additional constraints (power, fan-out, etc..)
Effective use of power control in adhoc
Ad hoc for improved coverage in cellular systems - i.e. could you use adhoc only in the coverage holes... how would you design your routing differently?
Ad hoc w/ HW link evolution. Because of the nature of the devices involved and the design of radio systems, it is likely that many different radio technologies will need to co-exist in this environment. Current routing techniques focus on mobility and geographic range. However, routing in this new environment will need to optimize for bridging, spectrum usage and power consumption.
Can you use multi-rate radios and power control to minimize power for given application requirements?
A new transport protocol for the Web that provides better support for interception and redirection (i.e. caching and replication)
Look at packet classification using FPGA and similar hardware
Parallelized firewall processing...
intrusion detections systems
detecting virus/worm propagation
CMU core measurements?
If none of the above projects interests you and you aren't able to come up with one on your own, here are a couple of suggestions ("algorithm") to get you started in finding a project that does.
Find some problem/area in networking that you like (or better still, that truly excites you, and/or that seems of immediate and important relevence to you). Then address the following questions:
Has a solution been posed to this problem in the past and, if so, what do you think of it?
If a solution has been proposed, has it been simulated properly?
If the system has been simulated, does anyone understand its mathematical properties?
If the system has been simulated and theoretically modeled, has it been implemented and its performance studied under realistic conditions?
Is the solution scalable? Is it secure? How well does it hold up in the face of failures? Will it work in hetoregenous environments? Will it stand up to coming advances in technology?
If several solutions have been proposed, has anyone performed a comparative study of them? Why do some schemes work better than others? Can we characterize the conditions under which some schemes work better than others.