15-441 Project 1 : TFTP Server

Assigned : Wednesday, September 1st, 2004

Due : Friday, September 17th, 2004, 11:59pm

Lead TA : Shafeeq Sinnamohideen

Introduction

The purpose of this project is to give you experience in developing concurrent network applications. You will use the Berkeley Sockets API to write a Trivial File Transfer Protocol (TFTP) Server.

TFTP is a deliberately simplified file transfer protocol. Specifially, it supports the bare minimum set operations needed to get and put files. Compared to FTP, it does not support listing of directiores, renaming, or access control. It is most commonly used in booting diskless workstations. The protocol is simple enough for a client to be implemented in the workstation's firmware, which uses it to fetch the operating system kernel to boot.

Logistics

You are to work in groups of 1 (alone) on this project. You are free to discuss the project and references with your classmates, but your solution must be your own.

You may code your server in C or C++. While other languages may provide features to make network programming easier, most kernel-level networking code is written in C and future assignments will require implementing such code in a simulated environment.

While you may develop & test your code on any operating system you wish, we will grade your code on Andrew Linux machines, so please make sure your code compiles and runs as expected on them.

Your Assignment

The Server

Your server must :

Implement the protocol specified in RFC 1350 with the following changes :
1. Only octet mode needs to be supported. The other specified modes may be implemented for extra credit (see below).
2. The server should listen for initial connections on a UDP port that is a command-line parameter instead of fixed at 69. This is to allow multiple students to run their servers on the same machine without interference.
Support simultaneous connections from multiple clients
Not utilize more than a single thread or process. Future projects involving kernel-level networking code will have this requirement, so this is a good opportunity to practice working in that environment.
Serve only those files contained in the directory it is run in.
Terminate after one hour. In case you inadvertently leave your server running, at least it will eventually stop. Since the server process can access any files you let it, running it longer than necessary may present a security risk.

The RFC leaves several details up to the implementor. You should be able to justify the choices you make. You may also find yourself constrained by resource or OS limitations. For instance, you may not be able to support an infinite number of simultaneous clients, or certain file names. You should identify what these limits are and do something reasonable when they are reached.

Test Cases

As with all network services, your server should be robust against dropped packets, clients that do not correctly implement the protocol, and malicious attacks. Thus, a large part of this assignment (and programming in general) is knowing how to test and debug your work. There are many ways to do this; be creative. We would like to know how you tested your server and how you convinced yourself it actually works. To this end, you should submit your test code along with brief documentation describing what you did to test that your server works. The test cases should include both generic ones that check the server functionality and those that test particular corner cases. If your server fails on some tests and you do not have time to fix it, this should also be documented (we would rather you know and acknowledge the pitfalls of your server, than miss them). Several paragraphs (or even a bulleted list of things done and why) should suffice for the test case documentation.

Most operating systems (inculding Andrew Linux and Solaris) include a TFTP client program, tftp, that you can use to test your server. By default, it assumes the server is on the standard port 69, but the default can be overridden by specifying a port number after the host name. You can get the complete manual through man tftp, but for a quick tutorial:

Running tftp presents you with a tftp> prompt.

At the tftp> prompt, connect host port tells the program to use host "host" and port "port" for all subsequent operations. Note that this step only sets a variable - it does not send any messages. Invoking tftp host port has the same effect, with no need to issue a connect.

The get and put commands start the sequence beginning with RRQ and WRQ packets respectively.

trace causes the client to print the header of every packet in receives or sends, which could come in handy for debugging.

binary tells the client to use octet mode.

If you are using a non-Andrew machine, be sure you're using the latest version of its tftp client - older versions included with some Linux distributions may be buggy. In particular, netkit-tftp-0.17 or earlier, and tftp-hpa-0.15 or earlier are known not to work. As far as we can tell, the versions on Andrew are correct.

Evaluation

Core networking : 20 points
This grade is intended to reflect your ability to write the "core" networking code that deals with creating sockets, reading/writing from them, and connection multiplexing. Even if your server doesn't actually do anything with the packets it reads, you can still receive credit here. You can receive partial credit if your code correctly handles only a single connection.
TFTP protocol : 15 points
In this section, we will test how well you read, interpreted, and implemented the TFTP protocol. We will use a variety of file sizes, but all communications will be from a well-behaved client, so even if you do no error checking, you can receive full credit here.
Robustness : 15 points
- Server robustness : 10 points
- Test cases : 5 points
Since code quality is of a high priority in server programming, we will test your program in a variety of ways using a series of test cases. These may include :
- Sending your server oversized messages to test if there is a buffer overflow.
- Making sure something reasonable happens when asked to transfer non-existent files.
- Making sure your server correctly handles clients that repeat messages
However, there are many corner cases that the RFC does not specify. You will find that this is very common in real world programming since it is difficult to foresee all the problems that might arise. Therefore, we will not require your server pass all of the test cases in order to get a full 15 points. We will also run your own test code to evaluate how you tested your work.
Style : 10 points
Poor design, documentation, or code structure will probably reduce your grade by making it hard for you to produce a working program and hard for the grader to understand it; egregious failures in these areas will cause your grade to be lowered even if your implementation performs adequately. Putting all of your code in one module counts as an egregious design failure. As does having functions that run on for pages. You may find it useful to follow one of the coding standards listed in the Coding Style section.

To help your development and testing, we suggest your server optionally take a verbosity level switch (-v level) as the command line argument to control how much information it will print.
Extra Credit : up to 6 points
More details in the extra credit section below.

Hand-In

Invocation

Your executable must be named tftpd and accept the port number as its first argument. It must run correctly when given only one argument. Any additional arguments should be documented.

Building

Your project must include a makefile called Makefile. We will build a binary from your source code using the makefile and GNU make. The makefile for this project should be simple. If you need help creating the makefile, everything you need to know (and much much more) about GNU make can be found in the GNU make manual. We will build your program by executing rm -f tftpd, and then make tftpd.

Your makefile should run gcc with option -Wall and should produce no errors.

Hand-in Procedure

You should submit the following files:

Makefile, *.c, *.h
README (in standard Unix text file format, please!) discussing how you approached the assignment.
EXTRA - If you did anything for extra credit, describe it in this file.
Test code and the corresponding documentation named as TESTS

Updated handin instructions will be provided via the course bulletin board.

Coding Style

As the course progresses, we expect you to develop and refine your personal coding style. By this we mean that there are many reasonable ways to write code so it is easy for people to read (not that anything you define as "your style" will be acceptable). Below we have provided you with a list of coding standards put forth by various groups at CMU and elsewhere. We strongly suggest that you spend some time looking through several of these before you start coding.

The Linux Kernel Coding Standard
http://pantransit.reptiles.org/prog/CodingStyle.html
The FreeBSD Kernel Source File Style Guide
here or here
The Parallel Data Lab's C Coding Standard
http://www.ece.cmu.edu/~eno/coding/CCodingStandard.html

Resources

RFC 1350 Text
http://www.ietf.org/rfc/rfc1350.txt or http://www.faqs.org/ftp/rfc/pdf/rfc1350.txt.pdf
RFC 1123 Text (Only need chapter 4.2 relates to TFTP)
http://www.ietf.org/rfc/rfc1123.txt
RFC 783 Text (obsolete)
http://www.ietf.org/rfc/rfc783.txt
Petersen and Davie, pages 374-378. (section on UDP)
BSD Sockets Primer
http://world.std.com/~jimf/papers/sockets/sockets.html
An Introductory 4.4 BSD Interprocess Communication Tutorial
http://docs.freebsd.org/44doc/psd/20.ipctut/paper.pdf
Gnu C Library Manual
http://www.gnu.org/software/libc/manual/html_node/index.html
select_tut man page
http://www.die.net/doc/linux/man/man2/select_tut.2.html
A select Tutorial
http://www.lowtek.com/sockets/select.html
A sample UDP client & server
http://www.pont.net/socket/index.html
The manual pages for many library calls on your system can be accessed using the man program. Some useful things to read up on are : socket, bind, connect, select, select_tut, getsockname, sendto, recvfrom, ip, udp

Hints

Depending on your previous experience, this project may be substantially larger than your previous programming projects. You can expect up to 750-1000 lines of code, possibly more, possibly less. With that in mind, this section gives suggestions for how to approach the project. Naturally, other approaches are possible, and you are free to use them.

Start early. Don't wait until the last week to start working.

Read the RFC. RFCs are written in a style that you may find unfamiliar. However, it is wise for you to become familiar with it, as it is similar to the styles of many standards organizations. You may well need to reread critical sections a few times for the meaning to sink in.

Whenever the RFC mentions TIDs, it really means "UDP port number." When the original text was written, the authors introduced this level of indirection to accomodate future protocols other than UDP/IP. The upshot of this is that your server is supposed to open a new, randomly chosen, port for every transfer request it handles, rather than multiplexing many requests over a single port

RFC 1350 mentions a "Sorcerer's Apprentice" bug in the protocol that is fixed by RFC 1123. (Yes, the numbering is a bit strange. RFC 783 was the first TFTP standard, which was revised by 1123, and finally 1350. Why RFC 1350 didn't just include the text from 1123 is anyone's guess). In any case, the fix is that ACK packets should never be retransmitted, and duplicate ACKs ignored. If an ACK is lost, the side that is waiting for it will time out and retransmit the DATA packet that generated the ACK. You should take some time to understand what the problem was and how the solution preserves correctness. (And optionally why the bug gets this name)

You can choose any reasonable value for timeout and maximum number of retransmissions. Basing your values on what the tftp client does would be a good start.
Get started by writing a simple program that receives messages from a single socket and repeats it back to the sender. It won't do anything useful, but it will get you used to working with socket, bind, connect and friends. In fact, pages 35-36 in the textbook contain a program that does the similar steps, but for TCP.

You should familarize yourself with the concept of blocking syscalls. Basically, calls like read, recv, send, printf, and many others, block, or halt, your program until the action the call is performing completes. For example, recv will wait until it has received at least some data to return. If nobody sends your program any packets, recv will never return. While this is a perfectly logical behavior, it means that if you're trying to listen to two sockets you can't simply alternate recving on both.

To get around this problem, we have the select syscall. It waits until any one of a specified set of file descriptors (sockets) has data waiting to be read, and tells you which ones are ready. Thus, you know which sockets you can call recv on without getting blocked. select also takes a timeout parameter, which causes it to return if no data arrives on any socket in the set within the time interval. After reading the select manpage, you should be able to extend your previous program to handle multiple connections. Note that it is possible to perform this task without using select at all - for example, by using poll or non-blocking sockets instead.

At this point, you are ready to implement the TFTP protocol. The previous step should have given you and idea of what book-keeping will be necessary for each transfer. Read the RFC again carefully and think about the sequence of operations. Find the common tasks and group them into procedures to avoid writing the same code twice. There's more than one way to partition the tasks, with different tradeoffs. You might start by implementing a handler for one of WRQ or RRQ and moving on to supporting the second packet type in that sequence. Use tftp with the tracing option to help you find out if what's going on is what you think is going on.

You could swap the last two steps around if you prefer. Given the evaluation critera, you could receive more credit for a server that handles a single client perfectly than one that handles multiple clients but doesn't get the protocol quite right. In either case, if you don't have the first one done a week before the project is due, you will have a hard time finishing it on time.

One way of of making your server terminate on time is to call alarm() early in your main() function. alarm(n) schedules a SIGALRM to be sent n seconds in the future. Unless you are using signals in your program, the default action for SIGALRM is to terminate the program. Another way of doing this is to keep track of the total time spent waiting in select() and exiting when it is time to.

If you don't trust the access control in your server (or are implementing it later) one way to limit the possible security risk is to make your server accept connections only from the same host. This can be done by simply setting the bind parameters so that it only accepts connections from 127.0.0.1 instead of INADDR_ANY. Once you are confident that your server is secure, you can remove this restriction.

Extra Credit

Our intent in suggesting extra credit items is to give interested students the opportunity to explore additional topics that we do not have time to cover in class. The primary reward for working on the suggested items is the additional experience and knowledge that they give you, not extra credit points. Extra credit will be granted at the discretion of the teaching staff.

For each suggestion, we list a rough estimate of the number of points you can receive. If you have more specific expectations about the extra credit you will receive, you should consult your TAs beforehand to avoid any disappointment. If you work on the suggested topics below, please include in your project submission a file called EXTRA, describing what you have done.

Test case, 3 points. In general, your test code will be evaluated in the robustness part (see evaluation section). But you can get 3 points if your test code will capture interesting error case and is adopted for project grading.
Access control, 3 points. Since TFTP provides no access control, any client can fetch or write any file on the server that it knows the name of, and that the server process has permission to access. Most deployed servers work around this by restricting the set of files that are served. The previously requirement to serve files in the current directory is one way of accomplishing this. You could, however, come up with a more flexible mechanism for providing a useful TFTP service without compromising critical files on the server.
Netascii mode, 3 points. Implement this mode as specified by the RFC.
RFC 2348 TFTP Blocksize Option, 3 points.
RFC 2349 TFTP Timeout Interval and Transfer Size Option, 3 points.