Class 24. Network Programming. Nov. 17, 2005 Review of Internetworking: Client/Server Model Server has some service to provide. Waiting for clients to request service. E.g., Amazon has server. Waits for customers to contact it to buy things. Transaction: Client contacts Server. Set up connection. They go back and forth in a transaction. Then client disconnects. This is known as a SESSION. Physical structure * (Most) computers have connection to a local area network view network interface * Common LAN technologies: Switched Ethernet, Wireless Networks connected into an internet THE Internet is one (very big) instance of an internet. Communication Information sent over Internet in packets. Each packet contains header with information about source, destination, and format. Contains payload consisting of actual information. Typical packet size is <= 1500 bytes, so most messages must be broken into multiple packets. Packets are routed independently. Complex system to determine where a packet should go. Typically multiple routes. No guarantee that packets sent from source to destination will follow the same route. Internet based on BEST EFFORT service. Everyone will try their best to send packet along, but if things get difficult, packets will get dropped. Up to hosts at either end of the connection to deal with this. Protocol Communication between client & server involves combination of application and system software on both machines, plus the hardware & software supporting the Internet. Example: Client running Internet Explorer. Server running Apache. Client Computer: Application: IE makes socket library calls. Sockets library supports several different Internet protocols and includes kernel code to manage memory buffers and network interface. In this case, it would use TCP/IP. Network adapter provides connection via LAN to Internet. Host Computer: Similar structure, but with Apache accessing sockets library with it support for TCP/IP. Internet Addresses Most common protocol used has 32-bit IP addresses, stored in big-endian byte order (referred to as network byte order). IP address designates a particular host machine. Simplistic model is that every machine on the Internet has a unique, permanent IP address. In reality, it's more complicated than that: * Machines only assigned an IP address dynamically. Might change over time. Reduces total number of IP addresses needed. * Many machines sit behind firewalls. IP address not globally unique. (Analogy to phone numbers). * Some machines have multiple IP addresses. E.g., multiple network interfaces. Common representation is "dotted decimal" notation. Give 4 decimal numbers each as decimal between 0 and 255. Numbers given from LSB to MSB. E.g., IP address 0x8002C2F2 = 128.2.194.242 Very clumsy programmer's interface to do conversions: /* Data structures */ struct in_addr { unsigned int s_addr; /* network byte order (big-endian) */ }; /* Converting between dotted decimal & binary */ #include #include #include /* cp points to string such as "128.2.222.333". inp points to in_addr structure, which gets filled with the binary IP address */ int inet_aton(const char *cp, struct in_addr *inp); /* in is the actual structure containing the binary IP address. Returns pointer to (static) string with address in dotted decimal notation. */ char *inet_ntoa(struct in_addr in); Domain Naming System (DNS) You don't walk around thinking about dotted decimal addresses. Instead refer to domain names. E.g., www.cmu.edu, www.google.com, www.npr.org, etc. Mapping between IP addresses & domain names is implemented by complex, distributed naming system known as DNS. Can think of DNS as having millions of host entry structures: /* DNS host entry structure */ struct hostent { char *h_name; /* official domain name of host */ char **h_aliases; /* null-terminated array of domain names */ int h_addrtype; /* host address type (AF_INET) */ int h_length; /* length of an address, in bytes */ char **h_addr_list; /* null-terminated array of in_addr structs */ }; This data structure is designed to be more general than just for IP. So, it has field h_addrtype where different types are indicated by numbers (we use AF_INET), and addresses can be of different lengths. Also declare list of address as char **. Better way would be to use void **, but that is a notation that postdates original definition of sockets interface. Can get entry for IP address either by giving its address, or by giving its name: struct hostent * gethostbyname(char *name); // addr is pointer to to in_addr struct. // len is length of in_addr struct // type is AF_inet struct hostent * gethostbyaddr(char *addr, int len, type); Here's an example of using all of this code: #include "csapp.h" /* This is very rough code. Expects single argument to be either dotted decimal or name */ int main(int argc, char **argv) { /* argv[1] is a domain name */ char **pp; /* or dotted decimal IP addr */ struct in_addr addr; struct hostent *hostp; // Attempt to convert from dotted decimal to IP address if (inet_aton(argv[1], &addr) != 0) // Success: Use gethostbyaddr to get host entry // Note size and type entries hostp = Gethostbyaddr((const char *)&addr, sizeof(addr), AF_INET); else // Must be domain name. hostp = Gethostbyname(argv[1]); // Get official name printf("official hostname: %s\n", hostp->h_name); // Get aliases. for (pp = hostp->h_aliases; *pp != NULL; pp++) printf("alias: %s\n", *pp); // Get all addresses for (pp = hostp->h_addr_list; *pp != NULL; pp++) { // Entries *should* be in_addr structs, but we assume they are unsigneds. // Fill in to s_addr field of local in_addr struct. addr.s_addr = *((unsigned int *)*pp); // Call inet_ntoa to get dotted decimal name printf("address: %s\n", inet_ntoa(addr)); } return 0; } Interesting examples: aol.com (Choose one of its dotted decimal addresses) www.aol.com mit.edu cs.mit.edu cobia.cs.cmu.edu cobia.ics.cs.cmu.edu 128.2.205.225 Sockets We see that DNS lets us determine IP address from domain name. IP routing system provides mechanism to get from one host to another. To do anything useful, must establish a CONNECTION. For TCP, have following: Point-to-point Full duplex Reliable. Arbitrarily long messages. Delivered without duplications, omissions, or corruption Connection identified by its two endpoints, which are called SOCKETs A socket is identified by an IP address, plus a 16-bit PORT number. Ports are either WELL KNOWN, for specific service (e.g., 80 for http) Port identifies particular client or server on the host. Example: client machine www.cmu.edu (128.2.11.43) is running WWW server. When my web browser asks for web page from there, it will set up a connection: (Set up Ethereal. Have it monitor VPN. Start capture. Browse pages at www.cmu.edu. Stop capture. Choose some http packet. Look at stats). Server socket 128.2.11.43:80 Client socket. Depends on session. Sample: 128.2.216.107:2371 Once set up connection, can conduct session. Client requests service (web pages), server responds. All over single connection. Type of service identified by server's port number. Examples: 23: telnet 25: mail 80: http 20/21: ftp (Look at /etc/services for complete list on Linux). Socket interface Standard way for application programs to communicate over network. Once set up, makes communication look like file I/O: write to socket: Send message read from socket: Receive message Example: An echo server & client. These are called "echoclient" and "echoserveri". echoserveri PORT echoclient HOST PORT Hardest part about sockets is setting them up, especially for server. Once established, simply use file read & write. Made more arcane by desire for generality (not just IP), use of old-style C (predates void *), and lack of subtyping Most socket calls are to generic sockaddr structure: /* Data structures */ struct sockaddr { unsigned short sa_family; /* protocol family */ char sa_data[14]; /* address data. */ }; Just gets big block of bytes. Specific to IP is a sockaddr_in struct sockaddr_in { unsigned short sin_family; /* address family (always AF_INET) */ unsigned short sin_port; /* port num in network byte order */ struct in_addr sin_addr; /* IP addr in network byte order */ unsigned char sin_zero[8]; /* pad to sizeof(struct sockaddr) */ }; Uses bytes as: 0-1: Set to AF_INET 2-3: Port number (network byte order) 4-8: IP address (network byte order). Format a struct in_addr 9-15: Unused. #include "csapp.h" int main(int argc, char **argv) { int clientfd, port; char *host, buf[MAXLINE]; rio_t rio; if (argc != 3) { fprintf(stderr, "usage: %s \n", argv[0]); exit(0); } host = argv[1]; port = atoi(argv[2]); // Open echo connection to specified host & port. clientfd = Open_clientfd(host, port); Rio_readinitb(&rio, clientfd); printf("type:"); fflush(stdout); while (Fgets(buf, MAXLINE, stdin) != NULL) { // Send data to server Rio_writen(clientfd, buf, strlen(buf)); // Get back response Rio_readlineb(&rio, buf, MAXLINE); printf("echo:"); // Print out response Fputs(buf, stdout); printf("type:"); fflush(stdout); } // Close the connection Close(clientfd); exit(0); } Key steps for client socket: socket: allocate local data structures for a socket connect: Make the connection to a remote host & port While user has text: Read from user Write to socket: Send close: Close connection & deallocate data structures. Opening the client connection: int open_clientfd(char *hostname, int port) { int clientfd; struct hostent *hp; struct sockaddr_in serveraddr; // Request TCP/IP socket if ((clientfd = socket(AF_INET, SOCK_STREAM, 0)) < 0) return -1; /* check errno for cause of error */ /* Fill in the server's IP address and port */ if ((hp = gethostbyname(hostname)) == NULL) return -2; /* check h_errno for cause of error */ // Clear out entire data structure bzero((char *) &serveraddr, sizeof(serveraddr)); // Set bytes 0-1 serveraddr.sin_family = AF_INET; // Set bytes 4-7 bcopy((char *)hp->h_addr_list[0], // Source is from host entry // sin_addr is a field of type struct in_addr. // h_length indicates number of bytes in address (char *)&serveraddr.sin_addr.s_addr, hp->h_length); // Set bytes 2-3 serveraddr.sin_port = htons(port); /* Establish a connection with the server */ // SA is an alias to "typedef struct sockaddr SA;" if (connect(clientfd, (SA *) &serveraddr, sizeof(serveraddr)) < 0) return -1; return clientfd; } What must a server do? 1. Set up a listening socket. (socket/bind/listen) One that responds to client request at specified port. 2. Accept a connection (accept). This will set up a connection socket with same port number as server 3. Perform the echoing function (write/read) (read from client, convert to upper case, write back result) 4. Close this connection, so that can accept a new one. (close) Echo Server. First take care of easy part: What the server does once the connection has been established: /* Destructively modify string to be upper case */ void upper_case(char *s) { while (*s) { *s = toupper(*s); s++; } } void echo(int connfd) { size_t n; char buf[MAXLINE]; rio_t rio; Rio_readinitb(&rio, connfd); while((n = Rio_readlineb(&rio, buf, MAXLINE)) != 0) { printf("server received %d bytes\n", n); upper_case(buf); Rio_writen(connfd, buf, n); } } Now let's look at everything else: int main(int argc, char **argv) { int listenfd, connfd, port, clientlen; struct sockaddr_in clientaddr; struct hostent *hp; char *haddrp; short client_port; if (argc != 2) { fprintf(stderr, "usage: %s \n", argv[0]); exit(0); } port = atoi(argv[1]); listenfd = Open_listenfd(port); /* Create a listening socket */ while (1) { clientlen = sizeof(clientaddr); // Accept connection request. Creates a new socket // Fills in information about the client connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen); /* determine the domain name and IP address of the client */ hp = Gethostbyaddr((const char *)&clientaddr.sin_addr.s_addr, sizeof(clientaddr.sin_addr.s_addr), AF_INET); haddrp = inet_ntoa(clientaddr.sin_addr); client_port = ntohs(clientaddr.sin_port); printf("server connected to %s (%s), port %d\n", hp->h_name, haddrp, client_port); // Perform echo function echo(connfd); printf("Connection closed\n"); // Close the connection Close(connfd); } exit(0); } Creating a listening socket int open_listenfd(int port) { int listenfd, optval=1; struct sockaddr_in serveraddr; /* Create a socket descriptor */ // Allocate resources for a TCP/IP socket if ((listenfd = socket(AF_INET, SOCK_STREAM, 0)) < 0) return -1; /* Eliminates "Address already in use" error from bind. */ if (setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, (const void *)&optval , sizeof(int)) < 0) return -1; // Create a sockaddr structure for the server /* Listenfd will be an endpoint for all requests to port on any IP address for this host */ bzero((char *) &serveraddr, sizeof(serveraddr)); serveraddr.sin_family = AF_INET; // IP serveraddr.sin_addr.s_addr = htonl(INADDR_ANY); // Any client serveraddr.sin_port = htons((unsigned short)port); // Specified port // Bind this socket to this sockaddr structure if (bind(listenfd, (SA *)&serveraddr, sizeof(serveraddr)) < 0) return -1; // Start listening /* Make it a listening socket ready to accept connection requests */ if (listen(listenfd, LISTENQ) < 0) return -1; return listenfd; } Key idea: Server has two types of sockets, with two types of file descriptors: Listening: Waiting for request from client Connection: Servicing particular client Will see that can set up multiple connections simultaneously. All have same host:port number. Sockets code keeps track of what to do with incoming message based on host:port of client.