Date: Tue, 05 Nov 1996 21:47:55 GMT Server: NCSA/1.5 Content-type: text/html Last-modified: Thu, 19 Jan 1995 17:35:08 GMT Content-length: 7401
One of the design features of the Web is ``hypertext.'' Hypertext is the idea of creating a link from part of one document to another part of a document (either the same document, or a different one). The concept is really no different than a cross-reference, but the referenced document is available to the user in an instant.
In a hypertext system some standard method of specifying links must be created. The designers of the Web decided that for their needs, the most appropriate way to specify the start of a link was to mark a section of a document with special tags (see the HTML tutorial). Specifying the ``destination'' of the link was more complicated, because each part of each document on the internet had to have a unique ``name''. To solve the problem, they created the Uniform Resource Identifier or URI specification. The syntax of URIs was designed to be:
The URI specification was in turn used to write specifications for the naming of documents or services available through existing internet protocols. These initial specifications were named Uniform Resource Locators or URLs. In this document we discuss some of the commonly used URL specifications.
When people talk about putting documents on the Web, they are usually referring to putting their document in the namespace of a Hypertext Transfer Protocol or HTTP server. HTTP was designed to satisfy the curious needs of the Web, namely the quick, anonymous retrieval of documents.
The http server provides each user with their own Web namespace, starting in the directory ~/public/html. The server has the permissions of any other user, so the files in this directory must be readable by any user. We will describe what the full URL of this namespace later in this document.
URLs are similar to pathnames in a file-system, and just as with pathnames it is possible to construct relative URLs. In a Unix shell, pathnames name files relative to a ``working directory.'' URLs name documents relative to the URL of the document a link is being made from. Roughly speaking, if when you make a link from document a to document b and you leave out parts of the URL for b, those parts will be copied from the URL that was used to retrieve a.
For example, if I have a document called home.html, and I want to make a link from it to another document called todo.html, I can simply use the relative URL ``todo.html''. If I have a third document in a subdirectory data called results.html, I can use the relative URL ``data/results.html'' to link to it from home.html (URLs generally use the ``/'' character to separate directory names, just like Unix). To link from data/results.html to home.html you could use the relative URL ``../home.html''.
The name the http server uses for each user's web names pace is ``/~username/''. Remember that this points to the subdirectory ~username/public/html/ and not to ~username/. If you wanted to link to my home page from a document in your directory, you could use the relative URL ``/~keeper/keeper.html''.
To link to a file or service that is not on the CS http server, you need to know its full URL. To link to documents on other http servers, a full URL looks like the following:
http://fqdn/pathname
Where fqdn is the ``fully-qualified domain name'' (as in www.cs.wisc.edu). (Sometimes a server will be running on a port other than the standard one, in which case you follow the fqdn with a colon and the port number, as in monty.cs.wisc.edu:1080.) The full URL for the home page of the CS department is:
http://www.cs.wisc.edu/
The home page is located at the root of the tree. The full URL for my home page is:
http://www.cs.wisc.edu/~keeper/keeper.html
Another useful URL type is ftp, which allows you to link to files on anonymous FTP servers. The format is exactly the same as http URLs, except http becomes ftp. The URL for the condor directory on the CS FTP server is:
ftp://ftp.cs.wisc.edu/condor/
To create a link to a specific part of a document, that document must be written in HTML, and contain a named anchor (see the tutorial). The URL would simply be the URL of the file followed by a hash mark (``#'') and the name of the anchor. If the anchor is in the same document the URL is just ``#anchorname''.
The http server provides an access control system similar to the access control lists in AFS, but because all the files are already on AFS you should use it rather than the server to protect your files.
Normally when the server gets a request for the URL of a directory it returns a list of files. However, if a file named index.html exists in the directory it is sent instead. The CS home page is stored in the file index.html in the root directory. Many people like to make a link from index.html to their home page (as in ln -s username.html index.html) so that when the URL
http://www.cs.wisc.edu/~username/
is requested their home page is returned.
Each Web browser has a method for sending a search string to a remote server, and many Web services rely on this search string. Luckily, the method used for sending the string is to translate and append it to the URL of the service preceded by a question mark. The translation changes spaces to plus signs (``+'') and encodes other seldom-used special characters in hexadecimal. The finger script on our server uses this feature. To make a link that fingers a user from a document on our server, you can use the relative URL:
/cgi-bin/finger?username