Date: Tue, 10 Dec 1996 22:42:47 GMT Server: NCSA/1.4.2 Content-type: text/html Last-modified: Tue, 04 Oct 1994 03:39:12 GMT Content-length: 7901
The rest of this document assumes that you already know how to write programs. It also doesn't attempt to cover the same ground as NCSA's introduction to forms, which should be be considered the starting point for explorations of forms. You should also be sure to read the Common Gateway Interface documentation as well, which describes the interface that the HTTP server defines between HTML messages and server-side executables.
There are some basic features of server-side scripts that if used correctly will minimize the potential for security problems:
http://foo.bar.baz/a/b/cthe HTTP server at foo.bar.baz will check each successively longer substring of
/a/b/c
(ie. /a
,
/a/b
, etc.) against the list of "ScriptAliases" defined
in the server's configuration files. A ScriptAlias looks like this:
ScriptAlias /a /some/other/place/in/the/filesystem/awhich the server interprets to mean: if anyone ever references
/a/something
, then execute
/some/other/place/in/the/filesystem/a
and return its
output. Note that this implies two things about the executed program:
it must send a MIME Content-Type header as its first line of output,
to tell the client (Mosaic) what the output actually is (HTML ? GIF ?
JPEG ? etc), and then it should send some "useful" output, even if its
only an "OK, message received" line. See the mail-request
mentioned below for an example of how to do this.Only programs located in places referenced by a ScriptAlias will ever be executed by the server. In addition, the server caches a directory listing of all the programs in each location referenced in a ScriptAlias whenever it is started (or restarted), and uses it to check possible server-side programs before executing them. This prevents random programs placed in the right place from being accessed without a server restart (which only a priviledged user can do).
/a/b
is actually
an executable program in a ScriptAlias location, it executes the
program, passing it data in two ways.
First of all, any text left over from the URL that has not been "used"
to find the script will be used to set the value of an environment
variable named PATH_INFO. In the example above, this would relatively
simple: PATH_INFO would just be /c
. However, near
arbitrary text can be used here:
http:/foo.bar.baz/a/b/long=4748.39?//limit:=$!!:h+aposto:*&%&^$$#{fhfh}This will result in PATH_INFO being set to:
/long=4748.39?//limit:=$!!:h+aposto:*&%&^$$#{fhfh}(note the initial `/'). The main restriction is that spaces are not allowed, or rather, will terminate the component of the URL used to set PATH_INFO.
In addition, if you are using a forms interface, the values of all the
<input>
and <select>
tags in the
form will be made available, as the standard input of the
program.
urldecode
can be used by your own programs (easily if
they are an actual shell,awk or perl script) to do the decoding.
Invoke it as:
/cse/www/htbin-post/urldecodeand it will convert any encoded data read from its standard input into its original form on its standard output. At some point, I'll add a object module you can link with to do this from a compiled langauge like C (although you may get there before me, since the encoding is so simple).
More details are available about writing server scripts in the Common Gateway Interface documentation, where a number of other environment variables that are available to the program are described.
/projects/ai/590i/post-binAn example program, called
mail-request
, is already there
(its a shell/awk script). This is the program I use for the interface
to my music collection, so take a look
at the HTML source for that stuff to see how this is used. I intended
it to be usable by anyone else, and a for a variety of
purposes. Suggestions are welcome.The HTTP daemon will be restarted about 4 times a day, and on the next restart after you have placed a program there, you will able to have a link to it result in its execution. After that, you can keep changing the program in any way you see fit, and the daemon won't care - it merely notes prescence or absence.
I want to reiterate that this is potentially a big security issue. Please take care in how you handle arguments, how to handle input and what your program does or might do.
For the time being, all instances of these programs will run as the uid "nobody". Also access to both areas (the private one and the 590i area) is currently limited to machines in the .cs.washington.edu domain. This restriction is inconvenient, and intended to create a temporary breathing space so that we can get more experience with potential security issues.
webmaster@cs.washington.edu