Copyright © 2003-2005, Intel Corporation
Autograph Manual
Autograph is a system that automatically generates signatures for novel Internet worms propagate using TCP transport. The signatures generated by Autograph are the byte patterns unique and specific to Internet worm traffic payloads. Those signatures can be used by conventional Intrusion Detection Systems (IDSes) to identify and/or filter malicious worm flows.
Fast and accurate generation of worm signatures is crucial for IDSes to effectively intervene to halt and reverse the spreading of novel Internet worms. However, the signature generation has been known to entail non-trivial human labor and thus significant delay. Autograph's main goal is to reduce the time to generate good signatures when worms outbreak.
Autograph is designed with the following desirable properties in mind:
Autograph achieves those properties by monitoring suspicious traffic flowing into the edge networks' DMZs and analyzing the prevalence of portions of flow payloads. Moreover, Autograph leverages the multiple monitoring points dispersed in the Internet to achieve much faster signature detection. For more detail on Autograph's signature generation technique and the preliminary analysis see one of our Usenix Security papers titled Autograph: Toward Automated, Distributed Worm Signature Detection (http://autograph.cs.cmu.edu).
Autograph is recently proposed and still evolving. The current distribution of Autograph source code includes only a subset of the full Autograph features presented in the Usenix Security paper.
We keep adding more features into Autograph implementation. We will appreciate so much your feedback and suggestion, that is valuable for the future release of improved Autograph.
The official Autograph website, where you can find the latest Autograph distribution and the related document, is located at: http://autograph.cs.cmu.edu.
Send questions on any Autograph subject to autograph-users@mailman.srv.cs.cmu.edu. You can subscribe by going to the official Autograph website,
Autograph needs to be installed at the boundary of your network, where communication between internal and external hosts can be monitored. DMZ is the typical place for you to put Autograph. However, note that Autograph currently relies on port-scanner information to identify suspicious flows. Thus, Autograph needs to be placed before any proxy that filters out scanning activities.
You may choose to feed Autograph only inbound traffic because of restriction in your network topoloty or large traffic volume. Then, Autograph will try to guess the completion of connection setup based on time-out. Since the current version performs content-prevalence analysis on inbound suspicious flows, Autograph is still able to generate signatures. However, this one-way communication monitoring will affect the performance and accuracy of Autograph's suspicious flow selection heuristic.
Note: Autograph needs the full source code of rabinpoly for compile. You don't have to compile or install the rabinpoly library. For more detail, refer to the Installation section.
You have to have the following information before beginning Autograph installation.
Autograph installation is very easy. First, un-tar both Autograph and
rabinpoly distributions under the same directory. Let's say you start
Autograph building from tmp-autograph directory.
> cd tmp-autograph > tar xvzf autograph-0.1.tar.gz > tar xvzf rabinpoly-1.0.tar.gz > cd autograph-0.1 > ./configure
This will generate Makefiles that automatically locate rabinpoly source
distribution and will eventually install Autograph in
/usr/local/autograph. If you want to install Autograph in a
different directory, say /path/to/autograph, run
> ./configure --prefix=/path/to/autograph
If you have the rabinpoly source distribution in a different directory,
run ./configure with -with-rabinpoly option specified.
> ./configure --with-rabinpoly=/path/to/rabinpoly-1.0
Then, you build the source code by typing,
> make > make install
This will install compiled Autograph components (sffilter, copp,
interpreter.pl, autograph) under
/path/to/autograph/bin directory and sample configuration files
(sffilter.cfg, copp.cfg, blacklist.txt) under
/path/to/autograph/etc directory. The last command may need you to be
a superuser, depending on the permission of the installation directory.
If you want to install related documents including this manual, type
> make doc > make docinstall
This will copy a set of compiled Autograph documents into
/path/to/autograph/doc directory.
Current Autograph consists of two components, each of which runs as a separate program and needs a separate configuration file.
Autograph needs sufficient harddisk space to construct the suspicious
flow pool and store the intermediate information. That means you need prepare
a directory Autograph can access. Let's say you use /autograph-dir
for this purpose.
First, copy two example configuration files, /path/to/autograph/etc/sffilter.cfg and /path/to/autograph/etc/copp.cfg, into /autograph-dir. 
You can find the example configuration files in etc directory under the
original Autograph source tree, too.
> cd /autograph-dir > cp /path/to/autograph/etc/sffilter.cfg . > cp /path/to/autograph/etc/copp.cfg .
Second, you need to provide your site-specific information to Autograph by
editing the copied sffilter.cfg. The value of internal_network
should be replaced with your localnet information. Assume that your network
uses the IP address space 10.0.0.0/16 and 172.16.1.0/24. Then, changed
sffilter.cfg looks like
internal_network 10.0.0.0/16, 172.16.1.0/24 stat_report_interval 600 # 5 min . . .
If you want to run Autograph with the default parameter values, this is the end of the configuration. For the detail of other parameter configuration, see Configuration.
For your convenience, Autograph distribution includes a wrapper shell
script autograph that invokes sffilter and copp at once. 
The wrapper shell script is in /path/to/autograph/bin or in
aux directory under the original Autograph source tree.
If the network interface you are monitoring is eth0, you can start
Autogran by simply typing,
> cd /autograph-dir > /path/to/autograph/bin/autograph start
If the network interface is other than eth0,  specify the name with
-with-interface=<netinterface> option. Use -help option to
find more information on other available options of the wrapper shell
script.
  > /path/to/autograph/bin/autograph --help
  Usage: ./autograph {start|stop} [VAR=VALUE] ...
For start command, you have to specify the following variables. Otherwise,
the program will try to determine the values after examine a few possible
directories in your system.
Variables:
  --with-autograph=ARG  Path to installed autograph directory.
  --with-copp-config=ARG        Configuration file name for COPP.
  --with-sffilter-config=ARG    Configuration file name for Filter.
  --with-bpffilter=ARG  Optional BPF filter. (default: none)
  --with-interface=ARG  Network interface. (default: eth0)
Check with ps if both programs sffilter and copp are successfully launched.
In order to stop Autograph, use autograph stop command.
If you didn't make any configuration change other than internal_network,
Autograph will creat suspect_pool directory under your current directory
(here, /autograph-dir). Autograph uses this directory as its suspicious
flow pool (SFP).
You will see other files in your current directory that hold the various output Autograph generates. NOTE: you may not see anything interesting in those files except sig.out unless you have configured sffilter and copp to report states. See Configuration for the detail.
See bro manual to understand bro's signature format. http://www.bro-ids.org. The script script/interpreter.pl in Autograph source tree may be helpful to check the generated signatures.
This chapter presents a brief overview of the current Autograph implementaion included in this distribution. This will help you understand the configurable parameters, too. As noted, this initial distribution omits some interesting features of the complete Autograph system. For example, this version of Autograph does not include tattler component that allow distributed collaboration among multiple Autograph monitors. See our Usenix Security paper to get the full picture of the entire Autograph system.
Currently, Autograph consists of two interacting programs (sffilter,copp). sffilter performs suspicious flow selection heuristics (scanner-detection-based heuristic) and triggers copp's content-prevalence analysis via FIFO when enough number of suspicious flows are accumulated in SFP. The FIFO is non-blocking, and thus sffilter can send request messages as long as the FIFO queue can hold. copp processes the request one by one.
sffilter processes packets either from a live network interface or from tcpdump-style packet dump traces using libpcap library. It checks if the packet is originated from an external scanner. If so, it performs flow reassembly. Completed suspicious flows' payloads are stored in SFP directory as files. The files (completed suspicious flows) are removed from SFP after t_thresh seconds. If the current number of flows for a port equals to or is greater than theta_thresh , sffilter signals copp by sending a request message via FIFO.
All inbound/outbout TCP packets are used to determine scanners. If an external host has made more than or equal to s_thresh failed connection attemps during the last scanner_lifetime seconds, Autograph considers the external host to be scanning. Once the IP address is accused of scanning, the IP is considered a scanner for scanner_lifetime seconds.
Here, a failed connection attempt means that the internal peer host is non-existent or does not run a service on the destination port so that there is no subsequent response from the internal host within flow_inactivity_timeout seconds after the initial inbound SYN packet. Or, the internal connection peer has rejected the connection setup request. In order to tolerate a temporal failure of servers or the noise from failed-connection-prone p2p applications, we keep track of the pair of (internal host IP, port) used for successful connection during the last server_lifetime seconds, and exclude any failed connection attempts from counting for scanner detection. You can turn on/off this heuristic by configuring live_server_heuristic parameter.
copp waits for incoming requests from sffilter. The request message contains information on protocol number, port, number of suspicious flows sffilter observed, and time. When a request message is received copp checks SFP directory and reads all the files corresponding to the flows with the protocol and port number specified in the request message. Then, it chops each flow with COPP and constructs content-prevalence histogram. To be selected as a signature, a content-block should 1) be highly-ranked in the histogram, 2) be generated from at least min_prevalence flows, 3) be sent by at least source_count different external, and 4) be not listed in signature blacklist. See signature blacklist to know how to use signature blakclist option. The best source for more information on COPP and Autograph's content-prevalence analysis is the recent Usenix Security Paper.
You can specify the packet source and the configuration file in command line.
sffilter [-hv] [-i interface] [-r inputfile1 [-r inputfile2 ...] ]
         [-w output] [-f bpffilter] [-c config]
options:
	You have to select either network interface (-i) for online monitoring
	or tcpdump packet traces (-r) for offline analysis.
	-i interface 	network interface to be monitored
	-r inputfile	tcpdump-style packet dump trace. sffilter can process
			multiple packet dump traces.
	-w output	dump packets into output in tcpdump-style format.
	-f bpffilter	tcpdump-style bpf filter.
	-c config 	sffilter configuration file.
	-h		print help
	-v		verbose
sffilter configuration file is used to specify the following parameters.
copp [-hv] [-c config] options: -c config copp configuration file. -h print help -v verbose
copp configuration file is used to specify the following parameters.
Autograph generates signatures based on content-prevalence analysis, and thus it possibly reports signatures that are not specific or sensitive. Signature blacklisting is one way to prevent Autograph from generating bad signatures. Make your own signature blacklist file that lists signatures Autograph can generate but not good signatures. The format of the blacklist is exactly the same as Autograph's signature output (bro-style signature format). That means, you can generate a signature blacklist by editing the output of Autograph from training period. Then, specify the blacklist file name in copp configuration file (blacklist parameter). Autograph will ignore all the content blocks that are substrings of one of the blacklist patterns.
Here is an example signature blacklist:
signature blacklist0 {
        header ip[9:1] == 6
        payload /.*\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70/
        event "Signature : An Example Signature Black List Entry"
        }