.TH KWIC HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
kwic \- key word in context concordance
.SH SYNOPSIS
.nf
\fBkwic\fP  [ \fB\-k\fIn\fP \-m \-w\fIS\fP \-f\fIn\fP \-r \-l\fIn\fP \-p\fIn\fP \-i\fIc\fP \-c\fIn\fP \-d\fIF\fR + \- ]  filename ...
\-kn: keyword is n characters long (defaults to 15)
\-m : keywords not mapped from upper to lower case
\-wS: write string S onto id field (use quotes around blanks)
\-fn: filename (up to n characters) written onto id field
\-r : reset linenumber to 1 at beginning of every file
\-ln: line numbering begins with line n (instead of 1)
\-pn: page numbering begins with page n (instead of 1)
\-ic: page incrementer is character c (defaults to =)
\-cn: context is n characters long (defaults to 50)
\-dF: define punctuation set according to file F
\(pl : the + character indicates cedilla or umlaut
\(mi : read text from standard input (terminal or pipe)
.fi
.SH DESCRIPTION
\fIKwic\fP is a text concordance program,
generally for use with prose,
although it is often used for poetry.
Normally, it prints a left-hand keyword,
a 6 digit linenumber or 6 place pagenumber
(depending on how you want to label your text),
and a context of 50 characters, centered around the keyword.
Words are separated at their natural boundaries,
and adjustment is made for backspaces.
Newline characters are printed as "/",
and tabs are printed as a single blank.
If you want to have a space after the newline "/",
use the pad option of \fItprep\fP to insert a space
at the beginning of each line in your text.
The following characters are considered to be
punctuation marks:  ,.;:-"?!()[]{}  but all other
non-alphabetic characters can be part of a word.
These punctuation characters can be changed.
.PP
By default, only the first 15 characters
of the keyword are printed, followed by a vertical bar;
longer keywords are truncated.
If you want more or less than 15 characters in the keyword,
use the \-k option to lengthen or shorten it.
To find the longest word in your text,
use the \fImaxwd\fP program, and set \-k accordingly.
Keywords are mapped to lower case to ease the logistics of sorting,
unless the \-m option is specified.
.PP
The \-w argument allows you to write an id field
(such as the name of an author or work) after the keyword.
If you want to include any blanks,
enclose the entire string in quotes: \-w"Prose Edda".
The \-f argument allows you to write the current filename,
up to a number of characters you specify.
If the filename is shorter, it will be blank-padded,
and if it is longer, it will be truncated.
.PP
If the program encounters the character "=",
which, by default, indicates pagination,
it will count pages as well as line numbers.
Line numbers will print as: ``\ 12469'',
while page numbers will print as: ``178,12''.
If you are concording a series of short poems,
each starting with line 1, type them into separate files,
and use the \-r option to reset the linenumber to 1
at the beginning of each new file.
If you resume concording in the middle of your text,
you can set the line number with the \-l option,
or the page number with the \-p option.  
If you want to indicate pagination,
make sure that you begin your text with ``=1'',
on a line of its own, to indicate the first page.
When a new chapter starts at the top of the page,
be sure to set \-p to the previous page.
The page indicator can be changed with the \-i option;
\-i% will change it to a percent sign, for instance.
.PP
If you are sending output to the lineprinter,
the context width can be increased with the \-c argument;
\-c110, for instance, will give you about 55 characters
on either side of the keyword in context.
Note that the lineprinter can print only 132 characters per line,
so add up your field widths carefully.
.PP
If you are working with a foreign language,
and need to use normal punctuation marks as diacritical marks,
you can change the default punctuation set with the \-d option.
Just type the punctuation marks you want into a file,
on a single line with no embedded spaces,
and specify the filename after the \-d in your command line.
If you have cedillas or umlauts, you can represent them
as a `+' character after the accented letter.
Use the `+' option of \fIkwic\fP, and filter your output through
either the \fIcedilla\fP or \fIumlaut\fP program.
.PP
After generating the concordance,
it should be alphabetized using the Unix \fIsort\fP program.
Keywords should be grouped and counted with the \fIformat\fP program,
and the final results can be sent to the lineprinter.
Here is a typical program sequence for generating a concordance:
.nf
 % kwic \-c110 chapter* | sort | format | lpr
.fi
Usually, it is better to send the results of FORMAT
to a file, where they can be examined and edited,
before sending the file to the lineprinter.
.SH FILES
A temporary file, /tmp/KwicXXXXX,
is created if \fIkwic\fP has to work with standard input,
because seeking can only be done with files.
.SH "SEE ALSO"
format(hum), kwal(hum), maxwd(hum), tprep(hum), sort(1)
.SH LIMITATIONS
Words cannot be longer than 512 characters,
nor can the first half of the context.
Linenumbers cannot exceed 999999 and pagenumbers 
cannot exceed 999,99 without skewing the output format.
Most lineprinters will not print entries longer than 132 characters,
and the CAT/4 typesetter cannot handle lines longer than 7.54 inches.
.SH AUTHOR
Bill Tuthill
.SH BUGS
If there are lots of backspaces in the text,
the context width is somewhat shortened.
Using a wheel-like data structure might be more efficient
than using disk seeks and reads to output the contexts.
