htsnp is a program that uses haplotype motifs to locate "haplotype tagging" SNP sub-sets that carry all or most of the information found in a full set of SNPs sequenced in a sample population. The code uses a dynamic programming algorithm to chose an optimal SNP set from a set of motifs with measured frequencies. Details of the algorithms will be described in a forthcoming paper. The code is called as follows: Usages: ./htsnp [-i ] [-d ] [-p [-c ]] ./htsnp [-i ] [-a ] [-p [-c ]] ./htsnp [-i ] [-k ] -i : specifies a file of motif frequencies to use in SNP selection (default: stdin) -d : specifies a tolerated maximum amount of per-base error and seeks to minimize SNPs for that maximum (default: 0) -a : specifies a tolerated average amount of per-base error and seeks to minimize SNPs for that average (default: 0) -k : specifies a maximum number of SNPs and seeks to minimize expected total error given that maximum (default: 0) -p : specifies the population size from which motifs were derived (only used if a confidence interval is used) -c : specifies the size of the confidence interval if a population size is specified (default: 0.0) The program takes as input a motif file (created with the -s option of the hapmotif executable). It produces a set of SNPs that are chosen to allow inference of the other SNPs. In one version, the SNPs are chosen to be a minimal set yielding a particular level of expected prediction accuracy on each SNP when used with the prediction algorithm of the predictb executable. In that algorithm, we assign a missing SNP by finding the motif spanning that site that is most probable given the known sites in the sequence, using whatever SNP that motif has at that site. If a non-zero confidence interval is selected, along with a population size needed to establish it, then SNPs are chosen to provide the specified accuracy with at least the specified confidence at each site, chosen in isolation. In another version, the SNPs are chosen to yield a set of fixed size of minimum expected error rate. In the third version, the SNPs are chosen to be a minimal set yielding a particular level of expected prediction accuracy averaged over all SNPs.