This contains the following tools: filter - remove unwanted sequences from a clustering usage: filter seq.list < cluster.L > cluster2.L cluster2.L will only contain sequence labels found in seq.list hist - produce a histogram of cluster sizes from a "label"-formatted clustering. clusc - compare clusterings, calculating jaccard and (hopefully correctly) rand indices. More to follow as/when I need it. xcerpt - given a file containing a list of sequence labels (e.g. a "label" formatted clustering), extract matching sequences from a fasta.file Usage: xcerpt list.txt fasta.seq creates "fasta.seq.match" and "fasta.seq.rest"