Sequence Analysis with Distributed Resources

FASTA

FASTA [Pearson et al. 1988] is another commonly used family of programs for sequence database searches. FASTA stands for FAST-All, reflecting the fact that it can be used for a protein or nucleotide comparisons. The program achieves a high level of sensitivity for similarity searching at high speed. The high speed is achieved by using the observed pattern of word hits to identify potential matches before attempting the more time consuming optimized search. The trade-off between speed and sensitivity is controlled by the ktup parameter, which specifies the size of the word. Increasing the ktup decreases the number of background hits. Not every word hit is investigated but instead initially looks for segments containing several nearby hits.

The EBI offers a WWW service for FASTA, the full documentation is also available.

Like BLAST, FASTA offers a variety of programs for different searches. Here is an overview:

Program	Description
fasta3	Compare a protein sequence to a protein sequence database or a DNA sequence to a DNA sequence database using the FASTA algorithm (Pearson and Lipman, 1988, Pearson, 1996). Search speed and selectivity are controlled with the ktup(wordsize) parameter. For protein comparisons, ktup = 2 by default; ktup =1 is more sensitive but slower. For DNA comparisons, ktup=6 by default; ktup=3 or ktup=4 provides higher sensitivity; ktup=1 should be used for oligonucleotides (DNA query lengths < 20).
ssearch3	Compare a protein sequence to a protein sequence database or a DNA sequence to a DNA sequence database using the Smith-Waterman algorithm (Smith and Waterman, 1981). ssearch3 is about 10-times slower than FASTA3, but is more sensitive for full-length protein sequence comparison.
fastx3/fasty3	Compare a DNA sequence to a protein sequence database, by comparing the translated DNA sequence in three frames and allowing gaps and frameshifts. fastx3 uses a simpler, faster algorithm for alignments that allows frameshifts only between codons; fasty3 is slower but produces better alignments with poor quality sequences because frameshifts are allowed within codons.
tfastx3/tfasty3	Compare a protein sequence to a DNA sequence database, calculating similarities with frameshifts to the forward and reverse orientations.
tfasta3	Compare a protein sequence to a DNA sequence database, calculating similarities (without frameshifts) to the 3 forward and three reverse reading frames. tfastx3 and tfasty3 are preferred because they calculate similarity over frameshifts.