Fast databank searching with a reduced amino-acid alphabet
Centre de Génétique Moléculaire du CNRS 91198 Gif sur Yvette Cedex France
1To whom reprint requests should be sent
Fast sequence databanks search algorithms generally make use of hash tables and look for exactly matching words. An increased sensitivityat the expense of a decreased selectivitycan be attained in the case of proteins by using a reduced amino acid alphabet. We propose here an alphabet reduced to 10 symbols, that we used in modified versions of the FASTP and SCAN programs. An application to the aminoacyl-tRNA synthetases shows that this technique may be useful in detecting distant relationships between proteins.