A scale-independent signal processing method for sequence analysis
Laboratoire de Physique et Chumie Biomoléculaure (CNRS UR 198 Université Paris VI et Institut cure
1Atelier de Bio-informatique, Institut Curie, section Physique-Chime 11 rue Pierre et Marie Curie 75231, Paris cédex 05
2L.I P.N., Université Paris-Nord Avenue J. B Cléniment 94430, Villetanneuse
33C.T. I. S., Centre de Recherche INRA de Jouy-en-Josas Domaine de Vilvert, 78350 Jouy-en-Josas, France
*To whom reprint request should be sent
In this paper, we present methods to detect and localize patterns in biologically related protein sequences (family). The patterns common to the sequences of the family are detected by using Fourier analysis. No previous scales (codes) are needed, they are actually produced as a result of the analysis procedure, together with the frequencies of the Fourier decompositions. Characteristic features of the family are thus expressed as (codefrequency) pairs. Various tools are proposed in order to localize the patterns, to compare the codes, and to evaluate the proximity of an arbitrary sequence to the investigated family. The general strategy is illustrated on a family composed of proteins
Received on October 17, 1989; accepted on January 16, 1990