Analysis of context of 5'-splice site sequences in mammalian mRNA precursors by subclass method
Department of Information Engineering, Faculty of Engineering Japan
1Department of Chemistry, Faculty of Science, Hokkaido University Sapporo, 060, Japan
The signals that direct the excision of introns from mammalian pre-mRNA are not yet well understood. However, at least three lands of signals5'-splice site signals, 3'-splice site signals and branch point signals;play important roles in the excision of introns. In the present paper we treat only the 5'-splice sites. In addition to a consensus sequence for 5'-splice signals, several methods have been proposed, based on a statistical model, and used to analyze relative importance of each nucleotide at each position. In our approach a nucleotide sequence is regarded as a string with symbols of A, T G and C; important substrings of 5'-splice site sequences, called pattern sequences, are extracted. A pattern sequence expresses which nucleotide is needed at a limited number of positions around the 5'-splice site. It is observed that a particular pattern sequence matches predominantly 5'-splice site sequences nearest to the 5'-end of a gene and another pattern sequence matches predominantly the second nearest ones. Moreover, it is confirmed that the pattern sequences accurately predict authentic 5'-splice sites for unknown genes and explain some mutation examples.
Received on October 3, 1991; accepted on January 14, 1992