Relations Patterns and their Automatic Discovery in Biosequences
A. Brazma, I. Jonassen, I. Eidhammer, E. Ukkonen
Submitted to CABIOS
Outline of experiment:
The figure illustrates the main outline of the experiments done with the
Relation patterns found for the PROSITE families using the PROSITE patterns.
Restrictions applied, and no noise assumed:
Relation patterns found for the PROSITE families using patterns as found by Pratt.
Restrictions applied, and no noise assumed.
We analysed the 1enh entry found in the
database. We took away the last column in the alignment (which consists
mostly of gaps) and then removed all sequences that contain
at least one gap. Then a pattern was constructed by making one
pattern position for each column of the alignment, the pattern
position matching all amino acids found in that particular column:
[EGKPRSW]-[ACEGHIKLMNPQRSTVY]-[CGKLQRSY]-[ACHIKMPQTV]-[ADFIKLNPRSTV]-[ACFHILNY]- [HKNST]-[ADGHKLNPQRSTVY]-[ADEFHKLNPQRSTVWY]-[AQSY]-[AILRTV]-[ACDEFGIKLPQRSTVY]-[AEGHIKLQRSTV]- [LM]-[EKNQ]-[ACEGHIKLMNQRSTV]-[AEFHIKLQRSTVY]-[FY]-[ACDEFHIKLNQRSTY]-[ACDEFGHIKLMNQRSTVY]- [ADEGHKNQST]-[AEGHKMNPQRSV]-[FHKNTY]-[ILMPV]-[ACDEGMNSTVY]-[ACEFGIKLPRSVY]- [ACDEFGHKLNPQRSTVY]-[ADEHIKMQRTV]-[AKLRW]- [ACDEFHIKLMQRSTVWY]-[ACDEGHIKLMNQRSTV]-[FILMVY]-[ARS]-[ADEGHKLMNQRSTV]-[ADEFGHIKLMNQRSTVY]- [AILSTV]-[ACDEGHKMNQRSTVY]-[ILM]-[ACDGHLNPQRST]-[ADEKMPQST]-[ACDEKNQRSTVY]-[HKNQRTV]-[FILV]- [AEKQRT]-[FILSTV]-W-[FY]-[KQS]-N -[AHKNR]-[ARS]-[AIMNRSTVY]-[KQR]
This pattern together with the alignment was input to the algorithms
A1-A4 described in the paper. These tables show some
of the results obtained.
Poster presented at RECOMB 97