Relations Patterns and their Automatic Discovery in Biosequences
A. Brazma, I. Jonassen, I. Eidhammer, E. Ukkonen
Submitted to CABIOS
Experiments
PROSITE families
Outline of experiment:
The figure illustrates the main outline of the experiments done with the
PROSITE families:
Results:
Relation patterns found for the PROSITE families using the PROSITE patterns.
Restrictions applied, and no noise assumed:
- C4=10
- C4=15
- C4=20
Relation patterns found for the PROSITE families using patterns as found by Pratt.
Restrictions applied, and no noise assumed.
- C4=15
- C4=20
Homeodomain family
We analysed the 1enh entry found in the
HSSP
database. We took away the last column in the alignment (which consists
mostly of gaps) and then removed all sequences that contain
at least one gap. Then a pattern was constructed by making one
pattern position for each column of the alignment, the pattern
position matching all amino acids found in that particular column:
[EGKPRSW]-[ACEGHIKLMNPQRSTVY]-[CGKLQRSY]-[ACHIKMPQTV]-[ADFIKLNPRSTV]-[ACFHILNY]- [HKNST]-[ADGHKLNPQRSTVY]-[ADEFHKLNPQRSTVWY]-[AQSY]-[AILRTV]-[ACDEFGIKLPQRSTVY]-[AEGHIKLQRSTV]- [LM]-[EKNQ]-[ACEGHIKLMNQRSTV]-[AEFHIKLQRSTVY]-[FY]-[ACDEFHIKLNQRSTY]-[ACDEFGHIKLMNQRSTVY]- [ADEGHKNQST]-[AEGHKMNPQRSV]-[FHKNTY]-[ILMPV]-[ACDEGMNSTVY]-[ACEFGIKLPRSVY]- [ACDEFGHKLNPQRSTVY]-[ADEHIKMQRTV]-[AKLRW]- [ACDEFHIKLMQRSTVWY]-[ACDEGHIKLMNQRSTV]-[FILMVY]-[ARS]-[ADEGHKLMNQRSTV]-[ADEFGHIKLMNQRSTVY]- [AILSTV]-[ACDEGHKMNQRSTVY]-[ILM]-[ACDGHLNPQRST]-[ADEKMPQST]-[ACDEKNQRSTVY]-[HKNQRTV]-[FILV]- [AEKQRT]-[FILSTV]-W-[FY]-[KQS]-N -[AHKNR]-[ARS]-[AIMNRSTVY]-[KQR]
This pattern together with the alignment was input to the algorithms
A1-A4 described in the paper. These tables show some
of the results obtained.
Poster presented at RECOMB 97
More papers.