Methods for the automatic discovery of patterns in sequences

24.10.00


Click here to start


Table of Contents

Methods for the automatic discovery of patterns in sequences

Overview of talk

Families, patterns, motifs

Prosite: Patterns for classification

Example sequence motif (zinc finger c2h2)

Motif Usage

Protein Sequence Motif Databases

InterPro - EU funded collaboration between the databases

Motifs in Protein Analysis

Strategy for developing motifs

PPT Slide

A three steps Approach to Pattern Discovery

Pattern Languages

Different description languages

The advantages and disadvantages of deterministic patterns

Evaluating patterns

Examples of Fitness Functions

Algorithms for pattern discovery

Approaches to pattern discovery

Pattern Driven - pruning the search space

An Example Algorithm: Pratt

Pratt - solution space

An Example Algorithm: Pratt

Pratt - Pattern scoring

An Example Algorithm: Pratt

Pratt - Search

Pratt - functionality

Pratt - Example

Pratt in the InterPro project

Structure Motif Discovery

SPratt - Search based algorithm for discovering Structure Motifs

SPratt - Idea

Structure - represent each residue’s neighbourhood

Mark all residues within d Angstrom

Make neighbour string - C-terminal direction

Make neighbour string - N-terminal direction

SPratt - Neighbour Strings

SPratt - discovery algorithm

Example output: Cystein proteases

RMSd matrix

SPratt: Structures ? Motif

Combining SPratt with SAP

SAP output - cystein proteases

SAP output - 2Fe2S Ferrodoxins

Test Cases - Summary of SPratt runs

Expression Profiler

Cluster Genes

EPCLUST

Retrieve Upstreams

Retrieve Upstream Regions

Mine for Regulatory Signals

SPEXS - Sequence Pattern EXhaustive Search

Large Scale Experiment on Yeast

Gene Clusters

Pattern Score vs. Cluster Score

Randomized data

PPT Slide

Large Scale Experiment - Conclusions

Expression Profiler Web-tool

PPT Slide

Acknowledgements

Author: Inge Jonassen

Email: inge@ii.uib.no

Home Page: http://www.ii.uib.no/~inge