Protein Structure Comparison and Structure Patterns - an Algorithmic Approach

03.08.01


Click here to start


Table of Contents

Protein Structure Comparison and Structure Patterns - an Algorithmic Approach

Outline

Organisation of tutorial

Part I - Introduction

Motifs in Protein Analysis

Sequence structure/function motifs

Example sequence motif

Sequence alignment

PPT Slide

Protein structure-function motifs

Protein Classifications

Why compare structures?

Why is there more than one way?

Some terminology

Terminology (cont’d)

How to compare structures?

Structure Motifs

Framework

PPT Slide

PPT Slide

Structure Description - Levels

Structure Description - Features

PPT Slide

PPT Slide

PPT Slide

PPT Slide

PPT Slide

Equivalences and Alignments

Scoring Equivalences

Coordinate based RMSd

Calculating coordinate based RMSd

Scoring Equivalences

Distance based RMSD

PPT Slide

PPT Slide

PPT Slide

Comparison Algorithms

Dynamic Programming - DP

Edit distance

Edit distance (cont’d)

Dynamic programming - tabular computation

Alignment of biological sequences

Global alignment of two sequences

Example of pairwise global alignment

PPT Slide

Local Alignment

Local alignment - recurrence relation

Dynamic Programming for alignment of structures

Alternating superposition and alignment

PPT Slide

A simple alternating method - initial alignment

A simple alternating method - from superposition to alignment

A simple alternating method - dynamic programming

PPT Slide

Part II - Algorithms for pairwise structure comparison

Double Dynamic Programming

Double dynamic programming

DDP - outline

Ideas for DDP

PPT Slide

Ideas for DDP (cont’d)

Algorithm for DDP

DDP - Example

Noise

The low level scoring matrices

Example

More advanced scoring scheme

Iterated DDP

Iterated DDP (cont’d)

Algorithm for iterated DDP

PPT Slide

Algorithm statements to explain

Initialising the bias matrix Q

Selecting pairs and updating Q

Coherence problem

High level contribution to low level

Termination criterion

Geometric Hashing

Geometric Hashing - Outline

Geometric hashing

2D geometric hashing

Example

Coincidence sets

Reference frames

Example Reference Frame Systems

Example Reference Frame Systems

Remarks

Hashing

Preprocessing

Example Preprocessing

Recognition

Example Recognition

Extensions

GH for structure comparison

Algorithm for preprocessing

Algorithm for recognition

Remarks

Geometric hashing for SSE-representation

Part III - Comparison by Clustering

Comparison by Clustering

Compatibility

Seed matches

Consistency

Compatibility and consistency

Compatibility and consistency

Example

The philosophy

The methods

Searching for seed matches

Consistency

Decide consistency

Consistency

Example transformation

Example relation

The clustering step

Consistency test

Clustering methods

Linear clustering

Parallel linear clustering

Hierarchical clustering

Clustering by use of transformation

Seed matches

The transformation

Clustering by use of relation

Clustering by use of relation

The relation

Alexandrov and Fisher

Alexandrov and Fischer

Clustering as graph problem

Node product graph

Node product graph - Example

Refinement

Part IV Multiple structure alignment and Motif discovery

Multiple Alignment

Multiple Structure Comparison

Example: Extending DDP to multiple alignment

MSSAP: Representing the consensus structures

Alternative approach - using a median structure

PPT Slide

Framework

SPratt - Pattern Driven Algorithm for the Discovery of Structure Motifs

SPratt - Idea

SPratt - Neighbour Strings

Neighbour string for each residue

SPratt - Discovery Algorithm

SPratt - results

Example output: Cystein proteases

RMSd matrix

SPratt: Structures ? Motif

Combining SPratt with SAP

PPT Slide

SAP output - cystein proteases

SAP output - 2Fe2S Ferrodoxins

SPratt2

Neighbour strings in SPratt2

Neighbour String Angle Constraint

Neighbour String *-Patterns

Search Problem in SPratt2

Probe ? *-Pattern

*-Pattern Exploration Seeded by Probe

Use of Probe Structure in Search

Which Probes?

Ranking of patterns

Implementation

Performance of SPratt2 compared to that of SPratt

Mining PDB

Mining PDB - time usage and number of patterns found

Need to Reduce Pattern Set: Select a “Covering” Subset of Patterns

Conclusions

Conclusions (cont’d)

Open Problems and Future Directions

Acknowledgments

Author: Ingvar Eidhammer and Inge Jonassen

Email: ingvar@ii.uib.no and inge@ii.uib.no