Methods for finding motifs in sets of related biosequences

Dr. scient thesis

1996

Inge Jonassen,

Dept. of Informatics,
University of Bergen,
Norway.

Abstract:

The automatic discovery of patterns conserved in groups of related biological sequences is an important problem in molecular biology. This thesis discusses this problem, and presents a systematisation of a large number of reported methods. New methods for the automatic discovery of patterns and collection of patterns in sets of unaligned protein sequences, are proposed. The methods are able to discover patterns of a quite general type, and are guaranteed to find the best, according to a defined evaluation function, conserved patterns. Both non-heuristic and heuristic search methods are proposed. The problem of evaluating discovered patterns is discussed and several new evaluation functions are proposed. The new functions are shown to have useful properties for a set of test cases. The methods proposed in this thesis have been primarily designed for analysing protein sequences, but they may also be applicable to the analysis of nucleotide (DNA/RNA) sequences and possibly other types of sequence data.

Keywords

bioinformatics, protein sequences, pattern discovery, machine learning, search methods, PROSITE, minimum descript length principle

The thesis:

The thesis consists of:

An introductory part - full text available in postscript.
Research papers (more information about these on my publications page).
- Approaches to the automatic discovery of patterns in biosequences.
  Alvis Brazma, Inge Jonassen, Ingvar Eidhammer, David Gilbert.
- Finding flexible patterns in unaligned protein sequences.
  Inge Jonassen, John F. Collins, Desmond Higgins.
- Efficient discovery of conserved patterns using a pattern graph.
  Inge Jonassen.
- Scoring function for pattern discovery programs taking into account sequence diversity.
  Inge Jonassen, Carsten Helgesen, Desmond Higgins.
- Discovering patterns and subfamilies in biosequences
  Alvis Brazma, Inge Jonassen, Esko Ukkonen, Jaak Vilo

Inge's Home page.