A dynamic programming data base method for predicting protein secondary structure

Larry Stanfel

To a large extent a protein's function is determined by the shape it assumes directly after it is synthesized. Thus, if one could predict the shape to be acquired by an arbitrary sequence of amino acids, he could think seriously about designing a protein to carry out a desired function, such as neutralizing the adverse effects of a defective protein manufactured as a consequence of a genetic error. However, predicting 3-d shape from only sequence data has proven to be a challenging problem. Secondary protein structure - that is, the helices, turns, and strands formed by subsequences of the amino acids - is an important determinant of 3-d structure, so numbers of 2ary structure prediction metheods have been developed. This seminar concerns a rather novel method begun when the author was a "gjesteforsker" with Informatikk in 1994-95. It utilizes a data base approximating the set of all known secondary structures and seeks to parse the sequence of an uhnknown protein into 2ary structures by matching its subsequences with the data base entries so as to optimize an objective function within a dynamic programming algorithm. The technique is conceptually much simpler than most suggested, and, in two test sets of proteins with known 2ary structures, performs very well.

Back to seminar homepage