Date of Original Version

7-2004

Type

Conference Proceeding

Abstract or Description

Protein secondary structure prediction is an important step towards understanding the relation between protein sequence and structure. However, most current prediction methods use features difficult for biologists to interpret. In this paper, we present a new method that applies information retrieval techniques to solve the problem:we extract a context sensitive biological vocabulary for protein sequences and apply text classification methods to predict protein secondary structure. Experimental results show that our method performs comparably to the state-of-art methods. Furthermore, the context sensitive vocabularies can serve as a useful tool to discover meaningful regular expression patterns for protein structures.

DOI

10.1145/1008992.1009109

Share

COinS
 

Published In

Proceedings of the 27th Annual international ACM SIGIR Conference on Research and Development in information Retrieval. SIGIR '04, 538-539.