The goal of this dissertation is to provide tools to aid in the identification of the function of unknown genes. To that end, we first present a study in which gene expression data was used to annotate many unknown genes by clustering the expression data. We then present a tool for clustering gene expression data while also identifying short areas of high sequence similarity (motifs) among members of the clusters. Finally, we present a tool for identifying the functionally relevant sub-sections of protein sequences. These sub-sections can then be used to find proteins containing similar sub-sections, even though the rest of the protein may be quite different. This tool can thus find more distantly related proteins sharing functionally relevant features.
Kevin Thomas Horan (2011). Gene Function Prediction Based on Sequence or Expression Data. Doctoral dissertation, University of California at Riverside. |
@phdthesis{Hor11, author = "Kevin Thomas Horan", title = "Gene Function Prediction Based on Sequence or Expression Data", school = "University of California at Riverside", schoolabbr = "UC Riverside", year = 2011, month = Dec, }