The goal of this dissertation is to provide tools to aid in the identification of the function of unknown genes. To that end, we first present a study in which gene expression data was used to annotate many unknown genes by clustering the expression data. We then present a tool for clustering gene expression data while also identifying short areas of high sequence similarity (motifs) among members of the clusters. Finally, we present a tool for identifying the functionally relevant sub-sections of protein sequences. These sub-sections can then be used to find proteins containing similar sub-sections, even though the rest of the protein may be quite different. This tool can thus find more distantly related proteins sharing functionally relevant features.
| Kevin Thomas Horan (2011). Gene Function Prediction Based on Sequence or Expression Data. Doctoral dissertation, University of California at Riverside. | ||||||
@phdthesis{Hor11,
   author = "Kevin Thomas Horan",
   title = "Gene Function Prediction Based on Sequence or Expression Data",
   school = "University of California at Riverside",
   schoolabbr = "UC Riverside",
   year = 2011,
   month = Dec,
}