Colloquium: Exon and Intron Detection in Human Genomic DNA
4:10 p.m. Neill Hall 5W
James Keith Miller
Abstract: The exponential growth of raw genomic data demands a shift from biological methods of gene annotation to more computational and mathematical methods. We present a novel computational approach using likelihood ratios which we call the multi-window method. DNA n-tuple frequencies are collected from a training set of known exons and introns. Likelihood ratios, based on these n-tuple frequencies within a window of nucleotides, are used to predict the position of a nucleotide. This position either indicates the location within a codon for exon nucleotides, or indicates that the nucleotide is from an intron. We also compare the sensitivity and specificity of this method with a simple hidden Markov model which captures many of the same features as our multi-window method.