Math 574  -- Optimization Models in Computational Biology

Course Description
Bioinformatics and computational biology is one of the ``hottest'' interdisciplinary areas of science today. This graduate course aims to introduce a series of mathematical models which are used to solve several problems from the broad area of computational biology. Models covered include dynamic programming, clustering and trees, graph algorithms, combinatorial pattern matching, linear and integer optimization, and probabilistic models. The basics of these techniques will be illustrated by studying their applications to sequence motif search, sequence alignment, peptide identification, BLAST, phylogeny, protein structure prediction, and other problems from computational biology.

Emphasis will be on problem formulation, and not on the theory behind the models. Hence this course will be appealing to graduate students outside of mathematics. The textbook used introduces biological and mathematical ideas together in an easy way, and the same approach will be adopted throughout the course. Evaluation will be through homeworks and a course project (no exams will be given). For students coming from non-mathematical backgrounds, there will be enough non-technical exercises to work on (as a choice). Students from different backgrounds will be encouraged to work together on the assignments and project.

Syllabus

Web page for the text -   An Introduction to Bioinformatics Algorithms.

Announcements

Monday, January 9  
##   The class will meet in Webster B12 (and NOT in Neill 3W).
Saturday, March 11  
##    Description of the project is posted. You should choose a topic by Friday, March 24.
Thursday, March 16  
##    Latest paper on complexity analysis of algorithms for isothermic
sequencing by hybridization (SBH) has been uploaded (see under Papers).

Homeworks    
Homework 1     -- Due on Tuesday, Jan 24.
    Solutions to Homework 1
Homework 2     -- Due on Tuesday, Jan 31.
    Solutions to Homework 2
Homework 3     -- Due on Tuesday, Feb 7.
    Solutions to Homework 3
Homework 4     -- Due on Tuesday, Feb 14.
    Solutions to Homework 4
Homework 5     -- Due on Tuesday, Feb 21.
    Solutions to Homework 5
Homework 6     -- Due on Tuesday, Feb 28.
    Solutions to Homework 6
Homework 7     -- Due on Thursday, Mar 9.
    Solutions to Homework 7
Homework 8     -- Due on Thursday, Mar 30.
    Solutions to Homework 8
Homework 9     -- Due on Thursday, Apr 6.
    Solutions to Homework 9

Projects

Project Description  

Project Reports

  Gene Clustering    by Amy and DeAnne
  Cell Signaling    by Atef
  Language Processing    by Andrew
  Reaction Pathways and Boolean Models    by Ben and Jeff
  IP Model for MS/MS Spectra    by John
  Max Flow Model for Protein Domains    by Nate and TJ
  Hidden Markov Models    by Lisa and Seshu
  Sequence-Structure Motifs for Protein Function    by Yan and Da

Papers

Sequence Motif Discovery -
    Survey paper on DNA Binding Site representation and discovery (2002)
    A Generic Sequence Motif Discovery Algorithm   (Jan 2006)
Complexity of Isothermic Sequencing by Hybridization (SBH)   (Mar 2006)
Dynamic Programming model for Mass Spectrometry   (explains spectrum graph)  
Approximation Algorithms for SCS problem  (Shortest Common Superstring)
Four-body scoring function for protein decoy discrimination
Linear Programming Techniques for Fold Recognition
Integer Programming model for side-chain positioning
Protein Threading by Linear Programming

Software

MATLAB Tutorial from Mathworks page
Another guide to MATLAB from UBC CS.

AMPL
AMPL Handout 1

   Farmer Jones example: model file   data file
   Inventory model: model file   data file    Output from AMPL
   Hamiltonian path problem for sequencing by hybridization (SBH):
      model file   data file(example given in class)   Output from AMPL

VMD
    Sample tetrahedra - VMD script file to draw sample tetrahedra of each class
    in the Delaunay tessellation of the protein 2CI2;   All tetrahedra in 2CI2.

Links

Introduction to Computational Molecular Biology in MIT.
NCBI's genome data base.
A good animation showing DNA microarray methodology.
Bioinformatics journal web site (free access to articles on campus).
Nathan Edwards' lectures. The first one on Proteomics and Mass Spectrometry is recommended.


Last modified: Fri Feb 15 01:15:13 PST 2008