Assignments for Friday:
Assignment for Monday: Take home assignment 7 (Will be posted before Friday)
ML mapping
ML ratio test
Bayesian posterior probability through sampling
Powerpoint slides on trees and tree building are here
Introduction to Bayesian Analyses
An illustration of the usefulness of Bayesian thinking is here.Paul Lewis, a colleague in Ecology and Evolutionary Biology at the University of Connecticut is one of the pioneers applying a Bayesian framework to the analysis of molecular data. His lecture notes for his Introduction to Bayesian Phylogenetics are here (about four hours of lecture). Essentially, the Bayesian approach tries to assess the probability of a model, bipartition or range of a parameter value - in contrast to ML, which assesses the probability of the data given a model.
It has been shown that under some conditions the biased sampling of tree and parameter space converges on the posterior probability. The approach most often used in recent months is Markov Chain Monte Carlo sampling. The principle is illustrated by a little program that Paul Lewis wrote called MCRobot. This little robot runs around in two dimensional space over which different distribution can be defined. The walk of the robot is biased in a way so that probability to find the robot in a place is proportional to the defined distribution.
--- MCRobot demo in class ---
- Start the program. The black space you look at is the absolutely flat space where robot walks. To start the robot press "Ctrl-F" (you see the 100 steps that robot took connected to each other). To continue walking press "Ctrl-N" [you can hold "Ctrl-N" for the continuous walk]
- Run the robot for the sufficient amount of time. How well does the robot explore the space?
- Now change the terrain by introducing hills. To do so, drag a mouse somewhere in the space and release the mouse. The hill is depicted as yellow contours. Run the robot for some time. Do you observe any noticeable difference in the robot behavior?
- Go to "Chains" menu and toggle "2 chains". Now MCRobot will run two chains simultaneously (cold [original blue] chain, and heated [red chain]). Go to "Show" menu and choose "All chains" to have both chains depicted. Run the chains for some time. Are there any differences in the red and blue chains space exploration?
- Go to "Robot"-> "Options", toggle "Allow rotation" option. This will allow the rotation of the plane where the robot walks [before we always looked from the top] To change viewing angle, press right mouse button and rotate the mouse. Now run the chains again. Are there any differences in the red and blue chains space exploration from this angle of view?
The programs that evaluate molecular sequences (e.g., MrBayes) are doing the same as the MCRobot, but they walks around in tree and parameter space. For each place it visits, the program calculates the likelihood. The decision to take or reject a step is based on the likelihood. From the evaluation of all the trees and parameters visited (minus a burn-in phase), one can calculate the posterior probabilities of trees and parameters.
Additional material:
Paul Lewis (EEB - UConn) has written a very readable and thorough descriptions of the Bayesian approach: from MCB/EEB372 class 22
Paul Lewis' MCRobot program that illustrates the MCMC approach to estimate posterior probabilities is here.
Olga's exercise on the value of Bayesian thinking is here.
For those interested to read more about the application of probability mapping to comparative Genome analyses: An article on the use of ml mapping in comparative genome analyses is here. (See Fig1, 2, 3, 4, 7, and Tab. 4); an improved version of probability mapping that solves the problem of poor taxon sampling inherent with quartet analyses is here, and an article that describes the extension to more than 4 genomes is here.
Background information: The catalytic beta subunits are encoded on the chloroplasts genome, the paralogous non-catalytic alpha subunits are encoded in the nucleus, translated in the cytoplasm and transported into the plastids after translation.
For those interested to read more about the application of probability mapping to comparative Genome analyses: An article on the use of ml mapping in comparative genome analyses is here. (See Fig1, 2, 3, 4, 7, and Tab. 4); an improved version of probability mapping that solves the problem of poor taxon sampling inherent with quartet analyses is here, and an article that describes the extension to more than 4 genomes is here.