Interesting discussion on Giardia:
   http://sandwalk.blogspot.com/2007/09/giardia-lamblia-genome.html

Questions on sequence alignment (see class 13 and 14)

Remarks on last Friday - Data Formats

Review Quiz 2

Progressive alignment of multiple sequences
(e.g. clustalw/clustalx):

1) Pairwise distance calculation
2) Clustering analysis of the sequences based on pairwise alignment.
3) Iterated alignment of two most similar sequences or groups of sequences.

Problem: Step two can create a strong bias, that is recovered as "signal" in future analyses of the multiple sequence alignment.

Types of Error in a Databank search

False positives: The number of false positives are estimated in the E-value. The P-value or significance value gives the probability that a positive identification is made in error (same as with drug tests).
Danger: avoid fishing expiditions. If you do 100 tests on random data, you expect one to be positive at the 1% significance level. You could apply the Bonferoni correction, (test the individual test at the desired significance level divided by the number of parallel tests), but then the hypothesis to be be rejected is that all of the individual tests are not significantly different from chance.

False negatives: Homologous sequences in the databank that are not recognized as such. If there are only 12000 different protein families, an average a sequence should have (size of the databank)/12000 matches. In other words, the number of false negatives is probably very large.

Trees, nets, gene transfer and the ToL

Trees:

  • Topology and Branchlengths
  • Rooted vs Unrooted
  • Branches, splits, bipartitions
  • In a rooted tree: clades
  • Mono-, Para-, polyphyletic groups, cladists and a natural taxonomy
  • Shared derived versus primitive characters (Synapomorphy, Sympleisiomorphy, Autapomorphy, homoplasy)
  • shared derived

Studies on the Origin of Life

Top down approaches (fossil and molecular records, retrodiction of biochemical pathways)

Bottom up (prebiotic chemistry)

Primordial Soup (Miller -> see reading assignment) or Primordial Pizza (Wächtershäuser -> see reading assignments)

 

The RNA world

The currently favored scientific scenarios for the transition from chemistry to biology is somewhat as follows:

prebiotic chemistry either on Earth or in Space, in solution or on surfaces or in the gas phase
(autocatalytic chemical cycles and chemical networks)
?
self-replicating biopolymer
?
Emergence of cells, hypercycles or other means to co-select different genes
RNA world
??
Invention of protein biosynthesis

 

The existence of the RNA world as a transitory stage is supported by the following:

  • RNA molecules have catalytic activity. Famous ribozymes are the group I self splicing intron from Tetrahymena (ciliate) and the RNA portion of the E.coli Ribonuclease P (involved in tRNA processing)

  • RNA molecules have the potential to function as genetic material and as enzymes, or ribozymes (this solves the chicken vs. egg problem). This also allows for comparatively easy schemes to evolve RNAs in vitro to have new or different catalytic function (blind design by evolution).

  • Many enzymatic cofactors are nucleotides or nucleotide derived (FAD, ATP). Ribosomal protein synthesis relies on RNAs. RNA is an important part of the catalytic machinery that forms the peptide bond (see Noller et al.), tRNAs contain many strange bases suggesting that the catalytic potential of RNA molecules can go beyond what is possible with four bases only.

In vitro evolution has succeeded to evolving RNA's with novel properties, e.g. ATP binding. Jack Szostak's lab is working to evolve RNAs with template directed RNA polymerization capabilities. The principle selection scheme is depicted in this diagram at Szostak's web page.

In vitro selection became famous with Sol Spiegelman's experiments on the vitro replication of the Phage Qbeta RNA. In this case selection was for the fastest replicating molecules - they become shorter and lost their ability to infect bacteria.

Later inventions are the SELEX procedure to select for RNA with very specific binding properties (see left), and the selection of ribozymes with altered or new properties. In the latter case growth and selection can be either discrete or continuous. See reading materials for further discussion.

How can evolution be improved?

Genetic drift or the co-selection of slightly deleterious mutations lead to the fixation of deleterious mutations. These mutations can be eliminated if recombination occurs between different members of the population. Another advantage of recombination is that positive properties that arose independently in different parts of the molecules can be combined by recombination, molecular breeding, and sexual PCR.

Illustration for the power of recombination: Molecular Computation -> the traveling salesman problem Adleman's Science paper (JSTORE link)

In vitro evolution of proteins
Problem:
How to couple the functional protein to the genetic material.

Biological solution: cells contain the genetic material and the encoded proteins. Selection of cell that contain the more successful protein, will also select the gene encoding the protein.

Alternative: Link protein to encoding RNA. (see O'Keefe and Szostak's scheme h on RNA display here)

 

Reading assignment:

  • Article on SELEX by Craig Tuerk and UConn graduate Larry Gold (otional)
  • New Quiz due next Monday!