- Are all nucleotide binding sites homologous? (E.g. consider the one in F-ATPases and the one in the carbamoyl phosphate synthetase?)
- When calculating trees with clustalx/clustalw you have the option to exclude all positions that have a gap in any of the positions from the analyses. The default is to exclude gaps only from the pairwise alignments.
Under which conditions might it be advantages to turn on this option, under which conditions might the default setting be preferable?
- Likelihood based phylogenetic reconstruction aims to find the tree under which the data set (e.g., aligned sequences) are most probably. To calculate this probability, a model describing the evolutionary process is used. Which model assumption(s) would correspond to a normal parsimony analysis?
A) Gaps (-) in the sequence alignment are treated as a 5th nucleotide or as the 21st amino acid.
B) All types of substitutions occur with the same probability.
C) The ratio between probability for transition and transversion is described by a parameter kappa that is estimated from the data
D) A-D is correct
E) All answers are incorrect
- 100 bootstrap data-sets are created for a set of sequences. The Parsimony method is applied to these data-sets to give 100 trees. The consensus tree of this set of 100 trees is given below with bootstrap percentages indicated to the left of the branch to which the support value pertains. It has been rooted by assuming that U is the outgroup. Which of the four conclusions can be drawn?
/----------- W
/--------| 87
| \------------ X
|
/-----| 91
| | /----------- Y
| \---------| 100
/---| 100 \----------- Z
| |
| \-------------------------- V
|
-------------------------------- U
• A. The clade X+Y+Z never occurs in the set of 100 trees.
• B. The clade W+X+V never occurs in the set of 100 trees.
• C. The clade W+X+V could occur in up to 9 of the 100 trees.
• D. The clade W+V could occur in up to 13 of the 100 trees.
- A new mutant allele has just arisen in a population. Which statement is true?
A. If the mutant is neutral with respect to the original allele, there is a 50% probability that the mutant allele will replace the original allele.
B. It is very likely to disappear in a few generations due to random drift.
C. It will only become fixed in the population if there is a strong selective advantage.
D. If the mutant allele reaches a frequency of 50%, it will almost always go on to fixation.
- The following four trees are unrooted and branch lengths are not drawn to scale. Which statement is correct?

• A. All four trees are non-equivalent.
• B. All four trees are equivalent.
• C. Only trees (i) and (iii) are equivalent.
• D. Only trees (iii) and (iv) are equivalent.
- Give each of the trees in Figure 6 in parenthesis notation:
- You try to find homologs to a gene isolated from an Anabaena species. Let’s call this gene xyz. The genome of this species has just been publicly released. (Anabaena is a cyanobacterium, i.e., belongs to one of the bacterial phyla or kingdoms). Gene xyz is listed as a putative open reading frame but with no further annotations. A search of the nr database with the putative product of the xyz gene using BLASTP results in a match to an ATPase catalytic subunit from E. coli with a P-value of 10‑10. When you use the E. coli catalytic F-ATPase subunit to search the Anabaena genome you retrieve 5 significant hits. The top hit has an E-value of 10‑62. The sequence you try to find homologues for is ranked 4th. These findings suggests
A) that you need to do a PSI blast search before drawing any conclusions
B) that gene xyz encodes a paralog to the F-ATPase catalytic subunit
C) that gene xyz encodes the non-catalytic F-ATPase subunit.
- The chances of attaining false negatives when performing a PSI blast are decreased as compared to a normal blast search. (Correct/Incorrect)
- Due to the possible corruption of the PSSM the E-value of a match obtained in a later iteration of a PSIBlast search is not a good measure for obtaining a match of this quality due to chance. Correct/Incorrect
- You should run psi-blast for more than 5 iterations to ensure that accurate matches are returned. Correct/Incorrect
- You do a PSI-Blast search using an ATPsynthase catalytic subunit as query. In the 5th iteration a match to myosin with an E-value of 10^-27 is reported.
This demonstrates that at least a portion of ATPsynthase catalytic subunit is homologous to part of the myosin molecule. Correct / incorrect
- A program calculating pairwise sequence comparisons reports a probability of 1% that a match of this quality might be due to chance. If you were to perform 1000 different sequence comparisons using this program, how many matches at this level of similarity would you expect due to chance? ________
- The programs in PHYLIP can be used to:
A) Estimate phylogenies using the Parsimony method
B) Boostrap resampling of your data
C) Estimate phylogenies using the Fitch-Margoliash distance matrix method
D) Plotting unrooted phylogenies in a tree diagram
E) All of the above
-
Analyzing a paralogous gene family you find that two paralogs each are present in all three domains of life. The two groups of paralogs are joined by a branch that connects the bacterial domains

Ignoring horizontal gene transfer, this tree would suggest:
A) that the bacteria are a monophyletic (and not a para- or polyphyletic) group
B) that the last common ancestor of the two paralogous types of genes existed in a bacterium; and that ignoring horizontal gene transfer the most recent common ancestor of the three domains of life was a bacterium.
C) that the last common ancestor is placed into the tree of life between the bacteria on side and the archaea and eukaryotes on the other.
-
If the branches in the tree in question (15) that are indicated by *s have only 55% bootstrap support, (even though the branch indicated by ** is strongly supported), this would indicate that the conclusion drawn in question (15) is not strongly supported.
Correct/incorrect
- If the branch in the tree in question (15) that is indicated by ** has only 55% bootstrap support (even though the branches indicated by *s are strongly supported), this would indicate that the conclusion in question (15) is not strongly supported.
Correct/incorrect
-
You analyze a quartet of putatively orthologous sequences. The maximum parsimony tree looks like this:
The small central branch has 100% bootstrap support.
A) Maximum parsimony is not subject to the long branch attraction artifact, rather it always has the tendency to group the long with the short branches. Therefore, the finding that A and B group together is reliable.
B) This tree groups the two long branches together. The possibility exists that this result might represent a long branch attraction artifact.
C) The central branch is so strongly supported that one can exclude a long branch attraction artifact. (LBA is a statistical phenomenon and never reaches 100% bootstrap support.)
- In applying the Bayesian framework to the analysis of molecular data, which of the following are true?
A) The probability of the model given the data is assessed
B) The probability of the data given the model is assessed
C) This is the same as maximum likelihood analysis
D) Both a) and c)
E) Both b) and c)
- Analyzing alleles for a particular gene in the human population you obtain the following genealogy:

You conclude that the blue allele is rapidly spreading in the human population (over 70% of the sampled humans have this allele) and that this allele arose only about 40000 years ago. In contrast, the "red" alleles are more divergent from one another and they appear to be evolving in the human lineage since over 200,000 years. Surprisingly, the red and the blue allele go back to a common ancestor that is older than a million years.
What could have been the origin of the blue allele in the human population about 40,000 years ago?
|