Take home quiz #5

(Please return a hardcopy of your answers by Monday, November 5th before class)

 

  1. Describe the two types of error that can occur if you search for homologous sequences in a databank.

 

 

Which type occurs less frequently in a traditional BLAST search?

 

 

Which type is reduced in PSI blast searches as compared to normal blast?

 

 

  1. You want to use PSI BLAST to find possible intein encoding genes in a genome. You plan to calculate a PSSM using the nr databank. It does not matter which intein sequence you use as a query to build your PSSM. All inteins are homologs and they all will find the same targets in the databank. Correct / incorrect

 

  1. How many different unrooted trees topologies are possible for 3 species?

 

For 4 species?

 

For 5 species?

 

  1. You do a PSI BLAST search using an uncharacterized ORF from a Methanococcus strain as query. In the zero iteration you pick up a couple of hypothetical proteins form other Euryarchaeota, in the first iteration a rho termination factor form a Gram positve Bacterium scores above the cut-off level (E<10-3), in the second iteration this and other termination factors score with E values below 10-24. Does this prove that the ORF from Methanococcus that you used as a query is a homolog of the rho termination factors?
    A) Yes
    B) No

  2. Would your conclusion change, if the rho termination factor were already picked up in the zero iteration of your search?
    A) Yes
    B) No

 

  1. Prokaryotes are characterized by the absence of a complex internal membrane system, in particular, they do not have a membrane system (= nuclear envelope) surrounding their genetic material. Do you consider the absence of the nuclear envelope a sympleisiomorphy or a synapomorphy? Is the group formed by the presence of this character a proper taxonomic category?

 

 

 

  1. Archaea and Eukaryotes both have so-called TATA binding proteins, which play an important role in directing the RNA polymerase to the promoter, whereas bacteria do not have a homologous protein.
    Do you consider the presence of a TATA binding protein a shared derived character?
    A) Yes
    B) No

 

  1. Is this a valid argument to support shared ancestry between Archaea and at least part of the eukaryotic nucleocytoplasmic component?
    A) Yes
    B) No

 

  1. The evolution of species often is depicted as a tree. Give at least two examples for events where two branches form the "tree of life" fused to form a new organism:


  1. Under which conditions can two alleles be maintained over many generations (>>> 4Ne) in a small population of diploid organisms?

 

  1. What is the relationship between the mutation rate and the substitution rate for selectively neutral mutations?

 

  1. In analyzing a cyanobacterial genome (e.g., Prochlorococcus sp.), you try to identify genes that were recently transferred from another bacterial phylum (e.g., proteobacteria) to the Prochorococcales. Which reference genomes could you choose in TAXPLOT (@NCBI)?

 

 

  1. BLAST is available through several webservers. Under what circumstances might it be advantages to install the blast program on your personal computer?

 

 

 

 

  1. Under which circumstances might a command line interface be preferable over a graphics user interface?

 

 

 

 

 

  1. Assume that you used blastall (with an E-value cut-off of 10-7) to calculate this gene plot. Your query was a multiple sequence file containing all ORFs encoded in the Thermotoga petrophila genome, your target (the databank) was a similar file from Thermotoga maritima. For both of these genomes the ORFs were identified by the nucleotide position in the middle of the ORF. The file containing the ORFs from the T.maritima genome is here.
    In the plot one can clearly recognize syntenic regions, i.e. the neighborhood relations between matching genes appear to be the same in the two genomes. These syntenic regions are recognized as diagonal rows of blast hits. However, there a also many other matches that tend to form vertical and horizontal lines. Your task is to pick three points that are not part of the diagonal, and to find out which gene in
    T. maritima gave rise to these matches. Do not pick points that are in the same horizontal or vertical row.

Point 1

Position of central nucleotide of matching ORF in T. maritima:
Position of central nucleotide of matching ORF in
T. petrophila:

Annotation of ORF in T. maritima:

Point 2

Position of central nucleotide of matching ORF in T. maritima:
Position of central nucleotide of matching ORF in
T. petrophila:

Annotation of ORF in T. maritima:

Point 3

Position of central nucleotide of matching ORF in T. maritima:
Position of central nucleotide of matching ORF in
T. petrophila:

Annotation of ORF in T. maritima: