Go over take Home Quiz number 6
Go over positive /diversifying selection (ppt here)
Go over SPDBV exercise on ATPase subunits. (fit/magic fit and RMS coloring)
Oligo and cDNA based micro-chips/arrays
In most microarrays cDNAs or gene sequences are spotted onto glass slides
Array sizes ranges from several 100 to several thousand spots.
Use of Cy3 and Cy5 to label the cDNA to be analyzed.
(See animation)
The result looks something like this
An alternative is to synthesize the DNA directly onto the matrix (slides from Affymetrix)
(Slides from Affymetrix)
Analyzing Microarrays is one of the current hot topics in bioinformatics. There are many different approaches, the goal is to find genes or group of genes with similar responses. An example of a cluster analayis is here. For many the holy grail of the field seems to be to construct regulatory networks, and to consider their evolution. Microarrays have proven utility in finding genes that are expressed under certain circumstances and that might be useful targets for drugs or for further investigations. An often overlooked problem is the limited reproducibility and accuracy of the data.
Items worth consideration include:
How many replicates? Which types of replicates (technical, biologica)l?- What to use as reference? (If the reference does not have a signal. all sample are at infinity - pooled reference)
How is intensity measured (average pixel intensity, total intensity, median)
(Also, how do you store your data? As images with information for each pixel, or only the äverage for each spot?)
Fig 3.4 and following box in A primer to genome science).
Units of measurement: Intensity ratio for spot i = Ti = Ri/Gi => log (basis 2) Ti Correction for background intensity - what is used for background?- Normalizing expression? Different normalization for different intensities? (Fig 16.3 in Bioinf.; see also John Quackenbush's paper in Nature)
- limits of measurment (log T versus intensity) (Fig. 16.4)
- Limits of significance (Fig.16.3, see here for similar pictures); e.g., Bonferroni correction (divide significance cutoff by number of test), there are less severe tools availbale. Further validation is needed, even if statistical significane is beyond doubt.
- Hierachical clustering, k-Means, SOMs. Clustering a large number of microarray data leads to groups that correspond as functional units (spindle pole, ATP synthesis, mitochondrial ribosome. Example from Eisen et al. (PNAS 95, Issue 25, 14863-14868, 1998) is here. Another good oberview (with many of the same figures) is here, in particular the slides on clustering.