Reproducibility of Microarray and Gene Expression Analysis: Reference

Key Points

Introduction to Gene Expression
  • Microarray heat maps can be used to compare and identify gene expression patterns.

Introduction to Hierarchical Clustering
  • Hierarchical clustering groups observations on a continuous scale and allows a dissimilarity threshold to determine the number of clusters.

Recreating Supporting Figure 6 with TreeView
  • TreeView 3.0 can process the data files distributed with the paper to fully recreate Supporting Figure 6.

Recreating Supporting Figure 6 with Hive Plots
  • Hive plots are intended to represent large network data.

  • The expression profiles for a large number of genes can be visualized using a hive plot.

Recreating the Analysis with Cluster and TreeView
  • The reclustering of the original data with Cluster 3.0 does not support the recreation of Supporting Figure 6.

Recreating the Analysis with R
  • We can use R to write an uncentered correlation function, but the similarity calculations are not identical to those in the original paper.

  • We can use R to create dendrograms that show the clustering of arrays and genes, but the clusters are not identical to those in the original paper.

  • Our replicate heat map is not identical to the original heat map because of differences in clustering and color coding.

Citations

M. Madan Babu. “Introduction to microarray data analysis” in Computational Genomics (Ed: R. Grant), Horizon Press, U.K.

Tal Galili (2015). dendextend: an R package for visualizing, adjusting, and comparing trees of hierarchical clustering. Bioinformatics. DOI: 10.1093/bioinformatics/btv428

Renaud Gaujoux, Cathal Seoighe (2010). A flexible R package for nonnegative matrix factorization. BMC Bioinformatics 2010, 11:367. http://www.biomedcentral.com/1471-2105/11/367

Bryan A. Hanson (2016) HiveR: 2D and 3D Hive Plots for R. R package version 0.2.55, academic.depauw.edu/~hanson/HiveR/HiveR.html

M. J. L. de Hoon, S. Imoto, J. Nolan, and S. Miyano: Open Source Clustering Software. Bioinformatics, 20 (9): 1453–1454 (2004).

Schema, Mark. Microarray Analysis. New Jersey:John Wiley & Sons, 2003. Print.

Therese Sørlie, Robert Tibshirani, Joel Parker, Trevor Hastie, J. S. Marron, Andrew Nobel, Shibing Deng, Hilde Johnsen, Robert Pesich, Stephanie Geisler, Janos Demeter, Charles M. Perou, Per E. Lønning, Patrick O. Brown, Anne-Lise Børresen-Dale, and David Botstein. Repeated observation of breast tumor subtypes in independent gene expression data sets. PNAS 2003 100 (14) 8418-8423; published ahead of print June 26, 2003, doi:10.1073/pnas.0932692100

Tarca AL, Romero R, Draghici S. Analysis of microarray experiments of gene expression profiling. American journal of obstetrics and gynecology. 2006;195(2):373-388. doi:10.1016/j.ajog.2006.07.001

Eisen, M. B., Spellman, P. T., Brown, P. O. & Botstein, D. (1998) Proc. Natl. Acad. Sci. USA 95, 14863–14868.