WIT Press

Pleiotropic Microarray Gene Expression Data: Advanced Tandem Multivariate Data Mining


Free (open access)

Paper DOI









1,170 kb


B. B. Little, E. Barner & A. T. Dobson


Massive amounts of data are produced through microarray gene expression analysis, and commonly employed techniques include cluster analysis computed from Pearson correlation coefficients. Several sophisticated multivariate techniques are based upon decomposition of correlation matrices and their resultant numeric products. Principal components analysis (PCA) is based upon correlation matrix decomposition, and has been used to analyze microarray data. However, rotated factor analysis has not been explored extensively. Previously published data on 42,427 genes that were analyzed using cluster analysis are used in the present analysis. The data are from experiments to analyze global microarray gene expression from embryos at three stages of development: days one, two, and three post-fertilization. The previously published analysis used cluster analysis to correctly classify observations by stage/day based on gene expression. Data on 22,561 genes were suitable for further multivariate analysis. In the present investigation, quartimax rotated factor analysis was used to extract five factors that paralleled the cluster analysis, with days of egg and embryo development loading on separate factors. Factor scores were computed for each gene on the five factors, and used for modified gene shaving, or SVD (singular value decomposition). This identified supergenes that were responsible for the majority of variance across all five factors. Path analysis of factor scores suggested five genes might be pleiotropic or regulatory. This proof of concept numerical analysis provides the basis for development of more sophisticated multivariate analytical techniques for microarray data than cluster analysis to evaluate causal paths of pleiotropic control of gene expression. Keywords: SVD, factor analysis, rotation, path analysis, pleiotropy.


SVD, factor analysis, rotation, path analysis, pleiotropy.