Principal Component Analysis (PCA)

Principal Component Analysis (.pdf)

Principal component analysis (also known as principal components analysis) (PCA) is a technique from statistics for simplifying a data set. It was developed by Pearson (1901) and Hotelling (1933), whilst the best modern reference is Jolliffe (2002). The aim of the method is to reduce the dimensionality of multivariate data whilst preserving as much of the relevant information as possible. It is a form of unsupervised learning in that it relies entirely on the input data itself without reference to the corresponding target data (the criterion to be maximized is the variance).

PCA is a linear transformation that transforms the data to a new coordinate system such that the new set of variables, the principal components, are linear functions of the original variables, are uncorrelated, and the greatest variance by any projection of the data comes to lie on the first coordinate, the second greatest variance on the second coordinate, and so on. In practice, this is achieved by computing the covariance matrix for the full data set. Next, the eigenvectors and eigenvalues of the covariance matrix are computed, and sorted according to decreasing eigenvalue. Note that PCA's bias is not always appropriate; features with low variance might actually have high predictive relevance, it depends on the application.

Book

JOLLIFFE, I.T., 2002. Principal Component Analysis, second edition, New York: Springer-Verlag New York, Inc. [Cited by 2790] (647.34/year)

Bibliography

, E.O.&.#.3.9.;.L.E.N.I.C. and R. LIVEZEY, 1988. Practical considerations in the use of rotated principal component analysis(RPCA) in diagnostic …. Monthly Weather Review. [Cited by 46] (2.51/year)
ADLER, N. and B. GOLANY, 2001. … using data envelopment analysis combined with principal component analysis with an application to …. European Journal of Operational Research. [Cited by 36] (6.78/year)
ANDERSEN, A.H., D.M. GASH and M.J. AVISON, 1999. Principal component analysis of the dynamic response measured by fMRI: a generalized linear systems …. Magn Reson Imaging. [Cited by 40] (5.47/year)
ANDERSON, T.W., 1963. Asymptotic Theory for Principal Component Analysis. The Annals of Mathematical Statistics. [Cited by 219] (5.06/year)
BALDI, P. and K. HORNIK, 1989. Neural networks and principal component analysis: learning from examples without local minima. Neural Networks. [Cited by 316] (18.26/year)
BALSERA, M.A., et al., 1996. Principal component analysis and long time protein dynamics. J. Phys. Chem. [Cited by 58] (5.63/year)
BAUMGARTNER, R., et al., 2000. … of two exploratory data analysis methods for fMRI: fuzzy clustering vs. principal component analysis …. Magn Reson Imaging. [Cited by 47] (7.45/year)
BAZEN, A.M. and S.H. GEREZ, lip;. … field computation for fingerprints based on the principal component analysis of local gradients. Proceedings of ProRISC2000, 11th Annual Workshop on Circuits &h. [Cited by 36] (?/year)
BECKWITH-HALL, B.M., et al., 1998. … magnetic resonance spectroscopic and principal components analysis investigations into biochemical …. Chemical Research in Toxicology. [Cited by 57] (6.86/year)
BERG, T., et al., 1995. Atmospheric trace element deposition: Principal component analysis of ICP-MS data from moss samples.. Environ Pollut. [Cited by 37] (3.27/year)
BEUZEN, A. and C. BELZUNG, 1995. Link between emotional memory and anxiety states: a study by principal component analysis.. Physiol Behav. [Cited by 57] (5.04/year)
BRYANT, F.B. and P.R. YARNOLD, 1995. Principal-components analysis and exploratory and confirmatory factor analysis. Reading and understanding multivariate statistics. [Cited by 110] (9.73/year)
BYRNE, G.F., P.F. CRAPPER and K.K. MAYO, 1980. Monitoring Land-Cover Change by Principal Component Analysis of Multitemporal Landsat Data.. REMOTE SENSING ENVIRON. [Cited by 65] (2.47/year)
CALDER, A.J., et al., 2001. A principal component analysis of facial expressions. Vision Research. [Cited by 53] (9.98/year)
CHAPMAN, R.M. and J.W. MCCRARY, 1995. EP component identification and measurement by principal components analysis.. Brain Cogn. [Cited by 59] (5.22/year)
CHAVEZ, P., 1989. … spectral contrast in Landsat Thematic Mapper image data using selective principal component analysis. Photogrammetric Engineering and Remote Sensing. [Cited by 68] (3.93/year)
CICHOCKI, A., W. KASPRZAK and W. SKARBEK, 1996. Adaptive learning algorithm for principal component analysis with partial data. Proc. Cybernetics and Systems. [Cited by 27] (2.62/year)
COLLINS, M., S. DASGUPTA and R.E. SCHAPIRE, 2002. A generalization of principal components analysis to the exponential family. Advances in Neural Information Processing Systems. [Cited by 45] (10.44/year)
CRITCHLEY, F., 1985. Influence in principal components analysis. Biometrika. [Cited by 57] (2.67/year)
CROUX, C. and G. HAESBROECK, 2000. Principal component analysis based on robust estimators of the covariance or correlation matrix: …. Biometrika. [Cited by 61] (9.67/year)
DAUXOIS, J., A. POUSSE and Y. ROMAIN, 1982. Asymptotic Theory for the Principal Component Analysis of a Vector Random Function: Some …. J. MULTIVARIATE ANALY. [Cited by 63] (2.59/year)
DE, F. and M.J. BLACK, ICCV'01. Robust principal component analysis for computer vision. [Cited by 72] (?/year)
DE, L., et al., 1997. … in Patients With Arrhythmogenic Right Ventricular Dysplasia Principal Component Analysis of the ST- …. Circulation. [Cited by 41] (4.40/year)
DONG, D. and T.J. MCAVOY, 1994. Nonlinear principal component analysis-based on principal curves and neural networks. American Control Conference. [Cited by 125] (10.15/year)
DONG, D. and T.J. MCAVOY, 1996. Batch tracking via nonlinear principal component analysis. AIChE Journal. [Cited by 36] (3.49/year)
DOTY, R.L., et al., 1994. Tests of human olfactory function: principal components analysis suggests that most measure a common …. Percept Psychophys. [Cited by 36] (2.92/year)
DUCHENE, J. and S. LECLERCQ, 1988. An optimal transformation for discriminant and principal component analysis. IEEE Transactions on Pattern Analysis and Machine …. [Cited by 60] (3.28/year)
DUNIA, R. and S.J. QIN, 1998. Joint diagnosis of process and sensor faults using principal component analysis. Control Engineering Practice. [Cited by 41] (4.93/year)
DUNTEMAN, G.H., 1989. Principal Components Analysis. books.google.com. [Cited by 327] (18.89/year)
EASTMENT, H.T. and W.J. KRZANOWSKI, 1982. Cross-Validatory Choice of the Number of Components from a Principal Component Analysis. Technometrics. [Cited by 45] (1.85/year)
FUNG, T. and E. LEDREW, 1987. Application of principal components analysis to change detection. Photogrammetric engineering and remote sensing. [Cited by 106] (5.49/year)
GABRIEL, K.R., 1971. The biplot graphic display of matrices with application to principal component analysis. Biometrika. [Cited by 272] (7.70/year)
GABRIEL, K.R., 1971. The biplot graphical display of matrices with applications to principal component analysis. Biometrika. [Cited by 58] (1.64/year)
GERTLER, J., et al., 1999. Isolation enhanced principal component analysis. AIChE Journal. [Cited by 48] (6.57/year)
HANCOCK, P.J.B., A.M. BURTON and V. BRUCE…, 1996. Face processing: Human perception and principal components analysis. Memory and Cognition. [Cited by 75] (7.27/year)
HAYWARD, S., 1994. … anharmonic aspects in the dynamics of BPTI: A normal mode analysis and principal component analysis. Protein Science. [Cited by 33] (2.68/year)
HOLDEN, H. and E. LEDREW, 1998. … Healthy Corals Based on Cluster Analysis, Principal Components Analysis, and Derivative Spectroscopy. Remote Sensing of Environment. [Cited by 47] (5.66/year)
HOREL, J.D., 1981. A rotated principal component analysis of the interannual variability of the Northern Hemisphere 500 …. Monthly Weather Review. [Cited by 128] (5.06/year)
HOREL, J.D., Journal of Applied Meteorology. Complex Principal Component Analysis: Theory and Examples. [Cited by 75] (?/year)
HOTELLING, H., 1933. Analysis of a Complex of Statistical Variables Into Principal Components, Journal of Educational Psychology, volume 24, pages 417-441 and 498-520. [Cited by 804] (10.97/year)
HSIEH, W.W., 2001. Nonlinear principal component analysis by neural networks. Tellus A. [Cited by 42] (7.91/year)
HU, S. and S.M. WU, 1992. Identifying Root Causes of Variation in Automobile Body Assembly Using Principal Component Analysis. Transactions of NAMRI. [Cited by 49] (3.42/year)
IMAI, F.H., et al., 1996. Principal component analysis of skin color and its application to colorimetric color reproduction on …. J. Imaging Sci. Technol. [Cited by 28] (2.72/year)
JACKSON, D.A., 1993. Stopping Rules in Principal Components Analysis: A Comparison of Heuristical and Statistical …. Ecology. [Cited by 214] (16.08/year)
JACKSON, J.E. and G.S. MUDHOLKAR, 1979. Control Procedures for Residuals Associated with Principal Component Analysis. Technometrics. [Cited by 147] (5.38/year)
JEFFERS, J.N.R., 1967. Two Case Studies in the Application of Principal Component Analysis. Applied Statistics. [Cited by 37] (0.94/year)
JOHNSTONE, I.M., 2001. On the Distribution of the Largest Eigenvalue in Principal Components Analysis. The Annals of Statistics. [Cited by 123] (23.16/year)
JOLICOEUR, P. and J.E. MOSIMANN, 1960. Size and shape variation in the painted turtle. A principal component analysis.. Growth. [Cited by 61] (1.32/year)
JOLIFFE, I.T. and B.J. MORGAN, 1992. Principal component analysis and exploratory factor analysis.. Stat Methods Med Res. [Cited by 38] (2.66/year)
JOLLIE, I.T., 1986. Principal Component Analysis. [Cited by 206] (10.14/year)
JOLLIFFE, I.T., 1972. Discarding Variables in a Principal Component Analysis. I: Artificial Data. Applied Statistics. [Cited by 113] (3.29/year)
JOLLIFFE, I.T., 1973. Discarding Variables in a Principal Component Analysis. II: Real Data. Applied Statistics. [Cited by 50] (1.50/year)
JOLLIFFE, I.T., 2002. Principal Component Analysis. books.google.com. [Cited by 2790] (647.34/year)
KAMBHATLA, N., 1997. Dimension Reduction by Local Principal Component Analysis. [Cited by 103] (11.06/year)
KARGUPTA, H., et al., 2001. Distributed Clustering Using Collective Principal Component Analysis. Knowledge and Information Systems. [Cited by 40] (7.53/year)
KARHUNEN, J. and J. JOUTSENSALO, 1995. Generalizations of principal component analysis, optimization problems, and neural networks. Neural Networks. [Cited by 106] (9.37/year)
KIM, K.I., K. JUNG and H.J. KIM, 2002. Face recognition using kernel principal component analysis. Signal Processing Letters, IEEE. [Cited by 56] (12.99/year)
KISTLER, D.J. and F.L. WIGHTMAN, 1992. A model of head-related transfer functions based on principal components analysis and minimum-phase …. The Journal of the Acoustical Society of America. [Cited by 107] (7.48/year)
KOSANOVICH, K.A., K.S. DAHL and M.J. PIOVOSO, 1996. Improved process understanding using multiway principal component analysis. Ind. Eng. Chem. Res. [Cited by 46] (4.46/year)
KRAMER, M.A., 1991. Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal. [Cited by 365] (23.84/year)
KROONENBERG, P.M. and J. DE, 1980. Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika. [Cited by 110] (4.18/year)
KROONENBERG, P.M., 1983. Three-mode Principal Component Analysis: Theory and Applications. books.google.com. [Cited by 132] (5.66/year)
KRZANOWSKI, W.J., 1987. Cross-Validation in Principal Component Analysis. Biometrics. [Cited by 33] (1.71/year)
LI, X., 1998. Principal component analysis of stacked multi-temporal images for the monitoring of rapid urban …. International Journal of Remote Sensing. [Cited by 39] (4.69/year)
LONGMAN, R.S., et al., 1989. … equation for the parallel analysis criterion in principal components analysis: Mean and 95th …. Multivariate Behavioral Research. [Cited by 53] (3.06/year)
LOUGHLIN, W.P., 1991. Principal component analysis for alteration mapping(in remote sensing). Photogrammetric Engineering and Remote Sensing. [Cited by 55] (3.59/year)
MAIER, J., et al., 1987. Principal components analysis for source localization of VEPs in man.. Vision Res. [Cited by 79] (4.09/year)
MEGLEN, R.R., 1991. Examining large databases: A chemometric approach using principal component analysis.. J. CHEMOMETR. [Cited by 38] (2.48/year)
MOORE, B.C., Principal Component Analysis in Linear Systems: Controllability, Observability, and Model Reduction. ieeexplore.ieee.org. [Cited by 854] (?/year)
NG, R.T. and A. SEDIGHIAN, 1996. … multidimensional indexing structures for images transformed by principal component analysis. Proceedings of SPIE. [Cited by 33] (3.20/year)
NOMIKOS, P. and J.F. MACGREGOR, 1994. Monitoring batch processes using multiway principal component analysis. AIChE Journal. [Cited by 240] (19.50/year)
PEARSON, Karl, 1901. On lines and planes of closest fit to systems of points in space, Philosophical Magazine, Series 6, vol. 2, no. 11, pp. 559-572. [Cited by 383] (3.64/year)
PETRONI, A. and M. BRAGLIA, 2000. Vendor selection using principal component analysis. Journal of Supply Chain Management. [Cited by 38] (6.02/year)
PREISENDORFER, R.W. and C.D. MOBLEY, 1988. Principal component analysis in meteorology and oceanography. Elsevier Amsterdam. [Cited by 465] (25.40/year)
RAO, C.R., 1964. The use and interpretation of principal component analysis in applied research. Sankhya A. [Cited by 201] (4.75/year)
RASKIN, R. and H. TERRY, 1988. A principal-components analysis of the Narcissistic Personality Inventory and further evidence of …. J Pers Soc Psychol. [Cited by 145] (7.92/year)
RAYCHAUDHURI, S., J.M. STUART and R.B. ALTMAN, 2000. Principal components analysis to summarize microarray experiments: application to sporulation time …. Pac Symp Biocomput. [Cited by 239] (37.88/year)
RONEN, S., A. ARAGON-SALAMANCA and O. LAHAV, 1999. Principal component analysis of synthetic galaxy spectra. Monthly Notices of the Royal Astronomical Society. [Cited by 40] (5.47/year)
RUBNER, J. and P. TAVAN, 1989. A self-organizing network for principal-component analysis. Europhysics Letters. [Cited by 102] (5.89/year)
SCHOLKOPF, B., A. SMOLA and K.R. MULLER, 1999. Kernel principal component analysis. Advances in Kernel Methods-Support Vector Learning. [Cited by 149] (20.38/year)
SHUM, H.Y., K. IKEUCHI and R. REDDY, 1995. Principal component analysis with missing data and its application to polyhedral object modeling. IEEE Transactions on Pattern Analysis and Machine …. [Cited by 94] (8.31/year)
SILVERMAN, B.W., 1996. Smoothed Functional Principal Components Analysis by Choice of Norm. The Annals of Statistics. [Cited by 57] (5.53/year)
SMITH, M.O., P.E. JOHNSON and J.B. ADAMS, 1985. … of mineral types and abundances from reflectance spectra using principal components analysis. Lunar and Planetary Institute, NASA, American Geophysical …. [Cited by 72] (3.38/year)
SOMERS, K.M., 1986. Multivariate Allometry and Removal of Size with Principal Components Analysis. Systematic Zoology. [Cited by 45] (2.22/year)
STEPPAN, S.J., 1997. … Structure. I. Contrasting Results from Matrix Correlation and Common Principal Component Analysis. Evolution. [Cited by 43] (4.62/year)
STROTHER, S.C., et al., 1995. Principal component analysis and the scaled subprofile model compared to intersubject averaging and …. J Cereb Blood Flow Metab. [Cited by 54] (4.77/year)
STROTHER, S.C., I. KANNO and D.A. ROTTENBERG, 1995. Commentary and opinion: I. Principal component analysis, variance partitioning, and" functional …. J Cereb Blood Flow Metab. [Cited by 36] (3.18/year)
SU, C.T., 1997. Multi-response robust design by principal component analysis. Total Quality Management. [Cited by 34] (3.65/year)
TER, C.J.F., … detrended][canonical] correspondence analysis, principal components analysis and redundancy analysis …. Groep Landbouwwiskunde. [Cited by 324] (?/year)
TER, C.J.F., 1987. … detrended][canonical] correspondence analysis, principal components analysis and redundancy analysis. TNO Institute of Applied Computer Science, Wageningen. [Cited by 168] (8.70/year)
TIPPING, M., 2001. Sparse kernel principal component analysis. Advances in Neural Information Processing Systems. [Cited by 37] (6.97/year)
TIPPING, M.E. and C.M. BISHOP, Probabilistic principal component analysis. ingentaconnect.com. [Cited by 269] (?/year)
TONG, H. and C.M. CROWE, 1995. Detection of gross erros in data reconciliation by principal component analysis. AIChE Journal. [Cited by 47] (4.16/year)
WALL, M.E., A. RECHTSTEINER and L.M. ROCHA, Singular value decomposition and principal component analysis. arxiv.org. [Cited by 43] (?/year)
WENFU, K., R.H. STRORER and C. GEORGAKIS, 1995. Disturbance Detection and Isolation by Dynamic Principal Component Analysis. Chemometrics and Intelligent Laboratory System. [Cited by 120] (10.61/year)
WIDAMAN, K.F., 1993. Common Factor Analysis Versus Principal Component Analysis: Differential Bias in Representing Model …. Multivariate Behavioral Research. [Cited by 50] (3.76/year)
WOLD, S., 1978. Cross validatory estimation of the number of components in factor and principal component analysis. Technometrics. [Cited by 42] (1.48/year)
WOLD, S., 1987. Principal component analysis.. Chemometrics and Intelligent Laboratory Systems. [Cited by 418] (21.65/year)
WOOD, C.C. and G. MCCARTHY, 1984. Principal component analysis of event-related potentials: simulation studies demonstrate …. Electroencephalogr Clin Neurophysiol. [Cited by 40] (1.79/year)
XU, L. and L. YUILLE, 1995. Robust Principal Component Analysis by Self-organizing Rules Based on Statistical Physics Approach. IEEE TRANSACTIONS ON NEURAL NETWORKS. [Cited by 61] (5.39/year)
YEUNG, K.Y. and W.L. RUZZO, 2001. Principal component analysis for clustering gene expression data. Bioinformatics. [Cited by 154] (29.00/year)
ZITKO, V., 1994. Principal component analysis in the evaluation of environmental data - ?num=100&hl=en&lr=&ie=UTF-8&cluster=7399591638912396902">group of 2 ». Marine Pollution Bulletin. [Cited by 35] (2.84/year)

Principal Component Analysis (PCA)

Book

Links

Bibliography