Nonlinear principal component analysis of noisy data. 2007

William W Hsieh
Department of Earth and Ocean Sciences, University of British Columbia, Vancouver, BC, Canada. whsieh@eos.ubc.ca

With very noisy data, having plentiful samples eliminates overfitting in nonlinear regression, but not in nonlinear principal component analysis (NLPCA). To overcome this problem in NLPCA, a new information criterion (IC) is proposed for selecting the best model among multiple models with different complexity and regularization (i.e. weight penalty). This IC gauges the inconsistency I between the nonlinear principal components (u and ũ) for every data point x and its nearest neighbour x, with I=1 - correlation (u, ũ), where I tends to increase with overfitted solutions. Tests were performed using autoassociative neural networks for NLPCA on synthetic and real climate data (tropical Pacific sea surface temperatures and equatorial stratospheric winds), with the IC performing well in model selection and in deciding between an open curve or a closed curve solution.

UI MeSH Term Description Entries
D009622 Noise Any sound which is unwanted or interferes with HEARING other sounds. Noise Pollution,Noises,Pollution, Noise
D012044 Regression Analysis Procedures for finding the mathematical function which best describes the relationship between a dependent variable and one or more independent variables. In linear regression (see LINEAR MODELS) the relationship is constrained to be a straight line and LEAST-SQUARES ANALYSIS is used to determine the best fit. In logistic regression (see LOGISTIC MODELS) the dependent variable is qualitative rather than continuously variable and LIKELIHOOD FUNCTIONS are used to find the best relationship. In multiple regression, the dependent variable is considered to depend on more than a single independent variable. Regression Diagnostics,Statistical Regression,Analysis, Regression,Analyses, Regression,Diagnostics, Regression,Regression Analyses,Regression, Statistical,Regressions, Statistical,Statistical Regressions
D003627 Data Interpretation, Statistical Application of statistical procedures to analyze specific observed or assumed facts from a particular study. Data Analysis, Statistical,Data Interpretations, Statistical,Interpretation, Statistical Data,Statistical Data Analysis,Statistical Data Interpretation,Analyses, Statistical Data,Analysis, Statistical Data,Data Analyses, Statistical,Interpretations, Statistical Data,Statistical Data Analyses,Statistical Data Interpretations
D016571 Neural Networks, Computer A computer architecture, implementable in either hardware or software, modeled after biological neural networks. Like the biological system in which the processing capability is a result of the interconnection strengths between arrays of nonlinear processing nodes, computerized neural networks, often called perceptrons or multilayer connectionist models, consist of neuron-like units. A homogeneous group of units makes up a layer. These networks are good at pattern recognition. They are adaptive, performing tasks by example, and thus are better for decision-making than are linear learning machines or cluster analysis. They do not require explicit programming. Computational Neural Networks,Connectionist Models,Models, Neural Network,Neural Network Models,Neural Networks (Computer),Perceptrons,Computational Neural Network,Computer Neural Network,Computer Neural Networks,Connectionist Model,Model, Connectionist,Model, Neural Network,Models, Connectionist,Network Model, Neural,Network Models, Neural,Network, Computational Neural,Network, Computer Neural,Network, Neural (Computer),Networks, Computational Neural,Networks, Computer Neural,Networks, Neural (Computer),Neural Network (Computer),Neural Network Model,Neural Network, Computational,Neural Network, Computer,Neural Networks, Computational,Perceptron
D017711 Nonlinear Dynamics The study of systems which respond disproportionately (nonlinearly) to initial conditions or perturbing stimuli. Nonlinear systems may exhibit "chaos" which is classically characterized as sensitive dependence on initial conditions. Chaotic systems, while distinguished from more ordered periodic systems, are not random. When their behavior over time is appropriately displayed (in "phase space"), constraints are evident which are described by "strange attractors". Phase space representations of chaotic systems, or strange attractors, usually reveal fractal (FRACTALS) self-similarity across time scales. Natural, including biological, systems often display nonlinear dynamics and chaos. Chaos Theory,Models, Nonlinear,Non-linear Dynamics,Non-linear Models,Chaos Theories,Dynamics, Non-linear,Dynamics, Nonlinear,Model, Non-linear,Model, Nonlinear,Models, Non-linear,Non linear Dynamics,Non linear Models,Non-linear Dynamic,Non-linear Model,Nonlinear Dynamic,Nonlinear Model,Nonlinear Models,Theories, Chaos,Theory, Chaos
D025341 Principal Component Analysis Mathematical procedure that transforms a number of possibly correlated variables into a smaller number of uncorrelated variables called principal components. Analyses, Principal Component,Analysis, Principal Component,Principal Component Analyses

Related Publications

William W Hsieh
January 2004, Biotechnology progress,
William W Hsieh
March 2011, Neural networks : the official journal of the International Neural Network Society,
William W Hsieh
December 2000, Neural networks : the official journal of the International Neural Network Society,
William W Hsieh
January 2000, IEEE transactions on neural networks,
William W Hsieh
May 2008, Nature genetics,
William W Hsieh
December 2019, Advances in neural information processing systems,
William W Hsieh
April 2009, IEEE transactions on pattern analysis and machine intelligence,
William W Hsieh
February 2005, Human brain mapping,
William W Hsieh
January 2019, Briefings in bioinformatics,
William W Hsieh
October 2019, Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society,
Copied contents to your clipboard!