Gene selection algorithms for microarray data based on least squares support vector machine. 2006

E Ke Tang, and P N Suganthan, and Xin Yao
School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore. tangke@pmail.ntu.edu.sg

BACKGROUND In discriminant analysis of microarray data, usually a small number of samples are expressed by a large number of genes. It is not only difficult but also unnecessary to conduct the discriminant analysis with all the genes. Hence, gene selection is usually performed to select important genes. RESULTS A gene selection method searches for an optimal or near optimal subset of genes with respect to a given evaluation criterion. In this paper, we propose a new evaluation criterion, named the leave-one-out calculation (LOOC, A list of abbreviations appears just above the list of references) measure. A gene selection method, named leave-one-out calculation sequential forward selection (LOOCSFS) algorithm, is then presented by combining the LOOC measure with the sequential forward selection scheme. Further, a novel gene selection algorithm, the gradient-based leave-one-out gene selection (GLGS) algorithm, is also proposed. Both of the gene selection algorithms originate from an efficient and exact calculation of the leave-one-out cross-validation error of the least squares support vector machine (LS-SVM). The proposed approaches are applied to two microarray datasets and compared to other well-known gene selection methods using codes available from the second author. CONCLUSIONS The proposed gene selection approaches can provide gene subsets leading to more accurate classification results, while their computational complexity is comparable to the existing methods. The GLGS algorithm can also better scale to datasets with a very large number of genes.

UI MeSH Term Description Entries
D010363 Pattern Recognition, Automated In INFORMATION RETRIEVAL, machine-sensing or identification of visible patterns (shapes, forms, and configurations). (Harrod's Librarians' Glossary, 7th ed) Automated Pattern Recognition,Pattern Recognition System,Pattern Recognition Systems
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D001185 Artificial Intelligence Theory and development of COMPUTER SYSTEMS which perform tasks that normally require human intelligence. Such tasks may include speech recognition, LEARNING; VISUAL PERCEPTION; MATHEMATICAL COMPUTING; reasoning, PROBLEM SOLVING, DECISION-MAKING, and translation of language. AI (Artificial Intelligence),Computer Reasoning,Computer Vision Systems,Knowledge Acquisition (Computer),Knowledge Representation (Computer),Machine Intelligence,Computational Intelligence,Acquisition, Knowledge (Computer),Computer Vision System,Intelligence, Artificial,Intelligence, Computational,Intelligence, Machine,Knowledge Representations (Computer),Reasoning, Computer,Representation, Knowledge (Computer),System, Computer Vision,Systems, Computer Vision,Vision System, Computer,Vision Systems, Computer
D012680 Sensitivity and Specificity Binary classification measures to assess test results. Sensitivity or recall rate is the proportion of true positives. Specificity is the probability of correctly determining the absence of a condition. (From Last, Dictionary of Epidemiology, 2d ed) Specificity,Sensitivity,Specificity and Sensitivity
D015203 Reproducibility of Results The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results. Reliability and Validity,Reliability of Result,Reproducibility Of Result,Reproducibility of Finding,Validity of Result,Validity of Results,Face Validity,Reliability (Epidemiology),Reliability of Results,Reproducibility of Findings,Test-Retest Reliability,Validity (Epidemiology),Finding Reproducibilities,Finding Reproducibility,Of Result, Reproducibility,Of Results, Reproducibility,Reliabilities, Test-Retest,Reliability, Test-Retest,Result Reliabilities,Result Reliability,Result Validities,Result Validity,Result, Reproducibility Of,Results, Reproducibility Of,Test Retest Reliability,Validity and Reliability,Validity, Face
D016018 Least-Squares Analysis A principle of estimation in which the estimates of a set of parameters in a statistical model are those quantities minimizing the sum of squared differences between the observed values of a dependent variable and the values predicted by the model. Rietveld Refinement,Analysis, Least-Squares,Least Squares,Analyses, Least-Squares,Analysis, Least Squares,Least Squares Analysis,Least-Squares Analyses,Refinement, Rietveld
D020411 Oligonucleotide Array Sequence Analysis Hybridization of a nucleic acid sample to a very large set of OLIGONUCLEOTIDE PROBES, which have been attached individually in columns and rows to a solid support, to determine a BASE SEQUENCE, or to detect variations in a gene sequence, GENE EXPRESSION, or for GENE MAPPING. DNA Microarrays,Gene Expression Microarray Analysis,Oligonucleotide Arrays,cDNA Microarrays,DNA Arrays,DNA Chips,DNA Microchips,Gene Chips,Oligodeoxyribonucleotide Array Sequence Analysis,Oligonucleotide Microarrays,Sequence Analysis, Oligonucleotide Array,cDNA Arrays,Array, DNA,Array, Oligonucleotide,Array, cDNA,Arrays, DNA,Arrays, Oligonucleotide,Arrays, cDNA,Chip, DNA,Chip, Gene,Chips, DNA,Chips, Gene,DNA Array,DNA Chip,DNA Microarray,DNA Microchip,Gene Chip,Microarray, DNA,Microarray, Oligonucleotide,Microarray, cDNA,Microarrays, DNA,Microarrays, Oligonucleotide,Microarrays, cDNA,Microchip, DNA,Microchips, DNA,Oligonucleotide Array,Oligonucleotide Microarray,cDNA Array,cDNA Microarray
D020869 Gene Expression Profiling The determination of the pattern of genes expressed at the level of GENETIC TRANSCRIPTION, under specific circumstances or in a specific cell. Gene Expression Analysis,Gene Expression Pattern Analysis,Transcript Expression Analysis,Transcriptome Profiling,Transcriptomics,mRNA Differential Display,Gene Expression Monitoring,Transcriptome Analysis,Analyses, Gene Expression,Analyses, Transcript Expression,Analyses, Transcriptome,Analysis, Gene Expression,Analysis, Transcript Expression,Analysis, Transcriptome,Differential Display, mRNA,Differential Displays, mRNA,Expression Analyses, Gene,Expression Analysis, Gene,Gene Expression Analyses,Gene Expression Monitorings,Gene Expression Profilings,Monitoring, Gene Expression,Monitorings, Gene Expression,Profiling, Gene Expression,Profiling, Transcriptome,Profilings, Gene Expression,Profilings, Transcriptome,Transcript Expression Analyses,Transcriptome Analyses,Transcriptome Profilings,mRNA Differential Displays

Related Publications

E Ke Tang, and P N Suganthan, and Xin Yao
December 2009, IEEE transactions on neural networks,
E Ke Tang, and P N Suganthan, and Xin Yao
February 2009, BMC bioinformatics,
E Ke Tang, and P N Suganthan, and Xin Yao
January 2017, IEEE transactions on neural networks and learning systems,
E Ke Tang, and P N Suganthan, and Xin Yao
January 2022, Computational intelligence and neuroscience,
E Ke Tang, and P N Suganthan, and Xin Yao
May 2007, IEEE transactions on neural networks,
E Ke Tang, and P N Suganthan, and Xin Yao
February 2023, Artificial intelligence in medicine,
E Ke Tang, and P N Suganthan, and Xin Yao
June 2022, IEEE transactions on cybernetics,
E Ke Tang, and P N Suganthan, and Xin Yao
March 2010, Talanta,
E Ke Tang, and P N Suganthan, and Xin Yao
December 2015, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society,
E Ke Tang, and P N Suganthan, and Xin Yao
January 2023, PloS one,
Copied contents to your clipboard!