The application of rule-based methods to class prediction problems in genomics. 2003

George Michailidis, and Kerby Shedden
Department of Statistics, University of Michigan, Ann Arbor, MI 48109-1027, USA.

We propose a method for constructing classifiers using logical combinations of elementary rules. The method is a form of rule-based classification, which has been widely discussed in the literature. In this work we focus specifically on issues that arise in the context of classifying cell samples based on RNA or protein expression measurements. The basic idea is to specify elementary rules that exhibit a locally strong pattern in favor of a single class. Strict admissibility criteria are imposed to produce a manageable universe of elementary rules. Then the elementary rules are combined using a set covering algorithm to form a composite rule that achieves a perfect fit to the training data. The user has explicit control over a parameter that determines the composite rule's level of redundancy and parsimony. This built-in control, along with the simplicity of interpreting the rules, makes the method particularly useful for classification problems in genomics. We demonstrate the new method using several microarray datasets and examine its generalization performance. We also draw comparisons to other machine-learning strategies such as CART, ID3, and C4.5.

UI MeSH Term Description Entries
D008223 Lymphoma A general term for various neoplastic diseases of the lymphoid tissue. Germinoblastoma,Lymphoma, Malignant,Reticulolymphosarcoma,Sarcoma, Germinoblastic,Germinoblastic Sarcoma,Germinoblastic Sarcomas,Germinoblastomas,Lymphomas,Lymphomas, Malignant,Malignant Lymphoma,Malignant Lymphomas,Reticulolymphosarcomas,Sarcomas, Germinoblastic
D008957 Models, Genetic Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment. Genetic Models,Genetic Model,Model, Genetic
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D001943 Breast Neoplasms Tumors or cancer of the human BREAST. Breast Cancer,Breast Tumors,Cancer of Breast,Breast Carcinoma,Cancer of the Breast,Human Mammary Carcinoma,Malignant Neoplasm of Breast,Malignant Tumor of Breast,Mammary Cancer,Mammary Carcinoma, Human,Mammary Neoplasm, Human,Mammary Neoplasms, Human,Neoplasms, Breast,Tumors, Breast,Breast Carcinomas,Breast Malignant Neoplasm,Breast Malignant Neoplasms,Breast Malignant Tumor,Breast Malignant Tumors,Breast Neoplasm,Breast Tumor,Cancer, Breast,Cancer, Mammary,Cancers, Mammary,Carcinoma, Breast,Carcinoma, Human Mammary,Carcinomas, Breast,Carcinomas, Human Mammary,Human Mammary Carcinomas,Human Mammary Neoplasm,Human Mammary Neoplasms,Mammary Cancers,Mammary Carcinomas, Human,Neoplasm, Breast,Neoplasm, Human Mammary,Neoplasms, Human Mammary,Tumor, Breast
D002965 Classification The systematic arrangement of entities in any field into categories classes based on common characteristics such as properties, morphology, subject matter, etc. Systematics,Taxonomy,Classifications,Taxonomies
D005260 Female Females
D005786 Gene Expression Regulation Any of the processes by which nuclear, cytoplasmic, or intercellular factors influence the differential control (induction or repression) of gene action at the level of transcription or translation. Gene Action Regulation,Regulation of Gene Expression,Expression Regulation, Gene,Regulation, Gene Action,Regulation, Gene Expression
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D012313 RNA A polynucleotide consisting essentially of chains with a repeating backbone of phosphate and ribose units to which nitrogenous bases are attached. RNA is unique among biological macromolecules in that it can encode genetic information, serve as an abundant structural component of cells, and also possesses catalytic activity. (Rieger et al., Glossary of Genetics: Classical and Molecular, 5th ed) RNA, Non-Polyadenylated,Ribonucleic Acid,Gene Products, RNA,Non-Polyadenylated RNA,Acid, Ribonucleic,Non Polyadenylated RNA,RNA Gene Products,RNA, Non Polyadenylated
D019295 Computational Biology A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets. Bioinformatics,Molecular Biology, Computational,Bio-Informatics,Biology, Computational,Computational Molecular Biology,Bio Informatics,Bio-Informatic,Bioinformatic,Biologies, Computational Molecular,Biology, Computational Molecular,Computational Molecular Biologies,Molecular Biologies, Computational

Related Publications

George Michailidis, and Kerby Shedden
December 1996, Applied optics,
George Michailidis, and Kerby Shedden
June 2015, Psychological assessment,
George Michailidis, and Kerby Shedden
January 2014, Toxicology reports,
George Michailidis, and Kerby Shedden
August 2022, Proceedings of the National Academy of Sciences of the United States of America,
George Michailidis, and Kerby Shedden
September 2008, Artificial intelligence in medicine,
George Michailidis, and Kerby Shedden
August 1994, The Journal of medicine and philosophy,
George Michailidis, and Kerby Shedden
January 1947, Nature,
George Michailidis, and Kerby Shedden
April 1954, Archivos de pediatria del Uruguay,
George Michailidis, and Kerby Shedden
January 1983, Neuropsychobiology,
Copied contents to your clipboard!