Structure discovery in medical databases: a conceptual clustering approach. 1996

F A da Veiga
Department of Biomathematics, Faculty of Medicine, University of Coimbra, Portugal. faveiga@cygnus.ci.uc.pt

Clustering is an important data analysis tool for discovering structure in data sets. Although research on conceptual clustering has produced algorithms showing significant advantages over earlier numerical ones, existing methods still present some limitations regarding applicability to biomedical domains. In this paper we describe ADAGIO, a conceptual clustering algorithm combining a low-cost preordering process with a breadth-first incremental control strategy that incorporates merging and splitting operators. Experimental evaluation indicated that the algorithm achieves a good balance between structure discovery performance and computational efficiency, and demonstrated the comparative effectiveness of its missing information handling process. ADAGIO is able to handle qualitative, quantitative and mixed-type data. An application example to a cancer domain is given, where the algorithm was able to suggest interesting epidemiological interpretations.

UI MeSH Term Description Entries
D008490 Medical Informatics The field of information science concerned with the analysis and dissemination of medical data through the application of computers to various aspects of health care and medicine. Clinical Informatics,Medical Computer Science,Medical Information Science,Computer Science, Medical,Health Informatics,Health Information Technology,Informatics, Clinical,Informatics, Medical,Information Science, Medical,Health Information Technologies,Informatics, Health,Information Technology, Health,Medical Computer Sciences,Medical Information Sciences,Science, Medical Computer,Technology, Health Information
D009369 Neoplasms New abnormal growth of tissue. Malignant neoplasms show a greater degree of anaplasia and have the properties of invasion and metastasis, compared to benign neoplasms. Benign Neoplasm,Cancer,Malignant Neoplasm,Tumor,Tumors,Benign Neoplasms,Malignancy,Malignant Neoplasms,Neoplasia,Neoplasm,Neoplasms, Benign,Cancers,Malignancies,Neoplasias,Neoplasm, Benign,Neoplasm, Malignant,Neoplasms, Malignant
D011174 Portugal A country in southwestern Europe, bordering the North Atlantic Ocean, west of Spain. The capital is Lisbon. Madeira Island,Portuguese Republic
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D001185 Artificial Intelligence Theory and development of COMPUTER SYSTEMS which perform tasks that normally require human intelligence. Such tasks may include speech recognition, LEARNING; VISUAL PERCEPTION; MATHEMATICAL COMPUTING; reasoning, PROBLEM SOLVING, DECISION-MAKING, and translation of language. AI (Artificial Intelligence),Computer Reasoning,Computer Vision Systems,Knowledge Acquisition (Computer),Knowledge Representation (Computer),Machine Intelligence,Computational Intelligence,Acquisition, Knowledge (Computer),Computer Vision System,Intelligence, Artificial,Intelligence, Computational,Intelligence, Machine,Knowledge Representations (Computer),Reasoning, Computer,Representation, Knowledge (Computer),System, Computer Vision,Systems, Computer Vision,Vision System, Computer,Vision Systems, Computer
D016000 Cluster Analysis A set of statistical methods used to group variables or observations into strongly inter-related subgroups. In epidemiology, it may be used to analyze a closely grouped series of events or cases of disease or other health-related phenomenon with well-defined distribution patterns in relation to time or place or both. Clustering,Analyses, Cluster,Analysis, Cluster,Cluster Analyses,Clusterings
D016208 Databases, Factual Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references. Databanks, Factual,Data Banks, Factual,Data Bases, Factual,Data Bank, Factual,Data Base, Factual,Databank, Factual,Database, Factual,Factual Data Bank,Factual Data Banks,Factual Data Base,Factual Data Bases,Factual Databank,Factual Databanks,Factual Database,Factual Databases

Related Publications

F A da Veiga
October 1994, International journal of bio-medical computing,
F A da Veiga
January 2022, Software and systems modeling,
F A da Veiga
April 1993, Computer methods and programs in biomedicine,
F A da Veiga
January 2015, Studies in health technology and informatics,
F A da Veiga
June 2014, Journal of computer-aided molecular design,
F A da Veiga
August 2019, Entropy (Basel, Switzerland),
F A da Veiga
January 1999, Proceedings. AMIA Symposium,
F A da Veiga
June 2004, Protein engineering, design & selection : PEDS,
F A da Veiga
January 1997, Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society,
F A da Veiga
January 1997, Journal of the American Medical Informatics Association : JAMIA,
Copied contents to your clipboard!