An application of least squares fit mapping to clinical classification. 1992

Y Yang, and C G Chute
Section of Medical Information Resources, Mayo Clinic/Foundation, Rochester, Minnesota 55905.

This paper describes a unique approach, "Least Square Fit Mapping," to clinical data classification. We use large collections of human-assigned text-to-category matches as training sets to compute the correlations between physicians' terms and canonical concepts. A Linear Least Squares Fit (LLSF) technique is employed to obtain a mapping function which optimally fits the known matches given in a training set and probabilistically captures the unknown matches for arbitrary texts. We tested our method with 16,032 texts from the Mayo Clinic, and judged the results using human-assigned answers. In a test for comparison, the LLSF mapping achieved a precision rate of 89% at 100% recall, outperforming alternative approaches including string matching (36% precision), string matching enhanced by morphological parsing (51% precision), and statistical weighting (61% precision).

UI MeSH Term Description Entries
D008432 Mathematical Computing Computer-assisted interpretation and analysis of various mathematical functions related to a particular problem. Statistical Computing,Computing, Statistical,Mathematic Computing,Statistical Programs, Computer Based,Computing, Mathematic,Computing, Mathematical,Computings, Mathematic,Computings, Mathematical,Computings, Statistical,Mathematic Computings,Mathematical Computings,Statistical Computings
D013358 Subject Headings Terms or expressions which provide the major means of access by subject to the bibliographic unit. Descriptors,Descriptor,Heading, Subject,Headings, Subject,Subject Heading
D016018 Least-Squares Analysis A principle of estimation in which the estimates of a set of parameters in a statistical model are those quantities minimizing the sum of squared differences between the observed values of a dependent variable and the values predicted by the model. Rietveld Refinement,Analysis, Least-Squares,Least Squares,Analyses, Least-Squares,Analysis, Least Squares,Least Squares Analysis,Least-Squares Analyses,Refinement, Rietveld
D016247 Information Storage and Retrieval Organized activities related to the storage, location, search, and retrieval of information. Information Retrieval,Data Files,Data Linkage,Data Retrieval,Data Storage,Data Storage and Retrieval,Information Extraction,Information Storage,Machine-Readable Data Files,Data File,Data File, Machine-Readable,Data Files, Machine-Readable,Extraction, Information,Files, Machine-Readable Data,Information Extractions,Machine Readable Data Files,Machine-Readable Data File,Retrieval, Data,Storage, Data

Related Publications

Y Yang, and C G Chute
August 2013, Statistical analysis and data mining,
Y Yang, and C G Chute
October 1978, Gene,
Y Yang, and C G Chute
January 1986, Physical review. C, Nuclear physics,
Y Yang, and C G Chute
October 2008, Neural networks : the official journal of the International Neural Network Society,
Y Yang, and C G Chute
January 1998, American journal of human genetics,
Y Yang, and C G Chute
February 2007, Journal of proteome research,
Y Yang, and C G Chute
December 1993, Physical review. C, Nuclear physics,
Y Yang, and C G Chute
January 1956, Journal of clinical psychology,
Y Yang, and C G Chute
January 1986, Genetique, selection, evolution,
Y Yang, and C G Chute
August 1983, Electroencephalography and clinical neurophysiology,
Copied contents to your clipboard!