Technical challenges for big data in biomedicine and health: data sources, infrastructure, and analytics. 2014

N Peek, and J H Holmes, and J Sun
Niels Peek, Centre for Health Informatics, The University of Manchester, Vaughan House, Portsmouth Street, Manchester M13 9GB, United Kingdom, E-mail: niels.peek@manchester.ac.uk.

OBJECTIVE To review technical and methodological challenges for big data research in biomedicine and health. METHODS We discuss sources of big datasets, survey infrastructures for big data storage and big data processing, and describe the main challenges that arise when analyzing big data. RESULTS The life and biomedical sciences are massively contributing to the big data revolution through secondary use of data that were collected during routine care and through new data sources such as social media. Efficient processing of big datasets is typically achieved by distributing computation over a cluster of computers. Data analysts should be aware of pitfalls related to big data such as bias in routine care data and the risk of false-positive findings in high-dimensional datasets. CONCLUSIONS The major challenge for the near future is to transform analytical methods that are used in the biomedical and health domain, to fit the distributed storage and processing model that is required to handle big data, while ensuring confidentiality of the data being analyzed.

UI MeSH Term Description Entries
D008490 Medical Informatics The field of information science concerned with the analysis and dissemination of medical data through the application of computers to various aspects of health care and medicine. Clinical Informatics,Medical Computer Science,Medical Information Science,Computer Science, Medical,Health Informatics,Health Information Technology,Informatics, Clinical,Informatics, Medical,Information Science, Medical,Health Information Technologies,Informatics, Health,Information Technology, Health,Medical Computer Sciences,Medical Information Sciences,Science, Medical Computer,Technology, Health Information
D003219 Confidentiality The privacy of information and its protection against unauthorized disclosure. Patient Data Privacy,Privacy of Patient Data,Privileged Communication,Confidential Information,Secrecy,Communication, Privileged,Communications, Privileged,Data Privacy, Patient,Information, Confidential,Privacy, Patient Data,Privileged Communications
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D016208 Databases, Factual Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references. Databanks, Factual,Data Banks, Factual,Data Bases, Factual,Data Bank, Factual,Data Base, Factual,Databank, Factual,Database, Factual,Factual Data Bank,Factual Data Banks,Factual Data Base,Factual Data Bases,Factual Databank,Factual Databanks,Factual Database,Factual Databases
D016247 Information Storage and Retrieval Organized activities related to the storage, location, search, and retrieval of information. Information Retrieval,Data Files,Data Linkage,Data Retrieval,Data Storage,Data Storage and Retrieval,Information Extraction,Information Storage,Machine-Readable Data Files,Data File,Data File, Machine-Readable,Data Files, Machine-Readable,Extraction, Information,Files, Machine-Readable Data,Information Extractions,Machine Readable Data Files,Machine-Readable Data File,Retrieval, Data,Storage, Data
D057225 Data Mining Use of sophisticated analysis tools to sort through, organize, examine, and combine large sets of information. Text Mining,Mining, Data,Mining, Text
D057286 Electronic Health Records Media that facilitate transportability of pertinent information concerning patient's illness across varied providers and geographic locations. Some versions include direct linkages to online CONSUMER HEALTH INFORMATION that is relevant to the health conditions and treatments related to a specific patient. Electronic Health Record Data,Electronic Medical Record,Electronic Medical Records,Computerized Medical Record,Computerized Medical Records,Electronic Health Record,Medical Record, Computerized,Medical Records, Computerized,Health Record, Electronic,Health Records, Electronic,Medical Record, Electronic,Medical Records, Electronic
D061108 Social Media Platforms that provide the ability and tools to create and publish information accessed via the INTERNET. Generally these platforms have three characteristics with content user generated, high degree of interaction between creator and viewer, and easily integrated with other sites. Folksonomy,Mobile Social Media,Mobile Social Networks,Social Media Messaging,Social Tagging,Social Medium,Twitter Messaging,Web 2.0,Folksonomies,Media Messaging, Social,Media, Social,Messaging, Social Media,Messaging, Twitter,Mobile Social Medias,Mobile Social Network,Social Media Messagings,Social Media, Mobile,Social Network, Mobile,Social Taggings,Tagging, Social,Web 2.0s
D019295 Computational Biology A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets. Bioinformatics,Molecular Biology, Computational,Bio-Informatics,Biology, Computational,Computational Molecular Biology,Bio Informatics,Bio-Informatic,Bioinformatic,Biologies, Computational Molecular,Biology, Computational Molecular,Computational Molecular Biologies,Molecular Biologies, Computational

Related Publications

N Peek, and J H Holmes, and J Sun
August 2021, Sensors (Basel, Switzerland),
N Peek, and J H Holmes, and J Sun
January 2017, Computational and structural biotechnology journal,
N Peek, and J H Holmes, and J Sun
October 2016, Journal of biomedical informatics,
N Peek, and J H Holmes, and J Sun
January 2015, Frontiers in genetics,
N Peek, and J H Holmes, and J Sun
January 2021, Frontiers in big data,
N Peek, and J H Holmes, and J Sun
September 2014, Big data,
N Peek, and J H Holmes, and J Sun
April 2014, Drug discovery today,
N Peek, and J H Holmes, and J Sun
November 2015, Nature,
N Peek, and J H Holmes, and J Sun
May 2001, Trends in biotechnology,
Copied contents to your clipboard!