Whole genome sequence of a Turkish individual. 2014

Haluk Dogan, and Handan Can, and Hasan H Otu
Department of Genetics and Bioengineering, Istanbul Bilgi University, Istanbul, Turkey.

Although whole human genome sequencing can be done with readily available technical and financial resources, the need for detailed analyses of genomes of certain populations still exists. Here we present, for the first time, sequencing and analysis of a Turkish human genome. We have performed 35x coverage using paired-end sequencing, where over 95% of sequencing reads are mapped to the reference genome covering more than 99% of the bases. The assembly of unmapped reads rendered 11,654 contigs, 2,168 of which did not reveal any homology to known sequences, resulting in ∼1 Mbp of unmapped sequence. Single nucleotide polymorphism (SNP) discovery resulted in 3,537,794 SNP calls with 29,184 SNPs identified in coding regions, where 106 were nonsense and 259 were categorized as having a high-impact effect. The homo/hetero zygosity (1,415,123∶2,122,671 or 1∶1.5) and transition/transversion ratios (2,383,204∶1,154,590 or 2.06∶1) were within expected limits. Of the identified SNPs, 480,396 were potentially novel with 2,925 in coding regions, including 48 nonsense and 95 high-impact SNPs. Functional analysis of novel high-impact SNPs revealed various interaction networks, notably involving hereditary and neurological disorders or diseases. Assembly results indicated 713,640 indels (1∶1.09 insertion/deletion ratio), ranging from -52 bp to 34 bp in length and causing about 180 codon insertion/deletions and 246 frame shifts. Using paired-end- and read-depth-based methods, we discovered 9,109 structural variants and compared our variant findings with other populations. Our results suggest that whole genome sequencing is a valuable tool for understanding variations in the human genome across different populations. Detailed analyses of genomes of diverse origins greatly benefits research in genetics and medicine and should be conducted on a larger scale.

UI MeSH Term Description Entries
D002874 Chromosome Mapping Any method used for determining the location of and relative distances between genes on a chromosome. Gene Mapping,Linkage Mapping,Genome Mapping,Chromosome Mappings,Gene Mappings,Genome Mappings,Linkage Mappings,Mapping, Chromosome,Mapping, Gene,Mapping, Genome,Mapping, Linkage,Mappings, Chromosome,Mappings, Gene,Mappings, Genome,Mappings, Linkage
D003062 Codon A set of three nucleotides in a protein coding sequence that specifies individual amino acids or a termination signal (CODON, TERMINATOR). Most codons are universal, but some organisms do not produce the transfer RNAs (RNA, TRANSFER) complementary to all codons. These codons are referred to as unassigned codons (CODONS, NONSENSE). Codon, Sense,Sense Codon,Codons,Codons, Sense,Sense Codons
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D014421 Turkey Country in Southeastern Europe and Southwestern Asia bordering the Black Sea, between Bulgaria and Georgia, and bordering the Aegean Sea and the Mediterranean Sea, between Greece and Syria. The capital is Ankara. Turkiye
D015894 Genome, Human The complete genetic complement contained in the DNA of a set of CHROMOSOMES in a HUMAN. The length of the human genome is about 3 billion base pairs. Human Genome,Genomes, Human,Human Genomes
D016366 Open Reading Frames A sequence of successive nucleotide triplets that are read as CODONS specifying AMINO ACIDS and begin with an INITIATOR CODON and end with a stop codon (CODON, TERMINATOR). ORFs,Protein Coding Region,Small Open Reading Frame,Small Open Reading Frames,sORF,Unassigned Reading Frame,Unassigned Reading Frames,Unidentified Reading Frame,Coding Region, Protein,Frame, Unidentified Reading,ORF,Open Reading Frame,Protein Coding Regions,Reading Frame, Open,Reading Frame, Unassigned,Reading Frame, Unidentified,Region, Protein Coding,Unidentified Reading Frames
D017422 Sequence Analysis, DNA A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis. DNA Sequence Analysis,Sequence Determination, DNA,Analysis, DNA Sequence,DNA Sequence Determination,DNA Sequence Determinations,DNA Sequencing,Determination, DNA Sequence,Determinations, DNA Sequence,Sequence Determinations, DNA,Analyses, DNA Sequence,DNA Sequence Analyses,Sequence Analyses, DNA,Sequencing, DNA
D053263 Gene Regulatory Networks Interacting DNA-encoded regulatory subsystems in the GENOME that coordinate input from activator and repressor TRANSCRIPTION FACTORS during development, cell differentiation, or in response to environmental cues. The networks function to ultimately specify expression of particular sets of GENES for specific conditions, times, or locations. Gene Circuits,Gene Modules,Gene Networks,Transcriptional Networks,Gene Module,Circuit, Gene,Circuits, Gene,Gene Circuit,Gene Network,Gene Regulatory Network,Module, Gene,Modules, Gene,Network, Gene,Network, Gene Regulatory,Network, Transcriptional,Networks, Gene,Networks, Gene Regulatory,Networks, Transcriptional,Regulatory Network, Gene,Regulatory Networks, Gene,Transcriptional Network
D054643 INDEL Mutation A mutation named with the blend of insertion and deletion. It refers to a length difference between two ALLELES where it is unknowable if the difference was originally caused by a SEQUENCE INSERTION or by a SEQUENCE DELETION. If the number of nucleotides in the insertion/deletion is not divisible by three, and it occurs in a protein coding region, it is also a FRAMESHIFT MUTATION. INDELs Mutation,Insertions-Deletions Mutation,Insertion-Deletion Mutation,INDEL Mutations,INDELs Mutations,Insertion Deletion Mutation,Insertion-Deletion Mutations,Insertions Deletions Mutation,Insertions-Deletions Mutations,Mutation, INDEL,Mutation, INDELs,Mutation, Insertion-Deletion,Mutation, Insertions-Deletions
D058977 Molecular Sequence Annotation The addition of descriptive information about the function or structure of a molecular sequence to its MOLECULAR SEQUENCE DATA record. Gene Annotation,Protein Annotation,Annotation, Gene,Annotation, Molecular Sequence,Annotation, Protein,Annotations, Gene,Annotations, Molecular Sequence,Annotations, Protein,Gene Annotations,Molecular Sequence Annotations,Protein Annotations,Sequence Annotation, Molecular,Sequence Annotations, Molecular

Related Publications

Haluk Dogan, and Handan Can, and Hasan H Otu
August 2009, Nature,
Haluk Dogan, and Handan Can, and Hasan H Otu
December 2015, Mammalian genome : official journal of the International Mammalian Genome Society,
Haluk Dogan, and Handan Can, and Hasan H Otu
October 2003, Journal of neuroimmunology,
Haluk Dogan, and Handan Can, and Hasan H Otu
October 2014, Genome announcements,
Haluk Dogan, and Handan Can, and Hasan H Otu
January 2012, Nature genetics,
Haluk Dogan, and Handan Can, and Hasan H Otu
December 2019, Journal of open source software,
Haluk Dogan, and Handan Can, and Hasan H Otu
March 2016, Genome announcements,
Haluk Dogan, and Handan Can, and Hasan H Otu
June 2023, Microbiology resource announcements,
Haluk Dogan, and Handan Can, and Hasan H Otu
December 2015, Tuberculosis (Edinburgh, Scotland),
Haluk Dogan, and Handan Can, and Hasan H Otu
March 2022, Balkan medical journal,
Copied contents to your clipboard!