Direct observation of genomic heterogeneity through local haplotyping analysisReportar como inadecuado

Direct observation of genomic heterogeneity through local haplotyping analysis - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

BMC Genomics

, 15:418

Human and rodent genomics


BackgroundIt has been an abiding belief among geneticists that multicellular organisms’ genomes can be analyzed under the assumption that a single individual has a uniform genome in all its cells. Despite some evidence to the contrary, this belief has been used as an axiomatic assumption in most genome analysis software packages. In this paper we present observations in human whole genome data, human whole exome data and in mouse whole genome data to challenge this assumption. We show that heterogeneity is in fact ubiquitous and readily observable in ordinary Next Generation Sequencing NGS data.

ResultsStarting with the assumption that a single NGS read or read pair must come from one haplotype, we built a procedure for directly observing haplotypes at a local level by examining 2 or 3 adjacent single nucleotide polymorphisms SNPs which are close enough on the genome to be spanned by individual reads. We applied this procedure to NGS data from three different sources: whole genome of a Central European trio from the 1000 genomes project, whole genome data from laboratory-bred strains of mouse, and whole exome data from a set of patients of head and neck tumors. Thousands of loci were found in each genome where reads spanning 2 or 3 SNPs displayed more than two haplotypes, indicating that the locus is heterogeneous. We show that such loci are ubiquitous in the genome and cannot be explained by segmental duplications. We explain them on the basis of cellular heterogeneity at the genomic level. Such heterogeneous loci were found in all normal and tumor genomes examined.

ConclusionsOur results highlight the need for new methods to analyze genomic variation because existing ones do not systematically consider local haplotypes. Identification of cancer somatic mutations is complicated because of tumor heterogeneity. It is further complicated if, as we show, normal tissues are also heterogeneous. Methods for biomarker discovery must consider contextual haplotype information rather than just whether a variant -is present-.

AbbreviationsBAMBinary SAM

CEUCentral European

GATKGenome analysis tool kit

LHALocal haplotyping analysis

NGSNext generation sequencing

SAMSequence analysis and mapping

SNPSingle nucleotide polymorphism

SRASequence read archive

VCFVariant call format.

Electronic supplementary materialThe online version of this article doi:10.1186-1471-2164-15-418 contains supplementary material, which is available to authorized users.

Download fulltext PDF

Autor: Kamalakar Gulukota - Donald L Helseth Jr - Janardan D Khandekar


Documentos relacionados