Sequencing genes in silico using single nucleotide polymorphismsReportar como inadecuado




Sequencing genes in silico using single nucleotide polymorphisms - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

BMC Genetics

, 13:6

Statistical and computational genetics

Abstract

BackgroundThe advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive.

ResultsTo accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences a pair of phased sequences-alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes.

ConclusionsPrior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate genes for more detailed functional and mechanistic studies.

KeywordsIn silico SNPs 1000 Genomes Project multi-allelic gene imputation Electronic supplementary materialThe online version of this article doi:10.1186-1471-2156-13-6 contains supplementary material, which is available to authorized users.

Download fulltext PDF



Autor: Xinyi Cindy Zhang - Bo Zhang - Shuying Sue Li - Xin Huang - John A Hansen - Lue Ping Zhao

Fuente: https://link.springer.com/article/10.1186/1471-2156-13-6



DESCARGAR PDF




Documentos relacionados