Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing PlatformsReportar como inadecuado

Revising a Personal Genome by Comparing and Combining Data from Two Different Sequencing Platforms - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

For the robust practice of genomic medicine, sequencing results must be compatible, regardless of the sequencing technologies and algorithms used. Presently, genome sequencing is still an imprecise science and is complicated by differences in the chemistry, coverage, alignment, and variant-calling algorithms. We identified ∼3.33 million single nucleotide variants SNVs and ∼3.62 million SNVs in the SJK genome using SOLiD and Illumina data, respectively. Approximately 3 million SNVs were concordant between the two platforms while 68,532 SNVs were discordant; 219,616 SNVs were SOLiD-specific and 516,080 SNVs were Illumina-specific i.e., platform-specific. Concordant, discordant, and platform-specific SNVs were further analyzed and characterized. Overall, a large portion of heterozygous SNVs that were discordant with genotyping calls of single nucleotide polymorphism chips were highly confident. Approximately 70% of the platform-specific SNVs were located in regions containing repetitive sequences. Such platform-specificity may arise from differences between platforms, with regard to read length 36 bp and 72 bp vs. 50 bp, insert size ∼100–300 bp vs. ∼1–2 kb, sequencing chemistry sequencing-by-synthesis using single nucleotides vs. ligation-based sequencing using oligomers, and sequencing quality. When data from the two platforms were merged for variant calling, the proportion of callable regions of the reference genome increased to 99.66%, which was 1.43% higher than the average callability of the two platforms, representing ∼40 million bases. In this study, we compared the differences in sequencing results between two sequencing platforms. Approximately 90% of the SNVs were concordant between the two platforms, yet ∼10% of the SNVs were either discordant or platform-specific, indicating that each platform had its own strengths and weaknesses. When data from the two platforms were merged, both the overall callability of the reference genome and the overall accuracy of the SNVs improved, demonstrating the likelihood that a re-sequenced genome can be revised using complementary data.

Autor: Deokhoon Kim , Woo-Yeon Kim , Sun-Young Lee, Sung-Yeoun Lee, Hongseok Yun, Soo-Yong Shin, Jungyoun Lee, Yoojin Hong, Youngmi Won,



Documentos relacionados