This highly accurate and presumably mostly unbiased set of SNP and indel genotype calls for NA12878 is the only gold standard variant genotype data set publicly available for systematic comparisons of variant callers. They also provided more highly confident calls and regions by integration of the version v2.19 GIAB calls, genomic information of pedigrees of NA12878 and Illumina Platinum project variant calls. They integrated fourteen variant data sets from five NGS technologies, seven read mappers and three variant calling methods and manually arbitrated between discordant data sets. Recently, the Genome in a Bottle (GIAB) consortium published a set of highly confident variant calls for one individual in the 1000 Genome project (sample NA12878) 19. The low concordance of variant-calling pipelines also prompted the clinical genomics community to seek for standardization of performance benchmarking of the pipelines 19.Ī systematic comparison of variant calling performance requires a gold standard set of reference variant calls. Several studies have been conducted to compare different variant calling pipelines and they reported substantial disagreement among variant calls made by different pipelines, suggesting a need for more cautious interpretation of called variants in genomic medicine 16, 17, 18. The first three callers can be applied to both Illumina and Ion Proton sequencing data, but TVC was developed as an Ion Proton specific caller. Among many read aligners, BWA-MEM 11, Bowtie2 12 and Novoalign ( ) are popular and among many variant callers, the Genome Analysis Tool Kit HaplotypeCaller (GATK-HC) 13, Samtools mpileup 14, Freebayes 15 and Torrent Variant Caller (TVC) are widely used in genomic variant analyses. Genomic variants, such as single nucleotide polymorphisms (SNPs) and DNA insertions and deletions (also known as indels), are identified by various analysis pipelines with combinations between short read aligners and variant callers. The accurate identification of genomic variants is therefore critical factor for the success of clinical genomics based on NGS technology. This interpretation simplicity by virtue of rarity of the variants, however, entails risk of false discoveries due to the errors in sequencing and false detection by variant calling methods. Because Mendelian disease associated variants with high functional impact rarely occur among the healthy population, interpretation of patient-specific variants is relatively simple. Already many Mendelian disease studies have employed NGS to identify causal genes based on patient-specific variants 4, 5, 6, 7, 8, 9, 10. Genome sequencing projects for healthy and disease cohorts have identified numerous functional or disease-associated genomic variants, which can give clues about therapeutic targets or genomic markers for novel clinical applications. Recent advances in next generation sequencing (NGS) technology have enabled many researchers access to affordable whole exome or genome sequencing (WES or WGS), leading to the phenomenal achievements in genome sequencing projects such as the personal genome project 1, genome project 3. The results of our study provide useful guidelines for reliable variant identification from deep sequencing of personal genomes. We observed different biases toward specific types of SNP genotyping errors by the different variant callers. Based on the gold standard reference variant calls from GIAB, we compared the performance of thirteen variant calling pipelines, testing combinations of three read aligners-BWA-MEM, Bowtie2 and Novoalign-and four variant callers-Genome Analysis Tool Kit HaplotypeCaller (GATK-HC), Samtools mpileup, Freebayes and Ion Proton Variant Caller (TVC), for twelve data sets for the NA12878 genome sequenced by different platforms including Illumina2000, Illumina2500 and Ion Proton, with various exome capture systems and exome coverage. Recently, a set of high-confident variant calls for one individual (NA12878) has been published by the Genome in a Bottle (GIAB) consortium, enabling performance benchmarking of different variant calling pipelines. Hence, a systematic comparison of the variant callers could give important guidance to NGS-based clinical genomics. Assorted variant calling methods have been developed, which show low concordance between their calls. The success of clinical genomics using next generation sequencing (NGS) requires the accurate and consistent identification of personal genome variants.