Introduction to quality assessment
Next-generation sequencing (NGS) has revolutionized the field of agrigenomics by enabling researchers and breeders to explore genetic variation at an unprecedented level of detail. However, even with declining sequencing costs, it remains expensive and time-consuming to sequence large populations at high coverage. As a result, imputation has become an essential step in genomic analyses. Imputation leverages the genetic relationships and linkage disequilibrium (LD) patterns observed in well-characterized reference samples to predict unobserved variants in target samples. In a previous blog we described different statistical techniques used for imputation, while in this one we will focus on the metrics used to assed the quality of the imputed genotypes.
Quality assessment of imputation results helps researchers determine the reliability of the inferred genotypes. By evaluating and comparing imputed genotypes against known data or quality metrics, scientists can identify potential biases, minimize errors in downstream analyses, and optimize breeding strategies. Ultimately, robust quality assessment ensures that we are making well-informed decisions in plant and animal breeding programs, enhancing crop yields, disease resistance, and other economically important traits.
Metrics for evaluating imputation quality
No single metric provides a complete picture, so multiple measures are often used in combination.
INFO score The INFO score, commonly reported in human genetics, provides a summary measure of the variance of imputed allele dosages. It essentially compares the observed variance in the imputed genotypes to the variance expected under a perfect imputation scenario. High INFO scores (close to 1) indicate that imputation is well-calibrated and that the imputed genotypes closely match the expected distribution.
R-squared value The imputation R-squared (R²) is a correlation-based metric that evaluates how well the imputed genotype dosage correlates with true genotypes. An R² close to 1 suggests a strong agreement between imputed and true genotypes, reflecting high accuracy. Lower R² values signal that the imputation process may be less reliable, which can occur in genomic regions with low LD or inadequate reference panel representation.
Concordance rate Concordance measures the proportion of imputed genotypes that match a set of known, experimentally validated genotypes (often derived from high-coverage sequencing or genotyping arrays). Concordance rates provide a direct, intuitive measure of accuracy. For example, a concordance rate of 99% means that out of every 100 genotypes imputed, 99 are correct compared to the ground truth. This metric is straightforward but may be influenced by the quality of the known reference data. This metric is reported in the CURIO Genomics Platform.
Imputation quality score (IQS) The Imputation Quality score (IQS) is designed to account for the effect of allele frequency when assessing imputation performance. Since rare variants are harder to impute accurately than common variants, IQS adjusts for differences in allele frequencies to provide a more balanced perspective. By doing so, IQS facilitates meaningful comparisons of imputation quality across variants with varying population frequencies. This metric is reported in the CURIO Genomics Platform.
Hellinger score The hellinger score compares the probability distributions of imputed and true genotype probabilities. It provides a measure of how similar these distributions are, rather than simply focusing on point estimates. A low Hellinger distance indicates that the imputed genotype probabilities closely match the true underlying distribution, thus capturing the uncertainty and probabilistic nature of genotype inference.
Factors influencing imputation quality
Genetic structure and population diversity Population history and genetic structure play a significant role in imputation outcomes. In crops or livestock populations with minimal diversity or strong founder effects, it may be easier to impute missing genotypes accurately. Conversely, species with high genetic diversity and complex population stratification may require more sophisticated models and larger, more representative reference panels. Imputation accuracy often decreases when dealing with populations that differ significantly from the reference panel in terms of allele frequencies and LD patterns.
Reference panel selection The composition and size of the reference panel is one of the most critical factors influencing imputation quality. Larger, diverse reference panels tend to produce more accurate results because they capture a broader range of haplotypes and LD patterns. In agrigenomics, where domesticated crops and livestock have undergone complex breeding histories, selecting a reference panel that represents the target population’s genetic diversity is paramount. Custom panels that incorporate local germplasm or region-specific breeding lines often outperform generic reference sets.
Choice of imputation algorithm
Several tools exist for imputation, each with its own algorithms, assumptions and computational efficiencies. Methods that integrate large, diverse reference panels and use LD-aware algorithms tend to yield the highest-quality imputation results. For instance, research in dairy cattle has demonstrated that incorporating multiple reference breeds and using iterative refinement strategies can improve the concordance rate and R² metrics. Similarly, in maize breeding programs, pipelines that combine Beagle for phasing and Minimac3 for imputation often achieve better accuracy than using a single tool.Popular algorithms include IMPUTE2, Minimac3, and Beagle which are based on a Bayesian framework. While IMPUTE2 and Minimac3 have a long track record in human genetics and have been adapted for agrigenomic contexts, Beagle, has gained traction among breeders due to its speed and relative ease of use and it is the basis behind the CURIO Genomics Platform.
Performance comparisons have shown that while all three tools generally yield high-quality imputations, there can be differences in accuracy and run time depending on the species, marker density, and population structure. Trialing multiple tools, or combining them, may lead to improved overall performance.
Challenges in quality assessment
Limitations of current metrics While each quality metric provides valuable insights, none are perfect. High R² or INFO scores do not guarantee that rare variants are well-imputed, and a strong concordance rate may be skewed by an abundance of common variants that are easy to predict. Over-reliance on a single metric could lead to misinterpretations of overall imputation quality. Balancing multiple metrics and understanding their individual biases is critical.
Computational considerations Assessing imputation quality can be computationally intensive. For large datasets containing millions of markers, running multiple imputation software tools, performing cross-validation, or evaluating various quality metrics may require significant computational resources. Data handling, storage, and processing time can become bottlenecks, necessitating efficient pipeline design and parallelized computing solutions.
Future directions in quality assessment
Methodological advancements As agrigenomic datasets become richer and more complex, there is a growing need for more nuanced imputation quality assessment methods. Future approaches might incorporate machine learning models that integrate multiple metrics to produce a composite, context-dependent quality score. Additionally, novel statistical measures that better handle rare variants or structural variants could emerge as key tools for agrigenomics.
Integrating genomic and phenotypic data Ultimately, the goal of imputation in agrigenomics is to enable more effective breeding and management decisions. Future quality assessments may integrate phenotypic or environmental information to measure how accurately imputed genotypes predict traits of interest. By correlating imputed genotype quality with trait heritability or predictive accuracy in genomic selection, researchers can directly connect the fidelity of imputation to real-world outcomes.
Conclusion
Quality assessment of NGS imputation results is a foundational step in ensuring reliable and actionable genetic data in agrigenomics. By understanding the key metrics—INFO scores, R² values, concordance rates, IQS, and Hellinger scores—researchers can more accurately gauge imputation performance. The choice of reference panels, imputation software, and awareness of population structure further refine the accuracy and reliability of imputed datasets. While current metrics and methodologies provide a solid baseline, future advancements will likely integrate more comprehensive assessments, blending genomic, phenotypic, and environmental data to yield even more informed breeding decisions. As the field continues to evolve, robust quality assessment will remain central to unlocking the full potential of genomic information in agriculture.
References
Crisp, P. A., Hammond, R. L., & Furtado, A. (2022). Comprehensive Assessment of Genotype Imputation in Crop Genomics: A Case Study in Soybean. BMC Genomics, 23(1), 157.
Delgado, E. C., & Johnson, E. (2020). Evaluation of Multiple Imputation Tools for Cattle Genomic Prediction. Animal Genetics, 51(5), 789-799.
O’Donell, S. A., & Martin, M. (2023). Using Bayesian Frameworks for Robust Imputation in Plant Breeding: A Critical Review. The Plant Genome, 16(1), e20250.