Whole-exome sequencing (WES) has gained significant traction as a tool for both scientific research and clinical practice in diagnosing genetic disorders. Despite its high informativeness, the method is not without limitations. Moreover, target regions may exhibit incomplete coverage or lack coverage entirely. A clear understanding of the completeness of exonic region coverage in genes selected for analysis is crucial when interpreting WES data.
This study included WES data from 71 patients, generated using the DNBSEQ-G50 genetic analyzer (MGI, China) and the Exome Capture V5 Probe Set (MGI, China) for DNA library preparation. Secondary data processing was performed using the ZLIMS software platform (MGI, China). The quality of sequencing results was assessed based on the exonic regions of 662 genes associated with the clinical manifestations of primary immunodeficiency. Coverage statistics were calculated using the bedtools software package (v2.27.1). Target regions for the analysis were extracted from the MGI Exome Capture V5 BED file corresponding to the Human hg19/hg37 genomic assembly.
Among the genes associated with primary immunodeficiency, 339 regions within 210 genes (31% of the total gene list) exhibited coverage of fewer than 30 nucleotides. Additionally, 224 exons in 183 genes were entirely uncovered across all samples (n = 71). The absence of coverage was partially attributed to the exclusion of certain region coordinates in the BED file integrated into the ZLIMS system, which resulted in the exclusion of these regions from downstream analyses. Other contributing factors to the absence of coverage remain unresolved due to the proprietary nature of the system.
Our data analysis demonstrates that the significant number of regions with low coverage necessitates careful consideration of the potential for both false-positive and false-negative results. However, even with good statistical parameters, it is important to remember the rule that the results of WES should not be used as a basis for excluding a clinical diagnosis. Our study highlights that identifying clinically significant variants may be limited not only by the inherent technical constraints of WES but also by deficiencies in the software solutions used for data processing. Even high-cost commercial data processing tools may contain critical flaws.