Abstract
Purpose To develop and validate a deep learning-based model for automated evaluation of mammography phantom images, with the goal of improving inter-radiologist agreement and enhancing the efficiency of quality control within South Korea’s national accreditation system. Materials and methods A total of 5,917 mammography phantom images were collected from the Korea Institute for Accreditation of Medical Imaging (KIAMI). After preprocessing, 5,813 images (98.2%) met quality standards and were divided into training, test, and evaluation datasets. Each image included 16 artificial lesions (fibers, specks, masses) scored by certified radiologists. Images were preprocessed, standardized, and divided into 16 subimages. An EfficientNetV2_L-based model, selected for its balance of accuracy and computational efficiency, was used to predict both lesion existence and scoring adequacy (score of 0.0, 0.5, 1.0). Model performance was evaluated using accuracy, F1-score, area under the curve (AUC), and explainable AI techniques. Results The model achieved classification accuracy of 87.84%, 93.43%, and 86.63% for fibers (F1: 0.7292, 95% bootstrap CI: 0.711, 0.747), specks (F1: 0. 7702, 95% bootstrap CI: 0.750, 0.791), and masses (F1: 0.7594, 95% bootstrap CI: 0.736, 0.781), respectively. AUCs exceeded 0.97 for 0.0-score detection and above 0.94 for 0.5-score detection. Notably, the model demonstrated strong discriminative capability in 1.0-score detection across all lesion types. Model interpretation experiments confirmed adherence to guideline criteria: fiber scoring reflected the “longest visible segment” rule; speck detection showed score transitions at two and four visible points; and mass evaluation prioritized circularity but showed some size-related bias. Saliency maps confirmed alignment with guideline-defined lesion features while ignoring irrelevant artifacts. Conclusion The proposed deep learning model accurately assessed mammography phantom images according to guideline criteria and achieved expert-level performance. By automating the evaluation process, the model can improve scoring consistency and significantly enhance the efficiency and scalability of quality control workflows.
| Original language | English |
|---|---|
| Article number | e0330091 |
| Journal | PloS one |
| Volume | 20 |
| Issue number | 9 September |
| DOIs | |
| Publication status | Published - 2025 Sept |
Bibliographical note
Publisher Copyright:© 2025 Yun et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
ASJC Scopus subject areas
- General
Fingerprint
Dive into the research topics of 'AI-Driven quality assurance in mammography: Enhancing quality control efficiency through automated phantom image evaluation in South Korea'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS