The clinical feasibility of deep learning-based classification of amyloid PET images in visually equivocal cases

Hye Joo Son, Jungsu S. Oh, Minyoung Oh, Soo Jong Kim, Jae Hong Lee, Jee Hoon Roh, Jae Seung Kim

Research output: Contribution to journalArticlepeer-review

26 Citations (Scopus)


Purpose: Although most deep learning (DL) studies have reported excellent classification accuracy, these studies usually target typical Alzheimer’s disease (AD) and normal cognition (NC) for which conventional visual assessment performs well. A clinically relevant issue is the selection of high-risk subjects who need active surveillance among equivocal cases. We validated the clinical feasibility of DL compared with visual rating or quantitative measurement for assessing the diagnosis and prognosis of subjects with equivocal amyloid scans. Methods: 18F-florbetaben scans of 430 cases (85 NC, 233 mild cognitive impairment, and 112 AD) were assessed through visual rating-based, quantification-based, and DL-based methods. DL was trained using 280 two-dimensional PET images (80%) and tested by randomly assigning the remaining (70 cases, 20%) cases and a clinical validation set of 54 equivocal cases. In the equivocal cases, we assessed the agreement among the visual rating, quantification, and DL and compared the clinical outcome according to each modality-based amyloid status. Results: The visual reading was positive in 175 cases, equivocal in 54 cases, and negative in 201 cases. The composite SUVR cutoff value was 1.32 (AUC 0.99). The subject-level performance of DL using the test set was 100%. Among the 54 equivocal cases, 37 cases were classified as positive (Eq(deep+)) by DL, 40 cases were classified by a second-round visual assessment, and 40 cases were classified by quantification. The DL- and quantification-based classifications showed good agreement (83%, κ = 0.59). The composite SUVRs differed between Eq(deep+) (1.47 [0.13]) and Eq(deep−) (1.29 [0.10]; P < 0.001). DL, but not the visual rating, showed a significant difference in the Mini-Mental Status Examination score change during the follow-up between Eq(deep+) (− 4.21 [0.57]) and Eq(deep−) (− 1.74 [0.76]; P = 0.023) (mean duration, 1.76 years). Conclusions: In visually equivocal scans, DL was more related to quantification than to visual assessment, and the negative cases selected by DL showed no decline in cognitive outcome. DL is useful for clinical diagnosis and prognosis assessment in subjects with visually equivocal amyloid scans.

Original languageEnglish
Pages (from-to)332-341
Number of pages10
JournalEuropean Journal of Nuclear Medicine and Molecular Imaging
Issue number2
Publication statusPublished - 2020 Feb 1
Externally publishedYes


  • F-florbetaben PET
  • Alzheimer’s disease
  • Amyloid
  • Deep learning
  • Equivocal scan

ASJC Scopus subject areas

  • Radiology Nuclear Medicine and imaging


Dive into the research topics of 'The clinical feasibility of deep learning-based classification of amyloid PET images in visually equivocal cases'. Together they form a unique fingerprint.

Cite this