Assessment of CNN Performance in Cases of Breast Cancer, Staging and Restaging- A Tumour Type Not Included in Algorithm Training
; Searle, Julie ; Fakhry-Darian, Daniel ; Shimizu, Takeshi ;
Searle, Julie
Fakhry-Darian, Daniel
Shimizu, Takeshi
Glos Author
Date
2023-08-18
Type
Conference Abstract
Collections
Abstract
Aim/Introduction: To quantitatively assess the lesion classification performance of a convolutional neural network (CNN), using semi-automated segmentation, on metastatic breast cancer which is outside the data training set. The CNN was designed with the aim of supporting a reading physician to calculate total disease burden.
Material(s) and Method(s): 30 staging and restaging metastatic breast cancer F18 FDG PET/CT scans were analysed by 3 expert PET/CT readers and subsequently using a CNN algorithm. PERCIST criteria with a cut off of 41% was used in both readings to segment foci and measure metabolic tumour volume (MTV); the initial human read involved manual segmentation using an isocontour tool whereas the CNN algorithm classified foci based on an automated segmentation algorithm. Foci agreement on both reads was recorded for sensitivity/specificity analysis; MTV agreement was assessed with Spearman's rank correlation and Bland-Altman plots. Foci classification was verified by further correlation with other imaging modalities and clinical follow-up.
Result(s): PERCIST segmentation criteria identified 1191 foci. CNN classification sensitivity, specificity, accuracy and precision for these foci were 72%, 97%, 93% and 81%. This agrees with previous testing on unseen datasets. Good correlation was observed between the ranked MTV as measured by the CNN and the expert readers (Spearman's rho = 0.85, p>0.001), however a positive bias was observed in the CNN measured MTV relative to the manual measurement. This additional MTV was likely due to inclusion of physiological false positives (predominantly brown fat and bowel). The CNN classified 21 false positive (FP) and 14 false negative (FN) foci. The tendency to classify FP foci results in a precision score of 81%, however these FP findings are common and were easily identified by users.In total, 89 corrections (7% of total foci) were required by the expert readers when using the CNN (30 FP, 50 FN, 9 not segmented); 3 edits per patient on average using the current segmentation criteria. This demonstrates the robustness of the CNN to data from outside the training set.
Conclusion(s): The automated segmentation plus CNN classification algorithm requires minimal human interaction for assessment of foci and calculation of MTV in a metastatic breast cancer cohort; minor discrepancies were clinically insignificant and would not change patient management. This confirms the value of the software in assisting clinical reads and the potential of improving diagnostic confidence and ongoing management.
Citation
Nawwar A., Darian D., Searle J., Shimizu T. & Lyburn I.D. (2023). Assessment of CNN Performance in Cases of Breast Cancer, Staging and Restaging- A Tumour Type Not Included in Algorithm Training. European Journal of Nuclear Medicine and Molecular Imaging, 50(Supplement 1), S730-S731. https://doi.org/10.1007/s00259-023-06333-x
