Abstract
This paper presents a novel investigation of machine learning performance by examining probability outputs in conjunction with classification accuracy (CA) and area under the curve (AUC). One of the main issues in the deployment of computer-aided detection/diagnosis (CAD) systems is lack of ‘trust’ of clinicians in the CAD system, increasing the possibility of the system not being used. Whilst most authors evaluate the performance of their breast CAD systems based on CA and AUC, we study the distribution of the classifiers’ probability outputs and use it as an additional confidence level metric to indicate the reliability of a computer system. Experimental results suggest that although most classifiers produce similar results in terms of CA and AUC (less than 2% variation), their performances are significantly different when considering confidence level (10 to 25% difference). This study may provide opportunities for refining radiologists’ interaction with CAD systems and improving the reliability of CAD systems as well as diagnostic decision making in medicine with high CA or AUC with high degree of certainty.
Original language | English |
---|---|
Title of host publication | Proceedings Volume 10718, 14th International Workshop on Breast Imaging (IWBI 2018) |
Editors | Elizabeth A Krupinski |
Publisher | SPIE |
Volume | 10718 |
Edition | 14th |
DOIs | |
Publication status | Published (in print/issue) - 6 Jul 2018 |
Event | 14th International Workshop on Breast Imaging - Atlanta, Georgia, USA Duration: 8 Jul 2018 → 11 Jul 2018 Conference number: 14th |
Conference
Conference | 14th International Workshop on Breast Imaging |
---|---|
Abbreviated title | IWBI |
Period | 8/07/18 → 11/07/18 |