TY - JOUR
T1 - The effect of confounding data features on a deep learning algorithm to predict complete coronary occlusion in a retrospective observational setting
AU - Brisk, Rob
AU - Bond, RR
AU - Finlay, D
AU - McLaughlin, James
AU - Jasinska-Piadlo, Alicja
AU - Leslie, Stephen James
AU - Gossman, David E
AU - Menown, Ian
AU - McEneaney, David J.
AU - Warren, Stafford
PY - 2021/2/20
Y1 - 2021/2/20
N2 - Abstract Aims Deep learning (DL) has emerged in recent years as an effective technique in automated ECG analysis. Methods and results A retrospective, observational study was designed to assess the feasibility of detecting induced coronary artery occlusion in human subjects earlier than experienced cardiologists using a DL algorithm. A deep convolutional neural network was trained using data from the STAFF III database. The task was to classify ECG samples as showing acute coronary artery occlusion, or no occlusion. Occluded samples were recorded after 60 s of balloon occlusion of a single coronary artery. For the first iteration of the experiment, non-occluded samples were taken from ECGs recorded in a restroom prior to entering theatres. For the second iteration of the experiment, non-occluded samples were taken in the theatre prior to balloon inflation. Results were obtained using a cross-validation approach. In the first iteration of the experiment, the DL model achieved an F1 score of 0.814, which was higher than any of three reviewing cardiologists or STEMI criteria. In the second iteration of the experiment, the DL model achieved an F1 score of 0.533, which is akin to the performance of a random chance classifier. Conclusion The dataset was too small for the second model to achieve meaningful performance, despite the use of transfer learning. However, ‘data leakage’ during the first iteration of the experiment led to falsely high results. This study highlights the risk of DL models leveraging data leaks to produce spurious results.
AB - Abstract Aims Deep learning (DL) has emerged in recent years as an effective technique in automated ECG analysis. Methods and results A retrospective, observational study was designed to assess the feasibility of detecting induced coronary artery occlusion in human subjects earlier than experienced cardiologists using a DL algorithm. A deep convolutional neural network was trained using data from the STAFF III database. The task was to classify ECG samples as showing acute coronary artery occlusion, or no occlusion. Occluded samples were recorded after 60 s of balloon occlusion of a single coronary artery. For the first iteration of the experiment, non-occluded samples were taken from ECGs recorded in a restroom prior to entering theatres. For the second iteration of the experiment, non-occluded samples were taken in the theatre prior to balloon inflation. Results were obtained using a cross-validation approach. In the first iteration of the experiment, the DL model achieved an F1 score of 0.814, which was higher than any of three reviewing cardiologists or STEMI criteria. In the second iteration of the experiment, the DL model achieved an F1 score of 0.533, which is akin to the performance of a random chance classifier. Conclusion The dataset was too small for the second model to achieve meaningful performance, despite the use of transfer learning. However, ‘data leakage’ during the first iteration of the experiment led to falsely high results. This study highlights the risk of DL models leveraging data leaks to produce spurious results.
KW - deep learning
KW - data leakage
KW - STEMI
KW - heart attacks
KW - acute myocardial infarction
KW - cardiology
KW - healthcare data science
KW - diagnostic algorithms
KW - decision support
KW - confounding factors
KW - curse of dimensionality
U2 - 10.1093/ehjdh/ztab002
DO - 10.1093/ehjdh/ztab002
M3 - Article
SN - 2634-3916
VL - 2
SP - 127
EP - 134
JO - European Heart Journal - Digital Health
JF - European Heart Journal - Digital Health
IS - 1
ER -