Explainability Metrics of Deep Convolutional Networks for Photoplethysmography Quality Assessment.

TitleExplainability Metrics of Deep Convolutional Networks for Photoplethysmography Quality Assessment.
Publication TypeJournal Article
Year of Publication2021
AuthorsO Zhang, C Ding, T Pereira, R Xiao, K Gadhoumi, K Meisel, RJ Lee, Y Chen, and X Hu
JournalIeee Access
Volume9
Start Page29736
Pagination29736 - 29745
Date Published01/2021
Abstract

Photoplethysmography (PPG) is a noninvasive way to monitor various aspects of the circulatory system, and is becoming more and more widespread in biomedical processing. Recently, deep learning methods for analyzing PPG have also become prevalent, achieving state of the art results on heart rate estimation, atrial fibrillation detection, and motion artifact identification. Consequently, a need for interpretable deep learning has arisen within the field of biomedical signal processing. In this paper, we pioneer novel explanatory metrics which leverage domain-expert knowledge to validate a deep learning model. We visualize model attention over a whole testset using saliency methods and compare it to human expert annotations. Congruence, our first metric, measures the proportion of model attention within expert-annotated regions. Our second metric, Annotation Classification, measures how much of the expert annotations our deep learning model pays attention to. Finally, we apply our metrics to compare between a signal based model and an image based model for PPG signal quality classification. Both models are deep convolutional networks based on the ResNet architectures. We show that our signal-based one dimensional model acts in a more explainable manner than our image based model; on average 50.78% of the one dimensional model's attention are within expert annotations, whereas 36.03% of the two dimensional model's attention are within expert annotations. Similarly, when thresholding the one dimensional model attention, one can more accurately predict if each pixel of the PPG is annotated as artifactual by an expert. Through this testcase, we demonstrate how our metrics can provide a quantitative and dataset-wide analysis of how explainable the model is.

DOI10.1109/access.2021.3054613
Short TitleIeee Access