Abstract
Healthcare systems are currently adapting to digital technologies, producing large quantities of novel data. Based on these data, machine-learning algorithms have been developed to support practitioners in labor-intensive workflows such as diagnosis, prognosis, triage or treatment of disease. However, their translation into medical practice is often hampered by a lack of careful evaluation in different settings. Efforts have started worldwide to establish guidelines for evaluating machine learning for health (ML4H) tools, highlighting the necessity to evaluate models for bias, interpretability, robustness, and possible failure modes. However, testing and adopting these guidelines in practice remains an open challenge. In this work, we target the paper-to-practice gap by applying an ML4H audit framework proposed by the ITU/WHO Focus Group on Artificial Intelligence for Health (FG-AI4H) to three use cases: diagnostic prediction of diabetic retinopathy, diagnostic prediction of Alzheimer’s disease, and cytomorphologic classification for leukemia diagnostics. The assessment comprises dimensions such as bias, interpretability, and robustness. Our results highlight the importance of fine-grained and case-adapted quality assessment, provide support for incorporating proposed quality assessment considerations of ML4H during the entire development life cycle, and suggest improvements for future ML4H reference evaluation frameworks.
| Original language | English |
|---|---|
| Pages (from-to) | 280-317 |
| Number of pages | 38 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 136 |
| State | Published - 2020 |
| Externally published | Yes |
| Event | 6th Workshop on Machine Learning for Health: Advancing Healthcare for All, ML4H 2020, in conjunction with the 34th Conference on Neural Information Processing Systems, NeurIPS 2020 - Virtual, Online Duration: 11 Dec 2020 → … |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Health
- Machine Learning
- Testing
Fingerprint
Dive into the research topics of 'ML4H Auditing: From Paper to Practice'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver