New research suggests a way to determine the accuracy of predictive AI in healthcare and when human clinicians should take over. Artificial intelligence (AI) has the potential to enhance various industries, including healthcare. However, to integrate AI tools safely, we need to understand when they are most useful.
Knowing when AI is more accurate than humans is crucial in healthcare, where predictive AI is increasingly used to assist clinicians. In collaboration with Google Research, we have published a joint paper in Nature Medicine introducing CoDoC (Complementarity-driven Deferral-to-Clinical Workflow). This AI system learns when to rely on predictive AI tools and when to defer to a clinician for the most accurate interpretation of medical images.
CoDoC explores human-AI collaboration in medical settings to deliver optimal results. In one scenario, CoDoC reduced false positives by 25% for a large mammography dataset without missing any true positives. We have collaborated with healthcare organizations, including the United Nations Office for Project Services’s Stop TB Partnership, and have also open-sourced CoDoC’s code on GitHub.
CoDoC is an add-on tool that can improve predictive AI models without requiring modification of the underlying AI tool itself. It is designed to be simple and usable, allowing non-experts like healthcare providers to deploy and run it on a single computer. Training CoDoC only requires a small amount of data and is compatible with proprietary AI models without needing access to their inner workings or training data.
The system determines the relative accuracy of predictive AI compared to clinicians’ interpretations by analyzing confidence scores, clinician interpretations, and ground truth data. It helps predictive AI systems “know when they don’t know.” Once trained, CoDoC can be inserted into a clinical workflow involving both AI and a clinician to assess which approach will result in the most accurate interpretation.
Testing CoDoC with real-world datasets has shown that combining human expertise with predictive AI improves accuracy. In addition to reducing false positives in mammography, CoDoC significantly reduces the number of cases that need to be read by a clinician. It can also improve the triage of chest X-rays for tuberculosis testing.
This research demonstrates the potential of AI systems like CoDoC to adapt and improve performance across different populations, settings, equipment, and diseases. However, further evaluation and validation are necessary to bring CoDoC safely into real-world medical settings. Healthcare providers and manufacturers must understand how clinicians interact with AI differently and validate systems with specific medical AI tools and settings.
Learn more about CoDoC and its benefits for healthcare.