“Doctors correctly diagnose illness ‘twice as often as online symptom checkers’,” The Sun reports.
A US study ran a head-to-head comparison between doctors and a symptom checker platform called Human Dx using what are known as clinical vignettes.
Clinical vignettes have been used for many years to help hone trainee doctors’ diagnostic skills. They are essentially diagnostic puzzles based on real-life case reports designed to test training and clinical knowledge.
The researchers provided 45 clinical vignettes to more than 200 doctors. They found doctors were twice as likely to diagnose accurately first time compared with an online symptom-checking application called Human Dx.
But these findings are not entirely reliable – vignettes can never fully replicate the real-life diagnosis of patients. And many of the doctors involved were still in training posts.
It’s often the case in the field of artificial intelligence that tasks computers find incredibly easy – like multiplying 30-digit prime numbers – humans find incredibly hard.
But the reverse is also true – tasks that are second nature to us, like understanding jokes, computers just cannot do.
It is possibly the case that diagnosis in some part relies on intuition, and not just an algorithmic approach to processing information.
That said, artificial intelligence has a great deal to offer medicine. For example, Google is working with the NHS to come up with software that can quickly and accurately scan radiotherapy images.
Applications may well become a diagnostic tool for doctors, rather than a replacement for them.
Where did the story come from?
The study was carried out by researchers from Harvard Medical School. No source of funding was reported in the paper.
It was published in the peer-reviewed JAMA Internal Medicine.
A conflict of interest was declared by Dr Nundy, who is an equity holder of the Human Diagnosis Project, the creators of Human Dx.
Symptom checkers are websites and apps that help patients with self-diagnosis. As these are becoming more popular, it is important that they are investigated thoroughly and the findings made public.
The media presented the facts of the study well, reporting the main findings accurately, although there was no discussion about the research’s limitations.
What kind of research was this?
This comparative study aimed to assess the diagnostic accuracy of doctors and computer algorithms known as symptom checkers.
This is a useful way of drawing comparisons and highlighting areas for further research.
However, the small sample of scenarios assessed here cannot be representative of all the different combinations of signs and symptoms patients may have.
What did the research involve?
The researchers compared the diagnostic accuracy of Human Dx, a web and app-based symptom checker, with the diagnostic accuracy of doctors.
A total of 45 vignettes were used in the study, and included 26 common and 19 uncommon conditions.
The 234 physicians involved were hospital doctors specialising in general medicine, rather than other specialities such as surgery or paediatrics. They were asked to rank diagnoses for each case. Each vignette was solved by at least 20 physicians.
The responses were reviewed by another two doctors, who independently decided whether the diagnosis was correct or in the top three diagnoses. Discrepancies were resolved by a third member of the research team.
Each doctor’s accuracy was compared with the symptom checker’s accuracy for each of the vignettes.
What were the basic results?
The study found physicians listed the correct diagnosis first more often across all vignettes compared with symptom checkers (72.1% vs 34.0%). They also recognised the top three diagnoses listed (84.3% vs 51.2%) more often.
Doctors were more likely to give the correct diagnosis across all severities of presentation, as well as for common and uncommon presentations.
How did the researchers interpret the results?
The researchers concluded that: “In what we believe to be the first direct comparison of diagnostic accuracy, physicians vastly outperformed computer algorithms in diagnostic accuracy (84.3% vs 51.2% correct diagnosis in the top three listed).
“Despite physicians’ superior performance, they provided the incorrect diagnosis in about 15% of cases, similar to prior estimates (10%-15%) for physician diagnostic error.”
They went on to say: “While in this project we compared diagnostic performance, future work should test whether computer algorithms can augment physician diagnostic accuracy.”
This study aimed to assess the diagnostic accuracy of the Human Dx symptom checker versus the accuracy of doctors.
The researchers found doctors were much more likely to accurately diagnose a condition than Human Dx.
However, this research did have some limitations:
This being said, the use of computer programs can be useful in reducing diagnostic error – as long as the symptom checkers are accurate.
This research highlights the need for future work to improve the performance of these programmes.
It will probably be many years until an application becomes sophisticated enough to replace your GP, but these types of applications could one day be a useful tool in a doctor’s (virtual) kitbag.
Links To The Headlines
Doctors correctly diagnose illness ‘TWICE as often as online symptom checkers’. The Sun, October 10 2016
Links To Science
Semigran HL, Levine DM, Nundy S. Comparison of Physician and Computer Diagnostic Accuracy. JAMA Internal Medicine. Published online October 10 2016