ARTICLE AD BOX
A Harvard study reveals advanced AI models, like OpenAI's o1, can outperform doctors in diagnosing patients and planning clinical management, particularly in emergency scenarios.
A new study finds that AI can outperform human doctors in certain tasks(AI generated image)A new Harvard study has found that advanced AI models can outperform doctors in diagnosing patients, including in high-pressure emergency room (ER) scenarios. The researchers pitted OpenAI's advanced o1 language model directly against hundreds of physicians across multiple diagnostic touchpoints, finding that the AI consistently outperformed doctors in both diagnosing conditions and planning clinical management.
What does the study find?
The new research published in the journal Science, gave 76 clinical cases from the emergency room of Beth Israel Deaconess Medical Center to OpenAI's o1 model and two expert attending physicians.
The researchers found that o1 performed on par with or significantly better than the human experts on a variety of tasks:
- Initial Triage: When the least information was available, o1 identified the exact or very close diagnosis in 67.1% of cases, compared to 55.3% and 50% accuracy for the two physicians
- ER evaluation: As more patient information became available during the physician’s initial assessment, the model’s accuracy improved to 72.4%, compared to 61.8% and 52.6% for the two doctors.
- Hospital admission: At the final stage, when patients were admitted to the medical floor or ICU, o1 reached 81.6% accuracy, again outperforming both physicians, who scored 78.9% and 69.7% respectively.
The study also found that AI had a definitive edge when asked to give treatment plans like giving antibiotics or planning end-of-life decision. Across five case studies, the AI achieved a median score of 89%, substantially outperforming physicians, who scored around 34% when using conventional resources and 41% when using GPT-4.
“Although applying AI to assist with clinical decision support is sometimes viewed as a high-risk endeavor, greater use of these tools might serve to mitigate the human and financial costs of diagnostic error, delay, and lack of access” the researchers said
“Our findings suggest that LLMs have now eclipsed most benchmarks of clinical reasoning, motivating the urgent need for human-computer interaction studies and prospective clinical trials to rigorously assess the potential of AI systems to improve clinical practice and patient outcomes.” they added
However, the researchers also warn that clinical medicine is filled with non-text inputs like patient's level of physical distress or the interpretation of medical imaging. This, they suggest, means there is a need for future research to asses how humans and machine and collaborare effectively.
In a statement to The Guardian, Arjun Manrai, one of trhe lead authors of the study noted, “I don’t think our findings mean that AI replaces doctors,”
“I think it does mean that we’re witnessing a really profound change in technology that will reshape medicine.” he added
About the Author
Aman Gupta
Aman Gupta is a Digital Content Producer at LiveMint with over 3.5 years of experience covering the technology landscape. He specializes in artificial intelligence and consumer technology, reporting on everything from the ethical debates around AI models to shifts in the smartphone market. <br> His reporting is grounded in first-hand testing, independent analysis, and a focus on how technology impacts everyday users. He holds a PG Diploma in Radio and Television Journalism from the Indian Institute of Mass Communication, Delhi (Class of 2022). <br> Outside the newsroom, he spends his time reading biographies, hunting for the perfect coffee beans, or planning his next trip. <br><br> You can find Aman on <a href="https://www.linkedin.com/in/aman-gupta-894180214">LinkedIn</a> and on X at <a href="https://x.com/nobugsfound">@nobugsfound</a>, or reach him via email at <a href="aman.gupta@htdigital.in">aman.gupta@htdigital.in</a>.

1 hour ago
1
.jpg?mbid=social_retweet)





English (US) ·