Putting AI to the Test: Assessing the Ethics of Chatbots – University of Miami
Home / News / Research and Innovation / Bioethics
Artificial intelligence is sparking a cognitive revolution. The range of applications in biomedicine is especially rich: pinpointing mutations in tumors, scrutinizing medical images, identifying patients susceptible to medical emergencies and more.
The diagnostic and treatment capacity of these evolving technologies seems boundless.
But these benefits birth challenging ethical questions. Can AI be trusted with confidential patient information? How can we ensure that it’s bias-free? Who is responsible for mistakes?
In a new paper published in NEJM AI, a physician, an ethicist and a medical student from the University of Miami Miller School of Medicine explored how effectively AI addresses ethical dilemmas in real-life, clinical scenarios.
Senior author Gauri Agarwal, M.D., associate professor of clinical medicine and associate dean for curriculum at the Miller School, and lead author Isha Harshe, a second-year medical student, tested five large language models (LLMs)—ChatGPT-4o mini, Claude 3.5 Sonnet, Microsoft Copilot, Meta’s LLaMA 3 and Gemini 1.5 Flash—to assess how they would respond to complex medical ethics cases.
Dr. Agarwal and Harshe compared the AI responses to the opinions of the piece’s third author, Kenneth Goodman, Ph.D., director emeritus of the Institute for Bioethics and Health Policy at the Miller School.
In the first case, a patient rejected standard-of-care robotic surgery despite the human surgeon having lost confidence in her non-robotic surgical skills.
Each of the LLMs offered a range of options, such as minimizing but not eliminating robotic involvement and the surgeon declining to perform the procedure at all.
Even with the potential for harm to patients, all of the LLMs said that proceeding with standard, non-robotic surgery was a legitimate option.
Dr. Goodman disagreed. He maintained the patient should receive the standard of care or be transferred to another facility.
“The uniform response highlights a major limitation of LLMs: projecting contemporary ethical principles to future scenarios,” said Dr. Agarwal. “While such an answer is consistent with current norms, it doesn’t reflect the implications of evolving standards of care and the reduction of human skills over time due to lack of use.”
Scenario two explored the role of AI in determining if end-stage care should be withdrawn from a patient lacking both decision-making capacity and a designated surrogate.
All five models agreed that AI alone shouldn’t be relied on here. Suggestions included deferring to a hospital’s ethics committee and/or physicians involved in the patient’s care.
However, Dr. Goodman said such a decision must only be made by a surrogate decision maker, not the clinical team or an ethics committee. He noted that, if a patient fails to appoint a surrogate, hospitals are typically required by law to identify a stand-in from a list of family members and friends.
In the third scenario, the authors asked if a chatbot could operate as a patient’s medical surrogate.
Four models rejected the idea straightaway. A fifth refused to answer, changing the topic when asked. The human ethicist, however, offered a different take. If a chatbot could convey a patient’s likely wishes—and not simply offer its own opinion—it might qualify.
The authors note that surrogates aren’t supposed to make decisions about a patient’s care. Instead, they are obligated to communicate what they think the patient would have wanted. This key difference raises the possibility of eventually employing chatbots as surrogates, possibly using previous chats with patients and their social network information, among other factors, as a basis for the LLM’s opinion.
“LLMs can support ethical thinking but they aren’t currently capable of making independent ethical decisions,” said Harshe. “They can, however, offer legitimate ethical points to consider. While we have principles and guidelines to assist us, critical medical ethics decisions require a type of intelligence that is uniquely human.”
Tags: bioethics, Dr. Gauri Agarwal, Dr. Kenneth Goodman, Institute for Bioethics and Health Policy, medical education, medical ethics, student research
This article was printed from The Miller School of Medicine Medical News
at the following URL: https://news.med.miami.edu/putting-ai-to-the-test-assessing-the-ethics-of-chatbots/
Copyright © 2025 University of Miami Health System
source
This article was autogenerated from a news feed from CDO TIMES selected high quality news and research sources. There was no editorial review conducted beyond that by CDO TIMES staff. Need help with any of the topics in our articles? Schedule your free CDO TIMES Tech Navigator call today to stay ahead of the curve and gain insider advantages to propel your business!
Artificial intelligence is sparking a cognitive revolution. The range of applications in biomedicine is especially rich: pinpointing mutations in tumors, scrutinizing medical images, identifying patients susceptible to medical emergencies and more.
The diagnostic and treatment capacity of these evolving technologies seems boundless.
But these benefits birth challenging ethical questions. Can AI be trusted with confidential patient information? How can we ensure that it’s bias-free? Who is responsible for mistakes?
In a new paper published in NEJM AI, a physician, an ethicist and a medical student from the University of Miami Miller School of Medicine explored how effectively AI addresses ethical dilemmas in real-life, clinical scenarios.
Senior author Gauri Agarwal, M.D., associate professor of clinical medicine and associate dean for curriculum at the Miller School, and lead author Isha Harshe, a second-year medical student, tested five large language models (LLMs)—ChatGPT-4o mini, Claude 3.5 Sonnet, Microsoft Copilot, Meta’s LLaMA 3 and Gemini 1.5 Flash—to assess how they would respond to complex medical ethics cases.
Dr. Agarwal and Harshe compared the AI responses to the opinions of the piece’s third author, Kenneth Goodman, Ph.D., director emeritus of the Institute for Bioethics and Health Policy at the Miller School.
In the first case, a patient rejected standard-of-care robotic surgery despite the human surgeon having lost confidence in her non-robotic surgical skills.
Each of the LLMs offered a range of options, such as minimizing but not eliminating robotic involvement and the surgeon declining to perform the procedure at all.
Even with the potential for harm to patients, all of the LLMs said that proceeding with standard, non-robotic surgery was a legitimate option.
Dr. Goodman disagreed. He maintained the patient should receive the standard of care or be transferred to another facility.
“The uniform response highlights a major limitation of LLMs: projecting contemporary ethical principles to future scenarios,” said Dr. Agarwal. “While such an answer is consistent with current norms, it doesn’t reflect the implications of evolving standards of care and the reduction of human skills over time due to lack of use.”
Scenario two explored the role of AI in determining if end-stage care should be withdrawn from a patient lacking both decision-making capacity and a designated surrogate.
All five models agreed that AI alone shouldn’t be relied on here. Suggestions included deferring to a hospital’s ethics committee and/or physicians involved in the patient’s care.
However, Dr. Goodman said such a decision must only be made by a surrogate decision maker, not the clinical team or an ethics committee. He noted that, if a patient fails to appoint a surrogate, hospitals are typically required by law to identify a stand-in from a list of family members and friends.
In the third scenario, the authors asked if a chatbot could operate as a patient’s medical surrogate.
Four models rejected the idea straightaway. A fifth refused to answer, changing the topic when asked. The human ethicist, however, offered a different take. If a chatbot could convey a patient’s likely wishes—and not simply offer its own opinion—it might qualify.
The authors note that surrogates aren’t supposed to make decisions about a patient’s care. Instead, they are obligated to communicate what they think the patient would have wanted. This key difference raises the possibility of eventually employing chatbots as surrogates, possibly using previous chats with patients and their social network information, among other factors, as a basis for the LLM’s opinion.
“LLMs can support ethical thinking but they aren’t currently capable of making independent ethical decisions,” said Harshe. “They can, however, offer legitimate ethical points to consider. While we have principles and guidelines to assist us, critical medical ethics decisions require a type of intelligence that is uniquely human.”
Tags: bioethics, Dr. Gauri Agarwal, Dr. Kenneth Goodman, Institute for Bioethics and Health Policy, medical education, medical ethics, student research
This article was printed from The Miller School of Medicine Medical News
at the following URL: https://news.med.miami.edu/putting-ai-to-the-test-assessing-the-ethics-of-chatbots/
Copyright © 2025 University of Miami Health System
source
This article was autogenerated from a news feed from CDO TIMES selected high quality news and research sources. There was no editorial review conducted beyond that by CDO TIMES staff. Need help with any of the topics in our articles? Schedule your free CDO TIMES Tech Navigator call today to stay ahead of the curve and gain insider advantages to propel your business!

