Proper implementation of chatbots in healthcare requires diligence

By | July 28, 2020

While the technology for developing artificial intelligence-powered chatbots has existed for some time, a new viewpoint piece lays out the clinical, ethical and legal aspects that should be considered before applying them in healthcare. And while the emergence of COVID-19 and the social distancing that accompanies it has prompted more health systems to explore and apply automated chatbots, the authors of a new paper — published by experts from Penn Medicine and  the Leonard Davis Institute of Healthcare Economics — still urge caution and thoughtfulness before proceeding.

Because of the relative newness of the technology, the limited data that exists on chatbots comes primarily from research as opposed to clinical implementation. That means the evaluation of new systems being put into place requires diligence before they enter the clinical space, and the authors caution that those operating the bots should be nimble enough to quickly adapt to feedback.


Chatbots are a tool used to communicate with patients via text message or voice. Many chatbots are powered by artificial intelligence. The paper specifically discusses chatbots that use natural language processing, an AI process that seeks to “understand” language used in conversations and draws threads and connections from them to provide meaningful and useful answers.

Within healthcare, those messages, and people’s reactions to them, carry tangible consequences. Since caregivers are often in communication with patients through electronic health records — from access to test results to diagnoses and doctors’ notes — chatbots can either enhance the value of those communications or cause confusion or even harm.

Read More:  Flu leaves a 4-year-old girl blind in Iowa

For instance, how a chatbot handles someone telling it something as serious as “I want to hurt myself” has many different implications.

In the self-harm example, there are several pertinent questions that apply. This touches first and foremost on patient safety: Who monitors the chatbot and how often do they do it? It also touches on trust and transparency: Would this patient actually take a response from a known chatbot seriously? 

It also, unfortunately, raises questions about who is accountable if the chatbot fails in its task. Moreover, another important question applies: Is this a task best suited for a chatbot, or is it something that should still be totally human-operated?

The team believes they have laid out key considerations that can inform a framework for decision-making when it comes to implementing chatbots in healthcare. These could apply even when rapid implementation is required to respond to events like the spread of COVID-19.

Among the considerations are whether chatbots should extend the capabilities of clinicians or replace them in certain scenarios; and what the limits of chatbot authority should be in different scenarios, such as recommending treatments or probing patients for answers to basic health questions.


Data published this month from the Indiana University Kelley School of Business found that chatbots working for reputable organizations can ease the burden on medical providers and offer trusted guidance to those with symptoms.

Researchers conducted an online experiment with 371 participants who viewed a COVID-19 screening session between a hotline agent — chatbot or human — and a user with mild or severe symptoms.

Read More:  Trust in Healthcare is Under Stress in the US and Globally, Edelman Finds

They studied whether chatbots were seen as being persuasive, providing satisfying information that likely would be followed. The results showed a slight negative bias against chatbots’ ability, perhaps due to recent press reports cited by the authors. 

When the perceived ability is the same, however, participants reported that they viewed chatbots more positively than human agents, which is good news for healthcare organizations struggling to meet user demand for screening services. It was the perception of the agent’s ability that was the main factor driving user response to screening hotlines.

Twitter: @JELagasse
Email the writer:

News Feed