Health Medical Research

AI platforms for psychiatric patients show racial bias: Cedars-Sinai

Artificial intelligence platforms for psychiatric patients are showing a pattern of racial bias by proposing different treatments for people of colour, a small Cedars-Sinai study finds.
Photo Credit: Cedars Sinai Medical Center.

HQ Team

July 17, 2025: Artificial intelligence platforms for psychiatric patients are showing a pattern of racial bias by proposing different treatments for people of colour, a small Cedars-Sinai study finds.

Investigators studied four large language models (LLMs), a category of AI algorithms trained on enormous amounts of data, which enables them to understand and generate human language, according to a statement from Cedars Sinai Medical Center, a non-profit, tertiary, 915-bed teaching hospital and multi-specialty academic health science center located in Los Angeles, California.

The LLMs, when presented with hypothetical clinical cases, often proposed different treatments for psychiatric patients when African American identity was stated or implied than for patients for whom race was not indicated.

Diagnoses, by comparison, were relatively consistent, according to the researchers.

The LLMs in medicine are drawing interest for their ability to quickly evaluate and recommend diagnoses and treatments for individual patients.

Schizophrenia, anxiety

“Most of the LLMs exhibited some form of bias when dealing with African American patients, at times making dramatically different recommendations for the same psychiatric illness and otherwise identical patient,” said Elias Aboujaoude, MD,  corresponding author of the study. 

“This bias was most evident in cases of schizophrenia and anxiety.”

The researchers found that two LLMs omitted medication recommendations for an attention-deficit/hyperactivity disorder case when race was explicitly stated, but they suggested them when those characteristics were missing from the case.

Another LLM suggested guardianship for depression cases with explicit racial characteristics.

In one other instance, an LLM showed increased focus on reducing alcohol use in anxiety cases only for patients explicitly identified as African American or who had a common African American name.

Claude, ChatGPT

The four LLMs were Claude, ChatGPT, Gemini, and NewMes-15 (a local, medical-focused LLaMA 3 variant). 

Ten psychiatric patient cases representing five diagnoses were presented to these models under three conditions — race-neutral, race-implied, and race-explicitly stated (stating that the patient is African American). 

The models’ diagnostic recommendations and treatment plans were qualitatively evaluated by a clinical psychologist and a social psychologist, who scored 120 outputs for bias by comparing responses generated under race-neutral, race-implied, and race-explicit conditions.

Aboujaoude said the LLMs showed racial bias because they reflected bias found in the extensive content used to train them. Future research, he said, should focus on strategies to detect and quantify bias in artificial intelligence platforms and training data, create LLM architecture that resists demographic bias, and establish standardised protocols for clinical bias testing.

Given the high burden of documentation, estimated to consume up to 40% of a provider’s time, LLMs that can quickly synthesise input data to generate custom patient reports have been seen as a solution. 

Optimise treatment

The ability of LLMs to “understand” plain text and audio can extend to automatically extracting and processing symptoms and other information from a clinical interview. 

By proposing diagnoses and interventions based on information gleaned from massive medical databases, LLMs can also optimise treatment.

The findings, published in the peer-reviewed journal NPJ Digital Medicine, also highlighted the need for oversight to prevent powerful AI applications from perpetuating inequality in healthcare.

“The findings of this important study serve as a call to action for stakeholders across the healthcare ecosystem to ensure that LLM technologies enhance health equity rather than reproduce or worsen existing inequities,” said David Underhill, PhD, chair of the Department of Biomedical Sciences at Cedars-Sinai and the Janis and William Wetsman Family Chair in Inflammatory Bowel Disease.

 “Until that goal is reached, such systems should be deployed with caution and consideration for how even subtle racial characteristics may affect their judgment.”