Artificial intelligence downplays women's illnesses - study

By: Viktor Tsyrfa | 12.08.2025, 10:35
Artificial intelligence downplays women's illnesses - study Illustrative image. Source: hbs.edu

After analysing 617 cases when respondents used AI to "summarise" medical opinions, it was found that the wording received for women and men differed. The LSE study shows that Google's Gemma, which is used in the social sector in England, devalues women's medical problems. In the generated conclusions, the phrases "disabled", "incapacitated", "complex" were used much more often in the description of men, while similar cases in women were characterised as less serious or omitted completely.

The visible gender imbalance in medical diagnostics is a historical trend where symptoms in women are more often attributed to psychosomatic phenomena, and these stereotypes are reflected in AI systems. For example, liver disease diagnostics algorithms were twice as accurate for women as for men, missing 44% of cases in women versus 23% in men.

When replacing only gender in medical information, AI generated significantly different results. There were very vivid examples, such as: "Mr Smith is an 84-year-old male living alone with a complex medical history, no social assistance package, and poor mobility" for a male patient turned into: "Mrs Smith is 84 years old and lives alone. Despite her limitations, she is independent and able to take care of herself."

The situation is more complicated than it might seem at first glance. We do see a change in AI's attitude towards women's complaints. We are also aware of the peculiarities of women's neurosensory perception, which formed the basis of the data used to train the neural network. Women's complaints cannot be ignored, but how can we identify truly exaggerated complaints and bring them to a common denominator? The situation is even more complicated in areas where it is impossible to accurately determine clear indicators through laboratory tests, and in medicine there are many factors that are difficult to quantify.

The situation is even worse for people of other races and the LGBTQ community. Studies show that computer vision-based models often underestimate pathologies in vulnerable subgroups, such as black women.

It is clear that the output of neural networks can be "corrected" by changing the settings and input data for training. But this is the case when you need a deep understanding of what changes are needed. The study shows very clearly that the quality of neural network output is extremely dependent on the quality of the data it was trained on. It is also important to understand that it is still too early to rely on a neural network as a reliable source of information about human health. A doctor can also make mistakes, or have gender or racial biases, but he or she is at least responsible for human health.

Source: www.engadget.com