AI in Mental Health: Emotional and Sentiment Analysis of Large Language Models' Responses to Depression, Anxiety, and Stress Queries

AI-generated keywords: Mental Health Large Language Models Emotional Responses Model Selection User Experience

AI-generated Key Points

  • Large Language Models (LLMs) are sought after for information on mental health issues like depression, anxiety, and stress.
  • Eight LLMs - Claude Sonnet, Copilot, Gemini Pro, GPT-40, GPT-40 mini, Llama, Mixtral, and Perplexity - were studied in terms of their responses to twenty questions related to mental health.
  • Emotional analysis of the generated answers showed dominance of optimism, fear, and sadness with neutral sentiment consistently high.
  • Mixtral exhibited high levels of negative emotions while Llama showcased optimistic and joyful responses.
  • Different mental health conditions influenced emotional responses; anxiety prompts elicited high fear scores while depression prompts generated elevated sadness. Stress-related queries produced the most optimistic responses with increased joy and trust.
  • Demographic framing had minimal impact on emotional tone in the study.
  • Model-specific and condition-specific differences were confirmed through statistical analyses emphasizing the importance of model selection in mental health applications.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Arya VarastehNezhad, Reza Tavasoli, Soroush Elyasi, MohammadHossein LotfiNia, Hamed Farbeh

License: CC BY 4.0

Abstract: Depression, anxiety, and stress are widespread mental health concerns that increasingly drive individuals to seek information from Large Language Models (LLMs). This study investigates how eight LLMs (Claude Sonnet, Copilot, Gemini Pro, GPT-4o, GPT-4o mini, Llama, Mixtral, and Perplexity) reply to twenty pragmatic questions about depression, anxiety, and stress when those questions are framed for six user profiles (baseline, woman, man, young, old, and university student). The models generated 2,880 answers, which we scored for sentiment and emotions using state-of-the-art tools. Our analysis revealed that optimism, fear, and sadness dominated the emotional landscape across all outputs, with neutral sentiment maintaining consistently high values. Gratitude, joy, and trust appeared at moderate levels, while emotions such as anger, disgust, and love were rarely expressed. The choice of LLM significantly influenced emotional expression patterns. Mixtral exhibited the highest levels of negative emotions including disapproval, annoyance, and sadness, while Llama demonstrated the most optimistic and joyful responses. The type of mental health condition dramatically shaped emotional responses: anxiety prompts elicited extraordinarily high fear scores (0.974), depression prompts generated elevated sadness (0.686) and the highest negative sentiment, while stress-related queries produced the most optimistic responses (0.755) with elevated joy and trust. In contrast, demographic framing of queries produced only marginal variations in emotional tone. Statistical analyses confirmed significant model-specific and condition-specific differences, while demographic influences remained minimal. These findings highlight the critical importance of model selection in mental health applications, as each LLM exhibits a distinct emotional signature that could significantly impact user experience and outcomes.

Submitted to arXiv on 15 Aug. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2508.11285v1

In the realm of mental health, depression, anxiety, and stress are prevalent concerns that drive individuals to seek information from Large Language Models (LLMs). This study delves into how eight LLMs - Claude Sonnet, Copilot, Gemini Pro, GPT-40, GPT-40 mini, Llama, Mixtral, and Perplexity - respond to twenty pragmatic questions regarding these mental health issues. These questions were tailored for six user profiles: baseline, woman, man, young, old, and university student. The models generated a total of 2,880 answers which were then analyzed for sentiment and emotions using cutting-edge tools. The emotional landscape across all outputs revealed dominance of optimism , fear , and sadness with neutral sentiment consistently high. Emotions like gratitude , joy , and trust appeared moderately while anger , disgust , and love <ksd></ksd>Love</ksd></ksd></ksd></ksd></ksd></ksd></ksd>. The choice of LLM significantly impacted emotional expression patterns; Mixtral exhibited high levels of negative emotions while Llama showcased optimistic and joyful responses. Furthermore,< kd > Impact on User Experience</ kd > the type of mental health condition greatly influenced emotional responses; anxiety prompts elicited high fear scores while depression prompts generated elevated sadness. Stress-related queries produced the most optimistic responses with increased joy and trust. Interestingly,< kd > Demographic Influences</ kd > demographic framing of queries had minimal impact on emotional tone. Statistical analyses confirmed significant model-specific and condition-specific differences emphasizing the importance of model selection in mental health applications. Each LLM possesses a distinct emotional signature that can significantly influence user experience and outcomes. To delve deeper into this investigation, a set of twenty meticulously crafted questions was developed covering various aspects related to mental health such as diagnosis options, treatment, lifestyle changes, risks management, support, medication side effects, daily life impacts, prognosis improvement timelines, etc. These questions were adapted to include demographic profiles ensuring systematic exploration of demographic influences. The data collection process involved submitting each query to all eight LLMs utilizing different methods based on the model's interface. Each unique combination generated 20 responses per LLM demographic category problem combination maintaining rigorous data collection standards throughout the study. This comprehensive analysis sheds light on how LLMs respond to individuals seeking information about mental health providing valuable insights into emotional patterns associated with different models, conditions, demographics ultimately highlighting the critical role model selection plays in shaping user experiences in mental health applications.
Created on 28 Aug. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.