AI in Mental Health: Emotional and Sentiment Analysis of Large Language Models' Responses to Depression, Anxiety, and Stress Queries

AI-generated keywords: Mental Health Large Language Models Emotional Responses Model Selection User Experience

AI-generated Key Points

Large Language Models (LLMs) are sought after for information on mental health issues like depression, anxiety, and stress.
Eight LLMs - Claude Sonnet, Copilot, Gemini Pro, GPT-40, GPT-40 mini, Llama, Mixtral, and Perplexity - were studied in terms of their responses to twenty questions related to mental health.
Emotional analysis of the generated answers showed dominance of optimism, fear, and sadness with neutral sentiment consistently high.
Mixtral exhibited high levels of negative emotions while Llama showcased optimistic and joyful responses.
Different mental health conditions influenced emotional responses; anxiety prompts elicited high fear scores while depression prompts generated elevated sadness. Stress-related queries produced the most optimistic responses with increased joy and trust.
Demographic framing had minimal impact on emotional tone in the study.
Model-specific and condition-specific differences were confirmed through statistical analyses emphasizing the importance of model selection in mental health applications.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Arya VarastehNezhad, Reza Tavasoli, Soroush Elyasi, MohammadHossein LotfiNia, Hamed Farbeh

arXiv: 2508.11285v1 - DOI (cs.CL)

License: CC BY 4.0

Abstract: Depression, anxiety, and stress are widespread mental health concerns that increasingly drive individuals to seek information from Large Language Models (LLMs). This study investigates how eight LLMs (Claude Sonnet, Copilot, Gemini Pro, GPT-4o, GPT-4o mini, Llama, Mixtral, and Perplexity) reply to twenty pragmatic questions about depression, anxiety, and stress when those questions are framed for six user profiles (baseline, woman, man, young, old, and university student). The models generated 2,880 answers, which we scored for sentiment and emotions using state-of-the-art tools. Our analysis revealed that optimism, fear, and sadness dominated the emotional landscape across all outputs, with neutral sentiment maintaining consistently high values. Gratitude, joy, and trust appeared at moderate levels, while emotions such as anger, disgust, and love were rarely expressed. The choice of LLM significantly influenced emotional expression patterns. Mixtral exhibited the highest levels of negative emotions including disapproval, annoyance, and sadness, while Llama demonstrated the most optimistic and joyful responses. The type of mental health condition dramatically shaped emotional responses: anxiety prompts elicited extraordinarily high fear scores (0.974), depression prompts generated elevated sadness (0.686) and the highest negative sentiment, while stress-related queries produced the most optimistic responses (0.755) with elevated joy and trust. In contrast, demographic framing of queries produced only marginal variations in emotional tone. Statistical analyses confirmed significant model-specific and condition-specific differences, while demographic influences remained minimal. These findings highlight the critical importance of model selection in mental health applications, as each LLM exhibits a distinct emotional signature that could significantly impact user experience and outcomes.

Submitted to arXiv on 15 Aug. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2508.11285v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

In the realm of mental health, depression, anxiety, and stress are prevalent concerns that drive individuals to seek information from Large Language Models (LLMs). This study delves into how eight LLMs - Claude Sonnet, Copilot, Gemini Pro, GPT-40, GPT-40 mini, Llama, Mixtral, and Perplexity - respond to twenty pragmatic questions regarding these mental health issues. These questions were tailored for six user profiles: baseline, woman, man, young, old, and university student. The models generated a total of 2,880 answers which were then analyzed for sentiment and emotions using cutting-edge tools. The emotional landscape across all outputs revealed dominance of optimism , fear , and sadness with neutral sentiment consistently high. Emotions like gratitude , joy , and trust appeared moderately while anger , disgust , and love <ksd></ksd>Love</ksd></ksd></ksd></ksd></ksd></ksd></ksd>. The choice of LLM significantly impacted emotional expression patterns; Mixtral exhibited high levels of negative emotions while Llama showcased optimistic and joyful responses. Furthermore,< kd > Impact on User Experience</ kd > the type of mental health condition greatly influenced emotional responses; anxiety prompts elicited high fear scores while depression prompts generated elevated sadness. Stress-related queries produced the most optimistic responses with increased joy and trust. Interestingly,< kd > Demographic Influences</ kd > demographic framing of queries had minimal impact on emotional tone. Statistical analyses confirmed significant model-specific and condition-specific differences emphasizing the importance of model selection in mental health applications. Each LLM possesses a distinct emotional signature that can significantly influence user experience and outcomes. To delve deeper into this investigation, a set of twenty meticulously crafted questions was developed covering various aspects related to mental health such as diagnosis options, treatment, lifestyle changes, risks management, support, medication side effects, daily life impacts, prognosis improvement timelines, etc. These questions were adapted to include demographic profiles ensuring systematic exploration of demographic influences. The data collection process involved submitting each query to all eight LLMs utilizing different methods based on the model's interface. Each unique combination generated 20 responses per LLM demographic category problem combination maintaining rigorous data collection standards throughout the study. This comprehensive analysis sheds light on how LLMs respond to individuals seeking information about mental health providing valuable insights into emotional patterns associated with different models, conditions, demographics ultimately highlighting the critical role model selection plays in shaping user experiences in mental health applications.

- Large Language Models (LLMs) are sought after for information on mental health issues like depression, anxiety, and stress.
- Eight LLMs - Claude Sonnet, Copilot, Gemini Pro, GPT-40, GPT-40 mini, Llama, Mixtral, and Perplexity - were studied in terms of their responses to twenty questions related to mental health.
- Emotional analysis of the generated answers showed dominance of optimism, fear, and sadness with neutral sentiment consistently high.
- Mixtral exhibited high levels of negative emotions while Llama showcased optimistic and joyful responses.
- Different mental health conditions influenced emotional responses; anxiety prompts elicited high fear scores while depression prompts generated elevated sadness. Stress-related queries produced the most optimistic responses with increased joy and trust.
- Demographic framing had minimal impact on emotional tone in the study.
- Model-specific and condition-specific differences were confirmed through statistical analyses emphasizing the importance of model selection in mental health applications.

SummaryLarge Language Models (LLMs) are like big libraries that help us learn about feelings like being sad, worried, or stressed. Eight special LLMs were studied to see how they talk about mental health. They found that some models were more positive while others were more negative. Different emotions, like happiness and fear, showed up in their answers depending on the questions asked. The type of question and the model used can affect how helpful the answers are for understanding mental health. Definitions- Large Language Models (LLMs): Big collections of words and phrases that help computers understand and generate human-like text. - Mental health: How we feel and think, including emotions like happiness, sadness, worry, and stress. - Optimism: Feeling hopeful or positive about the future. - Fear: A strong emotion caused by danger or uncertainty. - Sadness: Feeling unhappy or sorrowful. - Anxiety: A feeling of worry or nervousness about something uncertain or threatening. - Depression: A condition where someone feels very sad, hopeless, and unmotivated for a long time. - Stress: Pressure or tension caused by difficult situations. - Demographic framing: Considering factors like age, gender, or location when studying something. - Statistical analyses: Using numbers and data to study patterns and relationships in information.

Introduction

In today's society, mental health concerns such as depression, anxiety, and stress are becoming increasingly prevalent. As a result, individuals often turn to Large Language Models (LLMs) for information and support. These models utilize artificial intelligence to generate responses based on input data, making them valuable resources for those seeking information about mental health. A recent study conducted by researchers delved into how eight LLMs - Claude Sonnet, Copilot, Gemini Pro, GPT-40, GPT-40 mini, Llama, Mixtral, and Perplexity - respond to twenty pragmatic questions regarding mental health issues. The study aimed to analyze the emotional landscape of these responses in order to gain insights into the impact of model selection on user experience in mental health applications.

The Study

The study involved developing a set of twenty carefully crafted questions covering various aspects related to mental health such as diagnosis options, treatment methods, lifestyle changes, risk management strategies, support systems, medication side effects, daily life impacts, and prognosis improvement timelines. These questions were tailored for six different user profiles: baseline,< kd > woman , < kd > man , < kd > young , < kd > old , and < kd > university student. The data collection process involved submitting each query to all eight LLMs using different methods based on the model's interface. This ensured that each unique combination generated 20 responses per LLM demographic category problem combination while maintaining rigorous data collection standards throughout the study.

The Emotional Landscape

After collecting a total of 2,880 answers from the eight LLMs,the researchers used cutting-edge tools to analyze sentiment and emotions present in each response. The results revealed that across all outputs, there was a dominance of optimism, fear, and sadness with neutral sentiment consistently high. Emotions such as gratitude, joy, and trust appeared moderately while anger, disgust, and love were present in lower levels. Interestingly, the choice of LLM significantly impacted emotional expression patterns. For example,< kd > Mixtral exhibited high levels of negative emotions while Llama showcased optimistic and joyful responses. This highlights the importance of model selection in shaping user experiences in mental health applications.

The Influence of Mental Health Conditions

The type of mental health condition also played a significant role in emotional responses from the LLMs. Anxiety prompts elicited high fear scores while depression prompts generated elevated sadness. Stress-related queries produced the most optimistic responses with increased joy and trust. This suggests that individuals seeking information about specific mental health conditions may receive different emotional support depending on which LLM they consult.

The Impact of Demographics

One interesting finding from this study was that demographic framing of queries had minimal impact on emotional tone.This means that regardless of their age or gender, individuals seeking information about mental health are likely to receive similar emotional support from these models. However,< kd > statistical analyses confirmed significant model-specific and condition-specific differences, emphasizing the importance of considering both model selection and specific mental health concerns when utilizing LLMs for support or information.

Conclusion

In conclusion,< kd > this comprehensive analysis sheds light on how Large Language Models respond to individuals seeking information about mental health.The results highlight the critical role model selection plays in shaping user experiences in this domain.Each LLM possesses a distinct emotional signature that can significantly influence user experience and outcomes.This study provides valuable insights into emotional patterns associated with different models, conditions, demographics ultimately helping guide future research efforts towards improving the effectiveness of LLMs in mental health applications.

Limitations and Future Directions

While this study provides valuable insights into the emotional landscape of LLM responses to mental health queries, there are some limitations that should be considered. Firstly, the study only focused on eight specific models, so it may not be representative of all LLMs available. Additionally,the study only analyzed responses to twenty pre-determined questions, so further research is needed to explore a wider range of queries. Future studies could also consider incorporating user feedback and experiences with these models to gain a more comprehensive understanding of their impact on individuals seeking information about mental health. Furthermore,< kd > investigating how these models respond to different languages and cultures could provide valuable insights for global use. Despite these limitations, this study provides a solid foundation for future research efforts towards improving the effectiveness and user experience of Large Language Models in the realm of mental health.

Created on 28 Aug. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Similar papers summarized with our AI tools

70.0%

Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models

cs.CL

68.8%

Leveraging Large Language Models for Mental Health Prediction via Online Text…

cs.CL

67.1%

The opportunities and risks of large language models in mental health

cs.CL

64.5%

Benefits and Harms of Large Language Models in Digital Mental Health

cs.CL

63.9%

Personality Traits in Large Language Models

cs.CL

63.7%

Robust language-based mental health assessments in time and space through soc…

cs.CL

63.0%

EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement vi…

cs.CL

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.