Machine Learning Approaches for Mental Illness Detection on Social Media: A Systematic Review of Biases and Methodological Challenges

AI-generated keywords: Machine learning Mental illness Social media data Bias assessment Depression detection

AI-generated Key Points

  • Systematic review on the use of machine learning (ML) models for detecting depression through social media data
  • Identified 47 relevant studies published after 2010
  • Utilized Prediction model Risk Of Bias ASsessment Tool (PROBAST) to assess methodological quality and bias
  • Significant biases found in studies, including heavy reliance on Twitter and English-language content, limiting diversity
  • Non-probability sampling methods used in 80% of studies, affecting representativeness
  • Only 23% of studies addressed linguistic nuances crucial for accurate sentiment analysis
  • Risks identified: inconsistent hyperparameter tuning, inadequate data partitioning, class imbalance issues
  • Future research recommendations: diversifying data sources, standardizing preprocessing methods, addressing class imbalance, enhancing reporting transparency
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yuchen Cao, Jianglai Dai, Zhongyan Wang, Yeyubei Zhang, Xiaorui Shen, Yunchong Liu, Yexin Tian

Journal of Behavioral Data Science, 5(1)
License: CC BY 4.0

Abstract: The global increase in mental illness requires innovative detection methods for early intervention. Social media provides a valuable platform to identify mental illness through user-generated content. This systematic review examines machine learning (ML) models for detecting mental illness, with a particular focus on depression, using social media data. It highlights biases and methodological challenges encountered throughout the ML lifecycle. A search of PubMed, IEEE Xplore, and Google Scholar identified 47 relevant studies published after 2010. The Prediction model Risk Of Bias ASsessment Tool (PROBAST) was utilized to assess methodological quality and risk of bias. The review reveals significant biases affecting model reliability and generalizability. A predominant reliance on Twitter (63.8%) and English-language content (over 90%) limits diversity, with most studies focused on users from the United States and Europe. Non-probability sampling (80%) limits representativeness. Only 23% explicitly addressed linguistic nuances like negations, crucial for accurate sentiment analysis. Inconsistent hyperparameter tuning (27.7%) and inadequate data partitioning (17%) risk overfitting. While 74.5% used appropriate evaluation metrics for imbalanced data, others relied on accuracy without addressing class imbalance, potentially skewing results. Reporting transparency varied, often lacking critical methodological details. These findings highlight the need to diversify data sources, standardize preprocessing, ensure consistent model development, address class imbalance, and enhance reporting transparency. By overcoming these challenges, future research can develop more robust and generalizable ML models for depression detection on social media, contributing to improved mental health outcomes globally.

Submitted to arXiv on 21 Oct. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2410.16204v3

This systematic review examines the use of machine learning (ML) models for detecting mental illness, specifically depression, through analysis of social media data. A comprehensive search of PubMed, IEEE Xplore, and Google Scholar identified 47 relevant studies published after 2010. The Prediction model Risk Of Bias ASsessment Tool (PROBAST) was utilized to assess methodological quality and risk of bias in these studies. However, significant biases were found that may affect the reliability and generalizability of the ML models. These include a heavy reliance on Twitter (63.8%) and English-language content (over 90%), limiting diversity with a focus on users from the United States and Europe. Additionally, non-probability sampling methods were used in 80% of the studies which may limit representativeness. Only 23% of the studies explicitly addressed linguistic nuances such as negations which are crucial for accurate sentiment analysis. Inconsistent hyperparameter tuning (27.7%) and inadequate data partitioning (17%) were also identified as risks for overfitting. While most studies used appropriate evaluation metrics for imbalanced data (74.5%), some relied solely on accuracy without addressing class imbalance which could skew results. Reporting transparency varied across studies with many lacking critical methodological details. To address these challenges, future research should focus on diversifying data sources, standardizing preprocessing methods, ensuring consistent model development practices, addressing class imbalance issues, and enhancing reporting transparency. Through this review process involving title and abstract screening by two authors with expertise in machine learning and mental health research followed by full-text screening to ensure unbiased study selection; detailed information on study characteristics such as author names, publication details, study designs; machine learning models used; social media platforms analyzed; performance metrics measured; potential biases identified; limitations of the studies reviewed were extracted. Analytical methods were employed to systematically synthesize findings across different stages of the machine learning lifecycle in the selected studies including sampling techniques, data preprocessing methods, model construction, tuning, and evaluation based on quantitative metrics such as accuracy, precision, recall, F1 scores, and AUROCs. The importance of reporting transparency and completeness in scientific research was emphasized throughout the review process to ensure integrity, reproducibility, and reliability of findings reported in each study. By addressing these challenges and improving reporting standards, more robust and generalizable ML models can be developed for depression detection on social media platforms to contribute to improved mental health outcomes globally.
Created on 01 Jul. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.