Detection of Fake Users in SMPs Using NLP and Graph Embeddings

AI-generated keywords: Fake and Spam User Detection

AI-generated Key Points

  • Research focuses on detecting fake and spam user accounts on Twitter
  • Combination of Graph Representation Learning and Natural Language Processing techniques used
  • Large user base of social media platforms generates massive amount of data
  • Fake and spam accounts used by organizations for competitive advantage
  • Model's performance compared with other existing works in Table 2
  • Dataset size affects model's performance, older models rely on Twitter user IDs from 2018 or earlier
  • Researchers' database consists of 46k active Twitter accounts that passed Twitter's spam detection tool
  • Model achieves approximately 97% accuracy score, significant improvement over baseline
  • Approach utilizes optimized user metadata attributes, NLP features, and graph representation of user accounts
  • Proposed feature set demonstrates consistent performance across different models and class labels
  • Spam users tend to form communities by selectively following other users
  • Future studies will explore this observation further and investigate differences between legitimate human users vs. legitimate social bots, as well as human spammers vs. social bot spammers.
  • Impact of increased tweet length on automated spam accounts' ability to generate intelligent content needs further study due to recent changes in maximum tweet length limits.
  • Approach can be applied not only to Twitter but also to other social media platforms or real-time filtering applications for data collection pipelines.
  • Study provides valuable insights and lays foundation for future research in this field.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Manojit Chakraborty, Shubham Das, Radhika Mamidi

5 pages, 3 figures
License: CC BY 4.0

Abstract: Social Media Platforms (SMPs) like Facebook, Twitter, Instagram etc. have large user base all around the world that generates huge amount of data every second. This includes a lot of posts by fake and spam users, typically used by many organisations around the globe to have competitive edge over others. In this work, we aim at detecting such user accounts in Twitter using a novel approach. We show how to distinguish between Genuine and Spam accounts in Twitter using a combination of Graph Representation Learning and Natural Language Processing techniques.

Submitted to arXiv on 27 Apr. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2104.13094v1

This research focuses on detecting fake and spam user accounts on Twitter using a combination of Graph Representation Learning and Natural Language Processing techniques. The study acknowledges the large user base of social media platforms like Facebook, Twitter, and Instagram, which generates a massive amount of data every second. Many organizations use fake and spam accounts to gain a competitive edge over others. The authors compare their model's performance with other existing works in Table 2. They note that as the dataset size increases, the model's performance tends to decrease. Most models in Table 2 rely on Twitter user IDs from 2018 or earlier when there was a surge in fake, spam, and bot accounts. However, since mid-2018, Twitter has implemented robust spam detection and deletion technologies to eliminate most of these accounts. The researchers emphasize that their database consists of 46k active Twitter accounts that have successfully passed Twitter's spam detection tool. These accounts are much harder to detect as spam compared to older databases used by other models. Despite this challenge, their model achieves an accuracy score of approximately 97%, which is a significant improvement over the baseline. In conclusion, this work presents a novel and effective method for detecting spam users on social media platforms. The approach utilizes optimized user metadata attributes, features generated from NLP techniques, and graph representation of user accounts. The proposed feature set demonstrates consistent performance across different models and class labels. The researchers also observe that spam users tend to form communities by selectively following other users. They plan to explore this observation further in future studies. Additionally, they highlight the need for investigating the clear differences between legitimate human users vs. legitimate social bots and human spammers vs. social bot spammers. Furthermore, the authors suggest studying the impact of increased tweet length on automated spam accounts' ability to generate intelligent content due to recent changes in maximum tweet length limits. Overall, this spam account detection approach can be applied not only to Twitter but also to other social media platforms or real-time filtering applications for data collection pipelines. The study provides valuable insights and lays the foundation for future research in this field.
Created on 26 Jan. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.