NLP Methods in Host-based Intrusion Detection Systems: A Systematic Review and Future Directions
Authors: Zarrin Tasnim Sworna, Zahra Mousavi, Muhammad Ali Babar
Abstract: Host based Intrusion Detection System (HIDS) is an effective last line of defense for defending against cyber security attacks after perimeter defenses (e.g., Network based Intrusion Detection System and Firewall) have failed or been bypassed. HIDS is widely adopted in the industry as HIDS is ranked among the top two most used security tools by Security Operation Centers (SOC) of organizations. Although effective and efficient HIDS is highly desirable for industrial organizations, the evolution of increasingly complex attack patterns causes several challenges resulting in performance degradation of HIDS (e.g., high false alert rate creating alert fatigue for SOC staff). Since Natural Language Processing (NLP) methods are better suited for identifying complex attack patterns, an increasing number of HIDS are leveraging the advances in NLP that have shown effective and efficient performance in precisely detecting low footprint, zero day attacks and predicting the next steps of attackers. This active research trend of using NLP in HIDS demands a synthesized and comprehensive body of knowledge of NLP based HIDS. Thus, we conducted a systematic review of the literature on the end to end pipeline of the use of NLP in HIDS development. For the end to end NLP based HIDS development pipeline, we identify, taxonomically categorize and systematically compare the state of the art of NLP methods usage in HIDS, attacks detected by these NLP methods, datasets and evaluation metrics which are used to evaluate the NLP based HIDS. We highlight the relevant prevalent practices, considerations, advantages and limitations to support the HIDS developers. We also outline the future research directions for the NLP based HIDS development.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
Some bits of the article are not summarized yet, you can re-run the summarizing process by clicking on the Run button below.
Look for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.