Intent Mining from past conversations for Conversational Agent

AI-generated keywords: Intent Mining Conversational Systems Dialog Act Classifier ITER-DBSCAN Universal Sentence Encoder

AI-generated Key Points

  • Framework for intent mining from past conversations to improve conversational systems
  • Chatbots used for round-the-clock support and customer engagement
  • Challenges in building and training intent models due to lack of diverse training data
  • Four-step intent discovery framework proposed by the authors
  • Extract textual utterances using a pre-trained Dialog Act Classifier
  • Automatically cluster similar user utterances using ITER-DBSCAN algorithm
  • Manual annotation of clusters with intent labels by subject matter experts
  • Propagation of intent labels to unmapped utterances
  • Introduction of better sentence representation method using Universal Sentence Encoder
  • Effectiveness demonstrated using Microsoft application IT support data and publicly available datasets
  • Comprehensive framework improves coverage, accuracy, and saves time compared to manual annotation methods
  • Potential applications beyond conversational systems in short text clustering and labeling tasks
  • Future work includes scalability for larger datasets and exploration in general-purpose clustering tasks
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Ajay Chatterjee, Shubhashis Sengupta

8 pages, 2 figures
License: CC BY 4.0

Abstract: Conversational systems are of primary interest in the AI community. Chatbots are increasingly being deployed to provide round-the-clock support and to increase customer engagement. Many of the commercial bot building frameworks follow a standard approach that requires one to build and train an intent model to recognize a user input. Intent models are trained in a supervised setting with a collection of textual utterance and intent label pairs. Gathering a substantial and wide coverage of training data for different intent is a bottleneck in the bot building process. Moreover, the cost of labeling a hundred to thousands of conversations with intent is a time consuming and laborious job. In this paper, we present an intent discovery framework that involves 4 primary steps: Extraction of textual utterances from a conversation using a pre-trained domain agnostic Dialog Act Classifier (Data Extraction), automatic clustering of similar user utterances (Clustering), manual annotation of clusters with an intent label (Labeling) and propagation of intent labels to the utterances from the previous step, which are not mapped to any cluster (Label Propagation); to generate intent training data from raw conversations. We have introduced a novel density-based clustering algorithm ITER-DBSCAN for unbalanced data clustering. Subject Matter Expert (Annotators with domain expertise) manually looks into the clustered user utterances and provides an intent label for discovery. We conducted user studies to validate the effectiveness of the trained intent model generated in terms of coverage of intents, accuracy and time saving concerning manual annotation. Although the system is developed for building an intent model for the conversational system, this framework can also be used for a short text clustering or as a labeling framework.

Submitted to arXiv on 22 May. 2020

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2005.11014v1

In this paper, the authors present a framework for intent mining from past conversations to improve conversational systems. They highlight that chatbots are increasingly being used to provide round-the-clock support and enhance customer engagement. However, building and training an intent model for these systems can be challenging due to the need for substantial and diverse training data. To address these issues, the authors propose a four-step intent discovery framework. The first step involves extracting textual utterances from conversations using a pre-trained domain agnostic Dialog Act Classifier. Then, similar user utterances are automatically clustered using a novel density-based clustering algorithm called ITER-DBSCAN. Next, subject matter experts manually annotate the clusters with intent labels. Finally, intent labels are propagated to the utterances that are not mapped to any cluster. The authors also introduce a better sentence representation method using Universal Sentence Encoder for identifying imbalanced classes. They demonstrate the effectiveness of their framework using internal Microsoft application IT support conversational data and publicly available datasets for intent classification and short text classification. Overall, this paper presents a comprehensive framework for intent mining from past conversations in conversational systems which shows promise in improving coverage of intents, accuracy, and saving time compared to manual annotation methods. Additionally, the framework has potential applications beyond conversational systems in short text clustering and labeling tasks. In terms of future work, the authors aim to make their algorithm more scalable for larger datasets as well as explore its application in general-purpose clustering tasks such as knowledge graph generation and other areas of data mining.
Created on 26 Dec. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.