Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification

AI-generated keywords: Hierarchical Text Classification Prompt-based Learning HierVerb Pre-trained Language Models Label Hierarchy

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Performance in hierarchical text classification (HTC) is hindered by complex label hierarchy and high labeling cost, especially in low-resource or few-shot scenarios.
  • Researchers have explored the use of prompts on pre-trained language models (PLMs) for few-shot flat text classification tasks, but limited research exists on applying prompt-based learning to HTC problems with scarce training data.
  • The authors propose a path-based few-shot setting and establish a strict path-based evaluation metric to investigate few-shot HTC tasks further.
  • They introduce a framework called "HierVerb" that treats HTC as a single or multi-label classification problem at multiple layers using vectors as verbalizers constrained by hierarchical structure and hierarchical contrastive learning.
  • HierVerb outperforms existing methods that inject hierarchy through graph encoders by incorporating label hierarchy knowledge into verbalizers, maximizing the benefits of PLMs in HTC tasks.
  • Extensive experiments on three popular HTC datasets under few-shot settings demonstrate that using prompts with HierVerb significantly improves HTC performance.
  • This approach provides an elegant solution to bridge the gap between large pre-trained models and downstream hierarchical classification tasks.
  • The code and few-shot dataset used in this study are publicly available for further exploration.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Ke Ji, Yixin Lian, Jingsheng Gao, Baoyuan Wang

14 pages, 8 figures, Accepted by ACL 2023

Abstract: Due to the complex label hierarchy and intensive labeling cost in practice, the hierarchical text classification (HTC) suffers a poor performance especially when low-resource or few-shot settings are considered. Recently, there is a growing trend of applying prompts on pre-trained language models (PLMs), which has exhibited effectiveness in the few-shot flat text classification tasks. However, limited work has studied the paradigm of prompt-based learning in the HTC problem when the training data is extremely scarce. In this work, we define a path-based few-shot setting and establish a strict path-based evaluation metric to further explore few-shot HTC tasks. To address the issue, we propose the hierarchical verbalizer ("HierVerb"), a multi-verbalizer framework treating HTC as a single- or multi-label classification problem at multiple layers and learning vectors as verbalizers constrained by hierarchical structure and hierarchical contrastive learning. In this manner, HierVerb fuses label hierarchy knowledge into verbalizers and remarkably outperforms those who inject hierarchy through graph encoders, maximizing the benefits of PLMs. Extensive experiments on three popular HTC datasets under the few-shot settings demonstrate that prompt with HierVerb significantly boosts the HTC performance, meanwhile indicating an elegant way to bridge the gap between the large pre-trained model and downstream hierarchical classification tasks. Our code and few-shot dataset are publicly available at https://github.com/1KE-JI/HierVerb.

Submitted to arXiv on 26 May. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2305.16885v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In the field of hierarchical text classification (HTC), performance is often hindered by the complex label hierarchy and high labeling cost, especially in low-resource or few-shot scenarios. To address this issue, researchers have started exploring the use of prompts on pre-trained language models (PLMs) for few-shot flat text classification tasks. However, there is limited research on applying prompt-based learning to HTC problems with extremely scarce training data. In this study, the authors propose a path-based few-shot setting and establish a strict path-based evaluation metric to further investigate few-shot HTC tasks. They introduce a framework called "HierVerb," which is a multi-verbalizer approach that treats HTC as a single or multi-label classification problem at multiple layers. The vectors are used as verbalizers and are constrained by the hierarchical structure and hierarchical contrastive learning. By incorporating label hierarchy knowledge into verbalizers, HierVerb outperforms existing methods that inject hierarchy through graph encoders. This approach maximizes the benefits of PLMs in HTC tasks. The authors conduct extensive experiments on three popular HTC datasets under few-shot settings and demonstrate that using prompts with HierVerb significantly improves HTC performance. Additionally, they highlight that this approach provides an elegant solution to bridge the gap between large pre-trained models and downstream hierarchical classification tasks. The code and few-shot dataset used in this study are publicly available for further exploration. The paper has been accepted by ACL 2023 and consists of 14 pages with 8 figures. The authors of this work are Ke Ji, Yixin Lian, Jingsheng Gao, and Baoyuan Wang.
Created on 03 Jul. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.