CiteBench: A benchmark for Scientific Citation Text Generation

AI-generated keywords: scientific research

AI-generated Key Points

  • Rapidly evolving landscape of scientific research
  • Growing interest in automating summarizing and synthesizing research papers
  • Introduction of CiteBench benchmark for standardized evaluation of citation text generation models
  • Exploration of performance of strong baselines and transferability between datasets
  • Common task formulation and evaluation framework through CiteBench
  • Advancement in automated literature review and promotion of collaboration within the scientific community
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Martin Funkquist, Ilia Kuznetsov, Yufang Hou, Iryna Gurevych

License: CC BY-SA 4.0

Abstract: The publication rates are skyrocketing across many fields of science, and it is difficult to stay up to date with the latest research. This makes automatically summarizing the latest findings and helping scholars to synthesize related work in a given area an attractive research objective. In this paper we study the problem of citation text generation, where given a set of cited papers and citing context the model should generate a citation text. While citation text generation has been tackled in prior work, existing studies use different datasets and task definitions, which makes it hard to study citation text generation systematically. To address this, we propose CiteBench: a benchmark for citation text generation that unifies the previous datasets and enables standardized evaluation of citation text generation models across task settings and domains. Using the new benchmark, we investigate the performance of multiple strong baselines, test their transferability between the datasets, and deliver new insights into task definition and evaluation to guide the future research in citation text generation. We make CiteBench publicly available at https://github.com/UKPLab/citebench.

Submitted to arXiv on 19 Dec. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2212.09577v1

, , , , In the rapidly evolving landscape of scientific research, keeping up with the latest findings has become increasingly challenging. To address this issue, there is a growing interest in automating the process of summarizing and synthesizing research papers in various fields. One specific area of focus is citation text generation, where models are tasked with generating citation texts based on a set of cited papers and the context provided by the citing paper. Previous studies on citation text generation have utilized diverse datasets and task definitions, leading to a lack of standardization in evaluating and comparing different models. In response to this challenge, the authors introduce CiteBench: a benchmark designed to unify existing datasets and facilitate standardized evaluation of citation text generation models across different domains and task settings. By leveraging this new benchmark, the authors explore the performance of several strong baselines, assess their transferability between datasets, and offer valuable insights into task definition and evaluation practices that can guide future research in citation text generation. Furthermore, previous research has approached citation text generation from various perspectives, including extractive or abstractive summarization methods, single or multiple cited papers input, as well as outputting either a single sentence or a paragraph as the generated citation text. Despite these advancements, the lack of a common task formulation and evaluation framework has hindered direct comparisons between different studies. Through CiteBench, researchers now have access to a unified platform that consolidates four existing datasets for citation text generation tasks. Overall, this work not only contributes to advancing the field of automated literature review but also sets a foundation for more systematic and comprehensive evaluations in citation text generation research. The availability of CiteBench as an open-source resource further promotes collaboration and knowledge sharing within the scientific community.
Created on 02 May. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.