UPB at SemEval-2021 Task 8: Extracting Semantic Information on Measurements as Multi-Turn Question Answering

AI-generated keywords: MeasEval SemEval-2021 cascade system pretrained language model performance evaluation

AI-generated Key Points

  • Authors' approach to solving all five subtasks of the 8th task of MeasEval competition at SemEval-2021
  • Cascade system with individual subsystems for first two subtasks and single subsystem for last three subtasks
  • Steps involved in the approach:
  • Identifying quantities using a pretrained language model with CRF layer
  • Extracting measurement units and modifiers using Bidirectional LSTMs at character level
  • Identifying measured entities, properties, qualifiers, and relations using multi-turn question answering approach with hand-crafted questions specific to each relation type
  • Best performing model achieved an F1-score of 36.91% on test set
  • Limitations highlighted regarding unit extraction and sensitivity to identified quantities' quality
  • Discussion on related work in span identification, measurement unit identification, and relation extraction, including models like CRFs, LSTM cells with CRF, BERT+CRF, SpanBERT, and different neural network-based models
  • Paper structured into sections discussing solutions for relation extraction, span identification, and measurement unit identification while outlining approaches taken for each subtask proposed by MeasEval competition
  • Performance evaluation of systems presented along with error analysis
  • Concluding remarks and suggestions for future improvements in this area of research
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Andrei-Marius Avram, George-Eduard Zaharia, Dumitru-Clementin Cercel, Mihai Dascalu

5 pages, 3 figures, SemEval-2021 Workshop, ACL-IJCNLP 2021
License: CC BY 4.0

Abstract: Extracting semantic information on measurements and counts is an important topic in terms of analyzing scientific discourses. The 8th task of SemEval-2021: Counts and Measurements (MeasEval) aimed to boost research in this direction by providing a new dataset on which participants train their models to extract meaningful information on measurements from scientific texts. The competition is composed of five subtasks that build on top of each other: (1) quantity span identification, (2) unit extraction from the identified quantities and their value modifier classification, (3) span identification for measured entities and measured properties, (4) qualifier span identification, and (5) relation extraction between the identified quantities, measured entities, measured properties, and qualifiers. We approached these challenges by first identifying the quantities, extracting their units of measurement, classifying them with corresponding modifiers, and afterwards using them to jointly solve the last three subtasks in a multi-turn question answering manner. Our best performing model obtained an overlapping F1-score of 36.91% on the test set.

Submitted to arXiv on 09 Apr. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2104.04549v1

The paper presents the authors' approach to solving all five subtasks of the 8th task of MeasEval competition at SemEval-2021. The competition aims to advance research in extracting semantic information on measurements from scientific texts. Their approach consists of a cascade system with individual subsystems for each problem in the first two subtasks and a single subsystem for jointly solving the last three subtasks. The first step is identifying quantities using a pretrained language model with a Conditional Random Fields (CRF) layer. Then, measurement units and modifiers are extracted using Bidirectional LSTMs at the character level. Finally, measured entities, properties, and qualifiers are identified along with their relations using a multi-turn question answering approach with hand-crafted questions specific to each relation type. The best performing model achieved an F1-score of 36.91% on the test set; however, limitations were also highlighted regarding unit extraction and sensitivity to identified quantities' quality. Related work in span identification, measurement unit identification, and relation extraction was discussed as well as various models used in previous studies such as CRFs, LSTM cells with CRF, BERT+CRF, SpanBERT, and different neural network-based models. The paper is structured into sections discussing solutions for relation extraction, span identification, and measurement unit identification while outlining approaches taken for each subtask proposed by MeasEval competition. A performance evaluation of their systems together with an error analysis is presented followed by concluding remarks and suggestions for future improvements in this area of research.
Created on 19 Nov. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.