RDU: A Region-based Approach to Form-style Document Understanding

AI-generated keywords: Region-based Document Understanding Key Information Extraction Form-style documents Optical Character Recognition Layout-aware BERT

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Research paper titled "RDU: A Region-based Approach to Form-style Document Understanding" by authors Fengbin Zhu, Chao Wang, Wenqiang Lei, Ziyang Liu, and Tat Seng Chua
  • Introduces a novel method for efficiently extracting key information from form-style documents
  • Proposed model called Region-based Document Understanding (RDU) incorporates layout information and uses layout-aware BERT with soft layout attention masking and bias mechanisms
  • Includes Region Proposal Module inspired by computer vision models, as well as Region Categorization Module and Selection Module to assess validity of proposed regions
  • Experimental results on four types of form-style documents demonstrate effectiveness of RDU in achieving impressive results
  • Offers promising solution for intelligent document understanding in various applications
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Fengbin Zhu, Chao Wang, Wenqiang Lei, Ziyang Liu, Tat Seng Chua

Work in process

Abstract: Key Information Extraction (KIE) is aimed at extracting structured information (e.g. key-value pairs) from form-style documents (e.g. invoices), which makes an important step towards intelligent document understanding. Previous approaches generally tackle KIE by sequence tagging, which faces difficulty to process non-flatten sequences, especially for table-text mixed documents. These approaches also suffer from the trouble of pre-defining a fixed set of labels for each type of documents, as well as the label imbalance issue. In this work, we assume Optical Character Recognition (OCR) has been applied to input documents, and reformulate the KIE task as a region prediction problem in the two-dimensional (2D) space given a target field. Following this new setup, we develop a new KIE model named Region-based Document Understanding (RDU) that takes as input the text content and corresponding coordinates of a document, and tries to predict the result by localizing a bounding-box-like region. Our RDU first applies a layout-aware BERT equipped with a soft layout attention masking and bias mechanism to incorporate layout information into the representations. Then, a list of candidate regions is generated from the representations via a Region Proposal Module inspired by computer vision models widely applied for object detection. Finally, a Region Categorization Module and a Region Selection Module are adopted to judge whether a proposed region is valid and select the one with the largest probability from all proposed regions respectively. Experiments on four types of form-style documents show that our proposed method can achieve impressive results. In addition, our RDU model can be trained with different document types seamlessly, which is especially helpful over low-resource documents.

Submitted to arXiv on 14 Jun. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2206.06890v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

The research paper titled "RDU: A Region-based Approach to Form-style Document Understanding" by authors Fengbin Zhu, Chao Wang, Wenqiang Lei, Ziyang Liu, and Tat Seng Chua introduces a novel method for efficiently extracting key information from form-style documents. The proposed model, Region-based Document Understanding (RDU), takes into account layout information and uses a layout-aware BERT with soft layout attention masking and bias mechanisms to accurately predict results. It also includes a Region Proposal Module inspired by computer vision models and a Region Categorization Module and Selection Module to assess the validity of proposed regions. Experimental results on four types of form-style documents demonstrate the effectiveness of RDU in achieving impressive results. This approach offers a promising solution for intelligent document understanding in various applications. <br><br> Keywords: , , , , .
Created on 26 Mar. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.