TableFormer: Table Structure Understanding with Transformers

AI-generated keywords: Tables

AI-generated Key Points

  • Tables are crucial for organizing content in a concise manner and enhancing predictive capabilities of systems like search engines and Knowledge Graphs.
  • Identifying the structure of tables from images is challenging due to various shapes, sizes, and complexities.
  • A new table-structure identification model with object detection decoder for table cells has been introduced to improve existing deep learning models.
  • The model allows accurate extraction of table content directly from programmatic PDFs without custom OCR decoders, enhancing accuracy and enabling non-English table handling.
  • The proposed TableFormer model outperforms existing methods by leveraging tree-editing-distance-score on simple and complex tables.
  • Post-processing techniques extract cell content by matching predicted bounding boxes to PDF cells based on overlap and spatial proximity.
  • "SynthTabNet" dataset addresses missing characteristics in other datasets, valuable for future research in document understanding and table extraction.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Ahmed Nassar, Nikolaos Livathinos, Maksym Lysak, Peter Staar

License: CC BY 4.0

Abstract: Tables organize valuable content in a concise and compact representation. This content is extremely valuable for systems such as search engines, Knowledge Graph's, etc, since they enhance their predictive capabilities. Unfortunately, tables come in a large variety of shapes and sizes. Furthermore, they can have complex column/row-header configurations, multiline rows, different variety of separation lines, missing entries, etc. As such, the correct identification of the table-structure from an image is a non-trivial task. In this paper, we present a new table-structure identification model. The latter improves the latest end-to-end deep learning model (i.e. encoder-dual-decoder from PubTabNet) in two significant ways. First, we introduce a new object detection decoder for table-cells. In this way, we can obtain the content of the table-cells from programmatic PDF's directly from the PDF source and avoid the training of the custom OCR decoders. This architectural change leads to more accurate table-content extraction and allows us to tackle non-english tables. Second, we replace the LSTM decoders with transformer based decoders. This upgrade improves significantly the previous state-of-the-art tree-editing-distance-score (TEDS) from 91% to 98.5% on simple tables and from 88.7% to 95% on complex tables.

Submitted to arXiv on 02 Mar. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2203.01017v2

, , , , Tables play a crucial role in organizing valuable content in a concise and compact manner, enhancing the predictive capabilities of systems such as search engines and Knowledge Graphs. However, tables come in various shapes and sizes, with complex configurations like multi-line rows, different separation lines, and missing entries. Identifying the structure of a table from an image is a challenging task. In this paper, a new table-structure identification model is introduced to improve upon existing deep learning models. The new model incorporates a novel object detection decoder for table cells, allowing for accurate extraction of table content directly from programmatic PDFs without the need for custom OCR decoders. This architectural change enhances table-content extraction accuracy and enables the handling of non-English tables. has greatly advanced document understanding by improving table extraction from documents through addressing challenges related to locating tables on document pages and determining their structure. While table-location has been effectively solved using object-detection networks like YOLO and Mask-RCNN, table-structure decomposition remains a longstanding problem in document understanding. In this study, we propose an innovative solution called TableFormer that overcomes limitations present in current approaches while advancing the field of document understanding through improved Our approach is language agnostic and efficiently leverages data from original PDF documents while establishing direct links between table cells and their bounding boxes in images. The proposed model outperforms existing state-of-the-art methods by a wide margin thanks to its use of which have shown significant improvements in tree-editing-distance-score on both simple and complex tables. Qualitative analysis showcases the model's ability to predict bounding boxes for all table cells, including empty ones. Post-processing techniques extract cell content by matching predicted bounding boxes to PDF cells based on overlap and spatial proximity. One of the key contributions of this study is the introduction of a called "SynthTabNet," which addresses missing characteristics present in other datasets. This dataset will be valuable for future research in document understanding and table extraction. In conclusion, our research team has developed an end-to-end transformer-based approach for predicting table structures and cell bounding boxes from images.
Created on 21 Apr. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.