PDFTriage: Question Answering over Long, Structured Documents

AI-generated keywords: Large Language Models Document Question Answering PDFTriage Structured Documents GPT-3.5

AI-generated Key Points

  • Challenges faced by Large Language Models (LLMs) in document question answering when the document exceeds small context length
  • Existing approaches focus on retrieving relevant context from plain text documents, but struggle with structured documents like PDFs, web pages, and presentations
  • Introduction of PDFTriage, an approach leveraging both structure and content for context retrieval
  • Comparison of PDFTriage with retrieval baselines such as Page Retrieval and Chunk Retrieval in experiments
  • Utilization of PDF structure and GPT-3.5's interactive functions by PDFTriage for more accurate answer extraction
  • Outperformance of PDFTriage in multi-page tasks like structure questions and table reasoning based on user preferences
  • Human evaluation studies showing that PDFTriage provides high-quality answers compared to retrieval baselines
  • Evaluation of attributes such as question difficulty, clarity, information needed for answering, and overall quality of answers by each system
  • Highlighting the effectiveness of PDFTriage in handling structured documents for QA tasks where existing models fall short
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Jon Saad-Falcon, Joe Barrow, Alexa Siu, Ani Nenkova, Ryan A. Rossi, Franck Dernoncourt

License: CC BY 4.0

Abstract: Large Language Models (LLMs) have issues with document question answering (QA) in situations where the document is unable to fit in the small context length of an LLM. To overcome this issue, most existing works focus on retrieving the relevant context from the document, representing them as plain text. However, documents such as PDFs, web pages, and presentations are naturally structured with different pages, tables, sections, and so on. Representing such structured documents as plain text is incongruous with the user's mental model of these documents with rich structure. When a system has to query the document for context, this incongruity is brought to the fore, and seemingly trivial questions can trip up the QA system. To bridge this fundamental gap in handling structured documents, we propose an approach called PDFTriage that enables models to retrieve the context based on either structure or content. Our experiments demonstrate the effectiveness of the proposed PDFTriage-augmented models across several classes of questions where existing retrieval-augmented LLMs fail. To facilitate further research on this fundamental problem, we release our benchmark dataset consisting of 900+ human-generated questions over 80 structured documents from 10 different categories of question types for document QA.

Submitted to arXiv on 16 Sep. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2309.08872v1

In this study, we address the challenges faced by Large Language Models (LLMs) in document question answering (QA) when the document exceeds the small context length of an LLM. Existing approaches focus on retrieving relevant context from plain text documents, but structured documents such as PDFs, web pages, and presentations pose a unique challenge due to their rich formatting and organization. To bridge this gap, we introduce PDFTriage, an approach that leverages both structure and content for context retrieval. Our dataset comprises 908 questions across 82 documents with an average of 4,257 tokens per document. In our experiments, we compare PDFTriage with retrieval baselines such as Page Retrieval and Chunk Retrieval. PDFTriage utilizes the structure of PDFs and GPT-3.5's interactive functions to extract answers more accurately than traditional methods. User preferences indicate that PDFTriage outperforms other approaches in multi-page tasks like structure questions and table reasoning. Human evaluation studies conducted on Upwork with experienced annotators show that PDFTriage excels in providing high-quality answers compared to retrieval baselines. The study evaluates attributes such as question difficulty, clarity, information needed for answering, and overall quality of answers generated by each system. Overall, our research highlights the effectiveness of PDFTriage in handling structured documents for QA tasks where existing models fall short. We provide detailed descriptions of our methodology and results along with a benchmark dataset for further research in this area.
Created on 14 Nov. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.