Better Call GPT, Comparing Large Language Models Against Lawyers

AI-generated keywords: Large Language Models Legal Contract Reviewers Accuracy Speed Cost Efficiency

AI-generated Key Points

Comparison between Large Language Models (LLMs) and traditional legal contract reviewers
Objective: Determine if LLMs can outperform humans in accuracy, speed, and cost efficiency during contract review
Analysis structured around providing context for each contract scenario and using a standardized contract review playbook
Senior Lawyers evaluated contracts for adherence to predefined standards and identified specific influential sections
Ground truth data collected to create benchmarks for accuracy, speed, and cost efficiency
Duration of contract reviews recorded by Senior Lawyers, Junior Lawyers, Legal Process Outsourcers (LPOs), and LLMs compared
Hourly rates for lawyers determined based on industry benchmark reports and market data held by Onit Inc.
Costs for LLMs obtained through commercial pricing provided by service suppliers
Prominent LLM models from OpenAI, Google Anthropic, Amazon Meta considered for analysis
Advanced LLMs matched or exceeded human accuracy in determining legal issues
LLMs completed reviews in seconds compared to hours required by humans
LLMs operated at a fraction of the cost compared to traditional methods
Findings indicate a seismic shift in the legal industry with potential for enhanced accessibility and efficiency of legal services.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Lauren Martin (Onit AI Centre of Excellence), Nick Whitehouse (Onit AI Centre of Excellence), Stephanie Yiu (Onit AI Centre of Excellence), Lizzie Catterson (Onit AI Centre of Excellence), Rivindu Perera (Onit AI Centre of Excellence)

arXiv: 2401.16212v1 - DOI (cs.CY)

16 pages

License: CC BY 4.0

Abstract: This paper presents a groundbreaking comparison between Large Language Models and traditional legal contract reviewers, Junior Lawyers and Legal Process Outsourcers. We dissect whether LLMs can outperform humans in accuracy, speed, and cost efficiency during contract review. Our empirical analysis benchmarks LLMs against a ground truth set by Senior Lawyers, uncovering that advanced models match or exceed human accuracy in determining legal issues. In speed, LLMs complete reviews in mere seconds, eclipsing the hours required by their human counterparts. Cost wise, LLMs operate at a fraction of the price, offering a staggering 99.97 percent reduction in cost over traditional methods. These results are not just statistics, they signal a seismic shift in legal practice. LLMs stand poised to disrupt the legal industry, enhancing accessibility and efficiency of legal services. Our research asserts that the era of LLM dominance in legal contract review is upon us, challenging the status quo and calling for a reimagined future of legal workflows.

Submitted to arXiv on 24 Jan. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2401.16212v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

This paper presents a groundbreaking comparison between Large Language Models (LLMs) and traditional legal contract reviewers. The objective was to determine whether LLMs can outperform humans in terms of accuracy, speed, and cost efficiency during contract review. The analysis was structured around a two-factor approach: providing context for each contract scenario and using a standardized contract review playbook. This aimed to simulate real-world situations and ensure practical relevance. To establish ground truth data, Senior Lawyers evaluated each contract for adherence to predefined standards. They also identified specific sections that influenced their judgments. In cases where standards were not met due to missing information, Senior Lawyers explicitly recorded this at the end of their assessment. The collected data was aggregated to create benchmarks for accuracy, speed, and cost efficiency. Senior Lawyers also recorded the duration of each contract review to compare it with the time taken by Junior Lawyers, Legal Process Outsourcers (LPOs), and LLMs. Hourly rates for lawyers were determined based on industry benchmark reports and market data held by Onit Inc., while costs for LLMs were obtained through commercial pricing provided by service suppliers. In selecting models for analysis, prominent entities in the LLM space such as OpenAI, Google Anthropic, Amazon Meta were considered. Preliminary evaluations were conducted on models developed by these organizations to assess their applicability within the legal domain. These tests focused on analyzing reasoning capabilities and the ability to determine legal issues and their location within contracts. The results of the study showed that advanced LLMs matched or exceeded human accuracy in determining legal issues. In terms of speed, LLMs completed reviews in seconds compared to hours required by humans. Additionally, LLMs operated at a fraction of the cost, offering a significant reduction in expenses over traditional methods. These findings indicate a seismic shift in the legal industry, with LLMs poised to disrupt and enhance accessibility and efficiency of legal services. The research asserts that we are entering an era of LLM dominance in contract review, challenging the status quo and calling for a reimagined future of legal workflows.

- Comparison between Large Language Models (LLMs) and traditional legal contract reviewers
- Objective: Determine if LLMs can outperform humans in accuracy, speed, and cost efficiency during contract review
- Analysis structured around providing context for each contract scenario and using a standardized contract review playbook
- Senior Lawyers evaluated contracts for adherence to predefined standards and identified specific influential sections
- Ground truth data collected to create benchmarks for accuracy, speed, and cost efficiency
- Duration of contract reviews recorded by Senior Lawyers, Junior Lawyers, Legal Process Outsourcers (LPOs), and LLMs compared
- Hourly rates for lawyers determined based on industry benchmark reports and market data held by Onit Inc.
- Costs for LLMs obtained through commercial pricing provided by service suppliers
- Prominent LLM models from OpenAI, Google Anthropic, Amazon Meta considered for analysis
- Advanced LLMs matched or exceeded human accuracy in determining legal issues
- LLMs completed reviews in seconds compared to hours required by humans
- LLMs operated at a fraction of the cost compared to traditional methods
- Findings indicate a seismic shift in the legal industry with potential for enhanced accessibility and efficiency of legal services.

This study compared large language models (LLMs) to traditional legal contract reviewers. The goal was to see if LLMs could do a better job than humans in accuracy, speed, and cost efficiency when reviewing contracts. They analyzed different contract scenarios and used a standardized playbook for review. Senior lawyers looked at the contracts and identified important sections. They collected data to compare how long it took senior lawyers, junior lawyers, legal process outsourcers (LPOs), and LLMs to review contracts. The hourly rates for lawyers were determined based on industry reports and market data, while the costs for LLMs were obtained from service suppliers. Advanced LLMs were as accurate or even better than humans in finding legal issues. LLMs could complete reviews in seconds instead of hours like humans. And using LLMs was much cheaper than traditional methods. This study suggests that using LLMs can make legal services more accessible and efficient." Definitions- Large Language Models (LLMs): Computer programs that can understand and generate human-like text. - Accuracy: How correct or precise something is. - Speed: How fast something can be done. - Cost efficiency: Doing something in a way that saves money. - Contracts: Agreements between people or companies that outline what they will do. - Playbook: A set of rules or instructions for doing something. - Senior Lawyers: Experienced lawyers who have been practicing law for a long time. - Junior Lawyers: Lawyers who are less experienced or

Introduction

In recent years, there has been a growing interest in the use of Large Language Models (LLMs) for various tasks such as language translation, text summarization, and question-answering. However, their potential application in the legal industry has not been extensively explored until now. This research paper presents a groundbreaking comparison between LLMs and traditional legal contract reviewers to determine whether LLMs can outperform humans in terms of accuracy, speed, and cost efficiency during contract review.

The Objective

The objective of this study was to assess the capabilities of LLMs in the context of contract review and compare them with human performance. The analysis was structured around a two-factor approach: providing context for each contract scenario and using a standardized contract review playbook. This aimed to simulate real-world situations and ensure practical relevance.

Establishing Ground Truth Data

To establish ground truth data, Senior Lawyers evaluated each contract for adherence to predefined standards. They also identified specific sections that influenced their judgments. In cases where standards were not met due to missing information, Senior Lawyers explicitly recorded this at the end of their assessment.

Data Aggregation

The collected data was aggregated to create benchmarks for accuracy, speed, and cost efficiency. Senior Lawyers also recorded the duration of each contract review to compare it with the time taken by Junior Lawyers, Legal Process Outsourcers (LPOs), and LLMs.

Determining Costs

Hourly rates for lawyers were determined based on industry benchmark reports and market data held by Onit Inc., while costs for LLMs were obtained through commercial pricing provided by service suppliers.

Selecting Models for Analysis

In selecting models for analysis, prominent entities in the LLM space such as OpenAI, Google Anthropic, Amazon Meta were considered. Preliminary evaluations were conducted on models developed by these organizations to assess their applicability within the legal domain. These tests focused on analyzing reasoning capabilities and the ability to determine legal issues and their location within contracts.

Results

The results of the study showed that advanced LLMs matched or exceeded human accuracy in determining legal issues. In terms of speed, LLMs completed reviews in seconds compared to hours required by humans. Additionally, LLMs operated at a fraction of the cost, offering a significant reduction in expenses over traditional methods.

Implications for the Legal Industry

These findings indicate a seismic shift in the legal industry, with LLMs poised to disrupt and enhance accessibility and efficiency of legal services. The research asserts that we are entering an era of LLM dominance in contract review, challenging the status quo and calling for a reimagined future of legal workflows.

Conclusion

This research paper presents compelling evidence that LLMs have the potential to outperform humans in terms of accuracy, speed, and cost efficiency during contract review. With advancements being made in natural language processing technology every day, it is only a matter of time before LLMs become an integral part of contract review processes across industries. This has far-reaching implications for not just the legal industry but also for businesses looking to streamline their operations and reduce costs associated with contract review. As we enter this new era of LLM dominance, it will be crucial for organizations to adapt and embrace these technologies to stay competitive in today's fast-paced business landscape.

Created on 01 Feb. 2024

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

68.6%

A Survey on Evaluation of Large Language Models

cs.CL

66.9%

Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Em…

cs.CL

66.6%

Can Large Language Models Be an Alternative to Human Evaluations?

cs.CL

64.7%

GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large La…

econ.GN

64.6%

Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domai…

cs.CL

64.1%

ProCoT: Stimulating Critical Thinking and Writing of Students through Engagem…

cs.CL

61.3%

Practical and Ethical Challenges of Large Language Models in Education: A Sys…

cs.CL

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.