GeneGPT: Teaching Large Language Models to Use NCBI Web APIs

AI-generated keywords: GeneGPT

AI-generated Key Points

  • GeneGPT is a novel method for training large language models (LLMs) to utilize the Web APIs of the National Center for Biotechnology Information (NCBI) for genomics-related questions.
  • Codex is prompted with few-shot URL requests of NCBI API calls for in-context learning, and during inference, decoding is halted upon detecting a call request followed by making the API call with the generated URL.
  • GeneGPT surpasses state-of-the-art performance on seven out of nine tasks within the GeneTuring dataset, outperforming other LLMs like New Bing in one-shot and zero-shot tasks.
  • The macro-average score achieved by GeneGPT is 0.76, significantly higher than other LLMs such as BioMedLM, BioGPT, GPT-3, and ChatGPT.
  • External tools offer superior support compared to relevant web pages when enhancing LLM capabilities for genomics question-solving tasks.
  • Future research directions include fine-tuning LLMs using NCBI API calls instead of in-context learning and exploring multi-hop genomics question answering along with chain-of-thought prompting techniques.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu

Work in progress
License: CC BY 4.0

Abstract: In this paper, we present GeneGPT, a novel method for teaching large language models (LLMs) to use the Web Application Programming Interfaces (APIs) of the National Center for Biotechnology Information (NCBI) and answer genomics questions. Specifically, we prompt Codex (code-davinci-002) to solve the GeneTuring tests with few-shot URL requests of NCBI API calls as demonstrations for in-context learning. During inference, we stop the decoding once a call request is detected and make the API call with the generated URL. We then append the raw execution results returned by NCBI APIs to the generated texts and continue the generation until the answer is found or another API call is detected. Our preliminary results show that GeneGPT achieves state-of-the-art results on three out of four one-shot tasks and four out of five zero-shot tasks in the GeneTuring dataset. Overall, GeneGPT achieves a macro-average score of 0.76, which is much higher than retrieval-augmented LLMs such as the New Bing (0.44), biomedical LLMs such as BioMedLM (0.08) and BioGPT (0.04), as well as other LLMs such as GPT-3 (0.16) and ChatGPT (0.12).

Submitted to arXiv on 19 Apr. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2304.09667v1

, , , , In their paper titled "GeneGPT: Teaching Large Language Models to Use NCBI Web APIs," authors Qiao Jin, Yifan Yang, Qingyu Chen, and Zhiyong Lu introduce GeneGPT as a novel method for training large language models (LLMs) to utilize the Web Application Programming Interfaces (APIs) of the National Center for Biotechnology Information (NCBI) in order to answer genomics-related questions. The approach involves prompting Codex (code-davinci-002) with few-shot URL requests of NCBI API calls as demonstrations for in-context learning. During inference, decoding is halted upon detecting a call request, followed by making the API call with the generated URL. The raw execution results from NCBI APIs are then appended to the generated texts, allowing for continued generation until an answer is found or another API call is identified. Preliminary findings demonstrate that surpasses state-of-the-art performance on seven out of nine tasks within the GeneTuring dataset. Notably, it outperforms other LLMs such as New Bing in three out of four one-shot tasks and four out of five zero-shot tasks. The macro-average score achieved by GeneGPT stands at 0.76, significantly higher than retrieval-augmented LLMs like New Bing (0.44), biomedical LLMs including BioMedLM (0.08) and BioGPT (0.04), as well as general-purpose LLMs like GPT-3 (0.16) and ChatGPT (0.12). The study concludes by highlighting that external tools may offer superior support compared to relevant web pages when enhancing LLM capabilities for genomics question-solving tasks. Future research directions include fine-tuning LLMs using NCBI API calls instead of in-context learning and exploring multi-hop genomics question answering along with chain-of-thought prompting techniques to better address real-world information needs related to genomics. Overall, this work showcases the effectiveness of GeneGPT in leveraging NCBI Web APIs for genomic inquiries and sets a new benchmark in performance compared to existing large language models like New Bing across various genomics-related tasks.
Created on 12 Sep. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.