YORC: Yoruba Reading Comprehension dataset

AI-generated keywords: YORC Yorùbá language reading comprehension cross-lingual transfer large language models

AI-generated Key Points

Introduction of YORC (Yorùbá Reading Comprehension), a new dataset for Yorùbá language reading comprehension
Dataset based on Yorùbá high-school reading comprehension examinations
Baseline results using cross-lingual transfer with English RACE dataset and pre-trained encoder-only model
Evaluation of large language models (LLMs) like GPT-4
GPT-4 achieves highest accuracy of 36.14% on YORC data, but lower compared to AfroXLMR-base and ChatGPT on English test set
Challenges faced by LLMs in multi-choice QA setting for under-resourced African languages like Yorùbá
Limitations of LLMs for under-resourced African languages emphasized
Future work includes evaluation in few-shot settings and exploring approaches to adapt existing models with limited examples
Acknowledgment of Mr. Daud Olamide Abolade for assistance with manual text extraction using OCR tools
Gratitude expressed to OpenAI for providing API credits through Researcher Access API program for evaluating GPT-3.5 and GPT-4 LLMs
Overall contribution in creating a new reading comprehension dataset for Yorùbá language and highlighting challenges and potential future directions in improving performance for under-resourced languages using LLMs.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Anuoluwapo Aremu, Jesujoba O. Alabi, David Ifeoluwa Adelani

arXiv: 2308.09768v2 - DOI (cs.CL)

License: CC BY 4.0

Abstract: In this paper, we create YORC: a new multi-choice Yoruba Reading Comprehension dataset that is based on Yoruba high-school reading comprehension examination. We provide baseline results by performing cross-lingual transfer using existing English RACE dataset based on a pre-trained encoder-only model. Additionally, we provide results by prompting large language models (LLMs) like GPT-4.

Submitted to arXiv on 18 Aug. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2308.09768v2

Comprehensive Summary
Key points
Layman's Summary
Blog article

In this paper, the authors introduce YORC (Yorùbá Reading Comprehension), a new dataset for Yorùbá language reading comprehension. The dataset is based on Yorùbá high-school reading comprehension examinations. The authors provide baseline results by performing cross-lingual transfer using the existing English RACE dataset and a pre-trained encoder-only model. They also evaluate the performance of large language models (LLMs) like GPT-4. The results show that GPT-4 achieves the highest accuracy of 36.14% on the YORC data. However, this accuracy is still lower compared to AfroXLMR-base and ChatGPT on the English test set, highlighting the challenges faced by pre-trained LLMs in accurately answering questions in a multi-choice QA setting. The paper concludes by emphasizing the limitations of LLMs for under-resourced African languages like Yorùbá. As future work, the authors plan to extend their evaluation to few-shot settings and explore approaches that can effectively adapt existing reading comprehension models with limited examples. The authors acknowledge Mr. Daud Olamide Abolade for his assistance with manual text extraction using OCR tools and express gratitude to OpenAI for providing API credits through their Researcher Access API program for evaluating GPT-3.5 and GPT-4 large language models. Overall, this paper presents an important contribution in creating a new reading comprehension dataset for Yorùbá language and highlights the challenges and potential future directions in improving performance for under-resourced languages using LLMs.

- Introduction of YORC (Yorùbá Reading Comprehension), a new dataset for Yorùbá language reading comprehension
- Dataset based on Yorùbá high-school reading comprehension examinations
- Baseline results using cross-lingual transfer with English RACE dataset and pre-trained encoder-only model
- Evaluation of large language models (LLMs) like GPT-4
- GPT-4 achieves highest accuracy of 36.14% on YORC data, but lower compared to AfroXLMR-base and ChatGPT on English test set
- Challenges faced by LLMs in multi-choice QA setting for under-resourced African languages like Yorùbá
- Limitations of LLMs for under-resourced African languages emphasized
- Future work includes evaluation in few-shot settings and exploring approaches to adapt existing models with limited examples
- Acknowledgment of Mr. Daud Olamide Abolade for assistance with manual text extraction using OCR tools
- Gratitude expressed to OpenAI for providing API credits through Researcher Access API program for evaluating GPT-3.5 and GPT-4 LLMs
- Overall contribution in creating a new reading comprehension dataset for Yorùbá language and highlighting challenges and potential future directions in improving performance for under-resourced languages using LLMs.

YORC is a new dataset for reading comprehension in the Yorùbá language. It is based on high-school reading comprehension exams in Yorùbá. Researchers used a model called GPT-4 to test how well it could understand and answer questions in Yorùbá. GPT-4 did well on the YORC data, but not as well as other models did on English tests. There are challenges when using large language models like GPT-4 for languages with fewer resources, like Yorùbá. The researchers want to continue working on this and try different approaches to improve the models." Definitions1. Dataset: A collection of information or data. 2. Reading comprehension: The ability to understand and interpret written text. 3. Baseline results: Initial or starting point of measurement or comparison. 4. Accuracy: How correct or accurate something is. 5. Under-resourced: Lacking sufficient resources or support. 6. Limitations: Restrictions or weaknesses of something. 7. Few-shot settings: A situation where there are only a few examples available for learning or training. 8. OCR tools: Tools that can extract text from images or scanned documents. 9. API credits: Credits given by OpenAI to access their programming interface (API). 10. Researcher Access API program: A program by OpenAI that provides access to their API for researchers. 11. Highlighting challenges and potential future directions: Bringing attention to difficulties and possible

Yorùbá is a language spoken by over 40 million people in West Africa, primarily in Nigeria and Benin. Despite its widespread use, there is a lack of resources available for natural language processing (NLP) tasks in Yorùbá. This poses a challenge for researchers and developers who are interested in building NLP applications for this under-resourced language. In order to address this gap, a team of researchers from the University of Lagos and the African Institute for Mathematical Sciences (AIMS) have introduced YORC (Yorùbá Reading Comprehension), a new dataset specifically designed for reading comprehension tasks in Yorùbá. The dataset is based on high-school reading comprehension examinations commonly used in Yorùbá schools. The authors begin by discussing the motivation behind creating this dataset. They highlight the importance of having resources available for under-resourced languages like Yorùbá, as it allows for more diverse representation and inclusivity in NLP research and development. Additionally, they note that existing datasets often do not accurately reflect the linguistic nuances present in African languages, making it difficult to develop effective models. To create the YORC dataset, the authors collected high-school reading comprehension exams from various schools across Nigeria. These exams were then manually transcribed into digital format using optical character recognition (OCR) tools with assistance from Mr. Daud Olamide Abolade. The resulting dataset consists of over 1,000 passages and 10,000 questions covering various topics such as history, literature, science, and current affairs. Next, the authors provide baseline results by performing cross-lingual transfer using an existing English reading comprehension dataset called RACE (ReAding Comprehension from Examinations). They also evaluate the performance of large language models (LLMs) like GPT-4 on both the English RACE test set and their newly created YORC dataset. The results show that GPT-4 achieves the highest accuracy of 36.14% on the YORC data, but this is still lower compared to other LLMs like AfroXLMR-base and ChatGPT on the English test set. The authors discuss these results and highlight the challenges faced by pre-trained LLMs in accurately answering questions in a multi-choice question-answering (QA) setting. They note that these models are often trained on large amounts of data from high-resource languages, making it difficult for them to adapt to under-resourced languages with different linguistic structures and vocabulary. In conclusion, the paper emphasizes the limitations of LLMs for under-resourced African languages like Yorùbá and highlights potential future directions for improving performance. As future work, the authors plan to extend their evaluation to few-shot settings and explore approaches that can effectively adapt existing reading comprehension models with limited examples. The researchers also express gratitude to OpenAI for providing API credits through their Researcher Access API program for evaluating GPT-3.5 and GPT-4 large language models. This support from industry partners is crucial in advancing NLP research for under-resourced languages. Overall, this paper presents an important contribution in creating a new reading comprehension dataset for Yorùbá language and sheds light on the challenges faced by pre-trained LLMs in accurately processing under-resourced languages. It serves as a call-to-action for further research and development efforts towards improving NLP capabilities in African languages like Yorùbá.

Created on 01 Feb. 2024

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

50.5%

KLUE: Korean Language Understanding Evaluation

cs.CL

49.7%

A Survey on LLM-generated Text Detection: Necessity, Methods, and Future Dire…

cs.CL

49.7%

LLM-powered Data Augmentation for Enhanced Crosslingual Performance

cs.CL

49.1%

M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large …

cs.CL

48.9%

Evaluating GPT-3.5 and GPT-4 Models on Brazilian University Admission Exams

cs.CL

48.7%

A Comprehensive Overview of Large Language Models

cs.CL

48.4%

In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT

cs.CR

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.