Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias
AI-generated Key Points
⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.
- **Study Focus**:
- Authors studied the impact of Large Language Models (LLMs) like GPT-4 on citation practices in scientific knowledge dissemination.
- **Role of Citation Practices**:
- Citation practices play a crucial role in shaping the structure of scientific knowledge.
- These practices are influenced by contemporary norms and biases.
- **Introduction of LLMs**:
- LLMs like GPT-4 introduce a new dynamic to citation practices, recommending references based on parametric knowledge rather than search or retrieval-augmented generation.
- **Experiment Details**:
- Experiment used a dataset of 166 papers from prestigious conferences published after GPT-4's cut-off date, with a total of 3,066 references.
- GPT-4 was tasked with suggesting scholarly references for anonymized in-text citations within these papers.
- **Findings**:
- Similarity between human and LLM citation patterns observed.
- More pronounced high citation bias in GPT-4 compared to human patterns, even after controlling for variables like publication year and venue.
- **Model Behavior**:
- Model internalized citation patterns to a considerable extent.
- References recommended by GPT-4 were embedded within relevant citation contexts, indicating deeper conceptual internalization of citation networks.
- **Implications**:
- LLMs have potential to aid in citation generation but can also amplify existing biases and introduce new ones that may skew scientific knowledge dissemination.
- **Recommendations**:
- Importance of identifying and addressing biases within LLMs emphasized.
- Need for developing balanced methods to interact effectively with these models highlighted.
Authors: Andres Algaba, Carmen Mazijn, Vincent Holst, Floriano Tori, Sylvia Wenmackers, Vincent Ginis
Abstract: Citation practices are crucial in shaping the structure of scientific knowledge, yet they are often influenced by contemporary norms and biases. The emergence of Large Language Models (LLMs) like GPT-4 introduces a new dynamic to these practices. Interestingly, the characteristics and potential biases of references recommended by LLMs that entirely rely on their parametric knowledge, and not on search or retrieval-augmented generation, remain unexplored. Here, we analyze these characteristics in an experiment using a dataset of 166 papers from AAAI, NeurIPS, ICML, and ICLR, published after GPT-4's knowledge cut-off date, encompassing 3,066 references in total. In our experiment, GPT-4 was tasked with suggesting scholarly references for the anonymized in-text citations within these papers. Our findings reveal a remarkable similarity between human and LLM citation patterns, but with a more pronounced high citation bias in GPT-4, which persists even after controlling for publication year, title length, number of authors, and venue. Additionally, we observe a large consistency between the characteristics of GPT-4's existing and non-existent generated references, indicating the model's internalization of citation patterns. By analyzing citation graphs, we show that the references recommended by GPT-4 are embedded in the relevant citation context, suggesting an even deeper conceptual internalization of the citation networks. While LLMs can aid in citation generation, they may also amplify existing biases and introduce new ones, potentially skewing scientific knowledge dissemination. Our results underscore the need for identifying the model's biases and for developing balanced methods to interact with LLMs in general.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.