Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis

AI-generated keywords: Artificial Intelligence Scaling Up Internal Representations Neural Networks Fractured Entangled Representation

AI-generated Key Points

  • Growing interest in scaling up existing systems for improved performance in artificial intelligence
  • Debate on whether better performance always means better internal representations
  • Comparison between neural networks evolved through open-ended search processes and those trained using stochastic gradient descent (SGD)
  • Visualization of hidden neurons' functional behaviors as images to examine internal construction of output behavior
  • Significant differences found in internal representations between networks trained with SGD and evolved networks
  • Networks trained with SGD exhibit disorganization, potentially degrading core model capacities like generalization and creativity
  • Evolved networks tend towards a unified factored representation (UFR), which could be crucial for future representation learning
  • Acknowledgment of contributions from various individuals and research projects, including insights on vector rotation and discussions on Mixture of Experts
  • Highlighting previous studies on elegant representations found in Picbreeder CPPNs and support from funding sources like NSF GRFP Fellowship and Canada CIFAR AI Chairs program
  • Discussion on findings related to neural algorithms in models like GPT-3 and GPT-4, showcasing evidence of FER in certain contexts
  • References to studies on shortcut learning and heuristic-based approaches in model training, emphasizing the importance of understanding and mitigating FER for improved model performance
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Akarsh Kumar, Jeff Clune, Joel Lehman, Kenneth O. Stanley

43 pages, 25 figures
License: CC BY 4.0

Abstract: Much of the excitement in modern AI is driven by the observation that scaling up existing systems leads to better performance. But does better performance necessarily imply better internal representations? While the representational optimist assumes it must, this position paper challenges that view. We compare neural networks evolved through an open-ended search process to networks trained via conventional stochastic gradient descent (SGD) on the simple task of generating a single image. This minimal setup offers a unique advantage: each hidden neuron's full functional behavior can be easily visualized as an image, thus revealing how the network's output behavior is internally constructed neuron by neuron. The result is striking: while both networks produce the same output behavior, their internal representations differ dramatically. The SGD-trained networks exhibit a form of disorganization that we term fractured entangled representation (FER). Interestingly, the evolved networks largely lack FER, even approaching a unified factored representation (UFR). In large models, FER may be degrading core model capacities like generalization, creativity, and (continual) learning. Therefore, understanding and mitigating FER could be critical to the future of representation learning.

Submitted to arXiv on 16 May. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2505.11581v1

In the field of artificial intelligence, there is a growing interest in the idea that scaling up existing systems can lead to improved performance. However, the question arises: does better performance necessarily mean better internal representations? While some believe that it does, this position paper challenges that assumption. The paper compares neural networks evolved through an open-ended search process to networks trained using stochastic gradient descent (SGD) on the task of generating a single image. This setup allows for easy visualization of each hidden neuron's functional behavior as an image, revealing how the network's output behavior is constructed internally. The results show that while both types of networks produce the same output behavior, their internal representations differ significantly. has seen a surge in interest regarding existing systems to improve performance. However, have become a topic of debate - does better performance always equate to better internal representations? This paper challenges this notion by comparing neural networks evolved through open-ended search processes with those trained using stochastic gradient descent (SGD). By visualizing hidden neurons' functional behaviors as images and examining their internal construction of output behavior, it reveals significant differences between the two types of networks. Networks trained with SGD exhibit a form of disorganization termed , which may degrade core model capacities such as generalization and creativity. On the other hand, evolved networks tend towards a unified factored representation (UFR), which could be crucial for future representation learning. The paper acknowledges contributions from various individuals and research projects, including insights on vector rotation and discussions on Mixture of Experts. The paper also highlights previous studies on elegant representations found in Picbreeder CPPNs and mentions support from funding sources like NSF GRFP Fellowship and Canada CIFAR AI Chairs program. Additionally, the paper discusses findings related to neural algorithms in models like GPT-3 and GPT-4, showcasing evidence of FER in certain contexts. It also references studies on shortcut learning and heuristic-based approaches in model training, emphasizing the importance of understanding and mitigating FER for improved model performance. In summary, this detailed analysis delves into the challenges posed by in neural networks and emphasizes the significance of addressing this issue for advancing representation learning in .
Created on 23 Jul. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.