Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth

AI-generated keywords: Representations Width Depth Block Structure Error Patterns

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Scaling models by varying their architecture depth and width is important for achieving high performance in deep neural networks
  • Limited understanding of how depth and width affect learned representations within these models
  • Thao Nguyen, Maithra Raghu, and Simon Kornblith investigate this question in their paper "Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth"
  • Larger capacity models exhibit a block structure in hidden representations when model capacity is large relative to training set size
  • Block structure indicates underlying layers preserve and propagate dominant principal component of representations
  • Features learned outside block structure are often similar across architectures with varying widths and depths, but block structure itself is unique to each model
  • Wide and deep models exhibit distinctive error patterns across classes even when overall accuracy is similar
  • Study provides insights into how neural network representations vary with width and depth for designing effective deep learning models.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Thao Nguyen, Maithra Raghu, Simon Kornblith

ICLR 2021

Abstract: A key factor in the success of deep neural networks is the ability to scale models to improve performance by varying the architecture depth and width. This simple property of neural network design has resulted in highly effective architectures for a variety of tasks. Nevertheless, there is limited understanding of effects of depth and width on the learned representations. In this paper, we study this fundamental question. We begin by investigating how varying depth and width affects model hidden representations, finding a characteristic block structure in the hidden representations of larger capacity (wider or deeper) models. We demonstrate that this block structure arises when model capacity is large relative to the size of the training set, and is indicative of the underlying layers preserving and propagating the dominant principal component of their representations. This discovery has important ramifications for features learned by different models, namely, representations outside the block structure are often similar across architectures with varying widths and depths, but the block structure is unique to each model. We analyze the output predictions of different model architectures, finding that even when the overall accuracy is similar, wide and deep models exhibit distinctive error patterns and variations across classes.

Submitted to arXiv on 29 Oct. 2020

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2010.15327v2

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In the field of deep neural networks, scaling models by varying their architecture depth and width has been a key factor in achieving high performance for various tasks. However, there is limited understanding of how depth and width affect the learned representations within these models. In their paper, "Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth," authors Thao Nguyen, Maithra Raghu, and Simon Kornblith investigate this fundamental question. They find that larger capacity (wider or deeper) models exhibit a characteristic block structure in their hidden representations when model capacity is large relative to the size of the training set. This block structure indicates that underlying layers preserve and propagate the dominant principal component of their representations. Notably, features learned outside this block structure are often similar across architectures with varying widths and depths; however, the block structure itself is unique to each model. The authors also analyze output predictions from different model architectures and discover that even when overall accuracy is similar, wide and deep models exhibit distinctive error patterns across classes. Overall, this study provides important insights into how neural network representations vary with width and depth which can be used to design effective deep learning models.
Created on 30 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.