Fantastic Generalization Measures and Where to Find Them

AI-generated keywords: Generalization Deep Networks Complexity Measures Hyperparameters Insights

AI-generated Key Points

⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

Generalization in deep networks has been a topic of great interest in recent years
Many complexity measures have been proposed to understand generalization
Yiding Jiang, Behnam Neyshabur, Hossein Mobahi, Dilip Krishnan and Samy Bengio conducted the first large-scale study of generalization in deep networks
Over 40 complexity measures were investigated by training more than 10,000 convolutional networks while systematically varying commonly used hyperparameters
The goal was to uncover potentially causal relationships between each measure and generalization through carefully controlled experiments
Some measures failed to provide accurate results while others showed promise for further research
This study provides valuable insights into the generalization of deep networks and highlights the need for larger scale studies that can generalize across different models and settings
The findings emphasize the importance of carefully selecting appropriate complexity measures when evaluating generalization performance in deep learning models
Overall, this work contributes significantly to our understanding of how complex neural networks generalize their learned representations beyond their training data.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yiding Jiang, Behnam Neyshabur, Hossein Mobahi, Dilip Krishnan, Samy Bengio

arXiv: 1912.02178v1 - DOI (cs.LG)

License: NONEXCLUSIVE-DISTRIB 1.0

Abstract: Generalization of deep networks has been of great interest in recent years, resulting in a number of theoretically and empirically motivated complexity measures. However, most papers proposing such measures study only a small set of models, leaving open the question of whether the conclusion drawn from those experiments would remain valid in other settings. We present the first large scale study of generalization in deep networks. We investigate more then 40 complexity measures taken from both theoretical bounds and empirical studies. We train over 10,000 convolutional networks by systematically varying commonly used hyperparameters. Hoping to uncover potentially causal relationships between each measure and generalization, we analyze carefully controlled experiments and show surprising failures of some measures as well as promising measures for further research.

Submitted to arXiv on 04 Dec. 2019

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 1912.02178v1

⚠This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Comprehensive Summary
Key points
Layman's Summary
Blog article

The study of generalization in deep networks has been a topic of great interest in recent years, with many theoretically and empirically motivated complexity measures proposed to understand it. To address the gap in knowledge on this topic, Yiding Jiang, Behnam Neyshabur, Hossein Mobahi, Dilip Krishnan and Samy Bengio conducted the first large-scale study of generalization in deep networks. The authors investigated over 40 complexity measures taken from both theoretical bounds and empirical studies by training more than 10,000 convolutional networks while systematically varying commonly used hyperparameters. The goal was to uncover potentially causal relationships between each measure and generalization through carefully controlled experiments. Surprisingly, some measures failed to provide accurate results while others showed promise for further research. This study provides valuable insights into the generalization of deep networks and highlights the need for larger scale studies that can generalize across different models and settings. The findings also emphasize the importance of carefully selecting appropriate complexity measures when evaluating generalization performance in deep learning models. Overall, this work contributes significantly to our understanding of how complex neural networks generalize their learned representations beyond their training data.

- Generalization in deep networks has been a topic of great interest in recent years
- Many complexity measures have been proposed to understand generalization
- Yiding Jiang, Behnam Neyshabur, Hossein Mobahi, Dilip Krishnan and Samy Bengio conducted the first large-scale study of generalization in deep networks
- Over 40 complexity measures were investigated by training more than 10,000 convolutional networks while systematically varying commonly used hyperparameters
- The goal was to uncover potentially causal relationships between each measure and generalization through carefully controlled experiments
- Some measures failed to provide accurate results while others showed promise for further research
- This study provides valuable insights into the generalization of deep networks and highlights the need for larger scale studies that can generalize across different models and settings
- The findings emphasize the importance of carefully selecting appropriate complexity measures when evaluating generalization performance in deep learning models
- Overall, this work contributes significantly to our understanding of how complex neural networks generalize their learned representations beyond their training data.

Summary: Some people have been studying how well computers can learn things. They made many different tests to see how good the computers are at learning. They did a big study with lots of tests and found some things that work well and some that don't. This helps us understand how computers learn better. Definitions: - Generalization: When a computer can use what it learned from one thing to do something similar. - Complexity measures: Ways to measure how hard or complicated something is. - Large-scale study: A very big experiment with lots of tests. - Hyperparameters: Settings that control how a computer learns. - Causal relationships: When one thing causes another thing to happen.

Exploring Generalization in Deep Networks: A Large-Scale Study

Overview of the Research

The goal of this research was to uncover potentially causal relationships between each measure and generalization through carefully controlled experiments. To do so, they trained multiple convolutional neural network (CNN) architectures using various hyperparameter values such as learning rate or batch size on CIFAR-10 datasets with different levels of noise added to them. They then measured the performance of these models using several complexity measures including Rademacher Complexity (RC), Neural Tangent Kernel (NTK) widths at initialization (W0), NTK widths at convergence (W∞), Lipschitz constant (Lc), spectral norm (SN) etc., which were obtained either from theoretical bounds or empirical studies.

Results & Findings

Surprisingly, some complexity measures failed to provide accurate results while others showed promise for further research. For example, RC provided an inaccurate estimation when compared against other methods such as W0 or W∞; however NTK widths at initialization had strong correlation with test accuracy across all noise levels tested indicating its potential usefulness for predicting model performance beyond training data sets. Additionally, Lc showed good correlation with test accuracy only under certain conditions such as low learning rates or high batch sizes suggesting that it may be useful for specific cases but not generally applicable across all scenarios studied here. Finally SN also showed good correlation with test accuracy but only when applied to shallow CNN architectures rather than deeper ones indicating its limited utility when dealing with complex models like those used here.

Conclusion

Overall, this work contributes significantly to our understanding of how complex neural networks generalize their learned representations beyond their training data by providing valuable insights into the relationship between various complexity measures and model performance on unseen data sets . The findings also emphasize the importance of carefully selecting appropriate complexity measures when evaluating generalization performance in deep learning models since some may fail to accurately predict model behavior while others may be better suited depending on specific circumstances such as architecture depth or hyperparameter values used during training .

Created on 17 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: -1

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

⚠The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

72.4%

Optimisation & Generalisation in Networks of Neurons

cs.NE

71.1%

Rethinking Domain Generalization for Face Anti-spoofing: Separability and Ali…

cs.CV

70.3%

AI-GAs: AI-generating algorithms, an alternate paradigm for producing general…

cs.AI

69.2%

Recent Advances in Neural Question Generation

cs.CL

68.8%

Using Language Models For Knowledge Acquisition in Natural Language Reasoning…

cs.AI

68.8%

Modeling and measuring incurred claims risk liabilities for a multi-line prop…

q-fin.RM

68.7%

Emergent autonomous scientific research capabilities of large language models

physics.chem-ph

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.