Interpolating between Optimal Transport and MMD using Sinkhorn Divergences

AI-generated keywords: Data Sciences Probability Distributions Geometric Divergences Maximum Mean Discrepancies (MMD) Optimal Transport Distances

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Traditional norms and divergences like total variation and relative entropy only compare densities in a point-wise manner, limiting comprehensive understanding of distribution comparison.
  • Maximum Mean Discrepancies (MMD) and Optimal Transport distances (OT) offer more robust approaches by considering the geometry of the space and metrizing convergence in law.
  • Sinkhorn divergences bridge the gap between MMD and OT, representing a family of geometric divergences that provide theoretical guarantees such as positivity, convexity, and metrization of convergence in law.
  • The introduction of geometric entropy enhances the understanding and application of Sinkhorn divergences in probabilistic distribution comparisons.
  • A numerical scheme outlined in the paper enables large-scale application of Sinkhorn divergences in machine learning tasks, with efficient computation on GPU platforms for batches containing up to a million samples.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouvé, Gabriel Peyré

15 pages, 5 figures

Abstract: Comparing probability distributions is a fundamental problem in data sciences. Simple norms and divergences such as the total variation and the relative entropy only compare densities in a point-wise manner and fail to capture the geometric nature of the problem. In sharp contrast, Maximum Mean Discrepancies (MMD) and Optimal Transport distances (OT) are two classes of distances between measures that take into account the geometry of the underlying space and metrize the convergence in law. This paper studies the Sinkhorn divergences, a family of geometric divergences that interpolates between MMD and OT. Relying on a new notion of geometric entropy, we provide theoretical guarantees for these divergences: positivity, convexity and metrization of the convergence in law. On the practical side, we detail a numerical scheme that enables the large scale application of these divergences for machine learning: on the GPU, gradients of the Sinkhorn loss can be computed for batches of a million samples.

Submitted to arXiv on 18 Oct. 2018

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 1810.08278v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In the field of data sciences, comparing probability distributions is a fundamental problem that requires sophisticated methods to capture the geometric nature of the underlying space. Traditional norms and divergences like total variation and relative entropy only compare densities in a point-wise manner, often falling short in providing a comprehensive understanding of the distribution comparison process. However, Maximum Mean Discrepancies (MMD) and Optimal Transport distances (OT) offer more robust approaches by considering the geometry of the space and metrizing convergence in law. This paper delves into the study of Sinkhorn divergences, which represent a family of geometric divergences that bridge the gap between MMD and OT. By introducing a novel concept of geometric entropy, the authors provide theoretical guarantees for these divergences, including positivity, convexity, and metrization of convergence in law. Additionally, the paper outlines a numerical scheme that facilitates large-scale application of Sinkhorn divergences in machine learning tasks. Notably, on GPU platforms, gradients of the Sinkhorn loss can be efficiently computed for batches containing up to a million samples. Authored by Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouvé,and Gabriel Peyré,this research contributes valuable insights into enhancing probabilistic distribution comparisons through innovative geometric divergence techniques. The findings presented in this paper pave the way for advancements in statistical analysis and machine learning applications where accurate measurement of distribution discrepancies is crucial for decision-making processes.
Created on 24 Jun. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.