DiSMEC - Distributed Sparse Machines for Extreme Multi-label Classification

AI-generated keywords: DiSMEC extreme multi-label classification power-law distribution capacity control prediction accuracy

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • DiSMEC is a framework for extreme multi-label classification with supervised learning and large-scale datasets.
  • The datasets exhibit fit to power-law distribution where most labels have very few positive instances in the data distribution.
  • Most state-of-the-art approaches use low-dimensional linear subspace to capture correlation among labels, but this can be violated in the presence of power-law distributed extremely large and diverse label spaces.
  • Unlike other methods, DiSMEC does not make any low rank assumptions on the label matrix and instead uses one versus rest linear classifiers coupled with explicit capacity control to control model size.
  • DiSMEC can learn classifiers for datasets consisting hundreds of thousands labels within few hours using double layer parallelization.
  • The explicit capacity control mechanism filters out spurious parameters which keep the model compact in size without losing prediction accuracy.
  • Empirical evaluation on publicly available real world datasets consisting up to 670,000 labels showed that DiSMEC significantly boosted prediction accuracies compared to SLECC and FastXML, with an absolute improvement of 10% and 15%, respectively.
  • Overall, DiSMEC presents a promising solution for extreme multi-label classification that does not rely on low rank assumptions and provides explicit capacity control while maintaining high prediction accuracy.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Rohit Babbar, Bernhard Shoelkopf

Abstract: Extreme multi-label classification refers to supervised multi-label learning involving hundreds of thousands or even millions of labels. Datasets in extreme classification exhibit fit to power-law distribution, i.e. a large fraction of labels have very few positive instances in the data distribution. Most state-of-the-art approaches for extreme multi-label classification attempt to capture correlation among labels by embedding the label matrix to a low-dimensional linear sub-space. However, in the presence of power-law distributed extremely large and diverse label spaces, structural assumptions such as low rank can be easily violated. In this work, we present DiSMEC, which is a large-scale distributed framework for learning one-versus-rest linear classifiers coupled with explicit capacity control to control model size. Unlike most state-of-the-art methods, DiSMEC does not make any low rank assumptions on the label matrix. Using double layer of parallelization, DiSMEC can learn classifiers for datasets consisting hundreds of thousands labels within few hours. The explicit capacity control mechanism filters out spurious parameters which keep the model compact in size, without losing prediction accuracy. We conduct extensive empirical evaluation on publicly available real-world datasets consisting upto 670,000 labels. We compare DiSMEC with recent state-of-the-art approaches, including - SLEEC which is a leading approach for learning sparse local embeddings, and FastXML which is a tree-based approach optimizing ranking based loss function. On some of the datasets, DiSMEC can significantly boost prediction accuracies - 10% better compared to SLECC and 15% better compared to FastXML, in absolute terms.

Submitted to arXiv on 08 Sep. 2016

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 1609.02521v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

DiSMEC is a large-scale distributed framework for extreme multi-label classification which involves supervised learning with hundreds of thousands or even millions of labels. The datasets in this type of classification exhibit fit to power-law distribution where a large fraction of labels have very few positive instances in the data distribution. Most state-of-the-art approaches attempt to capture correlation among labels by embedding the label matrix to a low-dimensional linear subspace. However, this approach can be easily violated in the presence of power-law distributed extremely large and diverse label spaces. Unlike most state-of-the-art methods, DiSMEC does not make any low rank assumptions on the label matrix and instead uses one versus rest linear classifiers coupled with explicit capacity control to control model size. Using double layer parallelization, DiSMEC can learn classifiers for datasets consisting hundreds of thousands labels within few hours. The explicit capacity control mechanism filters out spurious parameters which keep the model compact in size without losing prediction accuracy. The authors conducted extensive empirical evaluation on publicly available real world datasets consisting up to 670,000 labels and compared DiSMEC with recent state of the art approaches such as SLEEC and FastXML. On some of the datasets, DiSMEC significantly boosted prediction accuracies - 10% better compared to SLECC and 15% better compared to FastXML, in absolute terms. Overall, DiSMEC presents a promising solution for extreme multi-label classification that does not rely on low rank assumptions and provides explicit capacity control while maintaining high prediction accuracy.
Created on 26 Jun. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.