The Relational Data Borg is Learning

AI-generated keywords: Relational Data Machine Learning Performance Techniques Algebraic

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • The paper treats machine learning over relational data as a database problem
  • Justification for this approach is based on feature extraction queries and computing group-by aggregates
  • The approach has been applied to various supervised and unsupervised learning tasks
  • Techniques leveraging knowledge about the underlying data can significantly enhance runtime performance of machine learning
  • The paper explores theoretical developments related to the algebraic, combinatorial, and statistical structure of relational data processing
  • Systems development involving code specialization, low-level computation sharing, and parallelization are explored to reduce complexity and constant factors in learning time
  • Extensive collaboration between the author and colleagues from RelationalAI and the FDB research project
  • Acknowledgments for contributions from industry partners such as AWS, GCP, Infor Corporation, LogicBlox Inc., Azure, and RelationalAI
  • Funding acknowledgments from EPSRC, ERC, and Horizon 2020 program
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Dan Olteanu

14 pages, 11 figures, VLDB 2020 keynote

Abstract: This paper overviews an approach that addresses machine learning over relational data as a database problem. This is justified by two observations. First, the input to the learning task is commonly the result of a feature extraction query over the relational data. Second, the learning task requires the computation of group-by aggregates. This approach has been already investigated for a number of supervised and unsupervised learning tasks, including: ridge linear regression, factorisation machines, support vector machines, decision trees, principal component analysis, and k-means; and also for linear algebra over data matrices. The main message of this work is that the runtime performance of machine learning can be dramatically boosted by a toolbox of techniques that exploit the knowledge of the underlying data. This includes theoretical development on the algebraic, combinatorial, and statistical structure of relational data processing and systems development on code specialisation, low-level computation sharing, and parallelisation. These techniques aim at lowering both the complexity and the constant factors of the learning time. This work is the outcome of extensive collaboration of the author with colleagues from RelationalAI, in particular Mahmoud Abo Khamis, Molham Aref, Hung Ngo, and XuanLong Nguyen, and from the FDB research project, in particular Ahmet Kara, Milos Nikolic, Maximilian Schleich, Amir Shaikhha, Jakub Zavodny, and Haozhe Zhang. The author would also like to thank the members of the FDB project for the figures and examples used in this paper. The author is grateful for support from industry: Amazon Web Services, Google, Infor, LogicBlox, Microsoft Azure, RelationalAI; and from the funding agencies EPSRC and ERC. This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 682588.

Submitted to arXiv on 18 Aug. 2020

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2008.07864v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

The paper titled "The Relational Data Borg is Learning" by Dan Olteanu presents an approach that treats machine learning over relational data as a database problem. The author justifies this approach based on two observations. Firstly, the input to the learning task often involves a feature extraction query over the relational data. Secondly, the learning task requires computing group-by aggregates. The paper discusses how this approach has been applied to various supervised and unsupervised learning tasks, including ridge linear regression, factorization machines, support vector machines, decision trees, principal component analysis, k-means clustering, and linear algebra over data matrices. The main message of the work is that the runtime performance of machine learning can be significantly enhanced by utilizing techniques that leverage knowledge about the underlying data. These techniques include theoretical developments related to the algebraic, combinatorial, and statistical structure of relational data processing. Additionally, systems development involving code specialization, low-level computation sharing and parallelization are explored to reduce both complexity and constant factors in learning time. The research presented in this paper is a result of extensive collaboration between the author and colleagues from RelationalAI (Mahmoud Abo Khamis, Molham Aref , Hung Ngo , XuanLong Nguyen) and the FDB research project (Ahmet Kara , Milos Nikolic , Maximilian Schleich , Amir Shaikhha Jakub Zavodny Haozhe Zhang). The author acknowledges their contributions as well as thanks other members of the FDB project for providing figures and examples used in the paper. Furthermore ,the author expresses gratitude for support received from industry partners such as Amazon Web Services (AWS), Google Cloud Platform (GCP), Infor Corporation (Infor), LogicBlox Inc., Microsoft Azure (Azure), and RelationalAI . Funding from EPSRC (Engineering and Physical Sciences Research Council) and ERC (European Research Council) is also acknowledged . The project has received additional funding from the European Union's Horizon 2020 research and innovation program under grant agreement No 682588 . Overall , this paper highlights potential for improving machine learning performance by incorporating techniques that exploit characteristics of relational data , provides insights into theoretical & practical advancements in this area .
Created on 07 Aug. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.