A cross-study analysis of drug response prediction in cancer cell lines

AI-generated keywords: Machine Learning

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Machine learning models can predict drug response for personalized cancer treatment
  • Cross-validation within a single study may provide an overly optimistic estimate of prediction performance on independent test sets
  • Researchers used machine learning to analyze five publicly available cell line-based data sets to assess model generalizability between different studies
  • Multitasking deep neural network achieved the best cross-study generalizability
  • Differences in viability assays can limit model generalizability across studies
  • Drug diversity is crucial for raising model generalizability in preclinical screening more than tumor diversity
  • Using diverse drugs in preclinical screening could improve model generalizability across different studies, providing an effective approach for personalized cancer treatment.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Fangfang Xia, Jonathan Allen, Prasanna Balaprakash, Thomas Brettin, Cristina Garcia-Cardona, Austin Clyde, Judith Cohn, James Doroshow, Xiaotian Duan, Veronika Dubinkina, Yvonne Evrard, Ya Ju Fan, Jason Gans, Stewart He, Pinyi Lu, Sergei Maslov, Alexander Partin, Maulik Shukla, Eric Stahlberg, Justin M. Wozniak, Hyunseung Yoo, George Zaki, Yitan Zhu, Rick Stevens

arXiv: 2104.08961v2 - DOI (q-bio.QM)
Accepted by Briefings in Bioinformatics

Abstract: To enable personalized cancer treatment, machine learning models have been developed to predict drug response as a function of tumor and drug features. However, most algorithm development efforts have relied on cross validation within a single study to assess model accuracy. While an essential first step, cross validation within a biological data set typically provides an overly optimistic estimate of the prediction performance on independent test sets. To provide a more rigorous assessment of model generalizability between different studies, we use machine learning to analyze five publicly available cell line-based data sets: NCI60, CTRP, GDSC, CCLE and gCSI. Based on observed experimental variability across studies, we explore estimates of prediction upper bounds. We report performance results of a variety of machine learning models, with a multitasking deep neural network achieving the best cross-study generalizability. By multiple measures, models trained on CTRP yield the most accurate predictions on the remaining testing data, and gCSI is the most predictable among the cell line data sets included in this study. With these experiments and further simulations on partial data, two lessons emerge: (1) differences in viability assays can limit model generalizability across studies, and (2) drug diversity, more than tumor diversity, is crucial for raising model generalizability in preclinical screening.

Submitted to arXiv on 18 Apr. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2104.08961v2

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Machine learning models have been developed to predict drug response as a function of tumor and drug features, enabling personalized cancer treatment. However, most algorithm development efforts rely on cross-validation within a single study to assess model accuracy, which typically provides an overly optimistic estimate of prediction performance on independent test sets. To provide a more rigorous assessment of model generalizability between different studies, researchers used machine learning to analyze five publicly available cell line-based data sets: NCI60, CTRP, GDSC, CCLE, and gCSI. Based on observed experimental variability across studies, the researchers explored estimates of prediction upper bounds. They reported performance results of various machine learning models and found that a multitasking deep neural network achieved the best cross-study generalizability. By multiple measures, models trained on CTRP yielded the most accurate predictions on the remaining testing data, and gCSI was the most predictable among the cell line data sets included in this study. Through experiments and further simulations on partial data, two lessons emerged: differences in viability assays can limit model generalizability across studies; and drug diversity is crucial for raising model generalizability in preclinical screening more than tumor diversity. The researchers also noted that drug sensitivity prediction is challenging due to biological complexity and experimental variability. This study provides valuable insights into improving machine learning models' ability to predict drug response for personalized cancer treatment by analyzing multiple publicly available datasets. The findings suggest that using diverse drugs in preclinical screening could improve model generalizability across different studies, thus providing an effective approach for personalized cancer treatment.
Created on 31 May. 2023

Assess the quality of the AI-generated content by voting

Score: 1

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.