Masked Autoencoders are Scalable Learners of Cellular Morphology

AI-generated keywords: Biological research

AI-generated Key Points

  • Deep vision models are more effective at capturing biological signals compared to hand-crafted features
  • Convolutional Neural Network (CNN) and Vision Transformer (ViT) based masked autoencoders outperform weakly supervised baselines
  • ViT-L/8 model trained on over 3.5 billion unique crops achieved relative improvements of up to 28% in inferring known biological relationships
  • Researchers adapted U-Nets for masked autoencoding (MU-Nets) with promising outcomes
  • Study includes detailed figures illustrating StringDB recall and training FLOps, as well as recall across different cosine similarity percentiles for each database
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Oren Kraus, Kian Kenyon-Dean, Saber Saberian, Maryam Fallah, Peter McLean, Jess Leung, Vasudev Sharma, Ayla Khan, Jia Balakrishnan, Safiye Celik, Maciej Sypetkowski, Chi Vicky Cheng, Kristen Morse, Maureen Makes, Ben Mabey, Berton Earnshaw

Spotlight at NeurIPS 2023 Generative AI and Biology (GenBio) Workshop
License: CC BY-NC-SA 4.0

Abstract: Inferring biological relationships from cellular phenotypes in high-content microscopy screens provides significant opportunity and challenge in biological research. Prior results have shown that deep vision models can capture biological signal better than hand-crafted features. This work explores how self-supervised deep learning approaches scale when training larger models on larger microscopy datasets. Our results show that both CNN- and ViT-based masked autoencoders significantly outperform weakly supervised baselines. At the high-end of our scale, a ViT-L/8 trained on over 3.5-billion unique crops sampled from 93-million microscopy images achieves relative improvements as high as 28% over our best weakly supervised baseline at inferring known biological relationships curated from public databases. Relevant code and select models released with this work can be found at: https://github.com/recursionpharma/maes_microscopy.

Submitted to arXiv on 27 Sep. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2309.16064v2

, , , , In the field of biological research, inferring biological relationships from cellular phenotypes in high-content microscopy screens presents both significant opportunities and challenges. Previous studies have demonstrated that deep vision models are more effective at capturing biological signals compared to hand-crafted features. This study delves into the scalability of self-supervised deep learning approaches when training larger models on extensive microscopy datasets. The results of this research show that both Convolutional Neural Network (CNN) and Vision Transformer (ViT) based masked autoencoders outperform weakly supervised baselines. At the highest scale examined in this study, a ViT-L/8 model trained on over 3.5 billion unique crops sampled from 93 million microscopy images achieved relative improvements of up to 28% over the best weakly supervised baseline in inferring known biological relationships sourced from public databases. The authors, including Oren Kraus, Kian Kenyon-Dean, Saber Saberian, Maryam Fallah, Peter McLean, Jess Leung, Vasudev Sharma, Ayla Khan, Jia Balakrishnan, Safiye Celik, Maciej Sypetkowski, Chi Vicky Cheng, Kristen Morse, Maureen Makes, Ben Mabey, and Berton Earnshaw conducted this study which was spotlighted at the NeurIPS 2023 Generative AI and Biology (GenBio) Workshop. Furthermore, the researchers adapted U-Nets for masked autoencoding (MU-Nets) by training them to reconstruct masked sections of input images. The results for MU-Net-M showcase promising outcomes in line with previous studies. The study also includes detailed figures illustrating StringDB recall as a function of training FLOps and recall across different cosine similarity percentiles for each database. Overall, this work contributes valuable insights into leveraging self-supervised deep learning approaches for analyzing cellular morphology in high-content microscopy screens. The release of relevant code and select models associated with this research can be accessed at https://github.com/recursionpharma/maes_microscopy.
Created on 19 Mar. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.