Learning to Generate Image Embeddings with User-level Differential Privacy

AI-generated keywords: Image Embeddings User-level Differential Privacy Federated Learning DP-FedEmb Utility

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Authors: Zheng Xu, Maxwell Collins, Yuxiao Wang, Liviu Panait, Sewoong Oh, Sean Augenstein, Ting Liu, Florian Schroff, H. Brendan McMahan
  • Title: "Learning to Generate Image Embeddings with User-level Differential Privacy"
  • Challenge addressed: Training large image-to-embedding feature extractors with user-level differential privacy (DP)
  • Proposed solution: DP-FedEmb as a variant of federated learning algorithms incorporating per-user sensitivity control and noise addition
  • Techniques used: Virtual clients, partial aggregation, private local fine-tuning, public pretraining
  • Applications: Training image embedding models for faces, landmarks, and natural species on benchmark datasets (DigiFace, EMNIST, GLD, iNaturalist)
  • Experiment results: Achieved utility under the same privacy budget with epsilon less than 4 while controlling drop in utility within 5%
  • Contribution: Enhancing privacy protection in large-scale image processing tasks while maintaining high levels of utility
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Zheng Xu, Maxwell Collins, Yuxiao Wang, Liviu Panait, Sewoong Oh, Sean Augenstein, Ting Liu, Florian Schroff, H. Brendan McMahan

CVPR camera ready. Addressed reviewer comments. Switched from add-or-remove-one DP to substitute-one DP

Abstract: Small on-device models have been successfully trained with user-level differential privacy (DP) for next word prediction and image classification tasks in the past. However, existing methods can fail when directly applied to learn embedding models using supervised training data with a large class space. To achieve user-level DP for large image-to-embedding feature extractors, we propose DP-FedEmb, a variant of federated learning algorithms with per-user sensitivity control and noise addition, to train from user-partitioned data centralized in the datacenter. DP-FedEmb combines virtual clients, partial aggregation, private local fine-tuning, and public pretraining to achieve strong privacy utility trade-offs. We apply DP-FedEmb to train image embedding models for faces, landmarks and natural species, and demonstrate its superior utility under same privacy budget on benchmark datasets DigiFace, EMNIST, GLD and iNaturalist. We further illustrate it is possible to achieve strong user-level DP guarantees of $\epsilon<4$ while controlling the utility drop within 5%, when millions of users can participate in training.

Submitted to arXiv on 20 Nov. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2211.10844v2

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In their paper titled "Learning to Generate Image Embeddings with User-level Differential Privacy," authors Zheng Xu, Maxwell Collins, Yuxiao Wang, Liviu Panait, Sewoong Oh, Sean Augenstein, Ting Liu, Florian Schroff, and H. Brendan McMahan address the challenge of training large image-to-embedding feature extractors with user-level differential privacy (DP). They propose DP-FedEmb as a variant of federated learning algorithms that incorporates per-user sensitivity control and noise addition to overcome this challenge. This approach allows for training from user-partitioned data centralized in the datacenter and leverages techniques such as virtual clients, partial aggregation, private local fine-tuning, and public pretraining to achieve strong trade-offs between privacy and utility. The authors apply DP-FedEmb to train image embedding models for faces, landmarks, and natural species on benchmark datasets including DigiFace, EMNIST, GLD, and iNaturalist. Through their experiments on millions of users participating in the training process, in terms of utility under the same privacy budget with epsilon less than 4 while controlling the drop in utility within 5%. This research contributes valuable insights into enhancing privacy protection in large-scale image processing tasks while maintaining high levels of utility.
Created on 04 Dec. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.