ITEm: Unsupervised Image-Text Embedding Learning for eCommerce

Résumés déjà disponibles dans d'autres langues : en

Auteurs : Baohao Liao, Michael Kozielski, Sanjika Hewavitharana, Jiangbo Yuan, Shahram Khadivi, Tomer Lancewicki

arXiv: 2311.02084v2 - DOI (cs.CV)

Résumé : Product embedding serves as a cornerstone for a wide range of applications in eCommerce. The product embedding learned from multiple modalities shows significant improvement over that from a single modality, since different modalities provide complementary information. However, some modalities are more informatively dominant than others. How to teach a model to learn embedding from different modalities without neglecting information from the less dominant modality is challenging. We present an image-text embedding model (ITEm), an unsupervised learning method that is designed to better attend to image and text modalities. We extend BERT by (1) learning an embedding from text and image without knowing the regions of interest; (2) training a global representation to predict masked words and to construct masked image patches without their individual representations. We evaluate the pre-trained ITEm on two tasks: the search for extremely similar products and the prediction of product categories, showing substantial gains compared to strong baseline models.

Soumis à arXiv le 22 Oct. 2023

Posez des questions sur cet article à notre assistant IA

Vous pouvez aussi discutez avec plusieurs papiers à la fois ici.

Instructions pour utiliser l'assistant IA ?

Résultats du processus de synthèse de l'article arXiv : 2311.02084v2

Résumé Complet
Points clés
Résumé vulgarisé
Article de blog

Le résumé n'est pas encore prêt

Les points clés ne sont pas encore prêts

Le résumé vulgarisé n'est pas encore prêt

L'article de blog n'est pas encore prêt

Créé le 08 Jui. 2024

Disponible dans d'autres langues : en

Évaluez la qualité du contenu généré par l'IA en votant

Note : 0

Le résumé précédent a été créé il y a plus d'un an et peut être réexécuté (si nécessaire) en cliquant sur le bouton Exécuter ci-dessous.

ITEm: Unsupervised Image-Text Embedding Learning for eCommerce

Posez des questions sur cet article à notre assistant IA

Résultats du processus de synthèse de l'article arXiv : 2311.02084v2

Articles similaires résumés avec nos outils d'IA