Learning from What is Already Out There: Few-shot Sign Language Recognition with Online Dictionaries
AI-generated Key Points
⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.
- Authors Matyáš Boháček and Marek Hrúz address challenges in current sign language recognition models:
- Models require large training datasets of laboratory-like videos, which are difficult and costly to collect.
- Limited availability of publicly accessible systems, especially for less-populated sign languages.
- Proposal to overcome limitations and democratize technology:
- Utilize online text-to-video dictionaries containing annotated data on various attributes and sign languages.
- Introduce UWB-SL-Wild few-shot dataset sourced from dictionary-scraped videos to reflect actual distribution of online sign language data.
- Approach presented in the study:
- Select glosses overlapping with existing datasets like WLASL100 and ASLLVD for transfer learning experiments.
- Novel approach to training sign language recognition models in a few-shot scenario.
- Results of the proposed method:
- State-of-the-art results on ASLLVD-Skeleton and ASLLVD-Skeleton-20 datasets with top-1 accuracy rates of $30.97%$ and $95.45%$, respectively.
- Contribution to advancing sign language recognition technology:
- Addressing challenges related to training data availability.
- Making technology more inclusive across diverse linguistic communities.
Authors: Matyáš Boháček, Marek Hrúz
Abstract: Today's sign language recognition models require large training corpora of laboratory-like videos, whose collection involves an extensive workforce and financial resources. As a result, only a handful of such systems are publicly available, not to mention their limited localization capabilities for less-populated sign languages. Utilizing online text-to-video dictionaries, which inherently hold annotated data of various attributes and sign languages, and training models in a few-shot fashion hence poses a promising path for the democratization of this technology. In this work, we collect and open-source the UWB-SL-Wild few-shot dataset, the first of its kind training resource consisting of dictionary-scraped videos. This dataset represents the actual distribution and characteristics of available online sign language data. We select glosses that directly overlap with the already existing datasets WLASL100 and ASLLVD and share their class mappings to allow for transfer learning experiments. Apart from providing baseline results on a pose-based architecture, we introduce a novel approach to training sign language recognition models in a few-shot scenario, resulting in state-of-the-art results on ASLLVD-Skeleton and ASLLVD-Skeleton-20 datasets with top-1 accuracy of $30.97~\%$ and $95.45~\%$, respectively.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
⚠The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.