emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography

AI-generated keywords: Surface electromyography Wearable sensors emg2qwerty dataset Human-computer interaction Automatic Speech Recognition

AI-generated Key Points

  • Surface electromyography (sEMG) is a non-invasive method for measuring muscle activity signals.
  • sEMG has the sensitivity to detect individual spinal neurons and identify various gestures and nuances.
  • Wearable wrist-based sEMG sensors offer low-friction, subtle, information-rich inputs for human-computer interactions.
  • The emg2qwerty dataset is a large-scale collection of non-invasive electromyographic signals recorded at the wrists while touch typing on a QWERTY keyboard.
  • The dataset includes ground-truth annotations and reproducible baselines, with 1,135 sessions involving 108 users and totaling 346 hours of recording.
  • The dataset reveals complex hierarchical relationships in the generative process from neurons to muscles and muscle combinations, as well as domain shift across different users and user sessions.
  • Standard modeling techniques from Automatic Speech Recognition (ASR) have shown strong baseline performance in predicting key-presses using sEMG signals alone.
  • Acknowledgments are extended to individuals who contributed to data collection, shared expertise in ASR techniques, provided engineering support, offered feedback on the research paper, engaged in discussions, and supported the work's vision and sponsorship.
  • The richness of both the task at hand and the emg2qwerty dataset is expected to drive progress in machine learning and neuroscientific communities.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Viswanath Sivakumar, Jeffrey Seely, Alan Du, Sean R Bittner, Adam Berenzweig, Anuoluwapo Bolarinwa, Alexandre Gramfort, Michael I Mandel

Submitted to NeurIPS 2024 Datasets and Benchmarks Track
License: CC BY-NC-SA 4.0

Abstract: Surface electromyography (sEMG) non-invasively measures signals generated by muscle activity with sufficient sensitivity to detect individual spinal neurons and richness to identify dozens of gestures and their nuances. Wearable wrist-based sEMG sensors have the potential to offer low friction, subtle, information rich, always available human-computer inputs. To this end, we introduce emg2qwerty, a large-scale dataset of non-invasive electromyographic signals recorded at the wrists while touch typing on a QWERTY keyboard, together with ground-truth annotations and reproducible baselines. With 1,135 sessions spanning 108 users and 346 hours of recording, this is the largest such public dataset to date. These data demonstrate non-trivial, but well defined hierarchical relationships both in terms of the generative process, from neurons to muscles and muscle combinations, as well as in terms of domain shift across users and user sessions. Applying standard modeling techniques from the closely related field of Automatic Speech Recognition (ASR), we show strong baseline performance on predicting key-presses using sEMG signals alone. We believe the richness of this task and dataset will facilitate progress in several problems of interest to both the machine learning and neuroscientific communities. Dataset and code can be accessed at https://github.com/facebookresearch/emg2qwerty.

Submitted to arXiv on 26 Oct. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2410.20081v1

Surface electromyography (sEMG) is a non-invasive method that measures signals generated by muscle activity. It has the sensitivity to detect individual spinal neurons and the ability to identify various gestures and their nuances. Wearable wrist-based sEMG sensors have the potential to provide low-friction, subtle, information-rich inputs for human-computer interactions. The emg2qwerty dataset has been introduced for this purpose, consisting of a large-scale collection of non-invasive electromyographic signals recorded at the wrists while touch typing on a QWERTY keyboard. The dataset includes ground-truth annotations and reproducible baselines. It comprises 1,135 sessions involving 108 users and totaling 346 hours of recording, making it the largest public dataset of its kind to date. These data reveal complex hierarchical relationships in terms of the generative process from neurons to muscles and muscle combinations. They also show domain shift across different users and user sessions. By applying standard modeling techniques from Automatic Speech Recognition (ASR), strong baseline performance has been demonstrated in predicting key-presses using sEMG signals alone. Acknowledgments are extended to individuals who contributed to the implementation of the data collection protocol, shared expertise in ASR techniques, provided engineering support, offered feedback on the research paper, engaged in helpful discussions, and supported the vision and sponsorship behind this work. The richness of both the task at hand and the emg2qwerty dataset is expected to drive progress in various areas of interest for both machine learning and neuroscientific communities. For further details on accessing the dataset and related code, interested parties can visit https://github.com/facebookresearch/emg2qwerty. Additionally, a detailed summary of the research paper submitted to NeurIPS 2024 Datasets and Benchmarks Track can be found at http://arxiv.org/abs/2410.20081v1.
Created on 03 Nov. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.