Gaze-Informed Multi-Objective Imitation Learning from Human Demonstrations
AI-generated Key Points
⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.
- Teaching learning agents from human demonstrations via supervised learning is widely explored in human-robot interaction.
- The study introduces a novel approach that incorporates eye gaze information to enhance agent performance.
- The proposed imitation learning architecture aims to improve task completion rates and optimize path efficiency by leveraging insights into where the demonstrator directs their visual attention.
- A Gaze-Informed Multi-Objective Imitation Learning framework is introduced, which learns from human action demonstrations and eye tracking data concurrently.
- The approach is designed for tasks where human gaze information provides contextual cues for effective decision-making.
- Tested in a visual navigation scenario with an unmanned quadrotor, the model achieves significantly higher task completion rates and generates more efficient navigation paths compared to a baseline model.
- The model demonstrates an ability to predict human visual attention patterns, showcasing multimodal learning capabilities from additional human input modalities.
- Emphasizing the importance of integrating visual attention information into agent training processes encourages adoption of such approaches in training agents for visuomotor tasks.
Authors: Ritwik Bera, Vinicius G. Goecks, Gregory M. Gremillion, Vernon J. Lawhern, John Valasek, Nicholas R. Waytowich
Abstract: In the field of human-robot interaction, teaching learning agents from human demonstrations via supervised learning has been widely studied and successfully applied to multiple domains such as self-driving cars and robot manipulation. However, the majority of the work on learning from human demonstrations utilizes only behavioral information from the demonstrator, i.e. what actions were taken, and ignores other useful information. In particular, eye gaze information can give valuable insight towards where the demonstrator is allocating their visual attention, and leveraging such information has the potential to improve agent performance. Previous approaches have only studied the utilization of attention in simple, synchronous environments, limiting their applicability to real-world domains. This work proposes a novel imitation learning architecture to learn concurrently from human action demonstration and eye tracking data to solve tasks where human gaze information provides important context. The proposed method is applied to a visual navigation task, in which an unmanned quadrotor is trained to search for and navigate to a target vehicle in a real-world, photorealistic simulated environment. When compared to a baseline imitation learning architecture, results show that the proposed gaze augmented imitation learning model is able to learn policies that achieve significantly higher task completion rates, with more efficient paths, while simultaneously learning to predict human visual attention. This research aims to highlight the importance of multimodal learning of visual attention information from additional human input modalities and encourages the community to adopt them when training agents from human demonstrations to perform visuomotor tasks.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.