ParaHome: Parameterizing Everyday Home Activities Towards 3D Generative Modeling of Human-Object Interactions

AI-generated keywords: Machine learning 3D motion Human-object interactions Large-scale dataset ParaHome system

AI-generated Key Points

  • Rich data encompassing 3D motion of humans and objects is crucial for machines to learn human interaction with the physical world
  • Scarcity of large-scale datasets capturing 3D motions of both humans and objects in causal interactions
  • Existing datasets focus on limited aspects, such as human motion without objects or hand-object interactions in static postures
  • Introduction of ParaHome system to capture and parameterize dynamic 3D movements of humans and objects in a home environment
  • System includes multi-view setup with 70 synchronized RGB cameras and wearable motion capture devices
  • Collection of a novel large-scale dataset of human-object interaction with advancements over existing datasets:
  • Capturing 3D body and dexterous hand manipulation motion alongside 3D object movement in a contextual home environment during natural activities
  • Encompassing human interaction with multiple objects in various episodic scenarios with corresponding descriptions in texts
  • Including articulated objects with multiple parts expressed with parameterized articulations
  • Participants perform sequences of actions involving the manipulation of one or two objects, including cooking-related actions and small actions that can occur in a room environment
  • Dataset captured from 30 participants (15 females and 15 males) interacting with objects
  • Each scenario consists of atomic actions divided into two sessions, ranging from [duration range]
  • Total [number] captures resulting in a total of [number]
  • Introduction of new research tasks for building generative models for learning and synthesizing human-object interactions using this dataset
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Jeonghwan Kim, Jisoo Kim, Jeonghyeon Na, Hanbyul Joo

License: CC BY 4.0

Abstract: To enable machines to learn how humans interact with the physical world in our daily activities, it is crucial to provide rich data that encompasses the 3D motion of humans as well as the motion of objects in a learnable 3D representation. Ideally, this data should be collected in a natural setup, capturing the authentic dynamic 3D signals during human-object interactions. To address this challenge, we introduce the ParaHome system, designed to capture and parameterize dynamic 3D movements of humans and objects within a common home environment. Our system consists of a multi-view setup with 70 synchronized RGB cameras, as well as wearable motion capture devices equipped with an IMU-based body suit and hand motion capture gloves. By leveraging the ParaHome system, we collect a novel large-scale dataset of human-object interaction. Notably, our dataset offers key advancement over existing datasets in three main aspects: (1) capturing 3D body and dexterous hand manipulation motion alongside 3D object movement within a contextual home environment during natural activities; (2) encompassing human interaction with multiple objects in various episodic scenarios with corresponding descriptions in texts; (3) including articulated objects with multiple parts expressed with parameterized articulations. Building upon our dataset, we introduce new research tasks aimed at building a generative model for learning and synthesizing human-object interactions in a real-world room setting.

Submitted to arXiv on 18 Jan. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2401.10232v1

To enable machines to learn how humans interact with the physical world in our daily activities, it is crucial to provide rich data that encompasses the 3D motion of humans as well as the motion of objects in a learnable 3D representation. However, there is a scarcity of large-scale datasets captured in natural and casual settings that include the 3D motions of both humans and objects occurring in causal interactions. Existing datasets primarily focus on limited aspects of these challenges, such as capturing human motion without objects or focusing on hand-object interactions in static postures or relatively simple and short interactions. To address these limitations, the authors introduce the ParaHome system designed to capture and parameterize dynamic 3D movements of humans and objects within a common home environment. The system consists of a multi-view setup with 70 synchronized RGB cameras, as well as wearable motion capture devices equipped with an IMU-based body suit and hand motion capture gloves. By leveraging this system, they collect a novel large-scale dataset of human-object interaction. The dataset offers key advancements over existing datasets in three main aspects: (1) capturing 3D body and dexterous hand manipulation motion alongside 3D object movement within a contextual home environment during natural activities; (2) encompassing human interaction with multiple objects in various episodic scenarios with corresponding descriptions in texts; (3) including articulated objects with multiple parts expressed with parameterized articulations. The participants perform sequences of actions composed of small atomic actions involving the manipulation of one or two objects. A total of are performed by participants, including cooking-related actions and small actions that can occur in a room environment. Each participant performs consisting of non-cooking actions and cooking-related actions placed in semi-arbitrary order with corresponding verbal instruction for each action. The dataset was captured from (15 females and 15 males) and contains interacting with . Each scenario performed by the participants consists of , divided into two sessions of captures due to storage limits. The duration of each session ranges from . In total, were captured, resulting in a total of . The authors introduce new research tasks aimed at building a generative model for learning and synthesizing human-object interactions in a real-world room setting using this dataset. This work addresses the limitations of existing datasets and provides valuable data for advancing the understanding and modeling of human-object interactions in natural and casual settings.
Created on 21 Jan. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.