Format Specification

LeRobot Data is the EgoSuite export format intended for robot learning workflows. It organizes egocentric episodes into a training-friendly dataset layout for downstream policy learning, evaluation, and data loading.

Overview

This dataset contains egocentric data captured from a head-mounted device and optional wrist-mounted cameras. Each episode records a single continuous human activity, including synchronized RGB video streams, head pose, hand pose, optional body pose in world coordinates, and action-level semantic annotations. All camera videos are lens-distortion corrected (undistorted).

Folder Structure

Each episode is stored as a self-contained LeRobot v3.0 dataset:

{task_name}/{episode_uuid}/
├── data/
│   └── chunk-000/
│       └── file-000.parquet       # per-frame state data
├── videos/
│   └── observation.images.{cam}/
│       └── chunk-000/
│           └── file-000.mp4       # video per camera
├── meta/
│   ├── info.json                  # dataset-level metadata
│   ├── tasks.parquet              # task index mapping
│   ├── subtasks.parquet           # subtask index mapping
│   └── episodes/
│       └── chunk-000/
│           └── file-000.parquet   # episode-level metadata & stats
├── pointcloud/
│   └── frame_*.pcd                # sparse point clouds (optional)
└── annotation.json                # original action-level semantic annotation file

Data Parquet Schema

data/chunk-*/file-*.parquet — one row per frame, all poses in world coordinates, precision fp32.

#	Column	Type	Note
0	`index`	int64	Global frame index across the full shard
1	`episode_index`	int64	Episode index within the shard
2	`frame_index`	int64	Frame index within the episode
3	`timestamp`	float32	Time in seconds since episode start
4	`task_index`	int64	Maps to `meta/tasks.parquet`
5	`subtask_index`	int64 (nullable)	Maps to `meta/subtasks.parquet`; null if not annotated
6	`observation.state.hand_left_world`	float32	Left hand joint positions in world space. Shape (21, 3). See Hand Joint Convention below
7	`observation.state.hand_left_world_rotation`	float32	Left hand joint rotations in world space. Shape (21, 4), quaternion (qw, qx, qy, qz)
8	`observation.state.hand_right_world`	float32	Right hand joint positions in world space. Shape (21, 3)
9	`observation.state.hand_right_world_rotation`	float32	Right hand joint rotations in world space. Shape (21, 4), quaternion (qw, qx, qy, qz)
10	`observation.state.body_world`	float32	Body joint positions in world space. Shape (22, 3) for full-body data or (14, 3) for upper-body data. Optional — included when body data is included
11	`observation.state.body_world_rotation`	float32	Body joint rotations in world space. Shape (22, 4) for full-body data or (14, 4) for upper-body data, quaternion (qw, qx, qy, qz). Optional — included when body data is included
12	`observation.state.head_world`	float32	Head position in world space. Shape (1, 3)
13	`observation.state.head_world_rotation`	float32	Head rotation in world space. Shape (1, 4), quaternion (qw, qx, qy, qz)
14	`observation.state.head_left_camera_position`	float32	Head-left camera position in world space. Shape (3,)
15	`observation.state.head_left_camera_rotation`	float32	Head-left camera rotation in world space. Shape (4,), quaternion (qw, qx, qy, qz)
16	`observation.state.head_right_camera_position`	float32	Head-right camera position in world space. Shape (3,)
17	`observation.state.head_right_camera_rotation`	float32	Head-right camera rotation in world space. Shape (4,), quaternion (qw, qx, qy, qz)
18	`observation.state.left_wrist_camera_position`	float32	Left wrist camera position in world space. Shape (3,). Optional — included when wrist camera data is included
19	`observation.state.left_wrist_camera_rotation`	float32	Left wrist camera rotation in world space. Shape (4,), quaternion (qw, qx, qy, qz). Optional — included when wrist camera data is included
20	`observation.state.right_wrist_camera_position`	float32	Right wrist camera position in world space. Shape (3,). Optional — included when wrist camera data is included
21	`observation.state.right_wrist_camera_rotation`	float32	Right wrist camera rotation in world space. Shape (4,), quaternion (qw, qx, qy, qz). Optional — included when wrist camera data is included

Notes:

All position values are in meters.
All quaternions use scalar-first order: (qw, qx, qy, qz).
Camera columns follow the pattern observation.state.{cam}_camera_position / {cam}_camera_rotation for each camera present in the episode. Wrist camera columns are optional.

Hand Joint Convention

LEFT_HAND_JOINTS = {
'leftWrist',
'leftThumbMCP',
'leftThumbPIP',
'leftThumbDIP',
'leftThumbTip',
'leftIndexMCP',
'leftIndexPIP',
'leftIndexDIP',
'leftIndexTip',
'leftMiddleMCP',
'leftMiddlePIP',
'leftMiddleDIP',
'leftMiddleTip',
'leftRingMCP',
'leftRingPIP',
'leftRingDIP',
'leftRingTip',
'leftLittleMCP',
'leftLittlePIP',
'leftLittleDIP',
'leftLittleTip',
}

RIGHT_HAND_JOINTS = {
'rightWrist',
'rightThumbMCP',
'rightThumbPIP',
'rightThumbDIP',
'rightThumbTip',
'rightIndexMCP',
'rightIndexPIP',
'rightIndexDIP',
'rightIndexTip',
'rightMiddleMCP',
'rightMiddlePIP',
'rightMiddleDIP',
'rightMiddleTip',
'rightRingMCP',
'rightRingPIP',
'rightRingDIP',
'rightRingTip',
'rightLittleMCP',
'rightLittlePIP',
'rightLittleDIP',
'rightLittleTip',
}

Body Joint Convention

Full-body data uses the 22-joint convention:

BODY_JOINTS = {
'Pelvis',
'leftHip',
'rightHip',
'Spine1',
'leftKnee',
'rightKnee',
'Spine2',
'leftAnkle',
'rightAnkle',
'Spine3',
'leftFoot',
'rightFoot',
'Neck',
'leftCollar',
'rightCollar',
'Head',
'leftShoulder',
'rightShoulder',
'leftElbow',
'rightElbow',
'leftWrist',
'rightWrist',
}

Upper-body data uses 14 joints. It removes the lower-limb joints from the 22-joint layout and reindexes the remaining joints compactly:

UPPER_BODY_JOINTS = {
'Pelvis',
'Spine1',
'Spine2',
'Spine3',
'Neck',
'leftCollar',
'rightCollar',
'Head',
'leftShoulder',
'rightShoulder',
'leftElbow',
'rightElbow',
'leftWrist',
'rightWrist',
}

Task and Subtask

Location	Purpose
`meta/tasks.parquet`	Maps `task_index` (int) to the corresponding task name (string)
`meta/subtasks.parquet`	Maps `subtask_index` (int) to the corresponding subtask name (string)
`data/chunk-/file-.parquet`	Includes per-frame columns `task_index` and `subtask_index`

meta/tasks.parquet

One row per unique task.

Field	Storage	Type	Description
Task name	Pandas index	str	Human-readable task description
`task_index`	Column	int	Integer ID referenced by `task_index` in data parquets

Example:

                                                          task_index
task
Coffee Table Snack Setup Preparation: The person...            0

Some parquet readers or viewers show the physical parquet columns instead of the pandas index view:

{"task_index": 0, "task": "Coffee Table Snack Setup Preparation: The person..."}

Index.name must be "task", Index.dtype must be str.
The DataFrame must contain exactly one column: task_index.

meta/subtasks.parquet

One row per unique subtask.

Field	Storage	Type	Description
Subtask name	Pandas index	str	Human-readable subtask description
`subtask_index`	Column	int	Integer ID referenced by `subtask_index` in data parquets

Example:

                                        subtask_index
subtask
Walked to the cabinet.                              0
Squat down.                                         1
Open the drawer.                                    2

Some parquet readers or viewers show the physical parquet columns instead of the pandas index view:

{"subtask_index": 0, "subtask": "Walked to the cabinet."}
{"subtask_index": 1, "subtask": "Squat down."}
{"subtask_index": 2, "subtask": "Open the drawer."}

Index.name must be "subtask", Index.dtype must be str.
The DataFrame must contain exactly one column: subtask_index.
Every subtask_index value that appears in data/chunk-*/file-*.parquet must have a corresponding row in this file.
Frames with no annotation have subtask_index = null.

meta/episodes/*.parquet

One row per episode.

Field	Type	Description
`episode_index`	int64	Zero-based episode index within the shard
`tasks`	list<string>	Task name(s) for this episode
`length`	int64	Total number of frames in the episode
`data/chunk_index`	int64	Chunk index of the data parquet file
`data/file_index`	int64	File index within the data chunk
`videos/observation.images.{cam}/chunk_index`	int64	Chunk index of the video file
`videos/observation.images.{cam}/file_index`	int64	File index within the video chunk
`videos/observation.images.{cam}/from_timestamp`	double	Start timestamp (seconds) of the video segment
`videos/observation.images.{cam}/to_timestamp`	double	End timestamp (seconds) of the video segment
`stats/{col}/min`	list<double>	Per-column min statistic
`stats/{col}/max`	list<double>	Per-column max statistic
`stats/{col}/mean`	list<double>	Per-column mean statistic
`stats/{col}/std`	list<double>	Per-column std statistic
`stats/{col}/count`	list<int64>	Frame count used for stats
`dataset_from_index`	int64	Global starting frame index of this episode within the shard
`dataset_to_index`	int64	Global ending frame index (exclusive) of this episode
`meta/episodes/chunk_index`	int64	Chunk index of this episodes parquet file
`meta/episodes/file_index`	int64	File index within the episodes chunk
`camera_intrinsics/{cam}`	list<float>[8]	Camera intrinsics `[fx, fy, cx, cy, k1, k2, k3, k4]` using undistorted intrinsics. One column per camera
`episode_uuid`	string	Unique identifier for this episode
`environment_id`	string	Environment identifier
`scene_id`	string	Scene identifier
`operator_id`	string	Operator identifier
`batch_version`	string	Batch version identifier

meta/info.json

Dataset-level metadata following the LeRobot v3.0 schema.

Field	Type	Description
`codebase_version`	string	`"v3.0"` for LeRobot v3.0
`robot_type`	null	Not applicable for egocentric data
`total_episodes`	int	Total number of episodes in the dataset
`total_frames`	int	Total number of frames across all episodes
`total_tasks`	int	Number of unique tasks
`chunks_size`	int	Max episodes per chunk (default 1000)
`fps`	float	Frame rate of the head camera
`splits`	object	Train/val split definitions
`data_path`	string	Path template for data parquet files
`video_path`	string	Path template for video files
`features`	object	Feature schema for all state and image columns

Overview​

Folder Structure​

Data Parquet Schema​

Hand Joint Convention​

Body Joint Convention​

Task and Subtask​

meta/tasks.parquet​

meta/subtasks.parquet​

meta/episodes/*.parquet​

meta/info.json​