Skip to main content

Annotation

EgoSuite MCAP files can include annotation topics that store labeled segments (e.g. skill steps, task phases) with captions, labels, description, and skill names. Each segment has frame and/or time bounds. One message per segment; an episode may contain many such messages.

Example: skill segment caption and hand tracking overlay (fold grey shirt)

Example: segment overlay with description and skill (e.g. "Fold the grey shirt from top to bottom", skill "Fold") and hand pose.

Topic & Message Type

  • Topic: /annotation/segments
  • Message type: annotation.segments.AnnotationSegment
  • Encoding: protobuf

Defined in annotation/segments.proto; all messages include common.header.Header.

Message definition

syntax = "proto3";

package annotation.segments;

import "common/header.proto";

message SkillSegment {
// annotations with description, skill, start_frame, end_frame
string description = 1;
string skill = 2;
uint32 start_frame = 3;
uint32 end_frame = 4;
// tier1/tier2: label, start_time/end_time in seconds, tier1.caption on first segment
string label = 5;
double start_time = 6;
double end_time = 7;
string caption = 8;
}

// Main message for topic /annotation/segments (one message per segment)
message AnnotationSegment {
common.header.Header header = 1;
SkillSegment segment = 2;
}

The proto supports both frame-based (start_frame/end_frame) and time-based (start_time/end_time) bounds; which fields are populated depends on the pipeline. Use caption for tier1/tier2-style text when present.