Skip to main content

Semantic Annotation

EgoSuite MCAP files can include annotation topics that store action-level semantic annotation with tasks, subtasks, and even skill names. Each segment has time bounds. One message per segment; an episode may contain many such messages.

Example: segment caption and hand pose overlay (clean the mirror)

Example: camera image overlaid with semantic annotation (e.g. event "The person cleaned the mirror, comb, and clips", action "Press the nozzle") and hand pose.

Topic & Message Type

  • Topic: /annotation/segments
  • Message type: annotation.segments.AnnotationSegment
  • Encoding: protobuf

Defined in annotation/segments.proto; each message contains one segment and includes common.header.Header.

Message definition

syntax = "proto3";

package annotation.segments;

import "common/header.proto";

message SkillSegment {
string subtask = 1;
string task = 2;
string skill = 3;
double start_time = 4;
double end_time = 5;
}

// Main message for topic /annotation/segments (one message per segment)
message AnnotationSegment {
common.header.Header header = 1;
SkillSegment segment = 2;
}

The proto supports time-based (start_time/end_time) bounds. task stores the global task description for the session. subtask describes the action segment for a specific time range; a session can contain multiple action segments over time. skill stores the skill associated with that action segment.