Keypoints and Pose Estimation¶

../../../_images/horse-keypoints-dark.png

../../../_images/horse-keypoints-light.png

Pose estimation is a computer vision task that identifies and tracks the spatial configuration of objects by detecting keypoints.

Working with Keypoints in 3LC¶

3LC Tables store keypoint data in a hierarchical structure of object instances. A typical keypoint structure includes 2D points, with a skeleton topology, per-point visibility, a bounding box, and a label.

The structure of a keypoints column is defined by its Schema. The Keypoints2DSchema class is used to define the schema of a keypoints column.

Creating Keypoint Tables¶

To get started with keypoint data, you’ll need to create keypoint Tables:

Create Keypoint Tables from YOLO Format: Learn to create keypoint Tables from YOLO-format files
Create Keypoint Tables from COCO Format: Learn to create keypoint Tables from COCO-format files
Create Custom Keypoint Tables: Learn to create keypoint Tables from scratch

When working with keypoints Tables - such as during custom data loading or prediction writing - the Keypoints2DInstances helper class simplifies conversion between Table rows and numpy arrays.

This helper class provides:

Reading from Tables: Convert a Table row to structured numpy arrays with from_row()
Writing to Tables: Convert numpy arrays back to Table row format with to_row()
Building from scratch: Create empty instances with create_empty() and add data incrementally with add_instance()

from tlc import Keypoints2DInstances

# Reading: Convert Table row to numpy arrays
kpts = Keypoints2DInstances.from_row(table_row)
kpts.keypoints.shape         # (num_instances, num_keypoints, 2)
kpts.instance_labels.shape   # (num_instances,)

# Writing: Convert numpy arrays back to Table format
updated_row = kpts.to_row()

Reading Keypoint Metadata¶

3LC provides the KeypointHelper class to extract keypoint metadata and schema information from Tables.

Shapes and index flattening are handled internally; these helpers return standard numpy arrays and Python lists.

from tlc import KeypointHelper

# Get keypoint shape (number of keypoints, channels)
shape = KeypointHelper.get_keypoint_shape_from_table(table)
shape  # (17, 3)

# Get keypoint names and attributes
keypoint_attrs = KeypointHelper.get_keypoint_attributes_from_table(table)
keypoint_attrs  # [{'internal_name': 'nose'}, {'internal_name': 'left_eye'}, ...]

# Get skeleton connections
skeleton = KeypointHelper.get_lines_from_table(table)
skeleton  # [0, 1, 0, 2, 1, 3, ...]

# Get flip indices for data augmentation
flip_indices = KeypointHelper.get_flip_indices_from_table(table)
flip_indices  # [0, 2, 1, 4, 3, ...]

# Get OKS sigmas for evaluation
oks_sigmas = KeypointHelper.get_oks_sigmas_from_table(table)
oks_sigmas  # [0.026, 0.025, 0.025, ...]

# Modify or set the OKS sigmas
edited_table = KeypointHelper.edit_oks_sigmas(table, [0.025, 0.025, 0.025, 0.025])

# Modify or set the default keypoint coordinates
edited_table = KeypointHelper.edit_default_keypoints(table, [0.025, 0.025, 0.025, 0.025])

# Modify or set the default skeleton connections
edited_table = KeypointHelper.edit_default_lines(table, [0, 1, 0, 2, 1, 3])

Keypoint Visibility¶

When working with keypoint ground truths, 3LC uses a three-state integer channel for keypoint visibility (COCO standard):

Value	Meaning	Description
0	Not labeled	Keypoint is not annotated or its location is unknown. Keypoint coordinates are typically (0, 0)
1	Labeled but not visible	Keypoint exists but is occluded or not visible in the image
2	Labeled and visible	Keypoint is visible and annotated in the image

OKS Sigmas¶

Object Keypoint Similarity (OKS) sigmas define the expected spatial variance for each keypoint. They are critical for evaluating keypoint detection quality.

OKS sigmas are set at Table creation time and remain immutable throughout the Table’s lifetime. Each keypoint has an associated sigma value reflecting how precisely that keypoint can typically be localized. For example:

Highly visible keypoints (like eyes or nose) have smaller sigmas
Harder-to-localize keypoints (like hips or elbows) have larger sigmas

New derived Tables with modified sigmas can be created using the KeypointHelper.edit_oks_sigmas() method.

Training vs Evaluation¶

During runs, OKS sigmas serve two independent purposes:

Loss Computation: Used during training to weight keypoint predictions
Evaluation Metrics: Used to compute mAP and other metrics

Important

While you can use different sigma values for loss computation during training, all evaluation metrics computed in the 3LC Dashboard will always use the Table’s OKS sigmas. If sigmas are not provided, default uniform sigmas of size 1/num_keypoints will be used. This ensures consistency when comparing model performance across different Runs.

Warning

Avoid comparing evaluation metrics between Tables with different OKS sigmas, as the metrics will not be directly comparable. Always use Tables with consistent OKS sigmas when benchmarking multiple models or Runs.

Dashboard Workflows¶

Accepting Predictions as Ground Truth¶

When accepting or updating ground truths based on model predictions in the 3LC Dashboard, visibility values are automatically assigned according to the following rules:

Filtered-in keypoints receive visibility value 2 (visible)
Filtered-out keypoints receive visibility value 1 (not visible)
Not labeled keypoints (value 0) must be set explicitly through manual editing

Since predictions typically contain confidence values rather than visibility flags, the keypoint confidence filter is commonly used to determine which keypoints are filtered in or out. For example, setting a confidence threshold filters out low-confidence predictions, which will then receive visibility value 1 when accepted as ground truth.

Tip

Keypoint attributes including visibility can always be manually edited in the Dashboard. The automatic visibility assignment rules above only apply when using model predictions to update the ground truth set.

For more information about working with keypoints in the 3LC Dashboard, see the how-to article.

Framework Integration¶

3LC keypoint Tables can be used with the following frameworks:

SuperGradients: Full integration with the YOLO-NAS pose estimation models
Ultralytics YOLO: Full integration with the YOLO-pose models
Custom PyTorch Models: Use direct Table access for custom training loops

Examples and Tutorials¶

Train YOLO Pose Estimator: Complete training example with Ultralytics YOLO-pose
Train SuperGradients Pose Estimator: Complete training example with SuperGradients

Visualization¶

Pose and keypoints refer to ways of describing the position and orientation of objects, and the terms may be used interchangeably for 3LC. A set of predefined keypoints, lines, and triangles can be displayed and edited in the Dashboard. A bounding box is often a part of a pose set. Individual elements, such as keypoints, may not be added or removed, but adding or removing an entire pose object (including all keypoints, lines, BB) is allowed.

Create an image+keypoints chart¶

Creating an image+keypoints chart is essentially the same as creating an image+BB chart. Select the IMAGE and KEYPOINTS columns and press 2.

Display keypoint properties¶

When clicking on a keypoint on a chart, all available properties of the keypoint, such as the vertex role and visibility, will be displayed next to it. To select multiple keypoints, you can either hold Ctrl while clicking the keypoints or use one of the polygon selection tools to select the keypoints of interest.

The BBs(instances)/keypoints/lines/triangles have their own properties. The display of these properties can be changed in the context menu as shown below. The ways to set those display properties are the same as those for BBs. Learn more on this page.

Filter on pose/keypoints¶

The main difference between BBs and pose/keypoints is that a pose consists of a BB, a set of keypoints, and a set of lines. Therefore, the filters for pose/keypoints have some additions on top of standard BB filters. For pose/keypoints, you can filter on any metrics or properties associated with BBs, keypoints or lines. For example, if you filter on a BB metric such as the predicated BB confidence, the BBs and their associated keypoints/lines will be filtered all together. If you filter on a keypoint metric or property such as the vertex role, the filter will only apply to the keypoints. Other than that, filters for pose/keypoints are pretty much the same as those for BBs.

Editing¶

Edit pose/keypoints¶

Each keypoint is usually represented by (x,y,v) format. x and y are the 2D coordinates, and v is the visibility. You can edit (x,y) coordinates by moving the keypoint on a keypoints chart and change the visibility on the context menu in the chart. When the keypoints are moved, the associated lines will be moved automatically.

To edit a pose or a keypoint, here are what you can do:

To move a pose, click and drag the BB
To resize a pose, click and drag on an edge or a corner of the BB (all keypoints will also be resized proportionally)
To move a keypoint, click and drag that keypoint
To edit the visibility of a keypoint, click the keypoint and then change the visibility in the context menu (as shown below)

Work on a pose/keypoints Run¶

There are a lot of similarities between working on a BB Run and a pose/keypoints Run. Some standard metrics, such as predicted pose/keypoints, loss, and confidence, are collected with 3lc-ultralytics during training. Below is an example showing the collected metrics in the table and two charts with ground truth and predicted pose/keypoints, respectively. Similar to BB detection, the ground truth is presented with solid lines, while the predictions use dashed lines. Note that there are two confidence columns in the table. One is for the BB predictions, while the other is for each individual keypoint.

Similar to IOU for BBs, you can derive OKS (Object Keypoint Similarity) virtual columns for the ground truth and the predictions. In addition, you can also derive per-keypoint OKS virtual columns. The former is a standard OKS that calculates the average of all visible keypoints’ OKS for a pose, while the latter presents a value for each keypoint. The context menu shown below can be brought up by selecting the ground truth and predicted keypoint columns and RightClick on one of the column headers.

In a keypoints chart, the OKS metric will be displayed below the BB, and the per-keypoint OKS will be next to the keypoint.

In derived TP/FP/FN virtual columns, the TP/FP/FN counts are based on OKS in a similar fashion as IOU for BBs. You can adjust the OKS threshold (default 0.5) for such counts.

Convert predicted pose/keypoints¶

All the operations related to converting predicted pose/keypoints to the ground truth are the same with converting predicted BBs. Learn more on this page.