Convert Semantic to Instance Segmentation¶

Convert a semantic segmentation dataset into the instance segmentation format used by 3LC, using the simplest possible mapping: every semantic class becomes a single ā€œinstanceā€.

img

We load ADE20K’s semantic masks and store them as SegmentationMasks rows by collapsing all pixels of a given class into one mask per class. This fits a semantic-segmentation dataset into 3LC’s instance-segmentation column type for visualization and analysis; the trade-off is that individual objects of the same class are merged together, so instance identity is lost. Recovering it would require an extra step such as connected-components labeling or a model that predicts instance ids.

Project Setup¶

[ ]:
PROJECT_NAME = "3LC Tutorials - Create Tables"
DATASET_NAME = "ADE20k_toy_dataset"
DOWNLOAD_PATH = "../../transient_data"
INSTALL_DEPENDENCIES = True

Install dependencies¶

[ ]:
if INSTALL_DEPENDENCIES:
    %pip install -q 3lc
    %pip install -q huggingface-hub
    %pip install -q git+https://github.com/3lc-ai/3lc-examples.git
    %pip install -q matplotlib

Imports¶

[ ]:
import json
from pathlib import Path

import cv2
import numpy as np
import tlc
from huggingface_hub import hf_hub_download

from tlc_tools.common import download_and_extract_zipfile

Download the dataset¶

[ ]:
DATASET_ROOT = (Path(DOWNLOAD_PATH) / "ADE20k_toy_dataset").resolve()

if not DATASET_ROOT.exists():
    print("Downloading data...")
    download_and_extract_zipfile(
        url="https://www.dropbox.com/s/l1e45oht447053f/ADE20k_toy_dataset.zip?dl=1",
        location=DOWNLOAD_PATH,
    )

Fetch the label map from the Hugging Face Hub¶

[ ]:
# Load id2label mapping from a JSON on the hub. The JSON is 0-indexed
# ({"0": "wall", "1": "building", ...}), but the ADE20K mask PNGs are 1-indexed
# with 0 reserved for "unknown" — so a mask pixel value of N maps to id2label[N-1].
with open(
    hf_hub_download(
        repo_id="huggingface/label-files",
        filename="ade20k-id2label.json",
        repo_type="dataset",
    )
) as f:
    id2label = {int(k): v for k, v in json.load(f).items()}

Load the images and segmentation maps¶

[ ]:
image_paths = list(sorted(DATASET_ROOT.glob("**/images/training/*.jpg")))
segmentation_map_paths = list(sorted(DATASET_ROOT.glob("**/annotations/training/*.png")))
[ ]:
# Call .to_relative() to ensure aliases are applied
image_paths = [tlc.Url(p).to_relative().to_str() for p in image_paths]
print(image_paths[0])

Transform the segmentation maps to instance segmentation masks¶

[ ]:
def single_channel_map_to_per_class_masks(map: np.ndarray) -> tuple[np.ndarray, list[int]]:
    """Convert a single channel segmentation map to a stack of per-class masks.

    The input map uses ADE20K's 1-indexed convention (0 = unknown, 1..150 = real
    classes). Returned ``labels`` are 0-indexed to match the HuggingFace id2label
    JSON, so we subtract 1.

    Args:
        map: A numpy array of shape (H, W) representing a single channel segmentation map.

    Returns:
        - A stack of per-class masks of shape (H, W, N), where N is the number of classes present.
        - A list of length N with 0-indexed class IDs.
    """
    masks = []
    labels = []
    for class_id in np.unique(map):
        if class_id == 0:
            continue
        mask = (map == class_id).astype(np.uint8)
        masks.append(mask)
        labels.append(int(class_id) - 1)
    if not masks:
        h, w = map.shape
        return np.zeros((h, w, 0), dtype=np.uint8), []
    return np.stack(masks, axis=-1), labels
[ ]:
# Build the column of instance segmentations in the format required by 3LC
instances = []

for mask_path in segmentation_map_paths:
    map_np = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
    h, w = map_np.shape
    masks, labels = single_channel_map_to_per_class_masks(map_np)

    instances.append(
        tlc.data_types.SegmentationMasks(
            image_height=h,
            image_width=w,
            masks=masks,
            labels=labels,
        )
    )

Write the instance segmentation masks to a table¶

[ ]:
table_writer = tlc.TableWriter(
    table_name="ADE20K-instance-segmentation",
    dataset_name=DATASET_NAME,
    project_name=PROJECT_NAME,
    schema={
        "image": tlc.schemas.ImageSchema(),
        "instances": tlc.data_types.SegmentationMasks.schema(classes=id2label),
    },
    if_exists="rename",
)
[ ]:
# Add all rows (images and instance segmentations) to the table in one go
table_writer.add_batch(
    {
        "image": image_paths,
        "instances": instances,
    }
)

Visualize a sample instance segmentation mask¶

[ ]:
import matplotlib.pyplot as plt

example_mask = table[0]["instances"].masks[:, :, 0]

plt.imshow(example_mask, cmap="gray")
plt.axis("off")
plt.show()