Sample View¶

The sample view returns training-ready Python objects. Where the row view stores serializable primitives and URL references, the sample view gives you PIL Images, NumPy arrays, dataclass instances, and so on.

Accessing Samples¶

sample = table[0]
# {"image": <PIL.Image ...>, "label": 0, "bounding_boxes": BoundingBoxes2D(...)}

table[i] always returns a dict with one entry per visible column. The shape matches the row view; only the values are different — bulk data is loaded and per-column sample types are applied.

When you access table[i], the system:

  1. Loads bulk data — file URLs are read and deserialized; chunked data is unpacked.

  2. Applies each column’s sample type — converts row data to Python objects (e.g., nested lists to numpy.ndarray, dicts to BoundingBoxes2D).

Row view vs sample view

Row view (table.table_rows[i]): URL references + serializable primitives, as stored on disk. Sample view (table[i]): Bulk data loaded, transforms applied — ready for training.

Both return a dict with the same keys; only the values differ.

Column Sample Types¶

A column’s sample type controls how stored data is converted to a Python object. Convenience schemas set this automatically — for example, ImageSchema configures the "pil_image" sample type, so table[i]["image"] returns a PIL.Image. See Column Types for what each schema produces, or Custom Sample Types to define your own.

Hidden Columns¶

Columns with sample_type="hidden" are kept in the row view (and on disk) but excluded from the sample view dict — useful for bookkeeping fields like sample weights that you don’t want in your training loop.

Selective Loading¶

Only columns that participate in the sample view are loaded. Hidden columns are skipped, keeping memory usage low.