Tabular¶
Tabular columns hold simple data: numbers, strings, booleans, and lists thereof. Use the convenience schemas to describe them — each one sets the right data type and is recognized by the Dashboard.
Scalars¶
import tlc
schema = {
"score": tlc.schemas.Float32Schema(),
"count": tlc.schemas.Int32Schema(),
"is_valid": tlc.BoolSchema(),
"name": tlc.schemas.StringSchema(),
}
Arrays (the shape parameter)¶
All scalar schemas accept a shape parameter to describe arrays:
schema = {
# Fixed-size list of 10 floats
"features": tlc.schemas.Float32Schema(shape=10),
# Variable-length list of integers
"token_ids": tlc.schemas.Int32Schema(shape=(-1,)),
# Variable 2D array (e.g. variable number of rows, 3 columns)
"points": tlc.schemas.Float32Schema(shape=(-1, 3)),
# Fixed 4x4 matrix
"transform": tlc.schemas.Float32Schema(shape=(4, 4)),
}
Use -1 for variable-size dimensions (numpy convention). shape=10 is shorthand for shape=(10,).
Tip
Scalar arrays vs NumPy/Torch arrays: Float32Schema(shape=...) stores Python lists and returns them as-is in
sample view. To get numpy.ndarray or torch.Tensor objects in sample view, pass
sample_type="numpy_array" or sample_type="torch_tensor". For file-backed storage of large arrays, use
ExternalNumpyArraySchema or
ExternalTorchTensorSchema instead. See Embeddings
for details.
Categorical¶
Categorical columns map integer values to named classes. This is the most common label representation in ML. See Categorical for the full schema, Dashboard editing, and prediction assignment workflows.
schema = {
"label": tlc.schemas.CategoricalLabelSchema(classes=["cat", "dog", "bird"]),
}
The classes parameter accepts multiple formats:
# List of names (0-indexed)
tlc.schemas.CategoricalLabelSchema(classes=["cat", "dog"])
# Dict mapping indices to names
tlc.schemas.CategoricalLabelSchema(classes={0: "cat", 1: "dog"})
# Single class
tlc.schemas.CategoricalLabelSchema(classes="binary")