Objects, URLs, and Schemas#

This page gives a brief introduction to the concepts of Objects, URLs, and Schemas in the context of tlc Python package.

Objects#

Object is the base class of all objects in the tlc Python package. An Object is a simple container for keys and values. All objects are required to have the attributes type, schema and created.

In practice, you will never need to create an Object directly. Instead, you will create instances of the two main subclasses: Table and Run.

However, it is useful to understand the common functionality provided by Object, to better understand how to work with the tlc package.

The main functionality provided by Object is the ability to serialize an object to JSON, and to construct objects from JSON. The JSON representation of an object is guaranteed to be sufficient to recreate the object, thereby providing a simple way to store and retrieve objects from persistent storage. We can think of the JSON representation of an object as a “recipe” for creating the object.

Urls#

Instances of type Object can be serialized to JSON and stored in various locations. The specific location of an object is defined by a Url, which could represent a local file path, or a path to some remote object storage.

Let’s proceed to create an Object and save it to a local file path.

[ ]:
import tlc

# Create a relative filepath Url
url = tlc.Url("./my_object")

# Create an object with the given url and write it to the url
my_object = tlc.Object(url=url)
my_object.write_to_url()

Serialized objects can be read back into memory using Object.from_url().

Inspired by the pathlib module, Url provides a simple and intuitive way to work with URLs:

[ ]:
# Create a relative filepath Url:
url = tlc.Url("./my_object")

# Create a absolute filepath Url:
url = tlc.Url("/path/to/my_object")

# Create a S3 Url:
url = tlc.Url("s3://bucket/my_object")

To create a Url for a object located within the 3LC project structure, the methods Url.create_table_url() and Url.create_run_url() can be used.

[ ]:
my_table_url = tlc.Url.create_table_url("my-table", "my-dataset", "my-project")
my_run_url = tlc.Url.create_run_url("my-run", "my-project")

The Url class provides a number of useful methods for working with URLs, such as exists(), join(), to_absolute(), etc. See the Url API documentation for more details.

Schemas#

All tlc Objects are described by schemas. A Schema is a tree-like structure that describes the layout of an object.

A schema describes all serializable attributes of an object. Minimally, this includes describing the data type. In addition, the dimensionality, size, display name, number role, and more can be described by the schema.

An object-attribute can be either composite or atomic. This is signalled by the presence of either the value attribute, or the values attribute. If the value attribute is present, then the schema is atomic, and the data is described by the ScalarValue subclass stored in the value attribute of the schema. If the values attribute is present, then the attribute is composite, and the schema can be recursed by following the sub-attributes defined in the values attribute of the schema, which is of type dict[str, Schema].

tlc does not have a separate notion of list- or array-schemas. Instead, the dimensionality and shape of an attribute is described by the size0, size1, …, size5 attributes of [Schema], which are of type DimensionNumericValue.

Enough theory, let’s look at some examples!

[ ]:
# Examples of creating schemas for object attributes

# A schema describing a integer value between 0 and 100.
int_schema = tlc.Schema(
    display_name="Integer Value Schema",
    description="This is an example schema",
    writable=False,  # The value described by this schema is not writable
    value=tlc.Int32Value(value_min=0, value_max=100),
)

# A schema describing a variable sized array of floats.
float_array_schema = tlc.Schema(
    display_name="Float Array Schema",
    description="This is an example schema",
    writable=True,  # The value described by this schema is writable
    value=tlc.Float32Value(unit="ms"),
    size0=tlc.DimensionNumericValue(value_min=1, value_max=10),
)

# Create a composite schema from the atomic schemas
composite_schema = tlc.Schema(
    values={"float_array": float_array_schema, "int_value": int_schema},
)