tlc.client.torch.samplers.sampler¶

Factory functions for creating PyTorch samplers from 3LC Tables.

Provides explicit sampler constructors based on a table’s sample weight column, as well as a general-purpose :func:create_sampler that dispatches to the appropriate concrete sampler based on boolean flags.

Module Contents¶

Classes¶

Class

Description

RangeSampler

Samples elements sequentially from a range.

RepeatByWeightSampler

Repeats elements based on their weight.

SubsetSequentialSampler

Samples elements sequentially from a given list of indices.

Functions¶

Function

Description

create_random_sampler

Create a uniform random sampler from a table.

create_repeat_by_weight_sampler

Create a sampler that repeats indices based on their weight.

create_sampler

Create a PyTorch sampler from a table’s weight column.

create_sequential_sampler

Create a sequential sampler from a table.

create_weighted_sampler

Create a weighted random sampler from a table’s weight column.

API¶

class RangeSampler(
end: int,
start: int = 0,
step: int = 1,
)¶

Bases: torch.utils.data.sampler.Sampler[int]

Samples elements sequentially from a range.

class RepeatByWeightSampler(
weights: list[float],
shuffle: bool = True,
random_state: Random | None = None,
)¶

Bases: torch.utils.data.sampler.Sampler[int]

Repeats elements based on their weight.

class SubsetSequentialSampler(
indices: list[int],
)¶

Bases: torch.utils.data.sampler.Sampler[int]

Samples elements sequentially from a given list of indices.

create_random_sampler(
table: Table,
exclude_zero_weights: bool = True,
) RandomSampler | SubsetRandomSampler¶

Create a uniform random sampler from a table.

Samples indices uniformly at random without using weights for probability. If exclude_zero_weights is True, only non-zero-weight rows are included.

Parameters:
  • table – The table to create a sampler from. Must have a sample weight column if exclude_zero_weights is True.

  • exclude_zero_weights – If True, rows with a weight of zero are excluded from sampling.

Returns:

A RandomSampler or SubsetRandomSampler depending on whether zero-weight rows are excluded.

Raises:

ValueError – If exclude_zero_weights is True and the table has no sample weight column or all weights are zero.

create_repeat_by_weight_sampler(
table: Table,
shuffle: bool = True,
) RepeatByWeightSampler¶

Create a sampler that repeats indices based on their weight.

Each row appears a number of times proportional to its weight (fractional weights are handled probabilistically). Zero-weight rows are always excluded. The epoch length equals the sum of the weights.

Parameters:
  • table – The table to create a sampler from. Must have a sample weight column with float values.

  • shuffle – If True, the repeated indices are shuffled. If False, they appear in sequential order.

Returns:

A RepeatByWeightSampler based on the table’s weight column.

Raises:

ValueError – If the table has no sample weight column, all weights are zero, or weights are not floats.

create_sampler(
table: Table,
exclude_zero_weights: bool = True,
weighted: bool = True,
shuffle: bool = True,
repeat_by_weight: bool = False,
) Sampler[int]¶

Create a PyTorch sampler from a table’s weight column.

This is a general-purpose dispatch function that selects the appropriate sampler based on the combination of boolean flags. For a cleaner API, consider using the explicit factory functions instead:

Parameters:
  • table – The table to create a sampler from.

  • exclude_zero_weights – If True, rows with a weight of zero will be excluded from the sampler. This is useful for reducing the length of the sampler for datasets with zero-weighted samples, and thus the length of an epoch when using a PyTorch DataLoader.

  • weighted – If True, the sampler will use sample weights (beyond the exclusion of zero-weighted rows) to ensure that the distribution of the sampled rows matches the distribution of the weights. When weighted is set to True, you are no longer guaranteed that every row in the table will be sampled in a single epoch, even if all weights are equal.

  • shuffle – If False, the valid indices will be returned in sequential order. A value of False is mutually exclusive with the weighted flag.

  • repeat_by_weight – If True, the sampler will repeat the indices based on the weights. This is useful for ensuring that the distribution of the sampled rows matches the distribution of the weights, while still sampling every row in the table (with weight > 1) in a single epoch. The number of repeats of samples with fractional weights will be determined probabilistically. A value of True will set the length of the sampler (and thus an epoch) to the sum of the weights. This flag requires values of True for both weighted and exclude_zero_weights.

Returns:

A Sampler based on the weights column of the table.

Raises:

ValueError – If an invalid combination of arguments is provided, or the table lacks a weight column.

create_sequential_sampler(
table: Table,
exclude_zero_weights: bool = True,
) RangeSampler | SubsetSequentialSampler¶

Create a sequential sampler from a table.

Returns indices in sequential order. If exclude_zero_weights is True, only non-zero-weight rows are included.

Parameters:
  • table – The table to create a sampler from. Must have a sample weight column if exclude_zero_weights is True.

  • exclude_zero_weights – If True, rows with a weight of zero are excluded from the sequence.

Returns:

A RangeSampler or SubsetSequentialSampler depending on whether zero-weight rows are excluded.

Raises:

ValueError – If exclude_zero_weights is True and the table has no sample weight column or all weights are zero.

create_weighted_sampler(
table: Table,
exclude_zero_weights: bool = True,
) WeightedRandomSampler¶

Create a weighted random sampler from a table’s weight column.

Samples indices with probability proportional to their weight, with replacement. This means not all rows are guaranteed to be sampled in a single epoch, even if all weights are equal.

Parameters:
  • table – The table to create a sampler from. Must have a sample weight column.

  • exclude_zero_weights – If True, rows with a weight of zero are excluded, reducing the epoch length to the number of non-zero-weight rows.

Returns:

A WeightedRandomSampler based on the table’s weight column.

Raises:

ValueError – If the table has no sample weight column or all weights are zero.