tlc.client.torch.samplers.sampler¶
Factory functions for creating PyTorch samplers from 3LC Tables.
Provides explicit sampler constructors based on a table’s sample weight column,
as well as a general-purpose :func:create_sampler that dispatches to the
appropriate concrete sampler based on boolean flags.
Module Contents¶
Classes¶
Class |
Description |
|---|---|
Samples elements sequentially from a range. |
|
Repeats elements based on their weight. |
|
Samples elements sequentially from a given list of indices. |
Functions¶
Function |
Description |
|---|---|
Create a uniform random sampler from a table. |
|
Create a sampler that repeats indices based on their weight. |
|
Create a PyTorch sampler from a table’s weight column. |
|
Create a sequential sampler from a table. |
|
Create a weighted random sampler from a table’s weight column. |
API¶
- class RangeSampler( )¶
Bases:
torch.utils.data.sampler.Sampler[int]Samples elements sequentially from a range.
- class RepeatByWeightSampler( )¶
Bases:
torch.utils.data.sampler.Sampler[int]Repeats elements based on their weight.
- class SubsetSequentialSampler( )¶
Bases:
torch.utils.data.sampler.Sampler[int]Samples elements sequentially from a given list of indices.
- create_random_sampler( ) RandomSampler | SubsetRandomSampler¶
Create a uniform random sampler from a table.
Samples indices uniformly at random without using weights for probability. If
exclude_zero_weightsis True, only non-zero-weight rows are included.- Parameters:
table – The table to create a sampler from. Must have a sample weight column if
exclude_zero_weightsis True.exclude_zero_weights – If True, rows with a weight of zero are excluded from sampling.
- Returns:
A RandomSampler or SubsetRandomSampler depending on whether zero-weight rows are excluded.
- Raises:
ValueError – If
exclude_zero_weightsis True and the table has no sample weight column or all weights are zero.
- create_repeat_by_weight_sampler( ) RepeatByWeightSampler¶
Create a sampler that repeats indices based on their weight.
Each row appears a number of times proportional to its weight (fractional weights are handled probabilistically). Zero-weight rows are always excluded. The epoch length equals the sum of the weights.
- Parameters:
table – The table to create a sampler from. Must have a sample weight column with float values.
shuffle – If True, the repeated indices are shuffled. If False, they appear in sequential order.
- Returns:
A RepeatByWeightSampler based on the table’s weight column.
- Raises:
ValueError – If the table has no sample weight column, all weights are zero, or weights are not floats.
- create_sampler(
- table: Table,
- exclude_zero_weights: bool = True,
- weighted: bool = True,
- shuffle: bool = True,
- repeat_by_weight: bool = False,
Create a PyTorch sampler from a table’s weight column.
This is a general-purpose dispatch function that selects the appropriate sampler based on the combination of boolean flags. For a cleaner API, consider using the explicit factory functions instead:
create_weighted_sampler()for probability-proportional samplingcreate_random_sampler()for uniform random samplingcreate_sequential_sampler()for deterministic sequential iterationcreate_repeat_by_weight_sampler()for repeating rows by weight
- Parameters:
table – The table to create a sampler from.
exclude_zero_weights – If True, rows with a weight of zero will be excluded from the sampler. This is useful for reducing the length of the sampler for datasets with zero-weighted samples, and thus the length of an epoch when using a PyTorch DataLoader.
weighted – If True, the sampler will use sample weights (beyond the exclusion of zero-weighted rows) to ensure that the distribution of the sampled rows matches the distribution of the weights. When
weightedis set to True, you are no longer guaranteed that every row in the table will be sampled in a single epoch, even if all weights are equal.shuffle – If False, the valid indices will be returned in sequential order. A value of False is mutually exclusive with the
weightedflag.repeat_by_weight – If True, the sampler will repeat the indices based on the weights. This is useful for ensuring that the distribution of the sampled rows matches the distribution of the weights, while still sampling every row in the table (with weight > 1) in a single epoch. The number of repeats of samples with fractional weights will be determined probabilistically. A value of True will set the length of the sampler (and thus an epoch) to the sum of the weights. This flag requires values of
Truefor bothweightedandexclude_zero_weights.
- Returns:
A Sampler based on the weights column of the table.
- Raises:
ValueError – If an invalid combination of arguments is provided, or the table lacks a weight column.
- create_sequential_sampler( ) RangeSampler | SubsetSequentialSampler¶
Create a sequential sampler from a table.
Returns indices in sequential order. If
exclude_zero_weightsis True, only non-zero-weight rows are included.- Parameters:
table – The table to create a sampler from. Must have a sample weight column if
exclude_zero_weightsis True.exclude_zero_weights – If True, rows with a weight of zero are excluded from the sequence.
- Returns:
A RangeSampler or SubsetSequentialSampler depending on whether zero-weight rows are excluded.
- Raises:
ValueError – If
exclude_zero_weightsis True and the table has no sample weight column or all weights are zero.
- create_weighted_sampler( ) WeightedRandomSampler¶
Create a weighted random sampler from a table’s weight column.
Samples indices with probability proportional to their weight, with replacement. This means not all rows are guaranteed to be sampled in a single epoch, even if all weights are equal.
- Parameters:
table – The table to create a sampler from. Must have a sample weight column.
exclude_zero_weights – If True, rows with a weight of zero are excluded, reducing the epoch length to the number of non-zero-weight rows.
- Returns:
A WeightedRandomSampler based on the table’s weight column.
- Raises:
ValueError – If the table has no sample weight column or all weights are zero.