tlc.core.objects.tables.system_tables.indexing_table#

The base class for tables which are populated by scanning the contents of a URL

Module Contents#

Classes#

Class

Description

IndexingTable

The base class for tables which are populated by scanning the contents of a URL.

API#

class tlc.core.objects.tables.system_tables.indexing_table.IndexingTable(url: tlc.core.url.Url | None = None, created: str | None = None, row_cache_url: tlc.core.url.Url | None = None, row_cache_populated: bool | None = None, project_scan_urls: list[tlc.core.url.Url] | None = None, extra_scan_urls: list[tlc.core.url.Url] | None = None, scan_urls: list[tlc.core.objects.tables.system_tables.indexing._ScanUrl] | None = None, constrain_to_type: str | None = None, scan_wait: float | None = None, file_extensions: list[str] | None = None, create_default_dirs: bool | None = None, init_parameters: Any = None)#

Bases: tlc.core.objects.table.Table

The base class for tables which are populated by scanning the contents of a URL.

The scanning can be limited to a particular object type (e.g. Run).

Parameters:
  • url – The URL of the table.

  • created – The creation timestamp of the table.

  • row_cache_url – The URL of the row cache.

  • row_cache_populated – Indicates whether the row cache is populated.

  • scan_urls – The URLs to be scanned.

  • constrain_to_type – The type of objects to be included in the table.

  • init_parameters – Any additional initialization parameters.

Parameters:
  • url – The URL of the table.

  • created – The creation time of the table.

  • description – The description of the table.

  • row_cache_url – The URL of the row cache.

  • row_cache_populated – Whether the row cache is populated.

  • override_table_rows_schema – The schema to override the table rows schema.

  • init_parameters – The initial parameters of the table.

  • input_tables – A list of Table URLs that are considered direct predecessors in this table’s lineage. This parameter serves as an explicit mechanism for tracking table relationships beyond the automatic lineage tracing typically managed by subclasses.

property running: bool#

Whether the indexing table is currently running

add_scan_url(scan_url: tlc.core.objects.tables.system_tables.indexing._ScanUrl) None#

Adds a Scan URL to the indexing table.

The URL will be added to the list of URLS scanned to populate the table. Any new content is added to the table on the next indexing cycle.

remove_scan_url(scan_url: tlc.core.objects.tables.system_tables.indexing._ScanUrl) None#

Removes a Scan URL from the indexing table.

The URL will be removed from the list of URLs scanned to populate the table. Any new content is removed from the table on the next indexing cycle.

add_extra_scan_urls(scan_urls: list[tlc.core.url.Url | str]) None#

Add extra scan urls to this indexing table

If the indexing table is running changes will be propagated to worker threads.

add_project_scan_urls(project_scan_urls: list[tlc.core.url.Url | str]) None#

Add scan urls to this indexing table

If the indexing table is running changes will be propagated to worker threads.

consider_indexing_object(obj: tlc.core.object.Object, url: tlc.core.url.Url, event_type: tlc.core.object_registry._IndexerCallbackEventType) bool#
add_indexing_object(obj: tlc.core.object.Object, url: tlc.core.url.Url) bool#

Adds a URL to the wait list (if it’s considerable)

should_consider_url(url: tlc.core.url.Url) bool#

Whether the indexer should consider the given URL for indexing

should_consider_object(obj: tlc.core.object.Object) bool#

Only consider registered types that are derived from the constrain_to_type

start() None#
ensure_dependent_properties() None#

The rows of an IndexingTable are considered dependent properties and this is where the table is populated with the objects from the indexed URLs

IndexingTable deviates from the immutability of the Table class and repeated calls to this function will re-populate the table with the latest indexed data.

A call to this function is a no-op if no new data is available, when the table is queried it will simply return the last populated index.

If new data is available, from indexing or “fast-track”, it will re-populate the table with the new data.

append_row(row: Any, location_index: int) None#

Register row in owned row list

stop() None#
wait_for_complete_index(timeout: float | None = None) bool#

Wait for a complete indexing cycle to finish

Parameters:

timeout – timeout in seconds. The function will block until the next indexing cycle is complete or the timeout is reached unless timeout is None, in which case the function will block indefinitely.

Returns:

True if the next index is available, False if timed out

property counter: int#

A counter that is incremented every time the table is updated with new data