tlc.core.objects.tables.system_tables.indexing_table
#
The base class for tables which are populated by scanning the contents of a URL
Module Contents#
Classes#
Class |
Description |
---|---|
The base class for tables which are populated by scanning the contents of a URL. |
API#
- class tlc.core.objects.tables.system_tables.indexing_table.IndexingTable(url: tlc.core.url.Url | None = None, created: str | None = None, row_cache_url: tlc.core.url.Url | None = None, row_cache_populated: bool | None = None, project_scan_urls: list[tlc.core.url.Url] | None = None, extra_scan_urls: list[tlc.core.url.Url] | None = None, scan_urls: list[tlc.core.objects.tables.system_tables.indexing._ScanUrl] | None = None, constrain_to_type: str | None = None, scan_wait: float | None = None, file_extensions: list[str] | None = None, create_default_dirs: bool | None = None, init_parameters: Any = None)#
Bases:
tlc.core.objects.table.Table
The base class for tables which are populated by scanning the contents of a URL.
The scanning can be limited to a particular object type (e.g. Run).
- Parameters:
url – The URL of the table.
created – The creation timestamp of the table.
row_cache_url – The URL of the row cache.
row_cache_populated – Indicates whether the row cache is populated.
scan_urls – The URLs to be scanned.
constrain_to_type – The type of objects to be included in the table.
init_parameters – Any additional initialization parameters.
- Parameters:
url – The URL of the table.
created – The creation time of the table.
description – The description of the table.
row_cache_url – The URL of the row cache.
row_cache_populated – Whether the row cache is populated.
override_table_rows_schema – The schema to override the table rows schema.
init_parameters – The initial parameters of the table.
input_tables – A list of Table URLs that are considered direct predecessors in this table’s lineage. This parameter serves as an explicit mechanism for tracking table relationships beyond the automatic lineage tracing typically managed by subclasses.
- add_scan_url(scan_url: tlc.core.objects.tables.system_tables.indexing._ScanUrl) None #
Adds a Scan URL to the indexing table.
The URL will be added to the list of URLS scanned to populate the table. Any new content is added to the table on the next indexing cycle.
- remove_scan_url(scan_url: tlc.core.objects.tables.system_tables.indexing._ScanUrl) None #
Removes a Scan URL from the indexing table.
The URL will be removed from the list of URLs scanned to populate the table. Any new content is removed from the table on the next indexing cycle.
- add_extra_scan_urls(scan_urls: list[tlc.core.url.Url | str]) None #
Add extra scan urls to this indexing table
If the indexing table is running changes will be propagated to worker threads.
- add_project_scan_urls(project_scan_urls: list[tlc.core.url.Url | str]) None #
Add scan urls to this indexing table
If the indexing table is running changes will be propagated to worker threads.
- consider_indexing_object(obj: tlc.core.object.Object, url: tlc.core.url.Url, event_type: tlc.core.object_registry._IndexerCallbackEventType) bool #
- add_indexing_object(obj: tlc.core.object.Object, url: tlc.core.url.Url) bool #
Adds a URL to the wait list (if it’s considerable)
- should_consider_url(url: tlc.core.url.Url) bool #
Whether the indexer should consider the given URL for indexing
- should_consider_object(obj: tlc.core.object.Object) bool #
Only consider registered types that are derived from the constrain_to_type
- ensure_dependent_properties() None #
The rows of an IndexingTable are considered dependent properties and this is where the table is populated with the objects from the indexed URLs
IndexingTable deviates from the immutability of the Table class and repeated calls to this function will re-populate the table with the latest indexed data.
A call to this function is a no-op if no new data is available, when the table is queried it will simply return the last populated index.
If new data is available, from indexing or “fast-track”, it will re-populate the table with the new data.
- wait_for_complete_index(timeout: float | None = None) bool #
Wait for a complete indexing cycle to finish
- Parameters:
timeout – timeout in seconds. The function will block until the next indexing cycle is complete or the timeout is reached unless timeout is None, in which case the function will block indefinitely.
- Returns:
True if the next index is available, False if timed out