tlc.url¶
Public URL API.
tlc.Url is hoisted to the top level; the surrounding URL surface lives here.
See URLs and Custom URL Adapters for narrative usage.
Package Contents¶
Classes¶
Class |
Description |
|---|---|
Structured metadata for a registered URL adapter. |
|
An Enum indicating what to do if content already exists at the target URL |
|
String constants for URL schemes (convenience only). |
|
The base class for all URL adapters. |
|
A directory entry returned by URL adapter operations. |
|
Interface a URL adapter must implement to be registered with the registry. |
Functions¶
Function |
Description |
|---|---|
Return the path of a registered URL alias. |
|
Return the registered URL aliases. |
|
List all registered URL adapters with structured metadata. |
|
List the URL schemes of all registered adapters. |
|
Class decorator that registers a |
|
Register an alias for a URL. |
|
Unregister a programmatically-registered URL alias. |
API¶
- class AdapterInfo¶
Bases:
typing.TypedDictStructured metadata for a registered URL adapter.
Initialize self. See help(type(self)) for accurate signature.
- source: tlcurl.url_adapters._registry.AdapterSource = None¶
- class IfExistsOption¶
-
An Enum indicating what to do if content already exists at the target URL
from tlcurl import UrlAdapterRegistry, IfExistsOption from tlcurl.url import Url url = Url("s3://my-bucket/my-file.txt") UrlAdapterRegistry.write_string_content_to_url(url, "Hello World!", IfExistsOption.OVERWRITE)
Initialize self. See help(type(self)) for accurate signature.
- OVERWRITE = overwrite¶
If content already exists at the target URL, overwrite the content (default).
- RAISE = raise¶
If content already exists at the target URL, raise an exception.
- RENAME = rename¶
If content already exists at the target URL, leave that content, and write the new content to a new URL that is similar to the original target URL.
- REUSE = reuse¶
If content already exists at the target URL, reuse the existing content.
- class Scheme¶
String constants for URL schemes (convenience only).
This class provides string constants for the schemes that ship with 3LC. It exists purely for convenience (IDE autocomplete, avoiding typos) and documentation purposes.
The actual available schemes are determined entirely by what adapters are registered with :class:
~tlcurl.url_adapters._registry.UrlAdapterRegistry. Custom schemes work automatically when an adapter is registered - no need to modify this class.Schemes that ship with 3LC::
Scheme.FILE = "file" # file:// - Local filesystem Scheme.S3 = "s3" # s3:// - Amazon S3 Scheme.GS = "gs" # gs:// - Google Cloud Storage Scheme.ABFS = "abfs" # abfs:// - Azure Blob Storage Scheme.API = "api" # api:// - In-memory API objects Scheme.HTTP = "http" # http:// - HTTP URLs Scheme.HTTPS = "https" # https:// - HTTPS URLs
Internal/virtual schemes (not backed by standalone adapters)::
Scheme.RELATIVE = "relative" # Relative paths (handled by FileUrlAdapter) Scheme.ALIAS = "alias" # URL aliases (handled by UrlAliasRegistry)
Example usage::
from tlcurl.url import Scheme, Url url = Url("/path/to/file") if url.scheme == Scheme.FILE: print("This is a local file") # Custom schemes work automatically when an adapter is registered url = Url("mydb://database/table/key") print(url.scheme) # "mydb".. note::
These are string constants, not an enum. Comparisons use string equality:
url.scheme == Scheme.FILEis equivalent tourl.scheme == "file".
- class UrlAdapter¶
Bases:
abc.ABCThe base class for all URL adapters.
URL adapters provide I/O operations for specific URL schemes. All operations are synchronous — for async usage, the registry runs these methods in a thread pool via the async wrapper functions exported from
tlc.url.Subclassing this ABC is the recommended path — it ships sensible defaults for the optional methods and the string-encoding forwarders. Adapters that need to bypass the ABC entirely can implement
UrlAdapterProtocoldirectly; the registry accepts anything that satisfies that structural protocol.To implement a custom adapter, subclass this and implement only the required abstract methods:
schemes()- Return the URL schemes this adapter handlesread_binary_content_from_url()- Read binary content from a URLexists()- Check if a URL exists
Custom adapters can be registered automatically via Python entry points. Declare the adapter in your package’s
pyproject.toml::[project.entry-points."tlc.url_adapters"] myscheme = "mypackage.adapters:MyAdapter"
The adapter will be discovered and registered at startup when your package is installed.
All other methods have sensible default implementations or raise helpful
NotImplementedErrormessages indicating which method to override for full functionality.Optional methods to override:
write_binary_content_to_url()- Write binary content (enables writes)list_dir()- List directory contentsis_dir()- Check if URL is a directorydelete_url()- Delete contentcopy_url()- Copy content between URLs (default reads+writes)make_dirs()- Create directories (default no-op for flat hierarchies)get_file_size()- Get file size (default reads entire file)is_writable()- Check if URL is writable (default False)stat()- Get file metadatatouch()- Update modification time
.. seealso::
Complete examples in the `3lc-examples <https://github.com/3lc-ai/3lc-examples>`_ repository: ``examples/http-image-url-adapter``, ``examples/kitti-virtual-table``, ``examples/nifti-virtual-table``.
- copy_url( ) None¶
Copy content from one URL to another.
Default implementation reads and writes. Override for more efficient server-side copies.
- Parameters:
source – The source URL.
destination – The destination URL.
- create_unique_url(
- url: Url,
Create a unique version of the URL by finding the next available name.
This method generates a unique URL by appending a numeric suffix (e.g., _0000, _0001) to the stem of the URL if the original URL already exists.
Override for scheme-specific uniquification behavior.
- Parameters:
url – The URL to make unique.
- Returns:
A unique URL (may be the same URL if it doesn’t exist).
- default_project_scan_urls() list[str | dict[str, object]]¶
Default project scan URLs contributed by this adapter.
Override to provide locations the indexer should scan for 3LC projects when this adapter is registered. Each entry can be a plain URL string (
"project"layout is applied automatically) or a dict with explicit fields (e.g.,{"url": "mydb://projects", "layout": "project"}).- Returns:
List of URL strings or scan-URL dicts (empty by default).
- default_run_scan_urls() list[str | dict[str, object]]¶
Default run scan URLs contributed by this adapter.
Override to provide flat locations the indexer should scan for Run objects. Each entry can be a plain URL string (
"flat"layout and"run"object type are applied automatically) or a dict with explicit fields.- Returns:
List of URL strings or scan-URL dicts (empty by default).
- default_table_scan_urls() list[str | dict[str, object]]¶
Default table scan URLs contributed by this adapter.
Override to provide flat locations the indexer should scan for Table objects. Each entry can be a plain URL string (
"flat"layout and"table"object type are applied automatically) or a dict with explicit fields.- Returns:
List of URL strings or scan-URL dicts (empty by default).
- delete_url(
- url: Url,
Delete content at a URL.
Override to enable deletion.
- Parameters:
url – The URL to delete.
- Raises:
NotImplementedError – By default.
- async delete_url_async(
- url: Url,
Delete content at a URL asynchronously.
Default implementation wraps :meth:
delete_urlin a thread pool. Override for native async I/O.- Parameters:
url – The URL to delete.
- abstract exists(
- url: Url,
Check if a URL exists.
This method MUST be overridden by subclasses.
- Parameters:
url – The URL to check.
- Returns:
True if the URL exists.
- async exists_async(
- url: Url,
Check if a URL exists asynchronously.
Default implementation wraps :meth:
existsin a thread pool. Override for native async I/O.- Parameters:
url – The URL to check.
- Returns:
True if the URL exists.
- force: ClassVar[bool] = False¶
Class-level registration parameter. Set to
Truein a subclass to replace an already-registered adapter for the same scheme(s), including built-ins. Read by entry point discovery and the@register_url_adapterdecorator.
- format_url( ) str¶
Format a URL for display/serialization.
Default implementation returns scheme://path[?query].
- Parameters:
scheme – The URL scheme.
path – The normalized path.
query – Optional query string.
- Returns:
The formatted URL string.
- generate_self_authenticated_url( ) str | None¶
Generate a self-authenticated URL that browsers can fetch without authorization headers.
Self-authenticated URLs carry their own authentication embedded in the URL itself, allowing browsers to load content in
<img>,<video>, and<audio>tags without customAuthorizationheaders. The embedded authentication is time-limited.The mechanism is adapter-specific (e.g. presigned URL for S3).
Default implementation returns
None, indicating this adapter does not support self-authenticated URLs and the caller should fall back to proxying through the object service.- Parameters:
url – The URL to generate a self-authenticated URL for.
expiration_seconds – How long the self-authenticated URL should remain valid (default: :data:
DEFAULT_SELF_AUTHENTICATED_URL_EXPIRY_SECONDS).
- Returns:
A self-authenticated URL string, or
Noneif not supported.
- async generate_self_authenticated_url_async( ) str | None¶
Generate a self-authenticated URL that browsers can fetch without authorization headers asynchronously.
Self-authenticated URLs carry their own authentication embedded in the URL itself, allowing browsers to load content in
<img>,<video>, and<audio>tags without customAuthorizationheaders. The embedded authentication is time-limited.The mechanism is adapter-specific (e.g. presigned URL for S3).
Default implementation wraps :meth:
generate_self_authenticated_urlin a thread pool, which itself defaults to returningNone, indicating this adapter does not support self-authenticated URLs and the caller should fall back to proxying through the object service.Derived adapters can override the sync method to support self-authenticated URLs or this method to support native async I/O (e.g., using aiobotocore, aiofiles).
- Parameters:
url – The URL to generate a self-authenticated URL for.
expiration_seconds – How long the self-authenticated URL should remain valid.
- Returns:
A self-authenticated URL string, or
Noneif this adapter does not support self-authenticated URLs.
- get_file_size(
- url: Url,
Get the size of the file at the URL.
Default implementation reads the entire file to get its size. Override for efficiency with large files.
- Parameters:
url – The URL to check.
- Returns:
The file size in bytes.
- get_scheme_for_url(
- url_string: str,
Get the scheme for the given URL string if this adapter can handle it.
Default implementation checks if the URL string starts with any of the adapter’s prefixes. Override for complex logic (e.g., FILE scheme handling Windows paths).
- Parameters:
url_string – The URL string to check.
- Returns:
The scheme string if this adapter can handle it, None otherwise.
- get_url_prefixes() list[str]¶
Get URL prefixes this adapter handles (e.g., [‘s3://’]).
Default implementation derives prefixes from schemes().
- Returns:
List of URL prefixes handled by this adapter.
- is_absolute_scheme() bool¶
Whether this adapter’s schemes represent absolute URLs.
Default is True. Override to return False for relative/virtual schemes.
- Returns:
True if URLs with this scheme are inherently absolute.
- is_dir(
- url: Url,
Check if a URL is a directory.
Default implementation returns False (assumes files only).
- Parameters:
url – The URL to check.
- Returns:
True if the URL is a directory.
- is_file_hierarchy_flat() bool¶
Determine if the file hierarchy is flat for this adapter.
Flat hierarchies (like cloud storage) treat paths as opaque keys, so
make_dirsis a no-op. Hierarchical file systems (like local disk) require explicit directory creation.Default is True (flat), consistent with the no-op default of
make_dirs. Override to return False for hierarchical file systems that needmake_dirs.- Returns:
True if the file hierarchy is flat.
- is_writable(
- url: Url,
Check if a URL is writable.
Default implementation returns False. Override and return True if writes are supported.
- Parameters:
url – The URL to check.
- Returns:
True if the URL is writable.
- list_dir(
- url: Url,
List directory contents.
Override to enable directory listing.
- Parameters:
url – The directory URL to list.
- Returns:
An iterator of directory entries.
- Raises:
NotImplementedError – By default.
- make_dirs( ) None¶
Create a leaf directory and all intermediate ones.
Default is a no-op (assumes flat hierarchy like cloud storage). Override if your storage requires directory creation.
- Parameters:
url – The directory URL to create.
exist_ok – If True, don’t raise if directory exists.
- abstract read_binary_content_from_url(
- url: Url,
Read binary content from a URL.
This method MUST be overridden by subclasses.
- Parameters:
url – The URL to read from.
- Returns:
The binary content at the URL.
- async read_binary_content_from_url_async(
- url: Url,
Read binary content from a URL asynchronously.
Default implementation wraps :meth:
read_binary_content_from_urlin a thread pool. Override for native async I/O (e.g., using aiobotocore, aiofiles).- Parameters:
url – The URL to read from.
- Returns:
The binary content at the URL.
- read_string_content_from_url( ) str¶
Read string content from a URL.
Concrete forwarding method — calls {meth}
read_binary_content_from_urland decodes withencoding(default UTF-8). Subclasses should not normally override this; override {meth}read_binary_content_from_urlinstead.- Parameters:
url – The URL to read from.
encoding – The text encoding (default: utf-8).
- Returns:
The string content at the URL.
- async read_string_content_from_url_async( ) str¶
Read string content from a URL asynchronously.
Default implementation wraps :meth:
read_string_content_from_urlin a thread pool. Override for native async I/O.- Parameters:
url – The URL to read from.
encoding – The text encoding (default: utf-8).
- Returns:
The string content at the URL.
- abstract schemes() list[str]¶
Get URL schemes this adapter handles.
This method MUST be overridden by subclasses.
- Returns:
List of scheme strings this adapter handles (e.g., [“file”], [“s3”]).
- stat(
- url: Url,
Get file metadata.
Default implementation creates a simple entry from available info. Override for more accurate metadata.
- Parameters:
url – The URL to get metadata for.
- Returns:
A directory entry with metadata.
- Raises:
FileNotFoundError – If the URL doesn’t exist.
- supports_relativization() bool¶
Whether URLs with this adapter’s schemes can be relativized.
Default is True. Override to return False for schemes like API where relativization doesn’t make sense.
- Returns:
True if URLs can be made relative.
- touch(
- url: Url,
Update the modification timestamp or create an empty file.
Default implementation creates an empty file if it doesn’t exist.
- Parameters:
url – The URL to touch.
- write_binary_content_to_url( ) Url¶
Write binary content to a URL.
Override this method to enable write support.
- Parameters:
url – The URL to write to.
content – The binary content to write.
- Returns:
The URL the content was written to. Normally the input
url; adapters that map writes onto a different location may return the resolved URL.- Raises:
NotImplementedError – By default.
- async write_binary_content_to_url_async( ) Url¶
Write binary content to a URL asynchronously.
Default implementation wraps :meth:
write_binary_content_to_urlin a thread pool. Override for native async I/O.- Parameters:
url – The URL to write to.
content – The binary content to write.
- Returns:
See :meth:
write_binary_content_to_url.
- write_string_content_to_url( ) Url¶
Write string content to a URL.
Concrete forwarding method — encodes
contentwithencoding(default UTF-8) and calls :meth:write_binary_content_to_url. Subclasses should not normally override this; override- Meth:
write_binary_content_to_urlinstead.- Parameters:
url – The URL to write to.
content – The string content to write.
encoding – The text encoding (default: utf-8).
- Returns:
The URL the content was written to. Normally the input
url; adapters that map writes onto a different location may return the resolved URL.
- async write_string_content_to_url_async( ) Url¶
Write string content to a URL asynchronously.
Default implementation wraps :meth:
write_string_content_to_urlin a thread pool. Override for native async I/O.- Parameters:
url – The URL to write to.
content – The string content to write.
encoding – The text encoding (default: utf-8).
- Returns:
See :meth:
write_string_content_to_url.
- class UrlAdapterDirEntry( )¶
A directory entry returned by URL adapter operations.
This class can be used directly or subclassed for custom implementations that wrap native objects (e.g., os.DirEntry, cloud storage listing info).
Initialize a directory entry.
- Parameters:
name – The entry’s base filename.
path – The entry’s full path.
is_dir – Whether this entry is a directory.
mtime – The modification time (defaults to now).
size – The file size in bytes.
- class UrlAdapterProtocol¶
Bases:
typing.ProtocolInterface a URL adapter must implement to be registered with the registry.
The methods that follow are the minimal required surface — they map 1:1 to the abstract methods of
UrlAdapterand to the operations the registry calls directly.Adapters that do not support a given operation (e.g. an HTTP read-only adapter and writes) should raise :class:
NotImplementedErrorfrom the corresponding method.- list_dir(
- url: Url,
List the immediate children of a directory URL as
UrlAdapterDirEntryobjects.
- write_binary_content_to_url( ) Url¶
Write binary content to a URL and return the URL the content landed at.
Most adapters write where asked and return
urlverbatim. Adapters whose backing store may rewrite the target on write (e.g. a versioning store that lands the content at a successor URL) return the resolved URL — callers that need to read or reference the content afterwards must use that returned URL.
- get_registered_url_aliases() dict[str, str]¶
Return the registered URL aliases.
- Returns:
A dictionary mapping alias tokens to paths.
- list_url_adapters() list[AdapterInfo]¶
List all registered URL adapters with structured metadata.
Entry-point plugins are discovered at package import time, so the list reflects everything installed via
pipin addition to the built-in adapters.- Returns:
One
AdapterInfodict per registered scheme.
- list_url_schemes() list[str]¶
List the URL schemes of all registered adapters.
Use
list_url_adapters()for full structured metadata.- Returns:
Sorted list of recognized scheme strings. Includes built-in schemes even when their backend packages are not installed, so URLs using those schemes can always be parsed.
- register_url_adapter(
- cls: type[UrlAdapter],
Class decorator that registers a
UrlAdaptersubclass.Instantiates the class with no arguments and registers it for all schemes returned by
schemes(). If instantiation or registration fails, the error is logged and the class is returned unregistered.This means adapters with optional dependencies can raise
ImportErrorin__init__— the decorator will catch it gracefully.Built-in adapters (those with all schemes in
_BUILTIN_SCHEMES) are recorded withsource="builtin"; all others getsource="runtime".- Parameters:
cls – The UrlAdapter subclass to register.
- Returns:
The class, unchanged.
- register_url_alias( ) None¶
Register an alias for a URL.
Writes the alias into the API-tier slice of the unified configuration store (
tlc.config). File / env-tier alias entries are not touched, so later reloads of those sources continue to take effect normally.- Parameters:
token – The alias token to register. Must match the regex
[A-Z][A-Z0-9_]*.path – The path to alias.
force – If True, force the registration of the alias even if it is already registered.
- Raises:
ValueError – If the token or path is invalid, or if the alias already exists (in the merged view) with a different path and
forceis False.
- unregister_url_alias(
- token: str,
Unregister a programmatically-registered URL alias.
Removes the token from the API-tier slice only. If the same token is also defined at a lower-precedence source (env, file), that value resurfaces in the merged view via the normal precedence merge.
- Parameters:
token – The alias token to unregister. Must match the regex
[A-Z][A-Z0-9_]*.- Raises:
KeyError – If the token is not present in the API-tier slice — file / env entries cannot be removed through this API; edit the underlying source to take them out.