harvester.filescheme package

Submodules

harvester.filescheme.filematcher module

class harvester.filescheme.filematcher.FileMatcherScheme(filesystem_config: FilesystemConfig, root_path: str, asset_regex_map: Dict[str, str], id_regex: str, datetime_regex: str)[source]

Bases: FileScheme

harvest() Iterator[dict][source]

Starts the harvesting of the resource, returning an iterator of the harvested items.

harvester.filescheme.filematcher.create_filematcherscheme(resource_config: ResourceConfig, filesystem_configs: Dict[str, FilesystemConfig]) FileScheme[source]

harvester.filescheme.stac_catalog module

class harvester.filescheme.stac_catalog.STACCatalogScheme(filesystem_config: FilesystemConfig, root_path: str, collection_id: str | None = None, deduplicate: bool = False)[source]

Bases: FileScheme

FileScheme for STAC catalogs. Recurses into sub-catalogs and harvests all items it finds along the way.

Parameters:
  • filesystem_config (AbstractFileSystem) – filesystem to search

  • root_path (str) – path to perform and recurse for searching

  • deduplicate (bool, optional) – Whether to deduplicate. Defaults to False.

harvest() Iterator[dict][source]

Starts the harvesting of the resource, returning an iterator of the harvested items.

harvester.filescheme.stac_catalog.create_staccatalogscheme(resource_config: ResourceConfig, filesystem_configs: Dict[str, FilesystemConfig]) FileScheme[source]

Module contents

class harvester.filescheme.FileMatcherScheme(filesystem_config: FilesystemConfig, root_path: str, asset_regex_map: Dict[str, str], id_regex: str, datetime_regex: str)[source]

Bases: FileScheme

harvest() Iterator[dict][source]

Starts the harvesting of the resource, returning an iterator of the harvested items.

class harvester.filescheme.STACCatalogScheme(filesystem_config: FilesystemConfig, root_path: str, collection_id: str | None = None, deduplicate: bool = False)[source]

Bases: FileScheme

FileScheme for STAC catalogs. Recurses into sub-catalogs and harvests all items it finds along the way.

Parameters:
  • filesystem_config (AbstractFileSystem) – filesystem to search

  • root_path (str) – path to perform and recurse for searching

  • deduplicate (bool, optional) – Whether to deduplicate. Defaults to False.

harvest() Iterator[dict][source]

Starts the harvesting of the resource, returning an iterator of the harvested items.

harvester.filescheme.create_filematcherscheme(resource_config: ResourceConfig, filesystem_configs: Dict[str, FilesystemConfig]) FileScheme[source]
harvester.filescheme.create_staccatalogscheme(resource_config: ResourceConfig, filesystem_configs: Dict[str, FilesystemConfig]) FileScheme[source]
harvester.filescheme.get_filescheme(resource_config: ResourceConfig, filesystem_configs: Dict[str, FilesystemConfig]) FileScheme | None[source]

Retrieves filescheme from mapping if found

Parameters:

resource_config (ResourceConfig) – Resource configuration

Returns:

Initialized endpoint from SCHEME_MAP

Return type:

Optional[FileScheme]