harvester package
Subpackages
- harvester.endpoint package
- Submodules
- harvester.endpoint.oads module
IndexFileParser
IndexFileParser.apply_assets()
IndexFileParser.apply_eo_extension()
IndexFileParser.apply_oads_extension()
IndexFileParser.apply_properties()
IndexFileParser.apply_sar_extension()
IndexFileParser.apply_sat_extension()
IndexFileParser.apply_version_extension()
IndexFileParser.apply_view_extension()
IndexFileParser.parse()
IndexFileParser.parse_bbox()
IndexFileParser.parse_geometry()
OADSEndpoint
camel_to_snake_case()
create_oads_endpoint()
pairwise()
pairwise_iterative()
parse_coord_list()
tripletwise()
- harvester.endpoint.opensearch module
- harvester.endpoint.query module
- harvester.endpoint.request module
- harvester.endpoint.stacapi module
- Module contents
- harvester.filescheme package
Submodules
harvester.app module
app.py
Contains functionality related to running the application waiting on redis messages
harvester.cli module
cli.py
Contains command line interface
harvester.exceptions module
harvester.filter module
harvester.harvester module
- harvester.harvester.init_resource(harvest_config: HarvesterConfig, filesystem_config: Dict[str, FilesystemConfig]) Resource [source]
- harvester.harvester.main(config: HarvesterAppConfig, harvester_name: str)[source]
harvester.model module
- class harvester.model.FileMatcherConfig(root_path: str, filesystem: str, asset_regex_map: Dict[str, str], id_regex: str, datetime_regex: str)[source]
Bases:
object
- asset_regex_map: Dict[str, str]
- datetime_regex: str
- filesystem: str
- id_regex: str
- root_path: str
- class harvester.model.FilterConfig(context: Dict = <factory>, expression: Dict = <factory>)[source]
Bases:
object
- context: Dict
- expression: Dict
- class harvester.model.FormatConfig(type: str, json: harvester.model.JSONConfig | NoneType = None, atom_xml: harvester.model.ATOMXMLConfig | NoneType = None)[source]
Bases:
object
- atom_xml: ATOMXMLConfig | None = None
- json: JSONConfig | None = None
- type: str
- class harvester.model.HarvesterAppConfig(harvesters: Dict[str, harvester.model.HarvesterConfig] = <factory>, redis: harvester.model.RedisConfig = RedisConfig(host='vs-redis-master', port=6379), filesystems: Dict[str, vs_common.model.FilesystemConfig] = <factory>)[source]
Bases:
object
- filesystems: Dict[str, FilesystemConfig]
- harvesters: Dict[str, HarvesterConfig]
- redis: RedisConfig = RedisConfig(host='vs-redis-master', port=6379)
- class harvester.model.HarvesterConfig(resource: harvester.model.ResourceConfig, filter: Union[harvester.model.FilterConfig, NoneType] = None, output: harvester.model.OutputType = <OutputType.queue: 'queue'>, queue: Union[str, NoneType] = None, postprocessors: Union[List[harvester.model.PostprocessorConfig], NoneType] = None)[source]
Bases:
object
- filter: FilterConfig | None = None
- output: OutputType = 'queue'
- postprocessors: List[PostprocessorConfig] | None = None
- queue: str | None = None
- resource: ResourceConfig
- class harvester.model.JSONConfig(property_mapping: Dict[str, str])[source]
Bases:
object
- property_mapping: Dict[str, str]
- class harvester.model.OADSConfig(url: str, use_oads_ext: bool = False)[source]
Bases:
object
- url: str
- use_oads_ext: bool = False
- class harvester.model.OpenSearchConfig(url: str, query: harvester.model.QueryConfig, format: harvester.model.FormatConfig)[source]
Bases:
object
- format: FormatConfig
- query: QueryConfig
- url: str
- class harvester.model.OutputType(value)[source]
Bases:
str
,Enum
An enumeration.
- console = 'console'
- queue = 'queue'
- class harvester.model.PostprocessorConfig(type: harvester.model.PostprocessorType, process: str, kwargs: Dict[str, Any] = <factory>)[source]
Bases:
object
- kwargs: Dict[str, Any]
- process: str
- type: PostprocessorType
- class harvester.model.PostprocessorType(value)[source]
Bases:
str
,Enum
An enumeration.
- builtin = 'builtin'
- external = 'external'
- class harvester.model.QueryConfig(time: harvester.model.TimeConfig, bbox: str, collection: Union[str, NoneType] = None, extra_params: Dict[str, str] = <factory>)[source]
Bases:
object
- bbox: str
- collection: str | None = None
- extra_params: Dict[str, str]
- time: TimeConfig
- class harvester.model.RedisConfig(host: str = 'vs-redis-master', port: int = 6379)[source]
Bases:
object
- host: str = 'vs-redis-master'
- port: int = 6379
- class harvester.model.ResourceConfig(type: harvester.model.ResourceType, stacapi: harvester.model.STACAPIConfig | NoneType = None, staccatalog: harvester.model.STACCatalogConfig | NoneType = None, filematcher: harvester.model.FileMatcherConfig | NoneType = None, oads: harvester.model.OADSConfig | NoneType = None, opensearch: harvester.model.OpenSearchConfig | NoneType = None)[source]
Bases:
object
- filematcher: FileMatcherConfig | None = None
- oads: OADSConfig | None = None
- opensearch: OpenSearchConfig | None = None
- stacapi: STACAPIConfig | None = None
- staccatalog: STACCatalogConfig | None = None
- type: ResourceType
- class harvester.model.ResourceType(value)[source]
Bases:
str
,Enum
An enumeration.
- FileMatcher = 'filematcher'
- OADS = 'oads'
- OpenSearch = 'opensearch'
- STACAPI = 'stacapi'
- STACCatalog = 'staccatalog'
- class harvester.model.STACAPIConfig(url: str, query: harvester.model.QueryConfig)[source]
Bases:
object
- query: QueryConfig
- url: str
- class harvester.model.STACCatalogConfig(root_path: str, filesystem: str, collection_id: str | NoneType = None, deduplicate: bool = False)[source]
Bases:
object
- collection_id: str | None = None
- deduplicate: bool = False
- filesystem: str
- root_path: str
harvester.output module
- harvester.output.create_console_handler(config: HarvesterAppConfig, harvester_name: str) OutputHandler [source]
- harvester.output.create_queue_handler(config: HarvesterAppConfig, harvester_name: str) OutputHandler [source]
- harvester.output.get_output_handler(config: HarvesterAppConfig, harvester_name: str) OutputHandler [source]
harvester.postprocess module
- harvester.postprocess.apply_postprocessing(items: Iterator[dict], postprocessors: List[PostprocessorConfig]) Iterator[dict] [source]
Wrapper to correctly handle errors in postprocessing.
- Parameters:
items (Iterator[dict]) – Items to apply postprocessing to
postprocessors (List[PostprocessorConfig]) – List of postprocess configurations
- Yields:
Iterator[dict] – Items with postprocessing applied to them
- harvester.postprocess.get_postprocessor(config: PostprocessorConfig) Callable[[...], Dict] [source]
harvester.resource module
- class harvester.resource.Endpoint(url: str)[source]
Bases:
Resource
Endpoints are resources that use a search protocol (or something similar) to harvest items. Thus, they are always associated with a specific URL.