preprocessor package

Subpackages

Submodules

preprocessor.archive module

preprocessor.archive.filter_filenames(filenames: List[str], glob: str, case: bool = False) List[str][source]
preprocessor.archive.is_tarfile(archive_file) bool[source]

Helper to detect whether a path or a file object is referencing a valid TAR file.

preprocessor.archive.open_tarfile(archive_file: str | bytes) TarFile[source]

Open a TAR file from either a path or a file object.

preprocessor.archive.unpack_files(archive_path, target_dir: str, glob=None, case=None, filenames=None, recursive=False) List[str][source]

Unpacks the contents of the specified ZIP or TAR archive to the given target directory. Optionally, only a given list of filenames will be extracted. When a glob is passed, all filenames (either given or from the archive) will be filtered and only the matching files will be extracted.

preprocessor.cli module

preprocessor.cli.setup_logging(debug=False)[source]
preprocessor.cli.validate_config(config)[source]

preprocessor.config module

preprocessor.config.constructor_env_variables(loader, node)[source]

Extracts the environment variable from the node’s value :param yaml.Loader loader: the yaml loader :param node: the current node in the yaml :return: the parsed string that contains the value of the environment variable

preprocessor.config.load_config(input_file: TextIO)[source]

preprocessor.daemon module

preprocessor.daemon.run_daemon(config, host, port, listen_queue, write_queue)[source]

Run the preprocessing daemon, listening on a redis queue for files to be preprocessed. After preprocessing the filename of the preprocessed files will be pushed to the output queue.

preprocessor.metadata module

preprocessor.metadata.evaluate_xpath(root, xpath)[source]
preprocessor.metadata.extract_footprint(metadata_files, footprint_extractor: str = '//gml:target/eop:Footprint/gml:multiExtentOf\n/gml:MultiSurface/gml:surfaceMembers/gml:Polygon')[source]
preprocessor.metadata.extract_metadata_for_stac(metadata_files: Dict[str, str], product_type: str, product_level: str | None)[source]

Temporary function extracting necessary metadata to create a minimal STAC item. For now the xpaths are hardcoded here

preprocessor.metadata.extract_product_types_and_levels(metadata_files: List, config: dict)[source]

Extracts product_types and product_levels found in the metadata based on configured XML xpath extractors.

preprocessor.metadata.parse_polygons_gsc(elem)[source]
preprocessor.metadata.parse_ring(string)[source]
preprocessor.metadata.serialize_coord_list(coords)[source]
preprocessor.metadata.update_config_by_product_types_and_levels(metadata_files: List, config: dict)[source]

Extracts product_type and product_level based on config, updates the config dict by type based config.

preprocessor.preprocess module

preprocessor.preprocess.copy_files(source, target, move=False)[source]
preprocessor.preprocess.custom_postprocessor(source_dir, target_dir, preprocess_config, path, data_file_globs: List[str] = [], args=None, kwargs=None)[source]

Preprocessing step for a custom preprocessing.

preprocessor.preprocess.custom_preprocessor(source_dir, target_dir, preprocess_config, path, data_file_globs: List[str] = [], args=None, kwargs=None)[source]

Preprocessing step for a custom preprocessing.

preprocessor.preprocess.preprocess_browse(config: dict, browse_type: str, browse_report: dict, browse: dict, use_dir: str | None = None)[source]
preprocessor.preprocess.preprocess_file(config: dict, file_path: str, use_dir: str | None = None)[source]

Runs the preprocessing of a single file.

preprocessor.preprocess.preprocess_internal(preprocess_config, previous_step='unpack')[source]

preprocessor.preprocessor_fix_core_dimap_image_ref module

preprocessor.preprocessor_fix_core_dimap_image_ref.attempt_to_create_hdr(source_dir, target_dir, preprocess_config, output: str = 'imagery.hdr')[source]
preprocessor.preprocessor_fix_core_dimap_image_ref.attempt_to_set_projection(source_dir, target_dir, preprocess_config)[source]
preprocessor.preprocessor_fix_core_dimap_image_ref.rename_reference_dimap(source_dir: PathLike, target_dir: PathLike, preprocess_config: dict, search: str, replace: str, output_file_name: str = 'imagery.dim')[source]

preprocessor.preprocessor_iceye_helpers module

preprocessor.preprocessor_iceye_helpers.extract_rpc_metadata(root: Element)[source]
preprocessor.preprocessor_iceye_helpers.extract_rpc_to_text_file(source_dir: PathLike, target_dir: PathLike, preprocess_config: dict, glob: str = '*ICEYE.xml', output_file_name: str = 'ICEYE.rpc')[source]
preprocessor.preprocessor_iceye_helpers.get_rpc_value(tag, root)[source]

preprocessor.stac module

class preprocessor.stac.HrefSortableAsset(href: str, title: str | None = None, description: str | None = None, media_type: str | None = None, roles: List[str] | None = None, extra_fields: Dict[str, Any] | None = None)[source]

Bases: Asset

Helper function enabling sorting Assets by href.

description: str | None

A description of the Asset providing additional details, such as how it was processed or created. CommonMark 0.29 syntax MAY be used for rich text representation.

extra_fields: Dict[str, Any]

Optional, additional fields for this asset. This is used by extensions as a way to serialize and deserialize properties on asset object JSON.

href: str

Link to the asset object. Relative and absolute links are both allowed.

media_type: str | None

Optional description of the media type. Registered Media Types are preferred. See MediaType for common media types.

owner: Item | Collection | None

The Item or Collection that this asset belongs to, or None if it has no owner.

roles: List[str] | None

Optional, Semantic roles (i.e. thumbnail, overview, data, metadata) of the asset.

title: str | None

Optional displayed title for clients and users.

class preprocessor.stac.STATS_APPROX(value)[source]

Bases: IntEnum

An enumeration.

APPROX_OK = 1
APPROX_OVERVIEW = 2
NO_APROX = 0
preprocessor.stac.create_simple_stac_item(preprocessor_config: dict, root_config: dict, upload_files: Dict[str, str], extra_files: Dict[str, str], product_type: str, product_level: str | None)[source]

Temporary method creating a minimal STAC item from information about products uploaded and metadata files uploaded. Accepts: ‘upload_files’ dictionary of upload_files (images), where key is local path and value is remote path. ‘extra_files’ dictionary of extra_files (sidecar or metadata), where key is local path and value is remote path. Assuming metadata file to read and create a STAC info from is first to pick by iterator.

preprocessor.stac.create_stac_asset(local_path: str, remote_path: str, root_config: Dict[str, Any], asset_config: Dict[str, Any], name: str = '', aggregator: Dict = {}, is_image: bool = False, compute_statistics: bool = False, approx: STATS_APPROX = STATS_APPROX.APPROX_OK, force_histogram_min_value: float | None = None, force_histogram_max_value: float | None = None)[source]

Helper function creating a STAC asset and filling it with image/metadata properties based on config.

preprocessor.stac.extract_asset_config_by_glob(local_path: str, stac_item_structure: Dict[str, Any], config: Dict[str, Any])[source]

preprocessor.util module

class preprocessor.util.Timer[source]

Bases: object

Helper timer class to allow logging of timing values

property elapsed
preprocessor.util.apply_gdal_config_options(preprocessor_config)[source]

Applies config specific gdal configuration options for a given preprocessing step Returning original values to allow switching them back after preprocessing done.

preprocessor.util.convert_unit(size_in_bytes, unit='B')[source]

Convert the size from bytes to other units like KB, MB, GB, TB

preprocessor.util.flatten(llist)[source]
preprocessor.util.get_all_data_files(source_dir, preprocessor_config, data_file_globs=[])[source]

Based on ‘data_file_globs’ configuration, gets all unique data file paths from folder matching any of the globs

preprocessor.util.get_size_in_bytes(file_path, unit)[source]

Get size of file at given path in bytes

preprocessor.util.pairwise(col)[source]
preprocessor.util.replace_ext(filename: str, new_ext: str, force_dot: bool = True) str[source]
preprocessor.util.set_gdal_options(config_options)[source]

Sets a key, value dictionary of config options to gdal

preprocessor.util.workdir(config: dict, use_dir: str = None)[source]

Module contents