Operating on Kubernetes
VS software is shipped as a set of Helm Charts, where each component has a meaningful default set of configuration values. These set values need to be created for each deployment of VS.
The important part of the initialization is the configuration. The values.yaml
file is
structured in YAML as detailed below. It can contain sections for each component, as well
as global
accessible by all individual components.
Full deployment configuration schema enabling strict validation of configuration against it will be released soon.
The following section contains just an extract of available keys and example values for the vs-deployment chart. To find out all possible configurations, please refer to the Helm configuration reference.
To go straight to creating your first earth observation data collection in VS, follow section Create a new collection step by step.
Global configurations
Values under global
key contain mainly parameters that more than 1 component
need to set up their behavior. Examples are: Database configuration, collections,
product types and layers.
database and django
global:
env:
DJANGO_MAIL: office@eox.at
DJANGO_PASSWORD: 7xtMd62&bY#I
DJANGO_USER: vs_admin
DB_NAME: vs_db
DB_PORT: "5432"
DB_PW: Go-J_eOUvj2k
DB_USER: vs_user
collections
In the collections
section, the collections are set up and it is defined
which products based on product_type
will be inserted
into them. The product_types
must list types defined in the product_types
section, coverage_types
allowed for a product_type must be a subset of those
configured for the whole collection. More information about main EOxServer models.
global:
collections:
COLLECTION:
product_types:
- PL00
coverage_types:
- int16_grayscale
productTypes
This section defines product_type
related information. It is a list
of possible product types where each entity defines filters
to
register new products into correct product_type
matching against STAC Item properties (metadata),
as well as which browses
renderings will be generated and
coverages
configuring the mapping of different named STAC assets
to coverage_types
,
defaultBrowse
selects one of the existing browses
and sets it as a default rendering.
Collections
key specifies the names of multiple collections that will a new product be ingested into.
masks
registration is not yet fully implemented in View Server 2.
More information about main EOxServer models.
global:
productTypes:
- name: HRA_MS4_1C
defaultBrowse: TRUE_COLOR
filter:
product_type: HRA_MS4_1C
collections:
- Deimos-HRA_MS4_1C
coverages:
RGBNir:
assets:
- ms
Pan:
assets:
- pan
masks:
- name: validity
validity: true
browses:
TRUE_COLOR:
asset: browse # optional name of asset to register as Browse
red:
expression: red
range: [0, 1000]
nodata: 0
green:
expression: green
range: [0, 1000]
nodata: 0
blue:
expression: blue
range: [0, 1000]
nodata: 0
FALSE_COLOR:
red:
expression: nir
range: [0, 1800]
nodata: 0
green:
expression: red
range: [0, 1000]
nodata: 0
blue:
expression: green
range: [0, 1000]
nodata: 0
PAN:
grey:
expression: pan
range: [0, 1600]
nodata: 0
NDVI:
grey:
expression: (nir-red)/(nir+red)
range: [-1, 1]
coverageTypes
Allows to define a new coverage_type
when not contained in the list of predefined ones.
By default View Server contains the following coverage_types
:
Sentinel 2 data -
coverage_type
named S2_RGBNir:
More information about main EOxServer models.
global:
coverageTypes:
- data_type: "Uint16"
name: "BGR"
bands:
- definition: "http://www.opengis.net/def/property/OGC/0/Radiance"
description: "Blue Channel"
gdal_interpretation: "BlueBand"
identifier: "blue"
name: "blue"
nil_values:
- reason: "http://www.opengis.net/def/nil/OGC/0/unknown"
value: 0
uom: "W.m-2.Sr-1"
significant_figures: 5
allowed_value_ranges:
-
- 0
- 65535
- definition: "http://www.opengis.net/def/property/OGC/0/Radiance"
description: "Red Channel"
gdal_interpretation: "RedBand"
identifier: "red"
name: "red"
nil_values:
- reason: "http://www.opengis.net/def/nil/OGC/0/unknown"
value: 0
uom: "W.m-2.Sr-1"
significant_figures: 5
allowed_value_ranges:
-
- 0
- 65535
- definition: "http://www.opengis.net/def/property/OGC/0/Radiance"
description: "Green Channel"
gdal_interpretation: "GreenBand"
identifier: "green"
name: "green"
nil_values:
- reason: "http://www.opengis.net/def/nil/OGC/0/unknown"
value: 0
uom: "W.m-2.Sr-1"
significant_figures: 5
allowed_value_ranges:
-
- 0
- 65535
storage
Here, the three relevant storages can be configured: the source
, data
and cache
storages.
The source
storage defines the locations from which the original files will
be downloaded to be preprocessed. Preprocessed images and metadata will then be
uploaded to the data
storage, which is also used by registrar
during registration.
The cache service will cache images on the cache
storage.
Each storage definition uses the same structure and can target various types
of storages, such as OpenStack Swift
, s3
or local
.
These storage definitions will be used in the appropriate sections.
global:
source:
type: swift
username:
password:
project_name:
project_id:
region_name:
auth_url:
user_domain_name:
user_domain_id:
project_domain_name:
project_domain_id:
data:
public:
type: swift
...
cache:
type: swift
...
layers
This section defines how the layers shall be cached and their configuration in the client.
There is a difference between the concept of parentLayers
and subLayers
.
If layer.parentLayer
value is equal to layer.id
, all of its properties and values are considered as a full layer for client configurations.
If layer.parentLayer
and layer.id
are not equal, a new cache tileset is created with the given id
and grids
definitions. In the client, such subLayer
is represented only as a Display option of a parentLayer.
The subLayer
definitions correspond to the defined browses
from product_type
values. Each WMTS subLayer
tileset created in the cache references the WMS layer of a collection in the renderer with the same name. The subLayer.id
should therefore be composed in the following manner:
collection.name__browse.name
. The two underscores is a default separator and a configurable value.
Full configuration schema of client - search for layers.
global:
layers:
- id: VHR_IMAGE_2018_Level_1
title: VHR IMAGE 2018 Level 1
displayColor: "#eb3700"
parentLayer: VHR_IMAGE_2018_Level_1
maxZoom: 18
visible: false
grids: &defaultGridOptions
- name: WGS84
zoom: 16
search: &defaultSearch
parameters:
- type: "eo:cloudCover"
title: "Cloud Coverage in percent"
name: "Cloud Coverage"
max: 100
min: 0
range: true
- type: "geo:uid"
title: "Product ID"
privileged: true
- id: VHR_IMAGE_2018_Level_1__TRUE_COLOR
title: VHR Image 2018 Level 1 True color
parentLayer: VHR_IMAGE_2018_Level_1
grids: *defaultGridOptions
- id: VHR_IMAGE_2018_Level_1__NDVI
title: VHR Image 2018 Level 1 NDVI
parentLayer: VHR_IMAGE_2018_Level_1
style: earth
grids: *defaultGridOptions
overlayLayers
This section defines overlayLayers
definitions in client and cache.
The following example configures a pre-seeded full coverage mosaic layer with
limited European extent served as an overlay.
Full configuration schema of client - search for overlayLayers.
global:
overlayLayers:
- id: VHR_IMAGE_2018_Level_3__outlines
title: VHR Image 2018 Level_3 outlines
description: "WMS rendering of Level 3 product footprints for current time range."
- id: VHR_IMAGE_2018_Level_3__masked_validity__Full
title: VHR Image 2018 Level 3 True Color with masked validity Full Coverage
protocol: WMTS
urls: baseUrlsWMTS
synchronizeTime: false
source: "VHR_IMAGE_2018_Level_3__masked_validity"
description: "<p>Pre-seeded Full coverage mosaic layer of VHR_IMAGE_2018 Level 3 products with their validity masks applied to masked out the final True Color rendering. Products composing the rendered tiles were sorted by time, placing newest products on top.</p><p>This mosaic does not have any search or time dimension functionality enabled."
grids: &defaultFullGridOptions
- name: WGS84
zoom: 16
restricted_extent: "-24.7 27.5 45 71.3"
ingress
Global definition of Kubernetes ingress controller which many services can take their URL access patterns from. Optional.
global:
ingress:
enabled: true
hosts:
- host: collection.remoteurl.com
tls:
- hosts:
- collection.remoteurl.com
secretName: secret
Component specific
preprocessor-v2
Here, the preprocessing can be configured in detail. Example of a preprocessing
configuration with defaults and a special configuration for a single product_type
:
preprocessor-v2:
config:
type_extractor:
xpath:
- /gsc:report/gsc:opt_metadata/gml:metaDataProperty/gsc:EarthObservationMetaData/eop:productType/text()
- /gsc:report/gsc:sar_metadata/gml:metaDataProperty/gsc:EarthObservationMetaData/eop:productType/text()
level_extractor:
xpath: ''
metadata_glob: "*GSC*.xml"
stac_output: true
preprocessing:
defaults:
stac_item_structure:
statistics:
compute_statistics: true
stats_approx: 2
assets:
pan: &cog_stac_asset
description: 'Product image converted into a COG'
title: 'Preprocessed image'
media_type: 'image/tiff; application=geotiff; profile=cloud-optimized'
roles:
- data
globs:
- '*.tif'
ms: *cog_stac_asset
gsc_metadata: &gsc_metadata_stac_asset
globs:
- '*.xml'
description: 'GSC metadata file from source archive'
title: 'GSC Metadata file'
media_type: 'application/xml'
roles:
- metadata
move_files: true
nested: true
output:
options: &default_output_options
format: COG
dstSRS: 'EPSG:4326'
dstNodata: 0
multithread: True
warpMemoryLimit: 3000
creationOptions:
- BLOCKSIZE=512
- COMPRESS=DEFLATE
- NUM_THREADS=8
- BIGTIFF=YES
- OVERVIEWS=AUTO
- PREDICTOR=YES
types:
SKY_CBU_3A:
data_file_globs:
- "*analytic_clip.tif"
- "*analytic.tif"
- "*panchromatic_clip.tif"
- "*panchromatic.tif"
output:
group_by: "(.*)"
options: *default_output_options
stac_item_structure:
statistics:
compute_statistics: true
stats_approx: 2
force_histogram_min_value: 2
assets:
pan:
<<: *cog_stac_asset
globs:
- '*_panchromatic*'
ms:
<<: *cog_stac_asset
globs:
- '*_analytic*'
gsc_metadata: *gsc_metadata_stac_asset
client
This section contains other configurations to the client other than layer definitions.
Those are referenced under the layers
key.
Full configuration schema of client.
client:
config:
eoxserverDownloadEnabled: true
leftPanelTabIndex: 0
timeDomain:
- "2010-01-01T00:00:00Z"
- today
displayTimeDomain:
- "2017-01-01T00:00:00Z"
- "2019-12-31T23:59:59Z"
selectedTimeDomain:
- "2018-08-01T00:00:00Z"
- "2018-08-31T23:59:59Z"
maxZoom: 17
displayInterval: P1096D
registrar
This section defines registrar-specific configurations, for setting up specific registration routes:
registrar:
config:
routes:
collections:
path: registrar.route.stac.CollectionRoute
queue: register-collections
backends:
- path: registrar.backend.eoxserver.CollectionBackend
- path: registrar_pycsw.backend.CollectionBackend
harvester
This section configures the harvester
service, filtering capabilities or
to which queue should it push the harvested results.
harvester:
config:
redis:
host: redis # docker swarm only, otherwise do not override default
harvesters:
Deimos-HRA_MS4_1C:
filter:
eq:
- property: "oads:product_type"
- HRA_MS4_1C
resource:
type: OADS
oads:
url: https://tpm-ds.eo.esa.int/oads/meta/Kompsat2/index/
use_oads_ext: true
output: queue
queue: register_queue
postprocessors:
- type: builtin
process: static
kwargs:
values:
properties:
collection: Deimos-HRA_MS4_1C
preprocessor
This section configures the preprocessor
- Mapchete enabled preprocessor.
Each config in configs
targets a specific set of products based on
metadata value collection.
preprocessor:
replicaCount: 1
limits:
cpu: 2
memory: 6Gi
requests:
cpu: 0.1
memory: 1Gi
config:
filesystems:
s3:
type: s3
s3:
access_key_id: access
secret_access_key: key
region: eu
processors:
p1:
type: local
local:
process: preprocessor.processes.local.browse_to_geotiff
paths:
output_path: s3://
collections:
SPOT6-7:
filesystems:
target: s3
data:
- input:
type: http
http:
asset_map:
- key: band_1
band: browse
output:
path: output_path
asset: data
processors:
- p1
Deploying using Helm
It is generally expected that a user will deploy the VS helm chart into the Kubernetes cluster. This can be done via the Flux Helm Operator, which takes care of the installation of helm charts as well as applying subsequent changes to the configuration automatically.
However, it is also possible to deploy manually using the helm command line tool directly:
helm install -f values.yaml vs chart-location
This will install VS with the configuration specified in values.yaml
. To apply changes to values.yaml
, the following command can be used:
helm upgrade -f values.yaml vs chart-location
Finally vs can also be uninstalled using the following command:
helm uninstall vs
Helm configuration reference
In this section variables for a helm deployment will be outlined starting with the main values file:
global:
env:
storage:
data: {}
source: {}
cache:
type: local
collections: {}
productTypes: []
defaultLayer:
layers: []
overlayLayers: []
coverageTypes: []
metadata: {}
database: {}
redis: {}
client: {}
cache: {}
renderer: {}
registrar: {}
harvester: {}
scheduler: {}
seeder: {}
preprocessor: {}
Global Configuration
Environment variables - env
Environment variables noted in other sections are added to this object as
key:value
pairs e.g.
global:
env:
GDAL_PAM_ENABLED: "NO"
Any environment variable added in global.env
will get passed to each
service of VS.
Note
Following global.env
variables are mandatory to be set this way for docker swarm deployment in the values.yaml
. These variables have their default values set for k8s deployment exclusively.
Any set of values.yaml
for docker swarm deployments should include the following values:
global:
ingress:
tls: false
env:
DB_HOST: "database"
RENDERER_HOST: "database"
Storage configuration - storage
The storage section handles all data storage-related configuration
data
key, value mapping in form
name:{config}
for registrar and preprocessor services. There may exist multiplename:{config}
mappings
for swift storage:
type: swift
username
- service username
password
- service password
project_name
- name of project
project_id
- id of project
region_name
- name of region
auth_url
- authentication url
auth_url_short
- short version ofauth_url
auth_version
- authentication version, defaults to 3
user_domain_name
- user domain name
streaming
- if streaming version of /vsi file accessor is used
for s3 storage:
type: s3
bucket
- name of S3 bucket
endpoint_url
- url endpoint
access_key_id
- access key identifier
secret_access_key
- secret access key
public
- default “false”
region_name
- aws s3 region
validate_bucket_name
- if bucket name should be validated, defaults to true
streaming
- if streaming version of /vsi file accessor is used
for local storage:
type: local
root_directory
- directory with data (must be accessible inside containers of services that access it)
for http storage:
type: http
endpoint_url
- url endpoint
streaming
- if streaming version of /vsi file accessor is used
source
optional data source for the preprocessor. Configuration parameters same as
data
cache
configuration for the data source of the cache. Configuration parameters same as
data
. Can betype:local
. In this case a local sqlite3 database is created.
Data Collections - collections
name:{config}
pairs where the name of the collection is mapped to the product and coverage
types
product_types
- list of product types for the collection
coverage_types
- list of coverage types for the collection
Product types - productTypes
List of product type objects with the following configs:
name
- product type name
defaultBrowse
- name of the default browse type
coverages
- mapping of coverage names to assets
assets
- list of assets
browses
- mapping of browse types to definitions
collections
- collections to which the product type belongs to
masks
- masks to which the product type belongs to
Layers - layers
Overlay layers - overlayLayers
Full configuration schema of client - search for overlayLayers.
Coverage Types - coverageTypes
List of coverage types to add to the backend.
bands
- list of band definitions
definition
- ogc link to band definition
description
- description of band
identifier
- identifier of band
name
- name of band
nil_values
- list of NAN values
reason
- ogc reason
value
- what value is considered NAN
uom
- unit of measure
wavelength
- wavelength
data_type
- type of data
name
- name of the band
Service Metadata - metadata
Metadata values used by services.
title
- title of the service
header
- client header
abstract
- abstract of the service
url
- override service url - if not set, then announced links in Capabilities documents will depend on the used hostname of the request
keywords
- list of keywords
accessConstraints
- access constraints
fees
- fees
contactName
- name of contact person
contactPhone
- phone of contact person
contactFacsimile
- facsimile of contact person
contactOrganization
- contact person organization
contactCity
- city of contact person
contactStateOrProvince
- state or province of contact person
contactPostcode
- postcode of contact
contactCountry
- country of contact
contactElectronicMailAddress
- contact email
contactPosition
- contact position
providerName
- name of provider
providerUrl
- url of provider
inspireProfile
- inspire profile
inspireMetadataUrl
- inspire metadata url
defaultLanguage
- default language of service
language
- language of service
Database configuration - database
Database configuration. See https://artifacthub.io/packages/helm/bitnami/postgresql for a comprehensive guide.
Redis configuration - redis
Redis configuration. See https://artifacthub.io/packages/helm/bitnami/redis for comprehensive configuration.
Common service configuration
Here is a list of common configurations across services.
replicaCount
- number of pods to spawn
nameOverride
- override the short name
fullNameOverride
- override the full name
image
- image mapping
repository
- repository of image
pullPolicy
- pull policy
tag
- tag. If unset will default to latest
service
- service mapping. Available only for forwarded services
type
- type of network service
port
- port to forward
resources
- resource mapping
limits
- resource limits
cpu
memory
requests
- request resources
cpu
memory
affinity
- affinity configuration
livenessProbe
- liveness tests
ingress
- ingress trigger
global
- global settings
All non-global configuration relevant to the services is located in the config
section
for each service e.g.
cache:
config:
# cache configuration values go here
client:
config:
# client configuration values go here
Client configuration - client
Cache configuration - cache
wmsEnabled
- wms enable switch
wmtsEnabled
- wmts enable switch
connectionTimeout
- timeout in seconds for connection
timeout
- timeout for upstream connection
expires
- tile expiry in number of seconds
key
- cache path scheme with keys
Renderer configuration - renderer
Currently accepts no additional custom configuration.
Registrar configuration - registrar
disableDefaultRoute
- disables default route for eoxserver if true
eoxserverInstanceBasePath
- the default backend instance path
eoxserverInstanceName
- the default backend instance name
defaultQueue
- the name of the queue that the registrar listens on - default “register”
defaultSuccessQueue
- queue that the registrar sends successfully registered items to
defaultErrorQueue
- queue that the registrar sends failed items to
defaultReplace
- if set to true, replaces existing items during registration, default “true”
defaultBackends
- list of backend definitions
defaultHandlers
- list of handler definitions
routes
- mapping of custom routes.
routes:
<route-name>:
path: <import-path>
queue: <queue>
backends:
- path: <backend-import-path>
kwargs: <backend-keyword-arguments>
Example configuration: https://gitlab.eox.at/vs/core/-/blob/main/registrar/config-sample.yaml
Harvester configuration - harvester
One-to-one mapping of the original configuration. More info: https://gitlab.eox.at/vs/harvester/-/blob/main/src/harvester/config-schema.json
Scheduler configuration - scheduler
One-to-one mapping of the original configuration. More info: https://gitlab.eox.at/vs/scheduler/-/blob/main/config-sample.yaml
Preprocessor configuration - preprocessor
One-to-one mapping of the original configuration. More info: https://gitlab.eox.at/vs/vs/-/blob/main/preprocessor/src/preprocessor/config-schema.json
Preprocessor-v2 configuration - preprocessor-v2
One-to-one mapping of the original configuration. More info: https://gitlab.eox.at/vs/preprocessor/-/blob/main/preprocessor/config-schema.yaml
Seeder configuration - seeder
minzoom
- minimum zoom from which to seed layers
maxzoom
- maximum zoom to which to seed layers
collection_grids
- dictionary of mappings collection:grids
if only selected grids for a certain collection should to be seeded
Ingestor configuration - ingestor
Currently accepts no additional custom configuration.
Scaling
For Kubernetes deployments, advanced scaling configurations are available.
Renderer
The renderer uses a fixed number of 8 workers for each replica. By default, the replicas have the following resource settings:
Limits:
cpu: 1500m
memory: 6Gi
Requests:
cpu: 500m
memory: 512Mi
This can be customized by setting the following helm values:
renderer:
resources:
requests:
cpu: 1
[...]
Scaling
The default replica count is set to 1 which can be customized by this helm value:
renderer:
replicaCount: 2
Alternatively, horizontal autoscaling is supported based on the CPU metric. If enabled, the default minimum and maximum value for the replicas are 1 and 3 respectively. It can be further customized using the following helm values:
renderer:
hpa:
enabled: true
minReplicas: 1
maxReplicas: 3
Note that the horizontal auto-scaler uses a target CPU utilization of 100%, which refers to 100% of the required CPU resources.
Service management
This subchapter documents k8s specific management steps.
Running commands in VS services
For administration, it can be necessary to run commands directly in one of the services that make up VS.
Most VS services correspond to a deployment
in Kubernetes, so they can be accessed like this:
kubectl exec -it deployment/vs-preprocessor -- bash
However stateful components such as redis or postgres map to a statefulsets in Kubernetes and can be accessed using the following command:
kubectl exec -n demo-eocat-multiple -it statefulset/vs-redis-master -- bash
Note that the command given above launches a shell inside the container. If only one command needs to be run inside the services, the command can be directly given instead of bash.
Purge database
Database structure and models are created as a first step of deployment of View Server and is afterward not updated if the used values change.
Warning
WARNING: The following step deletes all added contents of the database - ALL products have to be then re-registered!
To clean the database to enable recreating it from scratch when values changed do:
kubectl exec -it deployment/vs-registrar -- bash -c 'python3 $INSTANCE_DIR/manage.py flushdb'
helm uninstall name
helm install name --values ... # triggers database structure recreate
For platform-agnostic management and operations steps, visit chapter Operations and management.
Or continue to the section Data Ingestion to see how data can be ingested to a VS.