Create a new collection step by step
The following tutorial will guide you through the process of creating and updating a set of configurations (and database) for an imaginary new dataset - layer. Wherever possible it links to other parts of the documentation for further reference.
During the tutorial, elements of the EOxServer Data Model are used and should be understood although by defining the values, direct interaction with the EOxServer database models are usually not required:
Examples of public configurations
The following public View Server configuration values examples can be used as a further reference:
Analyze the data
First, analyze the earth observation data that you will provide by View Server. Below you will find the different types of information necessary to create the configurations:
Data format
For ideal viewing performance of the data, images should be formatted as Cloud Optimized GeoTIFF (COG) or should at least have internal overviews and internal tiling. If the data fulfill any of those two points, proceed to the next point.
If internal overviews are not present even in the case of large EO Data files, rendering a 1x1 pixel image will cause the whole image file to be read, which will negatively impact the rendering performance.
An important attribute of the raster data is their data type (UInt16, Int32, and others). Although View Server will generally be able to read any data type that GDAL can read, having this information is necessary for further steps.
To convert the data to COG it is suggested to:
either manually use GDAL tools before ingesting the data to View Server (In this case the component preprocessor is not necessary)
or configure and use the Preprocessor configuration - preprocessor and reference the preprocessed data instead.
Metadata format
It is also important to check the format of metadata files (sidecar files) next to the raster data. View Server uses SpatioTemporal Asset Catalog (STAC) items internally as a metadata format, both for storage and for messaging between components.
In an ideal case, the STAC items describing the data and metadata are already generated and should be used.
Having STAC items generated is not a prerequisite for all data. View Server will understand some other metadata formats during ingestion. More on that later.
Data storage
The next step is to clarify where the raster data and metadata are stored.
This does not have to be on the same infrastructure where View Server is going to be deployed. Having the data closer to the deployment (at the same cloud provider for example) should significantly speed up data access.
View Server supports the following access and storage protocols (the fsspec library should enable further extensions):
s3
OpenStack Swift
local path
http
Bands and rendering
Depending on the type of data (optical, radar, other) and the number of bands in the raster data, different types of rendering can be configured. In this step, it should be clarified how many bands are there in the raster data and which wavelengths (or general type of information) each band has. The knowledge of band structure will influence possible definitions of types of rendering further on.
Groups of products
Products can or should be grouped or separated using a shared metadata property.
An example of ideal separation would be Processing levels:
Level 1 should not be visualized together in one layer with Level 3 products
SAR products: Single Look Complex (SLC) and Ground Range Detected (GRD) should be separated
Generate configurations
Let’s continue to create View Server configuration values based on the knowledge about the products that have been gathered.
To find out all possible configurations for any of the values configuration keys please refer to the Helm configuration reference.
As a foundation for a new set of values, the default vs-deployment config values can be used as an empty template to be filled.
Warning
Database structure and models are created as a first step of deployment of View Server and is afterward not updated if the used values change (there are no database migrations performed between deploys).
Therefore you have to create a consistent working configuration, it might be an iterative process involving deleting the persistent database storage between each redeploy of updated values if the changes involve database model changes. Refer to Purge database for the how-to.
Further steps in this cookbook will contain a note if the configuration is used in the database structure or not.
Coverage Types
Changes involve database structure: YES
The first concept to focus on during values creation is coverage_types
. The objective of this step is to:
either map the raster data type and band order to the existing
coverage_type
definitionor alternatively, define a new
coverage_type
The possible values and meaning of the coverage_type
are described in Global.coverageTypes.
If there already is an existing coverage_type
with the same type of bands, just in a different order, (near-infrared band of data is for example not a last band, as in RGBNir coverage_type
, but first), then for the sake of clarity, it is always better to create a new coverage_type
although it is not strictly necessary, as for the rendering step, the band order (which band corresponds to which RGB color) can be changed.
Note
Pay attention to the following keys when defining a new coverage_type
:
coverageTypes[i].name - needed for collections definition
coverageTypes[i].bands[i].identifier - needed for browses definition
Product Types
Changes involve database structure:
YES for the
global.productTypes
key and all its values except for following:filter
,coverages
.
The second, even more, important step, is to create productType
definitions. Each productType
represents an EOxServer Product Type
model and some of its links to other models:
BrowseType
EOxServer model - specifies renderings (one to many) viabrowses
key - refer to Browse Types for guidance on how to fill this keyWhich
data assets
will map to which EOxServerCoverage Type
model -coverages
key. There can be multipledata assets
namedSTAC Item entries
for multiplecoverages
.To which
collections
will the product from theproduct_type
be added. One product can be added to multiple collections if theproduct_type
is allowed for those collections. Refer to Collections.
The possible values and meaning of the product_type
are described in Global.productTypes.
Browse Types
Changes involve database structure: YES
The third important step is to define browses
(rendering) definitions for each productType
. Each browses
entry represents an EOxServer Browse Type
model, therefore adding an available WMS Layer to the renderer service.
Multiple simple band expressions and pre-made functions can be used in the band.expression
value. Full list of usable functions.
The band specifications inside the expression
(red, pan, gray) need to match those defined in the selected coverage_type
and correspond to the meaning of the raster data itself.
The names of the color specification in browse_type name (red, green, blue, grey) are to be used as-is and reference the stretching into RGB (or grayscale) spectrum of the WMS output image.
If browse.asset key has a value with a name of a STAC asset, this asset will be used to as a Browse Model. This is a way to attempt to register an asset without a ‘data’ role. It is preferred for cases, when a viewing ready Browse has been already pregenerated rather than trying to fit it to a Coverage model. The Browse behaves slightly differently than Coverages - for example does not allow WCS to be used with it, but at the same time does not need exact georeferencing of image, just that the footprint is extracted correctly in the original STAC item.
Some examples of configured expressions are:
percentile rendering of 2-98% of precomputed histogram stretched to 1-256 with configured defaults if individual STAC Item does not have computed statistics contained in metadata. It also additionally masks our pixels in range 1-10 as extra no data.
TRUE_COLOR:
red:
expression: "interpolate(red, percentile(red, var('percmin', 2), 1), percentile(red, var('percmax', 98), 10), 1, 256, var('clip', True),[var('nodata_start',1),var('nodata_end',10)])"
range:
- 1
- 256
pansharpening operation on the source RGBNir Pan coverages
TRUE_COLOR_PANSHARPEN:
red:
expression: pansharpen(pan, red, green, blue, nir)[0]
range: [0, 1000]
nodata: 0
green:
expression: pansharpen(pan, red, green, blue, nir)[1]
range: [0, 1000]
nodata: 0
blue:
expression: pansharpen(pan, red, green, blue, nir)[2]
range: [0, 1000]
nodata: 0
hillshade rendering of DEM height data in EPSG:4326 with some parameters of the formula specified as “rendering variables” - allowing the WMS client to specify values
hillshade:
grey:
expression: hillshade(gray, var('zfactor', 5), 111120, var('azimuth', 315), var('altitude', 45), var('alg', 'Horn'))
range: [0, 255]
nodata: 0
Default unnamed browse type with 0-255 color range on 4 bands mapped to STAC Item Asset with name browse.
"":
asset: browse
Collections
Changes involve database structure: YES
The fourth step is to define all collections
grouping the Products. For each collection, it is necessary to add their allowed product_types
and coverage_types
.
Example configuration for creating three collections
: Level_1, Level_3 and a shared one:
collections:
VHR_IMAGE_2018:
product_types:
- DOV_MS_L1A
- DOV_MS_L3A
coverage_types:
- RGBNir
VHR_IMAGE_2018_Level_1:
product_types:
- DOV_MS_L1A
coverage_types:
- RGBNir
VHR_IMAGE_2018_Level_3:
product_types:
- DOV_MS_L3A
coverage_types:
- RGBNir
The part of productType
values corresponding to the above added collections
key could be for example:
productTypes:
- name: DOV_MS_L1A
collections:
- VHR_IMAGE_2018
- VHR_IMAGE_2018_Level_1
- name: DOV_MS_L3A
collections:
- VHR_IMAGE_2018
- VHR_IMAGE_2018_Level_3
The possible values and meaning of the collections
are described in Global.collections.
Displaying data
Changes involve database structure: NO
The fifth step influences how the layers can be displayed via the client
service and which tilesets will be exposed by the cache
service.
The possible values and meaning of the layers
and overlayLayers
are described in Global.layers and Global.overlayLayers.
External access
Changes involve database structure: NO
The sixth step is to define external access to View Server components. If the values are going to be deployed on Kubernetes, it is possible to use View Server’s ingress configuration - refer to ingress.
If there is already an external setup configured in the system (external ingress, traefik, etc.), the View Server ingress configurations should be completely disabled by:
ingress:
tls: false
How to get the data in
Changes involve database structure: NO
Storage
The seventh step in the workflow is to see where the data are located for the View Server to correctly reference them and ingest them to get the information about Product data and metadata into the database.
Possible values and meaning of the storage
are described in storage.
For successful ingestion, at least the data
key (location of data) needs to be filled according to the used protocol to access the data on the storage.
There are currently three ways how to ingest data into View Server and they might require further configuration.
Optionally preprocessor
can be used to convert data format beforehand. Refer to Preprocessor configuration schema.
Local storage
Warning
If the files to be ingested are on a local storage, the storage folder(s) need to be mounted into the containers of services, which need access to them. For direct registration without preprocessing, the services would be registrar and renderer.
The mounting needs to be configured on the level of helm release
or docker-compose templates
. Each node (master or worker) which will possibly host that service needs to have access to the data folder as well.
Example docker compose configuration mounting a folder /data/test1` into renderer container path /data` is following
renderer:
volumes:
- type: bind
source: /data/test1
target: /data
Global data storage configuration in values.yaml
for using this folder would look like:
global:
storage:
data:
directory-data:
type: "directory"
root_directory: "/data/"
Ingestion
Direct ingestion of STAC Item JSON strings to
redis register_queue
.
This process is suitable if the STAC items of Products already exist and for one-off ingestion campaigns - collections that do not require any regular updates or additions.
No special configuration except for storage.data
key is necessary.
Using the harvester service for a pulling approach
If you configured the harvester, it will harvest new or updated data from various endpoints and protocols and convert the metadata and data to STAC internally and then push it to other components (preprocessor, registrar).
Harvester-specific configuration is required. Refer to Harvester configuration schema.
Ingestor for legacy Browse Reports - pushing approach
Ingestor-specific configuration is required. Refer to Ingestor configuration schema.
Optionally refer to Data Ingestion chapter for more information.
Global env
The last important step is to modify the global.env
key which lists all environment variables and their values that all services have access to.
It specifies database and Django passwords, which should be changed as well.
Refer to Global configurations for more information.
Individual service configurations
Additionally, most View Server services are configurable using their keys in the values. Refer to Individual service configurations for more information.