Over the past 10 months, we’ve worked with a group of organisations from Germany including the Lower Saxon State Department for Waterway, Coastal and Nature Conservation (NLWKN), the Federal Agency for Nature Conservation (BfN), and the Federal Insititute for Hydrology (BAFG) to build a benthic information system called BenINFOS. Now, that system is available, and we’d like to introduce the project to a wider audience.

The Fach AG Benthos has the task of implementing the requirements of the Marine Strategy Framework Directive for assessing the status of the seabed fauna within the framework of the Federal / State Working Group North and Baltic Sea (BLANO).

Benthic data from these decentralized structures has to be digitally combined for the first time in this project.

The BenINFOS Project

In summer 2020, people from the German expert group on benthic information reached out to us to discuss a potential project. After a tendering procedure, wetransform and AquaEcology were tasked with the implementation of a first version of a Benthic Information System.

This system should help experts in the “Fach AG Benthos” with the task of implementing the requirements of the Marine Strategy Framework Directive (2008/56 / EU, MSFD) for assessing the status of the seabed fauna within the framework of the Federal / State Working Group North and Baltic Sea (BLANO).

  • Aggregate and consolidate benthic data from decentralized structures and provide this in a standardized BenINFOS data model - Calculation of the index values required for valuation (M-AMBI, BQI) and transparent presentation of the calculation levels
  • Implement the calculation of two indices (M-AMBI and BQI) that allow to assess the state of the benthic ecosystem
  • Export the calculated indices together with all log files and source data, to ensure full transparency and repeatability
  • Provide further data integration options for additional stakeholders and processes
  • Make the resulting uniform assessment accessible by means of an online application (BenINFOS specialist application) and through download and view services

For us, a project like this is very interesting, because it is about really using data that comes from different sources and organisations. As expected, there were a lot of challenges hidden in that area.

Challenges to Overcome

The different stakeholders in the domain working group had already implemented a common schema for their data, called ICES (developed by a working group in the organisation of the same name). However, even given the same GML application schema for the data, individual data sets still contained significant heterogeneity. We had to find solutions for problems such as:

  • Incomplete data to apply the methodologies, such as missing salt content or depth information for individual probes
  • Mismatched classification systems, e.g., for species names
  • Errors in spelling of species’ names and other properties
  • Minor technical issues in the format and encoding

Over the course of several months, we iterated both over the data, the actual R scripts that perform index calculations, and the web application used to manage and visualise individual index calculation runs. The teams at wetransform and AquaEcology worked together intensively with domain experts to find practical solutions and to ensure a high-quality result.

During these iterations, there were still some doubts as to whether such a system could be applied to all existing data sets, but most stakeholders were able to get the expected results from the integrated data towards the end of the pilot project.

The Results and Next Steps

At the end, the pilot project came to a positive conclusion. It offers an easy to use, straightforward process to integrate data sets and to configure index calculation runs.

Image of the BENInfos platform for the Marine Strategy Framework Directive
Configuration of index calculation runs.

The platform also lets you visualise and download the results of such runs, as shown below.

Image of the BENInfos platform for the Marine Strategy Framework Directive
Visualisation of the results of a calculation run.

The pre-processing steps and calculation scripts take care of a lot of required contextual information and create very detailed logs and outputs. All of this was implemented on top of an existing hale»connect on-premise deployment that is operated by plangis on behalf of the Federal Waterways Engineering and Research Institute (BAW). This project added a custom microservice for the M-AMBI and BQI R script execution, custom workflows, and a web application based on the hale»connect feature explorer.

There is still work to do to make the system fully operational, such as further improvement of the individual source data sets and adding the possibility to also publish result data sets as services automatically.

A second project phase will likely start in late 2021. In it, we also hope to bring more organisations around the Baltic Sea and the North Sea on board with the system. In the next months, we will also write and submit a scientific paper to explain what was done in greater detail.

If you are interested in staying up to date about this development, sign up for our newsletter here.

Over the past 10 months, we’ve worked with a group of organisations from Germany including the Lower Saxon State Department for Waterway, Coastal and Nature Conservation (NLWKN), the Federal Agency for Nature Conservation (BfN), and the Federal Insititute for Hydrology (BAFG) to build a benthic information system called BenINFOS. Now, that system is available, and we’d like to introduce the project to a wider audience.

The Fach AG Benthos has the task of implementing the requirements of the Marine Strategy Framework Directive for assessing the status of the seabed fauna within the framework of the Federal / State Working Group North and Baltic Sea (BLANO).

Benthic data from these decentralized structures has to be digitally combined for the first time in this project.

The BenINFOS Project

In summer 2020, people from the German expert group on benthic information reached out to us to discuss a potential project. After a tendering procedure, wetransform and AquaEcology were tasked with the implementation of a first version of a Benthic Information System.

This system should help experts in the “Fach AG Benthos” with the task of implementing the requirements of the Marine Strategy Framework Directive (2008/56 / EU, MSFD) for assessing the status of the seabed fauna within the framework of the Federal / State Working Group North and Baltic Sea (BLANO).

  • Aggregate and consolidate benthic data from decentralized structures and provide this in a standardized BenINFOS data model - Calculation of the index values required for valuation (M-AMBI, BQI) and transparent presentation of the calculation levels
  • Implement the calculation of two indices (M-AMBI and BQI) that allow to assess the state of the benthic ecosystem
  • Export the calculated indices together with all log files and source data, to ensure full transparency and repeatability
  • Provide further data integration options for additional stakeholders and processes
  • Make the resulting uniform assessment accessible by means of an online application (BenINFOS specialist application) and through download and view services

For us, a project like this is very interesting, because it is about really using data that comes from different sources and organisations. As expected, there were a lot of challenges hidden in that area.

Challenges to Overcome

The different stakeholders in the domain working group had already implemented a common schema for their data, called ICES (developed by a working group in the organisation of the same name). However, even given the same GML application schema for the data, individual data sets still contained significant heterogeneity. We had to find solutions for problems such as:

  • Incomplete data to apply the methodologies, such as missing salt content or depth information for individual probes
  • Mismatched classification systems, e.g., for species names
  • Errors in spelling of species’ names and other properties
  • Minor technical issues in the format and encoding

Over the course of several months, we iterated both over the data, the actual R scripts that perform index calculations, and the web application used to manage and visualise individual index calculation runs. The teams at wetransform and AquaEcology worked together intensively with domain experts to find practical solutions and to ensure a high-quality result.

During these iterations, there were still some doubts as to whether such a system could be applied to all existing data sets, but most stakeholders were able to get the expected results from the integrated data towards the end of the pilot project.

The Results and Next Steps

At the end, the pilot project came to a positive conclusion. It offers an easy to use, straightforward process to integrate data sets and to configure index calculation runs.

Image of the BENInfos platform for the Marine Strategy Framework Directive
Configuration of index calculation runs.

The platform also lets you visualise and download the results of such runs, as shown below.

Image of the BENInfos platform for the Marine Strategy Framework Directive
Visualisation of the results of a calculation run.

The pre-processing steps and calculation scripts take care of a lot of required contextual information and create very detailed logs and outputs. All of this was implemented on top of an existing hale»connect on-premise deployment that is operated by plangis on behalf of the Federal Waterways Engineering and Research Institute (BAW). This project added a custom microservice for the M-AMBI and BQI R script execution, custom workflows, and a web application based on the hale»connect feature explorer.

There is still work to do to make the system fully operational, such as further improvement of the individual source data sets and adding the possibility to also publish result data sets as services automatically.

A second project phase will likely start in late 2021. In it, we also hope to bring more organisations around the Baltic Sea and the North Sea on board with the system. In the next months, we will also write and submit a scientific paper to explain what was done in greater detail.

If you are interested in staying up to date about this development, sign up for our newsletter here.

(more)

FOSS4G Buenos Aires/Online

One of the largest global gatherings for geospatial software enthusiasts, FOSS4G, will take place from September 27th to October 2nd. The physical event will take place in Buenos Aires, with a virtual portion available worldwide.

FOSS4G brings together developers, users, decision-makers, and observers from a broad spectrum of organisations and fields of operation. Through six days of workshops, presentations, discussions, and cooperation, FOSS4G participants create effective and relevant geospatial products, standards, and protocols.

wetransform will showcase hale»studio, the #1 open-source ETL tool for the harmonisation of complex structured data with open standards such as INSPIRE, CityGML, and AIXM.

At the last FOSS4G, we focused on hale»studio’s use in creating interoperable, high-value datasets.

This year, we will bring in our expertise to help understand, explain, and promote the role of free open-source software in terms of sustainability. True to the theme, we will focus on one of hale»studio’s most exciting use cases: how hale»studio is helping save Europe’s forests. In this talk, we will show how harmonised data compliant to international standards such as INSPIRE can be used to introduce scalable solutions to urgent problems such as climate change. If you want to learn more, you can also check out our previous blog post on this topic.

The presentation will take place on Friday, 1st October 2021 at 15:00 CET.

You can register for the event here.

Virtual INSPIRE Conference

This year’s INSPIRE conference will take place virtually from 25th October to 29th October. The focus will be on how INSPIRE will develop and move towards a common European Green Deal data space for environment and sustainability.

The INSPIRE conference has always been one of the turnkey events for the INSPIRE community, and we’ve always done our best to provide that community with unique perspectives. Check out our previous contributions here.

Image of INSPIRE Helsinki 2019 event

A snippet from the INSPIRE Helsinki 2019 Event

The programme is already out, with wetransform contributing to 2 sessions.

We will speak about exciting developments in the INSPIRE ecosystem such as the new activities of the IDSA and the role of the GAIA-X initiative. We will also shed light on our perspective on how INSPIRE has developed and added value, and how the directive can be developed further. You can find more information on the sessions below:

wetransform: INSPIRE Conference Sessions

Session Name Date Time
Architectures, infrastructures and technological enablers for environmental data sharing Tuesday, 26.10.2021 10:00-11:30 CET
Past, present and future of INSPIRE: an industry perspective Wednesday, 27.10.2021 15:30-17:00 CET

Register now!

We look forward to seeing you there!

FOSS4G Buenos Aires/Online

One of the largest global gatherings for geospatial software enthusiasts, FOSS4G, will take place from September 27th to October 2nd. The physical event will take place in Buenos Aires, with a virtual portion available worldwide.

FOSS4G brings together developers, users, decision-makers, and observers from a broad spectrum of organisations and fields of operation. Through six days of workshops, presentations, discussions, and cooperation, FOSS4G participants create effective and relevant geospatial products, standards, and protocols.

wetransform will showcase hale»studio, the #1 open-source ETL tool for the harmonisation of complex structured data with open standards such as INSPIRE, CityGML, and AIXM.

At the last FOSS4G, we focused on hale»studio’s use in creating interoperable, high-value datasets.

This year, we will bring in our expertise to help understand, explain, and promote the role of free open-source software in terms of sustainability. True to the theme, we will focus on one of hale»studio’s most exciting use cases: how hale»studio is helping save Europe’s forests. In this talk, we will show how harmonised data compliant to international standards such as INSPIRE can be used to introduce scalable solutions to urgent problems such as climate change. If you want to learn more, you can also check out our previous blog post on this topic.

The presentation will take place on Friday, 1st October 2021 at 15:00 CET.

You can register for the event here.

Virtual INSPIRE Conference

This year’s INSPIRE conference will take place virtually from 25th October to 29th October. The focus will be on how INSPIRE will develop and move towards a common European Green Deal data space for environment and sustainability.

The INSPIRE conference has always been one of the turnkey events for the INSPIRE community, and we’ve always done our best to provide that community with unique perspectives. Check out our previous contributions here.

Image of INSPIRE Helsinki 2019 event

A snippet from the INSPIRE Helsinki 2019 Event

The programme is already out, with wetransform contributing to 2 sessions.

We will speak about exciting developments in the INSPIRE ecosystem such as the new activities of the IDSA and the role of the GAIA-X initiative. We will also shed light on our perspective on how INSPIRE has developed and added value, and how the directive can be developed further. You can find more information on the sessions below:

wetransform: INSPIRE Conference Sessions

Session Name Date Time
Architectures, infrastructures and technological enablers for environmental data sharing Tuesday, 26.10.2021 10:00-11:30 CET
Past, present and future of INSPIRE: an industry perspective Wednesday, 27.10.2021 15:30-17:00 CET

Register now!

We look forward to seeing you there!

(more)

If you have been creating INSPIRE GML, you have almost certainly encountered so-called codelists. They are an important part of INSPIRE data specifications and contribute substantially to interoperability. They are, however, not as straightforward as a simple enumeration is. This post explains what codelists are, how you use them, and why they are important.

In general, a codelist contains several terms whose definitions are universally agreed upon and understood. Codelists support data interoperability and form a shared vocabulary for a community. They can even be multilingual.

Managing Codelists and Codelist Registries

INSPIRE Codelists are commonly managed and maintained in codelist registers which provide search capabilities, so that both end users and client applications can easily access codelist values for reference. Registers provide unique and persistent identifiers for the published codelist values and ensure consistent versioning. There are many different INSPIRE registers which manage the identifiers of different resources commonly used in INSPIRE.

Codelists used in INSPIRE are maintained in the INSPIRE code list registry, the codelist registry of a member state, or an acknowledged, external third-party who maintains a domain-specific codelist.

To add a new codelist, you will have to either set up your own registry or work with the administration of one of the existing registries to get your codelist published. This can be a quite an involved process, which is designed to make sure that there is no random growth of codelists.

Extending Codelists

One special feature of codelists in INSPIRE is that they may be extensible. If a codelist is extensible, it will only contain a small set of common terms, but you can add your own terms. With respect to extensibility, we differentiate four different types of codelists in INSPIRE:

  • None (Not extensible): A codelist that is not extensible includes only the values specified in the INSPIRE Implementing Rules (IR).
  • Narrower (Narrower extensible): A codelist that is narrower extensible includes the values specified in the IR and narrower values defined by the data providers.
  • Open (Freely extensible): A freely extensible codelist includes the values specified in the IR and additional values defined by data providers.
  • Empty (Any values allowed): An empty codelist can contain any values defined by the data providers.

You can recognize which type a codelist is by either looking at the UML model, where they appear as tagged values (“extensibility”), or by looking into their definitions in the respective registry. For example, the Anthropogenic Geomorphologic Feature codelist is shown below.

Codelists have maintenance processes which enable the update of codelist values. Codelists of the type “Not extensible” can also be updated to include new values for inclusion in the next, updated version. Codelists of the type “Freely extensible” can include extended codelist values, however only if they are managed in a register. Codelists of the type “Empty” often pose a challenge to users as there are not always readily applicable codelists available. In some cases, empty codelists suggest use of a standard external codelist commonly used in the domain.

Codelist Encoding

The conceptual schema language rules in the INSPIRE Generic Conceptual Model contain guidance on how to include codelists in INSPIRE GML application schemas, some of which you may recognize:

  • Code lists should use the stereotype codeList.
  • The name of the codelist or enumeration should include the suffix Value
  • The documentation field of the codeList classes in the UML application schemas shall include the -- Name --, -- Definition --, and -- Description -- information.
  • The natural language name of the code list (given in the -- Name -- section) should not include the term Value.
  • The type of code list shall be specified using the tagged value extensibility on the codeList class.
  • For each code list, a tagged value called vocabulary shall be specified. The value of the tagged value shall be a persistent URI identifying the values of the code list.
  • A code list may also be used as a super-class for a number of specific codelists whose values may be used to specify the attribute value.
  • Values of INSPIRE-governed code lists and enumerations shall be in lowerCamelCase notation.

In UML, the usage of an extended code list is indicated by substituting the existing code list. The extended codelist is represented by a sub-type of the original codelist.

Codelist values are encoded in GML application schemas using gml:ReferenceType, which means that there is no formal link between the new subtype in the GML application schema and the extended codelist. The codelist itself must be published in a register and the register should be published in the INSPIRE register federation, however the application schema does not need to be adapted to use the extended or profiled codelist.

Using INSPIRE codelists in hale»studio

Both INSPIRE GML as well as the INSPIRE metadata – which describes harmonized datasets and network services – include references to codelists in the form of xlinks. Xlink is a recommendation by the World Wide Web Consortium for the definition of references in or across XML documents. Simple xlinksare the standard method for object references in GML. Attributes encoded using xlink require a URI to the remote object or internal document reference in xlink:href.

It is standard practice to refer to items in the INSPIRE registry using HTTP URIs.

If you are using hale»studio to create your harmonization project, you can load INSPIRE codelists directly from the INSPIRE registry for use in your project. The INSPIRE codelists are referenced using http in the exported GML data.

To import an INSPIRE codelist into your hale studio project, select “File” » “Import” » “Codelist”.

Next, select “From INSPIRE registry”. A list of all INSPIRE codelists will appear and you can either filter by name or search by INSPIRE theme. The selected codelist will be added to your project.

If all the target instances in your dataset will use the same codelist value, select the href attribute in the target property and apply the Assign function. In the Assign function dialog, select the icon with the yellow arrows to assign a codelist value from the codelist you loaded into your project.

Next steps

Codelists are a fundamental building block of any INSPIRE implementation: they promote data interoperability through the effective reuse of stable and persistent identifiers for universally defined concepts. INSPIRE harmonization projects can often be stalled by empty codelists and missing values. Wetransform has supported numerous customers with the UML encoding of custom, codelist extensions, and with the development and maintenance of codelist registries. If you are interested in moving ahead with your project and overcoming the obstacles, please get in touch with our support team at support@wetransform.to.

If you’re interested in learning more about such topics, feel free to check out our post on INSPIRE IDs or our news page!

If you have been creating INSPIRE GML, you have almost certainly encountered so-called codelists. They are an important part of INSPIRE data specifications and contribute substantially to interoperability. They are, however, not as straightforward as a simple enumeration is. This post explains what codelists are, how you use them, and why they are important.

In general, a codelist contains several terms whose definitions are universally agreed upon and understood. Codelists support data interoperability and form a shared vocabulary for a community. They can even be multilingual.

Managing Codelists and Codelist Registries

INSPIRE Codelists are commonly managed and maintained in codelist registers which provide search capabilities, so that both end users and client applications can easily access codelist values for reference. Registers provide unique and persistent identifiers for the published codelist values and ensure consistent versioning. There are many different INSPIRE registers which manage the identifiers of different resources commonly used in INSPIRE.

Codelists used in INSPIRE are maintained in the INSPIRE code list registry, the codelist registry of a member state, or an acknowledged, external third-party who maintains a domain-specific codelist.

To add a new codelist, you will have to either set up your own registry or work with the administration of one of the existing registries to get your codelist published. This can be a quite an involved process, which is designed to make sure that there is no random growth of codelists.

Extending Codelists

One special feature of codelists in INSPIRE is that they may be extensible. If a codelist is extensible, it will only contain a small set of common terms, but you can add your own terms. With respect to extensibility, we differentiate four different types of codelists in INSPIRE:

  • None (Not extensible): A codelist that is not extensible includes only the values specified in the INSPIRE Implementing Rules (IR).
  • Narrower (Narrower extensible): A codelist that is narrower extensible includes the values specified in the IR and narrower values defined by the data providers.
  • Open (Freely extensible): A freely extensible codelist includes the values specified in the IR and additional values defined by data providers.
  • Empty (Any values allowed): An empty codelist can contain any values defined by the data providers.

You can recognize which type a codelist is by either looking at the UML model, where they appear as tagged values (“extensibility”), or by looking into their definitions in the respective registry. For example, the Anthropogenic Geomorphologic Feature codelist is shown below.

Codelists have maintenance processes which enable the update of codelist values. Codelists of the type “Not extensible” can also be updated to include new values for inclusion in the next, updated version. Codelists of the type “Freely extensible” can include extended codelist values, however only if they are managed in a register. Codelists of the type “Empty” often pose a challenge to users as there are not always readily applicable codelists available. In some cases, empty codelists suggest use of a standard external codelist commonly used in the domain.

Codelist Encoding

The conceptual schema language rules in the INSPIRE Generic Conceptual Model contain guidance on how to include codelists in INSPIRE GML application schemas, some of which you may recognize:

  • Code lists should use the stereotype codeList.
  • The name of the codelist or enumeration should include the suffix Value
  • The documentation field of the codeList classes in the UML application schemas shall include the -- Name --, -- Definition --, and -- Description -- information.
  • The natural language name of the code list (given in the -- Name -- section) should not include the term Value.
  • The type of code list shall be specified using the tagged value extensibility on the codeList class.
  • For each code list, a tagged value called vocabulary shall be specified. The value of the tagged value shall be a persistent URI identifying the values of the code list.
  • A code list may also be used as a super-class for a number of specific codelists whose values may be used to specify the attribute value.
  • Values of INSPIRE-governed code lists and enumerations shall be in lowerCamelCase notation.

In UML, the usage of an extended code list is indicated by substituting the existing code list. The extended codelist is represented by a sub-type of the original codelist.

Codelist values are encoded in GML application schemas using gml:ReferenceType, which means that there is no formal link between the new subtype in the GML application schema and the extended codelist. The codelist itself must be published in a register and the register should be published in the INSPIRE register federation, however the application schema does not need to be adapted to use the extended or profiled codelist.

Using INSPIRE codelists in hale»studio

Both INSPIRE GML as well as the INSPIRE metadata – which describes harmonized datasets and network services – include references to codelists in the form of xlinks. Xlink is a recommendation by the World Wide Web Consortium for the definition of references in or across XML documents. Simple xlinksare the standard method for object references in GML. Attributes encoded using xlink require a URI to the remote object or internal document reference in xlink:href.

It is standard practice to refer to items in the INSPIRE registry using HTTP URIs.

If you are using hale»studio to create your harmonization project, you can load INSPIRE codelists directly from the INSPIRE registry for use in your project. The INSPIRE codelists are referenced using http in the exported GML data.

To import an INSPIRE codelist into your hale studio project, select “File” » “Import” » “Codelist”.

Next, select “From INSPIRE registry”. A list of all INSPIRE codelists will appear and you can either filter by name or search by INSPIRE theme. The selected codelist will be added to your project.

If all the target instances in your dataset will use the same codelist value, select the href attribute in the target property and apply the Assign function. In the Assign function dialog, select the icon with the yellow arrows to assign a codelist value from the codelist you loaded into your project.

Next steps

Codelists are a fundamental building block of any INSPIRE implementation: they promote data interoperability through the effective reuse of stable and persistent identifiers for universally defined concepts. INSPIRE harmonization projects can often be stalled by empty codelists and missing values. Wetransform has supported numerous customers with the UML encoding of custom, codelist extensions, and with the development and maintenance of codelist registries. If you are interested in moving ahead with your project and overcoming the obstacles, please get in touch with our support team at support@wetransform.to.

If you’re interested in learning more about such topics, feel free to check out our post on INSPIRE IDs or our news page!

(more)

2021 has been an eventful year, and we’ve had many exciting developments including further support for the increasingly popular GeoPackage format, more CSW capabilities, and a host of other improvements.

Here’s what’s new:

Information for Users

New Features

  • Users uploading data to the hale»connect platform via URL can now add Authorization headers to HTTP(s) requests to provide the required authentication, as shown below.
  • Organisations that have their own CSW configured can now edit values in the CSW capabilities documents through use of variables on the organization profile page.
    Note: Activation of the CSW_INSPIRE_METADATA_CONFIG feature toggle is required.
  • Organisation can now be filtered in the text filter of the dataset resource list.
  • hale»connect now supports GeoPackage as source data input to online transformations via URL.
  • Service publishing is now enabled with custom SLDs that include multiple feature types in one layer.

Changes

  • Usage statistics graphs now use the same color for the same user agent across multiple graphs, as shown below.
  • Metadata input fields that allow you to select predefined values as well as enter free text, now save the free text when you exit the field (and not just after entering “Enter”).
  • In the WMS service settings, there is now the option of restricting rendering in view services to the bounding box in the metadata. Whether the option is activated by default depends on the system configuration. If the option is activated, data may not be displayed in a view service if the bounding box in the metadata is incorrect or the axes are swapped.
    Note: Currently, this setting for data set series cannot be adjusted via the user interface.
  • Data fields that are interpreted as a number in the file analysis are no longer treated as a floating point number if values are all integers.
  • To improve overall performance, the system-wide display level configuration for raster layers is now checked earlier.
  • Terms of use / useContraints in the metadata: Descriptions for given code list values can now be adapted.
  • Use restrictions / useLimitations in the metadata: GDI-DE-specific rules only apply if the country specified in the metadata is Germany. If this is not specified, the GDI-DE standard value is no longer set if the data set is an INSPIRE data set (“INSPIRE” Category) so that INSPIRE Monitoring correctly recognizes the metadata as being in accordance with TG 2.0.
  • Outdated GDI-DE test suite tests were removed from hale»connect.

Fixes

  • Some unnecessary error messages that occur if a user does not have sufficient privileges to access certain information have been removed.
  • An error that caused status messages to be displayed incorrectly on the dataset overview page was fixed.
  • An error was fixed to prevent password protection on services after the password was removed.
  • File names of uploaded shape files can start with a number.
  • Fixed a bug with downloading files that need to be converted before adding to a dataset.
  • The automatic process that fills metadata now waits for any outstanding processes to calculate the attribute coverage, to be able to access the results.
  • Services of datasets in a dataset series no longer count for the capacity points (only the services of the series itself).
  • An error that deleted uploaded files that were not associated with a dataset (e.g. uploaded logos for organizations)was fixed.
  • Several “priority dataset” keywords are now correctly represented in the metadata when published.
  • When using the hybrid mode, no geometries were saved if no geometry was referenced in the SLD. This has been fixed by now always trying to identify a standard geometry in a feature type.
  • The Mapproxy cache for raster layers in a series no longer resets every time the series is changed. This now only happens when changes are made to the relevant individual data set.
  • An error has been corrected which caused the WFS to deliver invalid XML with missing namespace definitions. This also affected corresponding GetFeatureInfo queries.
  • GetFeatureInfo requests now return complete XML when the INFO_FORMAT parameter is of type text/xml.
  • GetFeatureInfo requests return results for raster/vector datasets.
  • An issue was fixed that was related to schema location when the same schema definition file is referenced directly in a combined schema and imported by a schema contained in the same combined schema file.
  • *The AuthorityUrl.name element can now only contain valid values for the data type NMTOKEN.
  • Added redirection handling for INSPIRE schemas.
  • A fix to prevent global capacity points updates from running during the day was implemented.
  • A fix was implemented to use string representations of number values as autofill results, when available.

Information for Systems Administrators

Mapproxy: Adjustments to the Docker Image

Until now, Mapproxy could become a bottleneck when processing WMS requests, as many parallel requests could only be processed poorly in the previous configuration. The runtime environment in which Mapproxy runs in the Docker container has been adapted, as well as the procedure for deleting caches. The result is that Mapproxy no longer acts as the root user within the container - the caches created are assigned to the root user. To ensure access to the caches created, the rights must be adapted so that the mapproxy user of the container has write and read rights. This can be done, for example, via a shell in the new mapproxy container: chown -R mapproxy: mapproxy / mapproxy / cache /

Note: As an alternative, there is also the possibility to keep Mapproxy running as root, but this should only be used as an interim solution - if you are interested, we can provide the appropriate configuration option.

Mapproxy: Extended configuration options

Mapproxy acts as a buffer in the system that intercepts GetMap requests to view services and, if possible, serves them from the built-up cache. It thereby determines what kind of requests are processed by deegree. The behavior of Mapproxy can now be adjusted in some aspects. The configuration options are currently only available at the system configuration level, with the exception of the setting to restrict the metadata to the bounding box.

Important: Changes to the configuration are not automatically applied to existing publications. The new actions on the debug page of the service-publisher should be used for this purpose:

  • To update the mapproxy configuration only:
    1. “Update mapproxy configuration for all publications” for all existing publications
    2. “update-mapproxy” for a single publication
  • To update the mapproxy configuration and to reset the cache (e.g. when changing the cache backend)
    1. “Update mapproxy configuration and clear cache for all publications” for all existing publications
    2. “reset-mapproxy” for a single publication

The new configuration options are described below. More information on the individual options can also be found in the Mapproxy documentation.

Reduced re-start times of unresponsive WMS/WFS services

With many publications, the initialization of the OWS services can take a long time. If the feature toggle to divide the configuration workspace into sub-workspaces per organization is used, the configurations are initialized in parallel. This significantly accelerates the start of WMS/WFS services after a failure.

Before / after examples from our systems: Before: approx. 5 minutes - after: approx. 90 seconds (10k+ Services) Before: between 30 and 50 minutes - after: between 5 and 8 minutes (60k+ Services)

If you are not yet using sub-workspaces in your deployment and are interested in it, please contact us. Start up time only improves significantly if the publications in the system are well distributed among different organizations.

Cache backend

By default, Mapproxy saves the individual cached tiles as individual files in a specific directory structure. This can quickly lead to millions of files being used for a cache. This, in turn, can be a problem if the file system’s limitations on the maximum number of files (or inodes) are reached. Once the limit has been reached and no more files can be created, it is particularly critical if data other than the caches are on the same file system. It is now possible to adapt the backend used for the caches. The options are as follows:

  • file - the standard setting with storage as individual files
  • sqlite - Saving a zoom level in a SQLite file
  • geopackage - Saving a zoom level in a geopackage file

Recommendation: We recommend using the sqlite backend, which we are already using productively. You should check whether the number of files in the file system could possibly become a problem (e.g. with df -i). Currently, we do not support any mechanism to migrate caches between different backends. In this respect, the old cache should be deleted when updating the configuration for existing publications. In principle, however, there is a tool at Mapproxy with which a migration can be carried out.

Restriction of the cache to certain zoom levels

In hale»connect, Mapproxy uses a uniform tile grid for all publications based on EPSG: 3857:

GLOBAL_WEBMERCATOR:
Configuration:
    bbox*: [-20037508.342789244, -20037508.342789244, 20037508.342789244, 20037508.342789244]
    origin: 'nw'
    srs: 'EPSG:3857'
    tile_size*: [256, 256]
Levels: Resolutions, # x * y = total tiles
    00:  156543.03392804097,  #      1 * 1      =          1
    01:  78271.51696402048,   #      2 * 2      =          4
    02:  39135.75848201024,   #      4 * 4      =         16
    03:  19567.87924100512,   #      8 * 8      =         64
    04:  9783.93962050256,    #     16 * 16     =        256
    05:  4891.96981025128,    #     32 * 32     =       1024
    06:  2445.98490512564,    #     64 * 64     =       4096
    07:  1222.99245256282,    #    128 * 128    =      16384
    08:  611.49622628141,     #    256 * 256    =      65536
    09:  305.748113140705,    #    512 * 512    =     262144
    10:  152.8740565703525,   #   1024 * 1024   =      1.05M
    11:  76.43702828517625,   #   2048 * 2048   =      4.19M
    12:  38.21851414258813,   #   4096 * 4096   =     16.78M
    13:  19.109257071294063,  #   8192 * 8192   =     67.11M
    14:  9.554628535647032,   #  16384 * 16384  =    268.44M
    15:  4.777314267823516,   #  32768 * 32768  =   1073.74M
    16:  2.388657133911758,   #  65536 * 65536  =   4294.97M
    17:  1.194328566955879,   # 131072 * 131072 =  17179.87M
    18:  0.5971642834779395,  # 262144 * 262144 =  68719.48M
    19:  0.29858214173896974, # 524288 * 524288 = 274877.91M

Mapproxy can now be configured not to cache tiles from a certain zoom level but always to make requests to deegree:

service_publisher:
map_proxy:
    # Don't cache but use direct access beginning with the given level 
    # (negative value to disable)
    # For example: A value of 18 mean levels 0-17 are cached but levels >=18 not
    use_direct_from_level: -1

Restricting queries and cache to the bounding box of the metadata

Since the data of a view service rarely covers the whole world, it makes sense to spatially limit the cache and the requests to deegree. Now there is the possibility to do this with the help of the bounding box of the metadata. When activated, requests that are outside of the system automatically deliver an empty image without making a request to deegree or the cache having to be expanded to include the information. In addition to activating the restriction, a buffer can also be configured around the bounding box to avoid content being cut off (which can be possible, for example, with raster data):

map_proxy:
    # limit mapproxy cache and source requests to metadata bounding box
    # otherwise the cache may encompass the whole world-wide grid (see above)
    coverage:
    enabled: true
    buffer: 0.01 # buffer for WGS 84 bounding box (to for instance compensate rasters that exceed the vector bounding box); 0.01 ~ 1km

Monitoring: Alerts on file systems

The existing alerts on file systems, which should provide information when a file system is almost full or no more handles are available, have unfortunately not been fully functional due to a change in the names of the metrics. These alerts have been revised and expanded to determine when the threshold of a file system’s maximum number of files is close to being met. The standard value of the limit is 10% available memory / files, but can be adjusted:

alerts: 
    filesystem: 
        # default limit in percent of available space / inodes, must be an integer value available_limit: 10

2021 has been an eventful year, and we’ve had many exciting developments including further support for the increasingly popular GeoPackage format, more CSW capabilities, and a host of other improvements.

Here’s what’s new:

Information for Users

New Features

  • Users uploading data to the hale»connect platform via URL can now add Authorization headers to HTTP(s) requests to provide the required authentication, as shown below.
  • Organisations that have their own CSW configured can now edit values in the CSW capabilities documents through use of variables on the organization profile page.
    Note: Activation of the CSW_INSPIRE_METADATA_CONFIG feature toggle is required.
  • Organisation can now be filtered in the text filter of the dataset resource list.
  • hale»connect now supports GeoPackage as source data input to online transformations via URL.
  • Service publishing is now enabled with custom SLDs that include multiple feature types in one layer.

Changes

  • Usage statistics graphs now use the same color for the same user agent across multiple graphs, as shown below.
  • Metadata input fields that allow you to select predefined values as well as enter free text, now save the free text when you exit the field (and not just after entering “Enter”).
  • In the WMS service settings, there is now the option of restricting rendering in view services to the bounding box in the metadata. Whether the option is activated by default depends on the system configuration. If the option is activated, data may not be displayed in a view service if the bounding box in the metadata is incorrect or the axes are swapped.
    Note: Currently, this setting for data set series cannot be adjusted via the user interface.
  • Data fields that are interpreted as a number in the file analysis are no longer treated as a floating point number if values are all integers.
  • To improve overall performance, the system-wide display level configuration for raster layers is now checked earlier.
  • Terms of use / useContraints in the metadata: Descriptions for given code list values can now be adapted.
  • Use restrictions / useLimitations in the metadata: GDI-DE-specific rules only apply if the country specified in the metadata is Germany. If this is not specified, the GDI-DE standard value is no longer set if the data set is an INSPIRE data set (“INSPIRE” Category) so that INSPIRE Monitoring correctly recognizes the metadata as being in accordance with TG 2.0.
  • Outdated GDI-DE test suite tests were removed from hale»connect.

Fixes

  • Some unnecessary error messages that occur if a user does not have sufficient privileges to access certain information have been removed.
  • An error that caused status messages to be displayed incorrectly on the dataset overview page was fixed.
  • An error was fixed to prevent password protection on services after the password was removed.
  • File names of uploaded shape files can start with a number.
  • Fixed a bug with downloading files that need to be converted before adding to a dataset.
  • The automatic process that fills metadata now waits for any outstanding processes to calculate the attribute coverage, to be able to access the results.
  • Services of datasets in a dataset series no longer count for the capacity points (only the services of the series itself).
  • An error that deleted uploaded files that were not associated with a dataset (e.g. uploaded logos for organizations)was fixed.
  • Several “priority dataset” keywords are now correctly represented in the metadata when published.
  • When using the hybrid mode, no geometries were saved if no geometry was referenced in the SLD. This has been fixed by now always trying to identify a standard geometry in a feature type.
  • The Mapproxy cache for raster layers in a series no longer resets every time the series is changed. This now only happens when changes are made to the relevant individual data set.
  • An error has been corrected which caused the WFS to deliver invalid XML with missing namespace definitions. This also affected corresponding GetFeatureInfo queries.
  • GetFeatureInfo requests now return complete XML when the INFO_FORMAT parameter is of type text/xml.
  • GetFeatureInfo requests return results for raster/vector datasets.
  • An issue was fixed that was related to schema location when the same schema definition file is referenced directly in a combined schema and imported by a schema contained in the same combined schema file.
  • *The AuthorityUrl.name element can now only contain valid values for the data type NMTOKEN.
  • Added redirection handling for INSPIRE schemas.
  • A fix to prevent global capacity points updates from running during the day was implemented.
  • A fix was implemented to use string representations of number values as autofill results, when available.

Information for Systems Administrators

Mapproxy: Adjustments to the Docker Image

Until now, Mapproxy could become a bottleneck when processing WMS requests, as many parallel requests could only be processed poorly in the previous configuration. The runtime environment in which Mapproxy runs in the Docker container has been adapted, as well as the procedure for deleting caches. The result is that Mapproxy no longer acts as the root user within the container - the caches created are assigned to the root user. To ensure access to the caches created, the rights must be adapted so that the mapproxy user of the container has write and read rights. This can be done, for example, via a shell in the new mapproxy container: chown -R mapproxy: mapproxy / mapproxy / cache /

Note: As an alternative, there is also the possibility to keep Mapproxy running as root, but this should only be used as an interim solution - if you are interested, we can provide the appropriate configuration option.

Mapproxy: Extended configuration options

Mapproxy acts as a buffer in the system that intercepts GetMap requests to view services and, if possible, serves them from the built-up cache. It thereby determines what kind of requests are processed by deegree. The behavior of Mapproxy can now be adjusted in some aspects. The configuration options are currently only available at the system configuration level, with the exception of the setting to restrict the metadata to the bounding box.

Important: Changes to the configuration are not automatically applied to existing publications. The new actions on the debug page of the service-publisher should be used for this purpose:

  • To update the mapproxy configuration only:
    1. “Update mapproxy configuration for all publications” for all existing publications
    2. “update-mapproxy” for a single publication
  • To update the mapproxy configuration and to reset the cache (e.g. when changing the cache backend)
    1. “Update mapproxy configuration and clear cache for all publications” for all existing publications
    2. “reset-mapproxy” for a single publication

The new configuration options are described below. More information on the individual options can also be found in the Mapproxy documentation.

Reduced re-start times of unresponsive WMS/WFS services

With many publications, the initialization of the OWS services can take a long time. If the feature toggle to divide the configuration workspace into sub-workspaces per organization is used, the configurations are initialized in parallel. This significantly accelerates the start of WMS/WFS services after a failure.

Before / after examples from our systems: Before: approx. 5 minutes - after: approx. 90 seconds (10k+ Services) Before: between 30 and 50 minutes - after: between 5 and 8 minutes (60k+ Services)

If you are not yet using sub-workspaces in your deployment and are interested in it, please contact us. Start up time only improves significantly if the publications in the system are well distributed among different organizations.

Cache backend

By default, Mapproxy saves the individual cached tiles as individual files in a specific directory structure. This can quickly lead to millions of files being used for a cache. This, in turn, can be a problem if the file system’s limitations on the maximum number of files (or inodes) are reached. Once the limit has been reached and no more files can be created, it is particularly critical if data other than the caches are on the same file system. It is now possible to adapt the backend used for the caches. The options are as follows:

  • file - the standard setting with storage as individual files
  • sqlite - Saving a zoom level in a SQLite file
  • geopackage - Saving a zoom level in a geopackage file

Recommendation: We recommend using the sqlite backend, which we are already using productively. You should check whether the number of files in the file system could possibly become a problem (e.g. with df -i). Currently, we do not support any mechanism to migrate caches between different backends. In this respect, the old cache should be deleted when updating the configuration for existing publications. In principle, however, there is a tool at Mapproxy with which a migration can be carried out.

Restriction of the cache to certain zoom levels

In hale»connect, Mapproxy uses a uniform tile grid for all publications based on EPSG: 3857:

GLOBAL_WEBMERCATOR:
Configuration:
    bbox*: [-20037508.342789244, -20037508.342789244, 20037508.342789244, 20037508.342789244]
    origin: 'nw'
    srs: 'EPSG:3857'
    tile_size*: [256, 256]
Levels: Resolutions, # x * y = total tiles
    00:  156543.03392804097,  #      1 * 1      =          1
    01:  78271.51696402048,   #      2 * 2      =          4
    02:  39135.75848201024,   #      4 * 4      =         16
    03:  19567.87924100512,   #      8 * 8      =         64
    04:  9783.93962050256,    #     16 * 16     =        256
    05:  4891.96981025128,    #     32 * 32     =       1024
    06:  2445.98490512564,    #     64 * 64     =       4096
    07:  1222.99245256282,    #    128 * 128    =      16384
    08:  611.49622628141,     #    256 * 256    =      65536
    09:  305.748113140705,    #    512 * 512    =     262144
    10:  152.8740565703525,   #   1024 * 1024   =      1.05M
    11:  76.43702828517625,   #   2048 * 2048   =      4.19M
    12:  38.21851414258813,   #   4096 * 4096   =     16.78M
    13:  19.109257071294063,  #   8192 * 8192   =     67.11M
    14:  9.554628535647032,   #  16384 * 16384  =    268.44M
    15:  4.777314267823516,   #  32768 * 32768  =   1073.74M
    16:  2.388657133911758,   #  65536 * 65536  =   4294.97M
    17:  1.194328566955879,   # 131072 * 131072 =  17179.87M
    18:  0.5971642834779395,  # 262144 * 262144 =  68719.48M
    19:  0.29858214173896974, # 524288 * 524288 = 274877.91M

Mapproxy can now be configured not to cache tiles from a certain zoom level but always to make requests to deegree:

service_publisher:
map_proxy:
    # Don't cache but use direct access beginning with the given level 
    # (negative value to disable)
    # For example: A value of 18 mean levels 0-17 are cached but levels >=18 not
    use_direct_from_level: -1

Restricting queries and cache to the bounding box of the metadata

Since the data of a view service rarely covers the whole world, it makes sense to spatially limit the cache and the requests to deegree. Now there is the possibility to do this with the help of the bounding box of the metadata. When activated, requests that are outside of the system automatically deliver an empty image without making a request to deegree or the cache having to be expanded to include the information. In addition to activating the restriction, a buffer can also be configured around the bounding box to avoid content being cut off (which can be possible, for example, with raster data):

map_proxy:
    # limit mapproxy cache and source requests to metadata bounding box
    # otherwise the cache may encompass the whole world-wide grid (see above)
    coverage:
    enabled: true
    buffer: 0.01 # buffer for WGS 84 bounding box (to for instance compensate rasters that exceed the vector bounding box); 0.01 ~ 1km

Monitoring: Alerts on file systems

The existing alerts on file systems, which should provide information when a file system is almost full or no more handles are available, have unfortunately not been fully functional due to a change in the names of the metrics. These alerts have been revised and expanded to determine when the threshold of a file system’s maximum number of files is close to being met. The standard value of the limit is 10% available memory / files, but can be adjusted:

alerts: 
    filesystem: 
        # default limit in percent of available space / inodes, must be an integer value available_limit: 10

(more)

Wind turbines, helipads, industrial plants, and campsites - wondering what’s common among these?

According to §21 of Germany’s Air Traffic Regulations and the EU Drone Regulation of 2021 they are recognized as drone flight prohibition zones. However, not all such drone no-fly zones have been formally identified and communicated to the public.

In 2018, drones caused 158 disruptions to air traffic in Germany - almost double the amount from the previous year. Drone flight clearly comes with safety and security risks, and not all no-fly zones are formally defined. The expected growth in drone usage (between 2020 and 2025, commercial drone usage is expected to increase by 200%) will worsen the situation and lead to more similar instances - unless no-fly zones are identified, monitored and communicated actively.

Moreover, precise and safe drone navigation will expand possible drone mobility uses and add value to society. For example, drones will reduce efforts and costs of supply lines and operational execution. They could be used to transport vaccines or medical equipment such as oxygen cannisters more effectively. Robotic camera drones could be used for high-voltage pipelines inspections or gas line maintenance. The applications of safe drone mobility are vast and diverse.

Eliminating the current drone navigation problems and making the most out of drone transport is no small feat. Germany has 357,386 km² of terrain that need to be analysed, branded, and visualized as fly or no-fly zones. Some of these terrain features have not yet been mapped. These problems are a part of a sphere of innovation that needs further strategic development.

Deutsche Flugsicherung (DFS), Fraunhofer IGD, and wetransform have joined forces to tackle these problems through the fAIRport (Flight Area Artificial Intelligence Retrieval Portal) project. The three-year long (2020 – 2023) and 1.2 million EUR project is supported by the Federal Ministry of Transport and Digital Infrastructure (BMVI) as a part of the mFUND (Modernity Fund) research initiative.

The main aim of fAIRport is to provide a comprehensive high-precision geodatabase for no-fly zones in Germany by merging existing datasets and creating new information through Fraunhofer IGD’s AI-based methods for orthoimage object detection.

wetransform is developing the centrepiece of the project- the fAIRport municipal portal and creating the dataflows for the establishment of the portal. The portal will collate static data, dynamic data, local information, and real-time information. Fraunhofer IGD is creating the 2D and 3D visualisations of no-fly zones that will make the portal more intuitive and easier to use. The portal will allow local authorities to access, upload, maintain, and manage no-fly zones within their jurisdictions.

The project is in its initial stages, and recently we hosted the second user workshop on the functionalities of the portal for the City of Langen, and important project stakeholder. The goal of the workshop was to get a deeper understanding of the required workflows and the portal requirements. The first draft of the portal based on user requirements is shown below:

This draft received positive feedback, but there’s still a long way to go. Nationwide data must be collected and branded as fly or no-fly zones, and then this data needs to be made accessible at scale and visualized effectively.

After project completion in early 2023, there will be clear rules about where drones can fly in all of Germany and all no-fly zones will be visualized and maintained with ease. Moreover, community members will be able to add in zones themselves, thus any spontaneous activities that lead to temporary no-fly zones can also easily be added to the fAIRport portal. These initiatives will open more flight corridors as only dedicated areas will be restricted. The improved definition of zones will also let local authorities across Germany effectively manage no-fly zones in their jurisdictions.

We’re excited to see how this project will cause disruption across industries. If you’re interested in other similar initiatives, you can also read our article about how wetransform is harnessing the power of open data to save Germany’s forests.

Interested in staying updated about the latest happenings in the world of data interoperability? Sign up for our newsletter here.

Wind turbines, helipads, industrial plants, and campsites - wondering what’s common among these?

According to §21 of Germany’s Air Traffic Regulations and the EU Drone Regulation of 2021 they are recognized as drone flight prohibition zones. However, not all such drone no-fly zones have been formally identified and communicated to the public.

In 2018, drones caused 158 disruptions to air traffic in Germany - almost double the amount from the previous year. Drone flight clearly comes with safety and security risks, and not all no-fly zones are formally defined. The expected growth in drone usage (between 2020 and 2025, commercial drone usage is expected to increase by 200%) will worsen the situation and lead to more similar instances - unless no-fly zones are identified, monitored and communicated actively.

Moreover, precise and safe drone navigation will expand possible drone mobility uses and add value to society. For example, drones will reduce efforts and costs of supply lines and operational execution. They could be used to transport vaccines or medical equipment such as oxygen cannisters more effectively. Robotic camera drones could be used for high-voltage pipelines inspections or gas line maintenance. The applications of safe drone mobility are vast and diverse.

Eliminating the current drone navigation problems and making the most out of drone transport is no small feat. Germany has 357,386 km² of terrain that need to be analysed, branded, and visualized as fly or no-fly zones. Some of these terrain features have not yet been mapped. These problems are a part of a sphere of innovation that needs further strategic development.

Deutsche Flugsicherung (DFS), Fraunhofer IGD, and wetransform have joined forces to tackle these problems through the fAIRport (Flight Area Artificial Intelligence Retrieval Portal) project. The three-year long (2020 – 2023) and 1.2 million EUR project is supported by the Federal Ministry of Transport and Digital Infrastructure (BMVI) as a part of the mFUND (Modernity Fund) research initiative.

The main aim of fAIRport is to provide a comprehensive high-precision geodatabase for no-fly zones in Germany by merging existing datasets and creating new information through Fraunhofer IGD’s AI-based methods for orthoimage object detection.

wetransform is developing the centrepiece of the project- the fAIRport municipal portal and creating the dataflows for the establishment of the portal. The portal will collate static data, dynamic data, local information, and real-time information. Fraunhofer IGD is creating the 2D and 3D visualisations of no-fly zones that will make the portal more intuitive and easier to use. The portal will allow local authorities to access, upload, maintain, and manage no-fly zones within their jurisdictions.

The project is in its initial stages, and recently we hosted the second user workshop on the functionalities of the portal for the City of Langen, and important project stakeholder. The goal of the workshop was to get a deeper understanding of the required workflows and the portal requirements. The first draft of the portal based on user requirements is shown below:

This draft received positive feedback, but there’s still a long way to go. Nationwide data must be collected and branded as fly or no-fly zones, and then this data needs to be made accessible at scale and visualized effectively.

After project completion in early 2023, there will be clear rules about where drones can fly in all of Germany and all no-fly zones will be visualized and maintained with ease. Moreover, community members will be able to add in zones themselves, thus any spontaneous activities that lead to temporary no-fly zones can also easily be added to the fAIRport portal. These initiatives will open more flight corridors as only dedicated areas will be restricted. The improved definition of zones will also let local authorities across Germany effectively manage no-fly zones in their jurisdictions.

We’re excited to see how this project will cause disruption across industries. If you’re interested in other similar initiatives, you can also read our article about how wetransform is harnessing the power of open data to save Germany’s forests.

Interested in staying updated about the latest happenings in the world of data interoperability? Sign up for our newsletter here.

(more)