Aligning the German ALKIS-AFIS-ATKIS standard with INSPIRE Annex I

Aligning the German ALKIS-AFIS-ATKIS standard with INSPIRE Annex I

General Interest Tutorials ALKIS hale»studio INSPIRE

Over the course of the last 15 years, Germany developed its own set of GML-based spatial data exchange standards, known as ALKIS-AFIS-ATKIS (or short 3A NAS). Surveying organisations in all states have implemented the standards, thus providing a common foundation for an INSPIRE implementation.

In 2016, the Arbeitsgemeinschaft der Vermessungsverwaltungen der Länder der Bundesrepublik Deutschland (AdV) commissioned wetransform to create a formal data transformation documentation, with 3A NAS and the “Hauskoordinaten” as a source and with 12 INSPIRE Annex I GML schemas as the target. This documentation was to be generated based on hale»studio alignments, and validated against data sets from multiple German states.

This project has recently been completed, resulting with the first full, formal and executable data transformation specification. Those results helped German authorities to achieve the breakthrough in the provision of harmonised INSPIRE data sets.

Additional project challenge was the high complexity of the source data models, being much larger than the individual INSPIRE annex data specifications. Furthermore, the 3A NAS schemas use a lot of special constructs to link features, and the German states have implemented individual variants due to different processes and legal requirements.

In this article, we explain how we created these highly complex alignments with up to 450 cells using hale»studio, what methodologies we applied, and how implementers, e.g. the state of Rheinland-Pfalz have already picked up the results to create INSPIRE compliant data sets from their 3A NAS production databases.

The baseline for the project was a massive collection of Excel matching tables, equivalent to more than 200 A3 pages when printed out. We used these Excel tables to create the initial Alignments. Furthermore, we worked with the AdV to define common rules for the transformation and for the resulting INSPIRE data sets, such as patterns for gml:id and gml:identifier elements.

ALKIS Data transformed to INSPIRE using hale»studio

Base Alignments and Custom Functions

During the initial analysis of the data models, we saw the need for specific functions and common mappings for all the alignments. As both, the source and target models are rich object-oriented models with rich inheritance hierarchies, we can define the common mappings in one alignment and then import these into all others. These so-called base alignments are re-usable components that we then imported into all Annex I alignments:

  • base-functions: Common functions for all themes (extended also by the other base alignments)
  • base-tn: Common functions and mappings for rail transport, road transport, water transport, air transport, cable transport
  • au-basis: Common mappings for all variants of the Administrative Units Alignment

The custom functions we wrote for this project included the following:

  • Creation of Geographical Name objects
  • Specific simplification rules for geometries
  • Generation of local IDs and Identifiers according to the AdV identifier rules
  • Conversion of units of measurement

Using the custom functions, we avoided a lot of redundancy in the alignments and reduced their complexity.

The Annex I Alignments

The core task in the project was to create the 14 concrete alignments used to generate the formal documentation. We applied the following development process:

  1. Implement the alignment according to the provided mapping tables, collecting any ambiguous points and posting questions on the project issue tracker
  2. Provide the alignments, the generated documents and the transformed data for review to the AdV stakeholders
  3. Implement improvements and fixes as suggested

In this project, we learned that the highly detailed matching tables captured only about 30% of all transformation cells in the final projects correctly or fully. Most of the work was to review and improve iterations that followed on the initial implementation. A lot of very important input was provided by the AdV stakeholders, so that the alignments could be improved until they reached sufficient quality on all aspects. The following links lead you to the interactive mapping documentation for some of these:

  1. Hauskoordinaten to INSPIRE Addresses
  2. 3A to INSPIRE Addresses
  3. 3A Flurstücke to INSPIRE Cadastral Parcels
  4. 3A Flurstücke to INSPIRE Administrative Units
  5. 3A Gebiete to INSPIRE Administrative Units
  6. 3A kommunale Gebiete to INSPIRE Administrative Units
  7. 3A to INSPIRE Air Transport Network
  8. 3A to INSPIRE CableTransport Network
  9. 3A to INSPIRE Road Transport Network
  10. 3A to INSPIRE Railway Transport Network
  11. 3A to INSPIRE Water Transport Network
  12. 3A to Geographical Names
  13. 3A to Hydro-Physical Waters
  14. 3A to Hydrography Network

These alignments are currently in the final resolution process of the AdV.

Variants and Derived Alignments

You might have noticed that there are three alignments that have Administrative Units as their target schema: In 3A, the geometry of Administrative Units is derived by creating the union of a set of land parcels. This process reduces redundancy in the data, but can be computationally expensive. As a consequence, we developed an alignment that creates these aggregated geometries for all levels of Administrative Units, but also made two variants that allow the specification of an additional data source with the respective pre-aggregated geometries.

We set up a process to generate derived alignments for subsets of the 3A data models based on the “Modellart”. The “Modellart” is a mix of model and scale – for example, there are landscape models in scales of 1:25.000 to 1:1.000.000. Each “Modellart” includes a subset of the total 3A model, so that the transformation also need to be used on a subset only, and some information is not available. We used annotations to the mapping cells to indicate which cell is relevant for which model. Due to hale's declarative mapping they can be created easily by excluding mappings for feature types that are not part of the respective model.

We also set up another automated generation process to derive modified alignments that would use the PostNAS database system instead of 3A XML as the source schema. One of the big advantages of a declarative system is that it makes such derivation processes and re-used of transformation mappings feasible.

Continuous Testing

For any kind of complex data processing, continuous testing is necessary. We set up an automated process that transformed and validated more than a dozen different data sets after each change to the mappings. This process was implemented with a Gradle script invoking the hale Command Line Interface. This interface has grown in capabilities with each release and can be used to control almost all aspects of hale – be it the transformation, the generation of artifacts such as the formal documentation or the validation of the results.

Documentation

The final deliverable of the project was the formal documentation. For a long time, hale studio had the capability to generate both matching tables and HTML documentation. Over the development of the last releases we have continuously improved the HTML documentation feature, so that the documentation offers a lot more than any static document could provide. It includes a graphical representation of the mapping, a verbal description, and information on the related schema entities, notes and other information. It is also interactive –search and filter options make it possible to choose what information to display.

Interactive HTML documentation generated from the Hydro Network Alignment

Collaboration

This project was a relatively complex undertaking, with more than 20 stakeholders reviewing the mappings and the transformed data to ensure completeness and correctness of the formal documentation. In the initial project, we used Gitlab as an issue tracker and collaboration platform. Gitlab is a very useful general purpose project and source code management platform, much like GitHub. However, we also found some issues with the usage of Gitlab for this specific use case:

  • Due to the size of the alignments, it was hard to establish context (e.g. which cell, which data) for any reported issues; reviewers used screenshots of the documentation
  • It was hard to keep track of changes made to the same mapping cells, so that repetitive and competing solutions were implemented
  • Standard diffs don’t work well to communicate changes and the history to a mapping cell to the domain experts involved in the projects

We thus implemented collaboration features as part of the documentation itself. These collaboration features enable efficient teamwork in larger groups with diverse backgrounds:

  • Tasks: Create and assign tasks to users of the platform if something needs to be changed in the transformation project.
  • Comments: Start a discussion visible to anybody with access to the transformation project in scope of a single mapping function or in scope of the entire alignment.
  • Notes: Add a private note for yourself to the alignment or any single cell.

These additional features require a central service to function, which we deployed as part of haleconnect.com. We evaluated the use of hale connect to manage our internal transformation projects over the last months. Now, we start to use the same processes with our customers to build better transformation projects faster.