News' image preview

Together, Spatineo and Wetransform provide integrated publishing and monitoring of spatial data through web services.

Spatineo is well known in the community to provide monitoring and analytics for spatial web services. Wetransform is mostly known for data transformation and publishing solutions for spatial data providers. Wetransform’s INSPIRE GIS platform is an easy to use solution that makes it effortless to publish, manage and update geospatial data. Keeping this data to a high standard is crucial, since INSPIRE mandates minimum uptime and defines reporting requirements. Spatineo Monitor helps achieve these goals, by constantly testing services, keeping data owners up-to-date, and automating reporting to INSPIRE.

Doesn’t that sound like a logical partnership? We thought so too - after the 2016 edition of the INSPIRE conference, we worked together to build a joint offering to provide more value to our clients.

Together, Spatineo and Wetransform provide integrated publishing and monitoring of spatial data through web services.

Spatineo is well known in the community to provide monitoring and analytics for spatial web services. Wetransform is mostly known for data transformation and publishing solutions for spatial data providers. Wetransform’s INSPIRE GIS platform is an easy to use solution that makes it effortless to publish, manage and update geospatial data. Keeping this data to a high standard is crucial, since INSPIRE mandates minimum uptime and defines reporting requirements. Spatineo Monitor helps achieve these goals, by constantly testing services, keeping data owners up-to-date, and automating reporting to INSPIRE.

Doesn’t that sound like a logical partnership? We thought so too - after the 2016 edition of the INSPIRE conference, we worked together to build a joint offering to provide more value to our clients.

Directly publish services from INSPIRE GIS to Spatineo Monitoring and Catalogues

This offering focuses on monitoring spatial data services created by wetransform’s INSPIRE GIS platform and on integrating statistics into the platform, but also goes beyond that. Wetransform’s founder, Thorsten Reitz, explains: “In our industry, the big players have a strong partner network that enables them to work effectively and to reach large numbers of customers. We think that this is also possible when startups and SMEs collaborate.”

Spatineo and Wetransform started integrating their platforms this spring, and now have the first customers using the integration. Spatineo Monitor and Wetransform’s INSPIRE GIS platform strengthen each other and in the near future this teamwork will be bringing more sophisticated services to our clients.

As of today, the integration provides the following functionality:

  • Automated registration and updating of new services at the Spatineo Monitor
  • Optional automated publishing in the Spatineo Directory
  • Display of availability and performance data in INSPIRE GIS

Spatineo’s Managing Director Sampo Savolainen is excited about the cooperation: “Combining Spatineo Monitor’s quality assurance with Wetransform’s excellent INSPIRE GIS platform gives our customers a strong and reliable platform to build successful services that support the application ecosystems using the data.”

(more)

News' image preview

INSPIRE Pilots and Data Harmonisation Case Studies

To support the INSPIRE implementation and understanding, the JRC coordinates several large-scale INSPIRE pilot projects, such as the Marine Pilot, the Transportation Pilot and also the Danube Reference Data and Services Infrastructure (DRDSI).

DRDSI is an initiative that aims to provide support for the implementation of the European Union Strategy for the Danube Region (EUSDR) in close cooperation with key scientific partners. The initiative covers a lot of use cases and datasets that are inherently cross border, such as…:

  • River Basin District Management
  • Assessment of Water Resources
  • Environmental Impact Analysis

INSPIRE Pilots and Data Harmonisation Case Studies

To support the INSPIRE implementation and understanding, the JRC coordinates several large-scale INSPIRE pilot projects, such as the Marine Pilot, the Transportation Pilot and also the Danube Reference Data and Services Infrastructure (DRDSI).

DRDSI is an initiative that aims to provide support for the implementation of the European Union Strategy for the Danube Region (EUSDR) in close cooperation with key scientific partners. The initiative covers a lot of use cases and datasets that are inherently cross border, such as…:

  • River Basin District Management
  • Assessment of Water Resources
  • Environmental Impact Analysis
Data sets being transformed to INSPIRE usign hale studio

In the scope of the DRDSI project, we implemented a data mapping and transformation pilot. The pilot involved all steps of a data harmonization project, from source data analysis over transformation to data publishing, and was conducted with a short timeframe and relatively small budget. The data harmonisation pilot was commissioned by the JRC and executed by wetransform.

Through this work, we created harmonized data that adds content and value to the existing DRDSI. The work aimed at filling in gaps in regional datasets by creating harmonised data for bordering countries and documenting results for use in the DRDSI platform.

Analysis

As a first step, we wanted to know whether the existing data is fit for INSPIRE harmonisation. We received seven data sets from Moldova and five data sets from Ukraine. For all data sets, we performed a quick quality analysis. This analysis included the following checks:

  • Completeness: Are all fields in the source data filled?
  • Consistency: Are there many inconsistent values, such as overlapping geometries, or different spellings of the same names?
  • Coverage: Can we likely get the minimum required information to fulfill INSPIRE requirements from the source data sets?
  • Encoding: Is the encoding clear and correct?

Based on the analysis, we decided to use data sets for three different INSPIRE themes for the pilot: Administrative Units, Hydro-Physical Waters and Railway Transport Networks.

Transformation

In many INSPIRE implementation projects, there are two steps: conceptual mapping and transformation development. With hale studio, both steps can be combined into one. There are several functions we used to make sure we got the mapping right – both conceptually and technically. In particular, we used hale studio’s real time validation features based on the loaded source data to assess whether our target data set is schema compliant. For review of the mapping by the data providers, we generated the interactive documentation and worked on improvements together. You can check out two example transformation projects we created here:

Publishing

When the transformation projects were completed, the next step was to publish the data as INSPIRE View and Download services.

We generally provide two options how to deliver services: Either as Docker Containers, or as public cloud services. As the data providers and research partners in the project didn’t have resources to host the services, we agreed to use haleconnect.com to publish the data sets. However, we also provided instructions on how the project partners could set up services based on degree directly.

Conclusions

The objective of this project was to quickly implement INSPIRE data sets and services to enable cross-border use cases for the Danube Reference Data and Services Infrastructure. We were able to work very effectively with the data stakeholders, who helped us with the analysis and the mapping through their profound understanding of the data. Using hale studio and hale connect, we acquired, analysed, transformed and published 6 INSPIRE data set with a total effort of about 10 person days.

(more)

News' image preview

The next version of hale studio, your friendly data transformation tool, is here! This time, our focus was on the integration of hale connect and hale studio. We believe that as a community, we could work together effectively by sharing transformation projects in a dedicated environment. The new capabilities enable you to browse and download shared projects and to upload your own projects directly from hale studio.

As usual, we also improved the software in many other aspects:

  • Arbitrary SQL Statements as Schema or Data Source: If you don’t have the option to define a view on a database you’re accessing, you can now use an arbitrary SQL statement both as schema and data source
  • Spatial Operations: We’ve now made the spatial index hale creates when loading data accessible through Groovy Scripting, and provided helper functions to make usage in scripts easier. We also added in a Spatial Join function that supports all modes of the 9-intersection-model.
  • Updated mapping documentation for help example projects
  • Import/Export hale schema definitions as JSON
  • DMG image for macOS installation

The next version of hale studio, your friendly data transformation tool, is here! This time, our focus was on the integration of hale connect and hale studio. We believe that as a community, we could work together effectively by sharing transformation projects in a dedicated environment. The new capabilities enable you to browse and download shared projects and to upload your own projects directly from hale studio.

As usual, we also improved the software in many other aspects:

  • Arbitrary SQL Statements as Schema or Data Source: If you don’t have the option to define a view on a database you’re accessing, you can now use an arbitrary SQL statement both as schema and data source
  • Spatial Operations: We’ve now made the spatial index hale creates when loading data accessible through Groovy Scripting, and provided helper functions to make usage in scripts easier. We also added in a Spatial Join function that supports all modes of the 9-intersection-model.
  • Updated mapping documentation for help example projects
  • Import/Export hale schema definitions as JSON
  • DMG image for macOS installation

hale studio 3.3.0 Connect with your Community!

We’ve also fixed several bugs and added smaller improvements. The whole list is available in the changelog.

Get the latest version, and let us know what you think of it!


Share Transformation Projects on haleconnect.com

In infrastructure projects such as INSPIRE, we’re all working with the same set of target data models. A request from implementers we hear often is that they want to see how it’s done - they want to build on how others have created compliant INSPIRE data sets. We’ve now made it really easy to share transformation projects and to work together on them with the hale studio/hale connect integration. This integration currently enables you to do three things:

  1. Log in to your haleconnect.com account from within hale studio
  2. Browse and download hale transformation projects from haleconnect.com
  3. Share your own hale transformation projects to the community

If you have a private cloud or on premise installation of hale connect or inspire gis, you can also log in to that by changing the application settings.

On haleconnect.com itself, you can discuss transformation projects with others, such as project partners or customers. They can create tasks or just leave comments on any level of the transformation project, even on an individual cell.

Going forward, we will provide more capabilities, such as working with schemas shared on haleconnect.com and also to share custom functions. We’re looking forward to your feedback on this!


Arbitrary SQL Statements as Source Schema and Source Data

When you’re working with databases as sources, it can be highly efficient to combine views with hale studio’s transformation capabilities. The main upsides include that some processing, filtering and joining can be done close to the data, and that only data really required for the transformation is transferred to hale studio. The issue is that sometimes you don’t have the option to create views on the database. In these cases, you can now define your own using an arbitrary SQL statement, which can include joins, WHERE clauses and anything else your database will allow you to pack into said statement.

You can also use that statement to load exactly the right data for your transformation from the database.

Thanks to the Bundesanstalt für Wasserbau for funding this work.


Spatial Operations Support

Have you ever wanted to join objects based on the spatial relationship of their geometries or perform other typical spatial operations in hale? Now, you can finally do so. We’ve added several features at different levels towards that purpose:

  1. hale studio has an internal spatial index that is now accessible through an API and exposed to Groovy Scripting as well as Transformation Functions.
  2. To make usage of the index in Groovy Scripts and Custom Functions easy, we’ve added two helper functions called spatialIndexQuery and boundaryCovers.
  3. There is also a new cousin to the existing Join function called Spatial Join available now, which lets you join source objects based on the spatial relationship between its geometries. The Spatial Join function supports the relation types contains, covered by, covers, crosses, equals, intersects, overlaps, touches, and within.

Thanks to swisstopo for supporting the work on this feature.


Working with hale schema definition files

We’ve also made another internal capability of hale available to you - the ability to export any data model in a common format, called a hale schema definition. This can be exported both as XML or JSON. So if you’ve ever looked for a tool to convert XML Schema to JSON or a shapefile schema to a XML descriptor, look no further.


(more)

News' image preview

One thing we’re doing a lot for our customers is to create INSPIRE data sets from their original data. Usually these data sets are available in a specific national or organisation-specific schema and need to be restructured substantially to meet the INSPIRE requirements. This harmonisation process is one of the things that has given INSPIRE a bad reputation, as in that it is a complex and time-intensive endeavour.

Recently, we passed the 100-datasets-harmonised mark. As we usually track the effort needed for each of these projects, we now start to have a meaningful sample size to judge how much time the development of each of these transformation projects took – time to look at some numbers!

The data that we collect for every project includes the source schema, the target schema, the time spent and a few additional variables, such as schema complexity. In this post, we’re going to look at the mean time spent per target data model, we’ll look at the correlation between source model complexity and effort as well as simple counts.

The dataset

Out of all the projects we’ve done, 68 have time tracking records, and are related to INSPIRE – either they use one of the 34 core data specifications, or an extension of one of those.

One thing we’re doing a lot for our customers is to create INSPIRE data sets from their original data. Usually these data sets are available in a specific national or organisation-specific schema and need to be restructured substantially to meet the INSPIRE requirements. This harmonisation process is one of the things that has given INSPIRE a bad reputation, as in that it is a complex and time-intensive endeavour.

Recently, we passed the 100-datasets-harmonised mark. As we usually track the effort needed for each of these projects, we now start to have a meaningful sample size to judge how much time the development of each of these transformation projects took – time to look at some numbers!

The data that we collect for every project includes the source schema, the target schema, the time spent and a few additional variables, such as schema complexity. In this post, we’re going to look at the mean time spent per target data model, we’ll look at the correlation between source model complexity and effort as well as simple counts.

The dataset

Out of all the projects we’ve done, 68 have time tracking records, and are related to INSPIRE – either they use one of the 34 core data specifications, or an extension of one of those.

Data set counts by required time for transformation project development

As the graph shows, quite exactly half of the projects can be completed in 8 hours or less, while only very few projects took more than 64 hours to complete. 64 hours equal about 10 productive person days when we factor in some overhead.

After looking at the general effort distribution, we wanted to dig a bit deeper – which INSPIRE Annex themes create a lot of effort for us?

Efforts to create a transformation project by target schema

The range the graph shows is pretty wide. While Addresses, Transport Networks and Hydrography Networks are all in the 30+ hour range, most of the other themes show mean times of 5 to 20 hours of required effort. As the orange line in the graph indicates, the number of datasets we’ve included for a given target data model is in many cases very small (1-3), so these numbers are certainly not stable.

Maybe we need to look at the dataset from a different angle. As we often work on a fixed price basis, we want to make sure the estimates we give are reliable, so it is important for us to know what drives effort up. Thus, the next thing we look at is source data model complexity. We measure complexity using an arbitrary set of measures that tests existence of some model features (such as foreign key relationships and inheritance) as well as model size to give a number between 1 (e.g. a single shapefile) and 10 (massive model, with every modelling feature you can imagine).

Effort required for transformation project development by Source Model Complexity

This graph does show an interesting – and not really unexpected relationship. On the X-Axis, we can see the source model complexity, on the Y-Axis, we see the time spent for the projects. We indicate effort and complexity for each project with a blue dot, and the trendline with an orange dotted line. The relationship is pretty clear: The more complex the model, the higher the mean effort. The trendline is actually almost linear, and shows a growth from about 3 to 28 over the complexity range from 1 to 10 – which is a factor close to 10.

Our conclusions?

  • Source model complexity is so far the best indicator for expected effort in a project;
  • Effort varies a lot across different INSPIRE themes;
  • Overall, more than half of the INSPIRE harmonisation projects can be completed in less than a day (caveat: we are quite experienced, so a person knowing less about INSPIRE and hale studio will need more time).

What are your experiences? How much time did you spend on transformation project setup?

(more)

News' image preview

Michael Lutz and Athina Trakas decided to spice up the OGC Europe forum slot at the Delft Technical Meeting by asking for position papers around the question “What if we would start implementation of INSPIRE again today?”

Michael Lutz and Athina Trakas decided to spice up the OGC Europe forum slot at the Delft Technical Meeting by asking for position papers around the question “What if we would start implementation of INSPIRE again today?”

Participants of the workshop engage in discussions, Photo by Michael Lutz

More than 40 participants joined the workshop and first of all, saw eight three-minute presentations with a wide range of suggestions:

  • Satish Sankaran (Esri Inc.) asked “What are the right metrics to measure success?”, and suggested that adoption rates could be improved if people could contribute to the infrastructure without requiring full compliance.
  • Paul van Genuchten (GeoCAT) highlighted the potential of INSPIRE linked data as explored in the GeoNovum testbed Geo4Web.
  • Thijs Brentjes (Geonovum) suggested a set of data specifications with a simple base, with no mandatory extensions, built on top of existing (INSPIRE) SDI. He also suggested to use additional encodings with the objective of making INSPIRE usable by web developers.
  • Sylvain Grellet (BRGM) also suggested that adoption would be easier with alternative encodings (SF, JSON, …). He was the first presenter to suggest different levels or labels for compliance. Sylvain also said that joint funding of development should be organised from the start on, instead of leaving everything to the implementers. Sylvain also suggested to better organise aspects such as trainings and hackathons.
  • Clemens Portele (Interactive Instruments) explained how important stability and reliability are for a major infrastructure project like INSPIRE. He suggested to improve specifications through small, iterative changes. It should be possible to make these changes in an agile, fast, usage driven way. He briefly outlined the work done by Geonovum and Interactive Instruments on making data accessible to web applications and suggested to put facades or proxies on top of the existing INSPIRE infrastructure.
  • Thorsten Reitz (wetransform) focused on the user experience of applications built directly to manage and explore INSPIRE data and services and explained that with most of the investment going to backend infrastructure, these leave a lot to be desired would be essential to show the value of the infrastructure.
  • A representative of Natural Resources Canada explained the objectives of the Maps for HTML standards working group and explained that adding capabilities such as MapML would foster usage and adoption.
  • Peter Bauman (Rasdaman GmbH) asked “What if our services could talk?” but wasn’t able to join in person.

These three recurring topics emerged:

  1. tiered or more flexible compliance,
  2. usage of web standards and improvements to data usability and
  3. adding proxy layers on top of the INSPIRE infrastructure.
Participants of the workshop engage in discussions, Photo by Michael Lutz

Following this agenda setting, we split up in four groups to discuss several key questions, using the World Cafe methodology:

  • What standards and technologies should the infrastructure be based on?
  • What architectural pattern would you recommend? What should be the main components of the infrastructure?
  • How would you organise the implementation process and make it cost-efficient?
  • How would you ensure a wide adoption and use of the infrastructure?

Athina and Michael asked me to facilitate the group discussion around the third question - “How would you organise the implementation process and make it cost-efficient?” Our objective was to define 2-3 recommendations and to suggest follow-up actions. As facilitator, I asked the following questions, and a lively discussion ensued:

What is the very first thing that should happen in the implementation process?

  • Define criteria for success early on (not “only” for compliance)
  • Define Interoperability (when is data interoperable -> when it us useable in clients), “General” interoperability isn’t the goal, usage is!
  • Provide end product specifications for high value use cases to large numbers of users
  • Define success for concrete users in addition to the overall objective

If you had one year to implement INSPIRE from scratch, how would you do it?

  • Start with simple data models, manage complexity (necessary vs. unnecessary, e.g. in ISO and metadata), expand over time with new use cases
  • Follow pragmatic approaches
  • Use a well-defined set of interfaces that are already proven (mainstream IT industry)
  • Follow trend in industry towards RESTful mechanisms
  • Focus on pushing data out (e.g. as done in Copernicus)

What is the process to continuously coordinate implementation and make re-use possible?

  • Use agile methods such as iterative development
  • Keep users in close involvement
  • Identify anything
  • Keep existing infrastructure, build agile infrastructure on top, then let the market choose what works
  • Transparency, make it available
  • The metric could be: How many users to you have? Users drive continual improvement?

How can implementation of key components be coordinated in an efficient way?

  • Coordinate development of core components early on, in particular validation and registries (e.g. code lists, extensions)
  • Common components should be coordinated; Harmonisation may be such an issue
  • No mandatory components, but all components for a reference implementation should be available
  • Make sure that requirements that are specific to INSPIRE are really understood well - Value vs. effort/costs on very specific INSPIRE requirements?
  • Clarify the business case for the implementation coordinator - is that organisation paid for by taxes?
  • Countries see lots of liability and low central investment
  • Central funds/investments is low compared to the overall investments required for implementation and compared to COPERNICUS
  • Harmonisation across INSPIRE and Copernicus currently looks like a Godzilla vs. King Kong fight, will be difficult to achieve effectively
Participants of the workshop engage in discussions, Photo by Michael Lutz

We then consolidated recommendations:

  1. We should treat INSPIRE not as separate infrastructure, but rather as integrated with existing products and processes, e.g. by extending national models to meet additional INSPIRE requirements
  2. Cost effective: INSPIRE not as something specific, but as a general infrastructure and be a natural part of what we are doing
  3. We would reframe INSPIRE in the context of the Open Data Movement to limit competition between INSPIRE and Open Data / Linked Open Data
  4. We also make sure products are designed from the user experience first
  5. We have to orient implementation guidance towards implementer’s questions and problems (“How to provide a bridge in INSPIRE” –> can’t be answered by a professional)
  6. Have a library of reference implementations to describe how it’s done for all annexes

… and Follow-Up actions:

  1. Collect reference implementations as concrete guides and publish those
  2. Provide compliance levels (?) as a means to get in easily
  3. Find how to react to new uses cases in an agile way

A side discussion that came up in our group as well as in at least two of the other three groups was what it meant to be compliant really, but I’ll leave that for another post ☺.

All in all, I really enjoyed the highly interactive format of the What if…? Workshop and the productive discussions, which was not just rehashing of previously discussed issues. Thanks to Athina and Michael!

(more)