Data models and GML application schemas: Key to interoperability

Data models and GML application schemas: Key to interoperability


OGC Beat People involved in geospatial data sharing are familiar with geospatial data models. Because the OGC Geography Markup Language (GML) is so widely implemented in GIS and other geospatial software products and solutions, it is also important for data sharing decision makers to understand GML application schemas and profiles.

Data models and data coordination
An information community is an industry, profession, academic discipline or other domain that shares a set of spatial information communication requirements. The data model used by an information community is an expression of their spatial information communication requirements.

Information communities have been doing ‘data coordination’ for decades to develop common data models. A data model details how to make information about real world objects useful as digital data. Domain experts in agriculture, weather or hydrology create data models by representing geospatial features and feature relationships in a conceptual language, such as Unified Modeling Language, which is then used to design tables in databases and encodings.

Different information communities work on different kinds of problems, and so they have different data models; yet they still need to share information with other organisations.

GML: International data encoding standard
The OGC Geography Markup Language (GML) Encoding Standard, a widely implemented international open standard, was developed to support the sharing of vector data as well as raster images and other kinds of spatial data. OGC Web Service interface standards, such as the OGC Web Feature Service (WFS) Interface Standard, provide the basis for operations to request GML-encoded data and to respond to such requests. The WFS standard is implemented in nearly all commercial GIS products and applications.

GML core schemas

GML provides the basis for domain- or community-specific ‘application schemas’ and ‘profiles’ that support data interoperability within a community of interest. These tailored, streamlined versions of GML require no change in any system that already implements OGC Web Service interface standards.

A profile is an implementation case of a more general standard or set of standards. An application schema is a profile that only implements one standard. The difference between application schemas and profiles, however, is not important for the purpose of this article.

GML contains a rich set of constructs for encodings: feature, geometry, coordinate reference system, time, dynamic feature, coverage (including geographic images), unit of measure, map presentation styling rules, etc. This set of constructs includes many elements that are not needed for a particular data model, and so they can be excluded from an application schema or profile. An extreme example, the GML Point Profile contains only a single GML geometry, namely a <gml:point> object type.

In designing an application schema or profile, an information community includes only what is necessary to encode their information model. This results in much simpler, ‘lighter weight’ GML encodings that require less storage, less bandwidth and less processing time. They also require less time for developers to learn and implement the encodings. There are more than 30 GML profiles and application schemas across multiple communities documented on the OGC Network in a list that has not been updated lately. The current actual number of profiles and application schemas is much larger.

When an information community implements their data model in GML using a GML application schema and/or profile, they realise tremendous gains in interoperability. Data encoded in two different GML application schemas by two different communities of interest may be integrated in a map or further conflated in ways that account for semantic inconsistencies.

The Ordnance Survey of Great Britain and the U.S. Census Bureau (in its TIGER data) provide data as GML application schemas. Also, the International Association of Oil & Gas Producers’ (OGP) Seabed Survey Data Model (SSDM) is implemented as a GML application schema. In Europe, the Cultural Heritage data model built under the auspices of the European INSPIRE Directive uses a GML application schema to capture the content and structure of the georeferenced cultural heritage data set.

Testing GML application schemas
The OGC Compliance Program provides an online free testing facility that information communities can use to validate GML 3.2.1 instances or GML Application profiles. Several Reference Implementations are also available in the Compliance wiki. Implementations that pass a compliance test can get certified using the OGC certification process.

The OGC Compliance Program enables Governments to easily and effectively mandate GML in procurement language. It also provides a way for developers to gain market visibility.

Summing up
After information communities create agreed-upon data models, they can then create GML application schemas and profiles based on those data models. OGC validation tools are available that communities can use to be sure their schemas and profiles are implemented correctly. Because the OGC GML Encoding Standard is implemented in the software of so many geospatial product providers and solution providers, it is an extremely useful encoding standard and an essential enabler for intra-community and inter- community interoperability.