Home Blogs Visualization and analysis of deeply geotemporal data

Visualization and analysis of deeply geotemporal data

Using OmniSci to interactively explore structure-level flood risk.

While both spatial and temporal data have been part of the digital landscape for years, their combination at high granularity is relatively new. At Geodesign Technologies, we call this “deeply geotemporal data” (DGT) to distinguish it from time series and conventional GIS data. Increasingly such datasets to serve the planning needs of both enterprise and government customers.

We will explore here what is possible and useful with new-generation tools, including cloud-based imagery software and a GPU-enabled database. We will consider a case study using DGT data to address flooding issues at community scale, assessing individual structures.

Thinking About Data Volume

First, let us just consider data volume, and why it is an issue to be managed. If you only work with summary data, or with very limited primary data, then likely your personal computer or laptop may be sufficient. For example, if you deal with “grab samples” of water quality, you can fit a lifetime of observations on a typical laptop. Similarly, if you only work with one parcel of land at a time or generalized data, conventional tools like Autodesk’s AutoCAD or ESRI’s ArcGIS may be both appropriate and sufficient. In my experience, it is only when you are dealing with millions to billions of observations that you need to start considering alternatives.

Millions to billions of observations may seem like big numbers. But several technological trends push you into this territory quicker than you expect. The first is “transactional” data, which can be anything from point-of-sale data to sensors. For example, your business might want to understand how sales relate to local market conditions or store location parameters. If so, you may quickly join the DGT club. Similarly, many of my local and state government clients are accumulating multiple years of permit data. In order to make useful projections from such data, they are increasingly needing to build models which consider both time and space.

Spatial Data Gets Temporal

Conventional spatial data sources are rapidly acquiring temporal characteristics. These trends are being driven by a combination of three technologies: microsatellites and drones are lowering the cost of image acquisition, and machine learning techniques are automating map generation from primary data. All of these technologies are relatively new and evolving quickly.

Currently, there is a three-way race between satellite, aerial and drone imagery providers.  Each platform has inherent advantages and disadvantages, but it’s safe to say that this completion will result in increased resolutions, band depths and image frequency.

“Remote sensing” from satellites or aircraft has traditionally involved to detailed analysis of a single scene, or perhaps paired “before and after” scenes. Yet open public data available today is imaging the entire planet weekly at 10-60m resolution. According to the European Space Agency, one full Sentinel-2 coverage requires about 3 Terrabytes of storage.  So a year’s worth of data would be roughly 52x that or 156T.  That’s a pile of hard drives!  Commercial satellites, airphotos and drones are imaging the entire planet daily, and parts of it in incredible detail.

Case Study: Using Analysis-Ready Satellite Data to Estimate Flood Risk

An exciting recent development in satellite data is the availability of analysis-ready data (ARD).  In conventional workflows, end users are responsible for image selection, atmospheric correction, masking, clipping and mosaicing. All of that data manipulation is required by end-users, before performing their desired objective, which is typically index generation or classification.

Two examples of such services are Google Earth Engine (GEE) and Tesselo, LLC. The first is a platform which contains a wide range of satellite data, but requires javascript or python programming to use.  The latter is a startup focussed primarily on the European Space Agencies Sentinel-1 and 2 satellites, but does not require programming to use.

For a South Carolina town experiencing serious “nuisance flooding” not well predicted by conventional FEMA floodplains, I worked with colleagues at EcoAccumen to develop a new approach based on analysis-ready data.  We leveraged DGT tools and an amazing new open dataset from Microsoft.  Microsoft has used machine learning techniques on airphoto data to extract building footprints for almost every built structure in the continental U.S.

We used GEE to compute a “Height Above Nearest Drainage” (HAND) model for each individual building footprint data. HAND is a relatively simple measure, compared to historical hydrological modeling approaches, since it considers only elevation relative to the nearest water feature.  However, it has the considerable virtue of being widely applicable with no data requirements beyond terrain, water features and structures at risk.

The result is simple and easy to use, while still being quite powerful. We found a very large range of risk, with complex spatial variation. The dashboard above is fully interactive and scaled to flood stage height in feet. This allows town managers to review a number of scenarios, and provides a useful complement to traditional practices.

Temporal Data Goes Big

Classic time-series data comes from relatively few fixed observation points.  For example, consider automated weather station data. The US government provides authoritative data since 1997 for approximately 2,000 stations across the US. This kind of data can now be supplemented using “crowdsourced” weather data stations, which currently provide more than 15x the locations available.

What does this mean for a typical county?  We looked at Placer County, California, using Synoptic Data’s “Mesonet” API to load 6 months of daily wind speed data. We found 45 weather stations, and 860k observations. If we wanted a long-term period of record, we’d go back to 1997, which would quickly get us to millions of records. Thanks to GPUs, we can interactively explore this data for any weather variable, station or date range.

Summary: Managing for DGT

Ignoring current ‘best available data’ is never a wise strategy. In business, your competitors will eat your lunch, and in government your projects will potentially be stalled for ages.  New data and tools are always a mix of opportunity and challenge.  But gig data in general and DGT in particular are not going away.  By doing some early planning and testing, you can position yourself and your organization to get best value from these exciting dataset developments.

Note: This is a guest blog by Dr. Michael Flaxman, Founder, Geodesign Technologies