Home Articles Tech lead: GIS – The Road Ahead

Tech lead: GIS – The Road Ahead

Prof. Arup Dasgupta
Managing Editor
[email protected]

GIS has undergone and is undergoing metamorphosis that matches the evolution of the technologies that feed into it. Here is a take on the current trends in GIS and the future course it is likely to take

The evolution of GIS is the story of the co-creative growth of information and communication technologies (ICT) and their application to geographical studies. Beginning with the use of computer graphics for automated cartography in the early 70’s, GIS has benefitted from every new development in the field of ICT. In fact, it has also contributed to these technologies through the evolution of unique instruments, systems and applications. The basic building blocks of a GIS remain the same but there is a sea change in their characteristics influenced by emerging technologies and new ways of addressing different aspects of the individual building blocks.

GIS data sources have expanded along with technology. We have seen the automation of conventional survey instruments and the emergence of data logging systems. GPS systems which were originally developed for satellite tracking have become important survey instruments. Aerial photography has advanced to aerial and satellite imaging sensors that go beyond the conventional visible and infrared bands into the microwave regions. The next step is GISready imagery.

GIS-ready Imagery
Remote sensing has become an important source of data but it requires processing like geo-referencing and ortho-rectification before they can be integrated into a GIS database. Prof Dr – Ing Christian Heipke of the Institute of Photogrammetry and Geoinformation, Hannover concurs and points out that such pre-processing is largely done automatically today, the only exception being the generation of true ortho imagery. According to Heipke, the characteristics of GISready imagery are up-to-date acquisition, rectified to the projection used in the GIS so that there are no geometric errors and true ortho imagery in the case of urban areas. Such data, which typically provides a ninety percent circle of error (CE 90%) of 4.8 m, does not come cheap and may cost anywhere up to USD 90 per sq km. However, Heipke opines that such imagery finds a ready market. Paul Ramsey of Open Geo however feels that ‘GIS ready’ is a prejudicial term used by vendors to insist that certain levels of image processing are a hard minimum. In fact the required quality is determined by the end use case and user requirements.

This is borne out in a novel approach to GIS-ready imagery provided by TerraLook, a joint project of the U.S. Geological Survey (USGS) and the National Aeronautics and Space Administration (NASA) Jet Propulsion Laboratory (JPL). Terralook provides satellite images that can be used to see changes in the earth’s surface over time. Each TerraLook product is a user-specified collection of satellite images selected from imagery archived at the USGS Earth Resources Observation and Science (EROS) Center. Images are bundled with standards-compliant metadata, a world file, and an outline of each image’s ground footprint, enabling their use in geographic information systems (GIS), image processing software and Web mapping applications. Heipke avers that it would make sense to have critical features pre-extracted like transportation and drainage since it is only based on such features that one can query a GIS. But he adds that whether or not line mapping, which is what “pre-extraction of critical features” was called in the past, is possible is a question of money and time. At the moment image classification, change detection and feature recognition are possible through image processing tools.

Of late, there have been very large imagery contracts placed with the leading satellite and aerial imagery providers. According to Heipke, in many cases these include GIS-ready imagery. However, Ramsey feels that big imagery contracts will be less and less the norm as consumer sensors become capable of gathering ‘acceptable for use’ imagery. Why fly a plane when you can launch a balloon? Or stand on a building? In a new twist to the concept of GIS-ready imagery, Ramsey states that a cellphone with a good AGPS unit and gyroscope and compass can gather imagery which can be made GISready using the right software and base imagery to match with. He cites the examples of Google Street view and Microsoft Photosynth.

Sensor Networks
Sensor networks are increasingly becoming important as source of live data. According to Heipke, this depends on what we define as a sensor network. A GPS network, for example, has existed for a long time; and so have other networks, like data collection platforms (DCP) which consist of a number of connected sensors delivering data to a central processing unit. The interesting part of sensor network is decentralised processing of self organising ad hoc networks which make the network scalable. Traffic is just one example in which such set-ups can be of major help.

Ramsey states that we are a sensor network. A billion people are walking around with sensors in their pockets. Right now they are capturing location and time and pictures and sometimes audio. As time goes on, they will capture even more information. The unique application is real time demographics. You don’t need to run a decennial census when you know where everyone is right now. Indeed, data from cellphones are being used to study traffic flows and spot areas of potential congestion. There are the issues of standardisation which, according to Ramsey, relate to standardisation of time scales and location frames. After that, most interoperability issues can be handled with clever (or not-so-clever) computer code. However, there are privacy issues. Ramsey opines that they are not uniquely geo; they are part of the larger gestalt of online sharing that are being promoted by social networking sites like FaceBook.

Crowd-sourcing is now a legitimate source of data, but issues like standards, validation and privacy remain. On the emerging standards for crowd-sourced data, Ramsey has to say that the most prominent standard has been the ‘no standard’ standard pushed by OpenStreetMap. Little structure, and categorisation defined by the crowd itself. The drawbacks of this approach are evident in the OpenStreetMap data, but the community continues its work, the next round will likely be something similar to the ‘grammar bots’ of Wikipedia. Map bots will cruise the OpenStreetMap corpus, finding and fixing inconsistencies automatically. Systems akin to page-ranking are also being proposed for establishing the quality of such data.

The report of the 1st EuroSDR Workshop on Crowd Sourcing for the Updating of National Databases, held at the Federal Office of Topography (swisstopo), Wabern, Switzerland on 20-21 August 2009 tried to classify the ‘crowd’ into groups based on their effort and utility of the data they produce. As Figure 1 shows, they range from low value casual users to high value experts and open mappers. An interesting group are the passive mappers who really are the sensors of the sensor networks discussed earlier. We see a blending of crowd-sourcing and sensors networks. According to Ramsey, when the sensor network is made of people carrying devices, the network is the crowd and vice versa. The question is what incentive people have to give up their privacy, their time, or both. So far, the incentive of seeing their local area well mapped has moved things along.

Figure 1: Classification of ‘crowd’ into groups based on their effort and utility of the data they produce Courtesy: EuroSDR
The other interesting conclusion from the workshop was the issues of data quality and reliability. Figure 2 shows a way to merge the quality requirements of national mapping agencies (NMAs) with the modest capabilities of the crowd. There is also a suggestion to maintain dual databases, one comprising of crowd sourced data and the other of the official data and to merge these two as the confidence in the former approaches the latter.

While Google Mapmaker is a very visible face of public crowd sourcing, Ramsey points out that the ‘success’ of MapMaker is derived from the ubiquity of Google itself. There are many agencies like Bing Maps, Wikimapia, Waze and Open Aerial Maps which have adopted crowd sourcing as a primary data source.


Figure 2: Ways to merge the quality requirements of NMAs with capabilities of the crowd Courtesy: EuroSDR
GIS stands apart from simple CAD or automated cartographic systems because of the analytical tools which provide the means to analyse and model systems and situations and provide decision alternatives. These tools have evolved over time and have adopted many of the developments in the ICT world. Ramsey feels that the adoption of advanced computational tools into GIS is 5 to 10 years behind the IT world but it is catching up fast.

Databases and Data Mining
GIS data has moved into the RDBMS world thereby leveraging the power of SQL. Spatial extensions of SQL can now manage graphical data efficiently and raise the power of querying maps to a new level. With the explosion of data in terms of high resolution imagery as well as data from sensor networks and crowd-sourcing, there are a new set of issues. While there are no issues relating to data volumes there are challenges in areas like data mining. Heipke concurs and points out new classification methods like support vector machines, graphical models (Markov and Conditional Random Fields) to name only a few new developments. Ramsey adds that the same tools that are being used to analyse Web data like ‘map-reduce’ technologies for massive data crunching and ‘shared nothing’ storage solutions, apply to geographic data.

The satellite image mosaics published by Microsoft and Google are a direct example of these new techniques being put into practice: imagery for the whole world, mosaiced and processed in massive quantities. As the tools of mass computation are commoditised through services like Amazon and AppEngine, we will see more use of them by mainstream GIS professionals. One such example is Oracle Exadata which can perform spatial operations on up to 2 Terabytes of Database System Global Area memory.

On the issues of archival, retrieval and maintenance, Heipke says that in particular retrieval is difficult, as it amounts to interpreting data with a particular semantic background. Turning images into GIS features is just one example, exploiting point cloud data to generate urban structures is another, integrating different data sets and thus exploiting their synergy is yet another. When visual interpretation and overlap look simple automating, these steps require a very detailed understanding of both, the employed algorithmic framework (including for example, machine learning approaches) and the application domain.

Open Source
Open source systems have established their relevance in the geospatial world. Yet, there is a sense of competition with proprietary systems. If industry could cooperate in evolving open geospatial standards, why is there a lack of such cooperation when it comes to open source solutions? Ramsey feels that proprietary vendors have a vested interest in a binary on/off view of the problem, because it keeps their existing customers more firmly on the ranch, paying support. There is no technical reason customers cannot build hybrid systems, particularly since open source software places a premium on interoperability with existing systems.

Proprietary systems are promoting cloud based offerings in the form of Software as a Service (SaaS) to reduce cost of ownership and upgradation. Do these initiatives pose any problem for open source? Is there a strategy to meet this challenge? Ramsey says that open source is not an entity; so doesn’t have a ‘strategy’. Open source is likely to end up inside those SaaS entities, doing all the heavy work. Unless you already own the proprietary code building a SaaS on top of proprietary software is not economical. Horizontally scaling services can only be built on software infrastructure that doesn’t have a large capital cost of deployment: open source software fits that bill. Indeed we do see situations where open source provide much better solutions as network servers, map servers and internet servers for intranets and internet map services.

Photogrammetry has moved from huge complex mechanical instruments using aerial photographs to elegant and complex software using digital imagery for determining 3D measurements from 2D stereo data. Optical imagery has been further complemented with interferometric SAR which overcomes the problems of cloud cover. LiDAR imaging also provides such 3D measurements. So what will dominate the digital photogrammetric world? Will the ubiquitous stereo pair disappear? Heipke feels Images similar to what we humans perceive will always play a major role, because humans can easily interpret such representations – this is not the case e. g. for radar images. Also, images have a higher resolution than LiDAR data, both in the geometric but also in the spectral dimension (colour instead of a monochromatic channel only). In particular for 3D structures (urban areas, forests), automatic interpretation of images and point clouds yield complementary results and should therefore be done in an integrated way. Satellite images with an ever more detailed ground resolution have an increasing importance, but there are also more and more applications for images with about 5 cm pixel size. Thus, aerial images remain necessary, also because acquisition is more flexible and in a number of cases more economic.

One of the key developments in GIS has been its integration with other information systems. We know that GIS grew out of the convergence of CAD, automated cartography, DBMS and later remote sensing. Now we are observing another kind of convergence where GIS is adding spatial value to other systems like design and engineering, business and enterprise management.

Design and Engineering
There is an artificial disconnect between GIS and CAD applications for an engineering project. While a GIS database shows the project site in its true geospatial location, the CAD drawings of the project elements show the details of the elements but not in their geospatial context. Taken further, a GIS may show objects like buildings as extrusions in the correct geographic locations but fail to show the inner details of the buildings. On the other hand, the CAD drawings bring out the inner details. Thus by bringing GIS and CAD together, we can achieve the best of both worlds. Some have begun to call this Geodesign. Geodesign results in a very detailed database which can be used for multiple purposes from routing of utility lines to marking out emergency escape routes. It does form an important information source for designing sustainable infrastructures.

Enterprise Resources Planning (ERP)
Since large enterprises have geographically distributed assets, suppliers and clients, it was a matter of time before GIS was brought in to support ERP activities. This is one area where integration is happening at a rapid pace and has achieved a level of maturity.

Figure 4: Integrating different systems using a Distribution Management System (DMS) Courtesy: www.gita.org
The key issues are the ability of ERP and GIS to access each others’ databases. The most obvious solution has been to provide connectors at both ends but then the most obvious is not always the best. The trend is to design a database that can be accessed by both ERP and GIS. The main issue is the nature of the databases. GIS databases are large but relatively static while ERP databases are lighter but dynamic as they cater to transactions in real time. Roger Langsdon of Logica (Figure 4) suggests an integrated model which uses a Distribution Management System as a common element between the different systems to balance out the disparate requirements.

Business Intelligence
Business Intelligence has many overlaps with ERP. In fact that part of the ERP which is client driven can be considered to constitute BI. The issues are similar to the ERP and the solutions so far are basically through data connectors. There is a large disconnect between BI users and GIS users because BI users think that GIS is too complicated solution for their tasks while the GIS users think BI is just about making maps and charts. The answer lies in between and that is the place to be explored.

BI depends on data warehousing and mining, online analytical processing, OLAP which can be made spatially oriented, reporting tools and a dashboard which offers a graphical picture of the various reports and processes. Extract, transform and load (ETL) is the key to any BI system. Integration can be attempted by making the ETL, data warehouse and OLAP spatially enabled.

GIS deployment has also undergone change from the ubiquitous desktop to the Web and now the Cloud. Mainframes gave way to servers and now to large distributed systems like the GRID and now the Cloud. Access is now possible from various platforms like tablets and mobiles.

GIS on the cloud has been described in detail in May 2011 edition of Geospatial World. One of the applications picking up is Data as a Service, DaaS. John Oeschle, Executive Vice President – Strategy and Product, DigitalGlobe, mentions that their cloud services platform helps give customers fast and easy online access to existing and newly collected imagery so that they have the visual information needed to make critical decisions. It is also a critical part of the evolving technology infrastructure that is helping Digital- Globe close the gap between the time that an image is collected and processed and the time that a government or business can use it to make more informed decisions.

Figure 5: Geospatial Business Intelligence Courtesy:
At the same time, since DigitalGlobe cloud is primarily distribution oriented, there is no computing or processing on behalf of customers. Therefore their cloud is unusual in that it doesn’t contain any information that customers would directly upload or consider proprietary. However, efforts like ArcGIS.com, a SaaS realisation, where users can upload and analyse data are picking up interest and it will be interesting to observe the developments. Cloud is also a great opportunity for open source, particularly for PaaS and as back-end engines for SaaS).

Portable devices like tablets and mobile phones have become multi-functional and are used for several spatial applications like navigation, finding points of interest, field data collection, field operations. Mobile devices, when used with Cloud can give extremely powerful solutions for field use. Most devices are now powerful enough to handle imagery and merged with SaaS can be configured for on-site image analysis for, say crop monitoring.

GIS has undergone and is undergoing a metamorphosis that matches the evolution of the technologies that feed into it. There are many areas like natural language spatial relations and interaction by sketching for example which have been proposed by Mike Goodchild many years ago but which have yet to become mainstream. The sixth sense technology demonstrated by Pranav Mistry of MIT Media Labs is a wearable technology which, when integrated with GIS, can revolutionise person-data interaction and bring new applications to the fore. GIS as a system may metamorphose into a component in a larger system and applications will become more holistic.