Associate Professor Civil Engineering Department
Sultan Qaboos University MUSCAT (OMAN)
R. Siva Kumar
Director, Military Operations Directorate of the Army HQ. India
Uncertainty is a significant problem in GIS because spatial data tend to be used for purposes for which they were never intended, and because the accuracy problems in GIS require considerations of both object oriented and field oriented views of geographic variations. Map accuracy is relatively a minor issue in cartography, and map users are rarely aware of the problem. Maps use a very constrained technology of pen and paper to communicate a view of the world to their users. Cartographers feel little need to communicate information on accuracy, except indirectly through map quality statement or in detailed legends. But when the same map is digitised and input to a GIS, the mode of use changes. The new uses extend well beyond the domain for which the original map was intended and designed.
A quantitative analysis of maps is not new to GIS, but GIS brings accuracy issues into focus. Moreover, machines used to make measurements in GIS (digital computers) are inherently far more precise than the machines of conventional map analysis. The bitter truth is that all geographical data are inherently inaccurate, and these inaccuracies will propagate through GIS operations in ways that are difficult to predict.
Some of the experiments carried out in the laboratory with map and its accuracy have been discussed in the succeeding paragraphs. Error analysis in spatial databases is very important, with a direct bearing on the accuracy of GIS and hence requires a due consideration.
Map accuracy is an important issue consisting of two aspects viz. (i) Attributes, and (ii) Spatial position, whether relative or absolute. The accuracy of spatial database is a derivative of the map and all results obtained are compared to the accuracy of the map. The map data, raster data and vector data requires to be compared for all the three graphic primitives viz. Point, Line and Area. One of the major requirements in the Digital Topographic Data Bases (DTDB) of a large country is consideration of a variety of features. These features may vary according to the topographical conditions of the regions of the country; and, in a large country, there could be many types of topography. As in the case of India, there could be three different types like snow-clad high hills, vast plains and deserts.
Maintenance of digital topographic database is an aspect that needs to be investigated. Digital stereo photogrammetry may be helpful in extraction of cartographic features with a greater accuracy with the inception of new knowledge available now. The digital elevation data in the form of contours, thus digital mono plotting can be considered as a quicker method of plotting the data for maintenance of DTDB. Keeping in mind the methodology and the availability of technology, some specific applications for printing of maps are to be exploited.
Map as a Basic Input for DTDB
Considering the available source of data for developing DTDB, topographical maps are most suitable in view of their rich information contents. Along with maps, aerial photography and satellite images are extra sources for collecting data. However, maps remain the main source of data in view of the abundantly available labelled information, an inherent part of maps. The scale of a map is a very important aspect since the information content depends mainly on the scale of the map. It has been found that 1:50,000 maps are the most suitable for making DTDB in Indian conditions as these maps are readily available and the information contents in such maps are sufficient to serve many purposes.
In order to digitise these maps, the easiest method was found to be scanning of paper maps. However, it has its own pitfalls. The maps printed in India are as old as 70 years, a thorough investigation was made using paper maps. For this purpose six copies of the same map printed in 1929, 1947, 1953, 1966, 1971 and 1975 were used.
Paper used in India for map printing is appropriately called map litho paper. Certain specifications for the paper were decided by the national mapping agency so that the line work printed on a map appears well and there is minimum distortion in the dimension of the paper due to weathering. Usually the weight of the paper used should be 95 grams per square metre (GSM).
- Thickness 95 m
- Moisture : 8% maximum
- Ash content : 15% maximum
- Brightness : 75 to 80%
- Opacity : 80 to 90%
- Dimensional stability 0.4%maximum. Stability with change of humidity 20% to 75%
- pH : 5.5 to 6.5%
These specifications are standardised by the Indian Standards in IS:12765:1989:specifications for paper used for printing of maps.
The study was carried out in two ways. Firstly, the dimensions of a map were checked with the theoretical dimensions. Secondly, the map was scanned and the raster data was brought to the theoretical dimension by rubber sheeting/warping. Then 10 nos. each of different primitives viz. Point, Line and Polygons were computed and the results are tabulated below:
Dimension of the Map: The theoretical dimension of the map on scale 1” = 1 mile, is given in Fig.1.
The actual dimensions of the six maps are tabulated below. As the FPS system was being used in India then, the dimensions are given in inches.
From the above, it is observed that the maximum change in X1 was 0.063” in the maps printed in 1966 and 1975. The maximum change in X2 was 0.057” in the map printed in 1947. The maximum change in the diagonal was 0.060” in the map printed in 1947.
As seen from above, the dimension does not really form a pattern due to vagaries of nature, quality of paper and mode of preservation. Hence the error pattern may not be modelled. The maximum distortion in all the four dimensions is of the order of 0.06” which amounts to about 76 metres in ground terms.
Effect of Graphic Primitive after Warping:
The maps were scanned at 300 dots per inch (dpi) resolution and the raster data was warped/ rubber sheeted to the theoretical dimensions. Ten numbers of well distributed points, lines and polygons were chosen. The co-ordinates/lengths/areas of points, lines and polygons are tabulated in tables 2, 3,and 4 respectively. For brevity, only the significant values of the co-ordinates are shown i.e. hundreds. All the values depicted in the following tables are in metres.
Point : The co-ordinates X and Y are read from the monitor for the same 10 points on the six editions of one map. From the results tabulated in table 2, it is noticed that the error is more in the Y direction which is 300 metres in ground terms.
Line : Lengths of different lines in metres were measured on the vector data and the results are given in Table 3. The variation in lengths of lines measured is at times even as high as about 100m in a length of about 300 metres.
Areas : A similar exercise was carried out for measurement of areas/polygons and the outcome is shown in Table 4. The areas are in sq.m.
From the results tabulated above, it is noticed that the areas are in error up to 40% times. The errors are not only due to the dimensional instability of the paper, but also due to other reasons. These errors may not be modelled and may not be corrected. Another problem in scanning the paper is that colour separation is not possible and hence all the details are available on the same scanned dataset. This will result in further increase in time for digitisation. Hence, ideally paper should not be used. For transferring the image on the standing nagatives, film based materials have to be used. Other inexpensive material such as sensitised synthetic paper may be tried out to see whether the image transferred is of the required quality.
Results of Investigation in Scanning:
There is a feeling among a number of professional database experts that one may scan at a higher level of resolution to get all the details. The associated problem with the higher scan resolutions is that the quality of data will be voluminous besides picking additional noise. The efforts required in cleaning up the data will be much more. A balance has to be struck for scan resolution. Another crucial area in scanning is choosing the threshold value so that unwanted information and noise is not picked up. Different threshold values were tried to find out the suitable figure. Depending upon the method of digitisation and the features being digitised, the threshold values and scan resolution are found out. After carrying out scanning of over 200 film positives, the following values were found to be ideal.
One of the problems associated with spatial data is polygon overlay. It is common to find that different themes for the same area share central lines. For example, the shore line of a lake will appear on both the layers showing the limits of land as well as that of hydrology layer. When these themes are overlaid on each other, the two versions of shore line should match precisely. In practice, it is often found that they do not match. Even if they match perfectly on the input documents i.e. film positive, their digital representation in the database differs. As the overlay operation is carried out with a high precision in the digital domain, the differences are treated as real, thus forming spurious polygons more commonly known as Sliver Polygons. During the course of this work, a method to overcome this problem yielded good results in minimising the sliver polygons. In the MicroStation environment, the limit/line is digitised only once in a particular theme. Subsequently, the same line/limit segment is copied on to the other themes from one level to another. When the data pertaining to four sheets on scale 1:50,000 was compared by overlaying all the layers of data, not even a single sliver polygon was detected due to the reasons stated above. However, this method is not feasible if different layers are digitised by different persons. In that case, at the time of merging the layers, one of the lines is deleted and copied from other level. The results are given in Table 8.
The following alternatives were also tried where two different lines appeared after digitisation. Any number of lines lying within the tolerance distance of each other are assumed to be the same. Similar approaches were used for snap joining the lines if the distance between them is less than the set tolerance. However, this at times gives rise to another problem of conflict in geometry and topology. Because of the tolerance distance, all objects or features smaller than the tolerance become ambiguous or indistinguishable. This problem was overcome by either allowing the topology to correct geometry or allowing geometry to correct topology. Only two such cases were noticed in the pilot database created during the work and they were simply corrected by allowing the geometry to correct topology. However, this may have implications on other objects whose properties may change thereby having a serious effect on the integrity of the data base. The clean command of the software had both auto mode as well as interactive mode. Hundred percent clean data without jeopardizing the other aspects may only be achieved by using the clean command in interactive mode on invoking of which the errors are flagged.
This study reveals that accuracy aspects depend upon source data, mainly the source from which the digitisation is undertaken.
- Paper maps are not found very suitable to achieve the desired accuracy since the aging of maps affects their dimensional stability and hence there is a great deal of variation.
- The point, line and area are having different accuracy figures while digitising from maps. And hence it is necessary to try other materials like synthetic sheets, plastic sheets etc. for digitisation and not paper prints.
- Old maps should be discarded for digitisation.
- On overlaying, it is necessary that the sliver polygons are removed from the data. This proposition being difficult, necessary methods are required to be evolved. The method of copying the line from one polygon to another appears to be satisfactory provided a job for digitising all the layers of a map is carried out by the same person.