Home Articles Comparative study on image data model for integrating multiple resolution image/raster data

Comparative study on image data model for integrating multiple resolution image/raster data

 

Taizou Yamamoto and Ryosuke Shibasaki
Institute of Industrial Science, University of Tokyo
7-22-1, Roppongi, Minato-Ku Tokyo 106, Japan
Tel: (81)-3-3402-6231 Fax: (81)-3-3479-2762
E-mail:[email protected]
 

Abstract
As a wide variety of satellite images with different resolutions become available and various types of geographic information is being accumulated, it is rather common among users to integrate those data for their application s. To conduct consistent and efficient processing of a wide range of geographic information, it is essential to develop data models for geographic information together with efficient operation method. However, we have no sufficient sets of data models and operations for handling image data as geographic or spatial data. This insufficiency leads to the generation of artifacts or the degradation of image data quality in overlying image data or transferring it to different coordinate systems by resampling. This study reveals possible degradation in the overlay and resampling operations, and makes comparison of data models and operations in term of computational load and its degradation. The results implies a possible future image data model.

1.Introduction
As a great variety of image data become available due to the launch of many remote sensing satellites, techniques of extracting necessary information by integrating these data are increasingly important. And as the development of GIS data and remote sensing image data are becoming very common. GIS is now an indispensable system as a platform of integrating information. For GIS, accurate and consistent processing are achieved by representing geographic information using data models and data handling.

So, what kinds of models should be used for imaged data? Image data is intuitively handled as raster data because in practice it is a two dimensional array of pixels which correspond with a certain unit area on the ground surface. But image data may not be raster data exactly all the time since the shape of each pixel neither be square nor rectangular due to viewing angle effects and topographic distortions. This means that we need to devise a new model to represent foot print more accurately, and a set of operations to handle them efficiently.

The objective of this study is to propose and make comparison of image data models together with operations consistent with those models. The performance of those models is evaluated in terms of computational complexity and size of the models. The shape of foot prints may be different, appending on the types of sensors. We paid attention here only to optical sensors whose foot prints are either square or rectangular.

2.Image data models

2.1 Necessity of image data model
Conventionally, pixel position is represented by its central points, while the shape of the pixel is only ambiguously represented by resolution. The shape of the pixel on the ground (foot print) is not represented explicitly. This kind of representation of image data is not sufficient for mapping foot print. We need to represent the foot print more clearly and explicitly. The ambiguity of the foot print representation cause several serious problems in "resampling" process. Resampling is an operation of projecting an image data onto grid cells defined on a given two dimensional coordinate system (Fig.1). Usually, for the implementation of the resampling, centre position of each grid cell is transferred to the image plane, and pixel value at the transferred location is given with interpolation. In many cases, since the size of the grid cells are set to be almost the same with that of the pixel, each pixel of the image data is mapped to a single grid cell.

Fig.1 Resampling
Fig.2 summarizes the possible errors, which may occur in the resampling process. Error percentage denotes the ratio of unoverlapped portion to the original footprint. Fig.3 shows that the error percentage are strongly affected by the relative size of grid cells, rotation angles between the image coordinate systems (line and pixel directions) and the grid-cell is as large as the pixel, maximum error could be 90%. This kind of errors also affect severely to the quality of product, when we overlay high resolution image data on low resolution image data which are resampled to the same coordinate system of the high resolution image data. This problem comes from the same source a the shape of footprints of pixels are not properly represented and no sufficient algorithms for operating those data models are developed. By applying smaller grid cells, of course, this problem can be solved with a conventional resampling algorithm. In those cases, however, computational load of the resampling would be proportional to the numbers of grid cells, which means the load may become quite heavy if "high quality resampling" is required. As will see later, a data model which represent explicitly the footprint boundary can mitigate the increase of computational load rather effectively.

Fig.2 Degradation of Image data in Resampling.

Fig.3 Average Percentage of the Unoverlapping area by the rotation angle* and grid size.
See* Fig.2

 

 

ACRS97

 

 

2.2 Proposal of image data model
1) Fundamental configuration of image data model

The image data model consists of 2-dimensional pixel model and additional footprint boundary models.

2) 2-dimensional pixel model
2-dimensional model have two components; one is 2-doimensional array of pixel values and mapping function which relate the image coordinate values with the ground coordinate values. The mapping function is denoted by f(i,j) , {(i,j) = (line# and pixel#)} for transferring the image coordinate values to the corresponding ground values. The model also has an inverse function f1(x,y,z) , {(x,y,z): ground coordinate values}. These functions are usually non linear function due to the topographic distortions and so on. The boundary of footprint is indirectly or implicitly represented as the boundary of square a rectangle pixel on the image plane. Although operation algorithms consistent with model are not sufficient by provided, this model is conventionally used.

3) Footprint boundary models
The footprint boundary models represent the shape or the boundary of pixel on the ground surfaces clearly. This model is not used independently, but is always used as additional models to the basic 2-dimensional pixel model. We propose two types of model as follows, under this category.

  1. Grid line model
    This model represents the line boundary and the pixel boundary of footprint the parallel straight lines with regular intervals on the 2-dimensional plane. This model has a directional vector, distance of each boundary, and the origin of the initial line and pixel boundary as parameter.
  2. Line segment model
    This model approximates the shape of the footprint (the boundary of line and pixel) by line segments. To represent this model, we should make a table of coordinate values of segment end points (ground coordinate values). Operations based on this model can be conducted efficiently by generating an interim file as follows.

    Ymn £y*We assume each line segment is denoted by a series of end points with coordinate values as (Xpij,Ypij), (XLij,YLij), (XL**,YL**) denotes the boundary of pixels, while (XL**, YL**) is boundary of line, where I is boundary # (pixel#, line#), and j is the serial # of ends points of line segments. In Fig.4 the image data has pixel boundaries (P0,P1,P2,…..) and line boundaries (L0,L1,L2,…). The location of P0 is represented by (Xpoj, Ypoj), {j=0,1,2…..M}.

    Fig.4 Line segment model
    We will show how the intersection of the line (y=y*) between the boundary of pixel (Pm) can be computed. First, find n satisfying the following inequalities. And, the solution of the following equations (Xm(y=y*), y*) is the point of intersection.
     

    Xm(y=y*)= y*-ymn
    ——————-
    Ym(n+1)-ymn
    .(xm(n+1)-xm)+ xmn

    We will compute all x coordinate values of intersection points between the pixel boundaries and y=y* where (y*=a0, a1, a2,…: sorted y coordinates of end of all pixel boundaries descending order) , with the above method. The same procedure is applied to compute all x coordinate values of intersection points with all line boundaries. With this process , we can generate the interim file shown as Table .1. When m is the number of pixel boundaries, n is line boundaries, and the total number of end points is M and N respectively for all the pixel boundaries and all the line boundaries, the maximum computational load is O (M=n+m=N).

3.The operation methods of the image data model
At this section, we list the operations necessary for the integrated use of multi-resolution image data with vector data, examine the computational load of the operations and the possible data quality degradation by resampling and overlaying.

Table 1. Intrim file for Line Segment Model

Line#
Y
0 1 2
b 0 X 0 0 X 0 1 X 0 2  
b 1 X 1 0 X 1 1    
b 2 X 2 0      
:        
 
pixel#
Y
0 1 2
a 0 X 0 0 X 0 1 X 0 2  
a 1 X 1 0 X 1 1    
a 2 X 2 0      
:        

 


ACRS 1997
 
 

3.1 Transformation to raster data (Resampling)
We define that the operation conduct resampling of an image of an image data and transfer resampled data to grid cells on a ground surface. In case grid cells is fully included the pixel, the grid cell will inherit the pixel value. In case the grid cell is larger than the pixel, grid cell will have an average value of all pixel fully and partially included by the grid cell. Here we assume the total number of grid cells is N*N.

1) Two dimensional pixel model
Coordinate values of four corners of grid cells on the ground are mapped to the image plane. Pixels are identified which include the mapped four corners respectively. If all the four corners are included in the same pixels , it is concluded that the grid cell is included in the pixel. Otherwise, a polygon is generated on the image plane by connecting the four grid corners. The grid cell will have an average value of pixels which are fully and partially included by the polygon. When the of the grid cell is relatively larger than that of the pixel, the computational load is proportional to the number of grid cells * the average number of pixels on the boundary of the grid cell (O(N*N*Np); NP is the average number of the pixels on the grid cell boundary). If the grid cell size is relative smaller than that of the pixels, computational load will be proportional to the total number of the grid cells (O(N*N)). The quality of the operation is relatively high when the grid cell is much larger or smaller than that of the pixels, while it is lower in case the grid cell size is almost equal to the pixel size. It should be noted that the convectional algorithm which maps only the center points of grid cells fail to identify pixels which are included by the mapped grid cell and that the quality of the transformation to the raster data is severely degraded.

2) Footprint boundary model

  1. Grid line model
    In Fig. 5 where the interval of pixel boundary and line boundaries is dA, dB respectively, the origin, (x0,y0) and the inclination angle , a ,b, the line of n-th pixel boundary and n-th line boundary (Pn,Ln)are represented by the following equations;

    Fig.5 Grid line model

    Pn:y=tan a .x+(y0-tan a .x0) +dA
    ———-
    cos a
    .n
    Ln:y=tan b.x+(y0-tan b.x0)+ dB
    ———-
    cos b
    .n

    When y is fixed (y=y*, y*: one of the grid boundaries parallel with x-axis), the range of the grid cell corresponding pixel (i,j) is represented by the following equations.

X0+ 1
—–
taba
(y*-y0- dA
——
cosa
.i )£x0+ 1
——
tana
{ y*-y0- dA
—–
cosa
.(i+1)}
 
X0+ 1
—–
tabb
(y*-y0- dB
——
cosb
.i )£x0+ 1
——
tanb
{ y*-y0- dB
—–
cosb
.(i+1)}

 

  1. And when the number of pixel and line is m and n , computational load is O(m+n) for a line parallel with x axis. Since the number of grid cells in a column along y-axis is N, the total computational load is O(N*(m+n)).
     

    • Line segment model
      By using the interim file (Table. 1), when y is fixed (y=y*), we seek m as follows. The first search of m needs the computing time by binary seek method.

      am£y*<am+1
      The range of x fulfilling the next equation provides range or pixel number (obtained by dividing x range by the grid all size) of the grid cell. Repeating the same procedure to line boundary direction, we can figure out which grid cells are included by which pixel. If the interim file is already generated, when the number of pixel and line boundary is m and n, the computational load of this operation with is O(m+n)for a line parallel with x axis. The total computational load is O(N(m+n)).

Y*-am
——-
am+1-am
.(x(m+1)i-xmi) + xnm£x <
y*-am
——-
am+1-am
.(x(m+1)(i+1)-xm(i+1))+xm(i+1)

3.2 Overlaying with vector data
Overlay is an operation of picking up pixels overlapping with or being included by given vector data (point, line segment, polygon ). In short, in the case of point data, pixels including point data should b extracted. In the case of line segment, all pixels on its line segment are to be selected. In the case of polygon, pixels being included by polygon and on boundary of polygon should b chosen. To do this operation, we do not need to use footprint boundary model explicitly, but the 2-dimensional pixel model. We project the point data (end or intermediate points of line segments) to image plane, pick out pixels on line or included by polygon. In this case, we assume the image mapping function F1 to map the line on the ground coordinate system to a line on the image plane.
 


 

 
 

3.3 Overlaying image data
It is an operation of identifying relations between different pixels. In short, it identifies inclusion, overlaying, detached relations between pixels of image A and pixels of image B as shown in Fig.6..

Fig.6 Overlaying
1) Two dimensional pixel model
We can generate F the function of mapping image A to image B and inverse projection as shown in Fig.6, to find out the relation of inclusion between the pixel of image A and image B by the same algorithm as described at 3.1,1)

2) Footprint boundary model

  1. Grid line model
    Similarly as 2-dimensional pixel model, we look for the mapping function F (from image A to image B) and its inverse function. Then, image A can be regarded as grid cell data. After Remote Sensing-generating grid lines from the mapping function F and F1, the same algorithm as 3.1,2) can b applied. An amount of calculation is proportional to the product of the number of the boundary of image A by the number of line boundary of image B plus the number of pixel boundary of image B. So, it is advantageous, image B is an image which has number of pixel and line boundary less than image A.
  2. Line segment model
    It is the same as grid line model, but as this model can not reuse the interim file, it needs time to conduct the same operation to update the interim file.

3. Conclusion
In transforming an image data to a raster data on the ground or overlaying multi-resolution image data, the quality of the results from those operations may be severely degraded, if the footprint of the pixels are not explicitly considered nor represented. To represent the footprint boundary explicitly, we need a new model of image data together with consistent operation methods to handle those models. The authors propose the footprint boundary models. Grid line model, the first type of the footprint model, can be applied to the limited situations, while line segment model, the second type can be applied to rather generic situations. Although it is still problematic how to generate the interim file for those models (that means how efficiently footprint boundaries can be approximated by line segments), once the interim file is generated, the computational load is O(N*(m+n)), where N is the number of lines of grid cells on the ground, m and n are the number of the line boundaries and pixels boundaries respectively). Considering that the computational load with the convectional model and algorithm is N*M(M: is the number of the columns of the grid cells), the line segment model can improve the computational efficiency drastically, when the grid cell size is much smaller than that of the pixels. However, when the ground coordinate system used for defining grid cells is changed frequently, the improvement of the efficiency by the footprint model is reduced, because the interim files have to be re-generated. It will be appropriate that when the grid cell size is not so small compared with the pixel size (the quality of the transformed image is low), the convectional 2-dimensional pixels model should be applied, while the footprint boundary models are better for the grid cells relatively smaller than the pixel (in this case, quality of the transformed image data is high.).

5.Future Prospects
To check the relevance of our ideas, it is important to implement the models and operations. To do so, we need to make an experiment on the computational load and time using actual image data for selecting suitable algorithms. It maybe affected not only by software such as DBMS but also by hardware as workstation or personal-PC. As for the selection of software, we have plan to develop the system with Oracle and SDE(Spatial Database Engine) considering flexibility of the system expansion and improvement. And as for hardware, since the computational load is heavy, we need to study on parallel computing, paying special attention to disk I/O. In addition, we will improve the system to provide the larger quantities of data rather speedy by migrating the data among storage devices considering the frequency of demand for data access.