Hong-qiao Wu, Tian-he Chi, Zhang Xin, Jian-bang He
Institute of Geographic Sciences and Natural Resources Research (IGSNRR)
Chinese Academy of Sciences (CAS), Beijing 100101, China
Institute of Computer Technology (ICT),
Chinese Academy of Sciences, Beijing 100080, China
Remote sense image data is becoming a main data source for spatial information with the development of observation and space flight technology. Meanwhile, the storage and management of RS image data has exerted an important effect on spatial information sharing and the implement of “Digital Earth”. The patterns of storage, management, promulgation and usage of spatial information has changed violently because of the development of network and distributing computing. Following the appearance of large spatial database management system supporting for spatial graphics, a popular study field for RS image data distributing storage and management has aroused.
There are some advantages for the storage and management of RS image data using current Relation Database Management System (RDBMS). RS image data provide more spatial information and becomes larger and larger, and RDBMS can cater for the requirement of large data distributing storage and information sharing; RDBMS is the mainstream commercial DBMS currently running with long-time development, strong function, stable performance and self-contained industrial standards. Some current enterprise RDBMS such as ORACLE, SQL Server, SYBASE, Informix adopt Client/Server architecture and provide security, integrality and multi-user sharing mechanism. All these properties are absolutely necessary for distributing storage and information sharing.
Current RDBMS are not enough in storage and management of complicated data types such Binary Large Objects (BLOBs) as graph, image, sound etc, although they successfully solved storage and management of number and text string. Such BLOBs are lack of meaning, and RDBMS can not support query based on content or analyze them. Furthermore These BLOBs must be downloaded to use at a distributing environment and increase the transmitting data. An object-oriented database (OODB) study has put forth to solve the above problems but the current study are not satisfying. Such OODB can not optimize storage and accessing of the BLOBs. At the present time, an extended RDBMS (Object-Relation Database) can query such BLOBs data with SQL, but as for multi-resolution, multi-spectrum and multi-epoch RS data, there are still some problems and low efficiency.
There are some difficulties in RS data sharing under the distributing architecture because of the multi-resolution, multi-spectrum, multi-epoch and large volume data of RS image. Parallel computing technique can alleviate the status. Currently, a parallel platform based on distributed sharing memory (DSM) has been successfully used in image processing. This paper has studied
Organization of Parallel Platform
To the point of parallel platform structure study, PC clusters and differ-structure system are different branches of network and are all key study fields. Usually, PC clusters refer to the LAN network and differ-structure system refers to Internet. Scientific American Journal(may,1997)acclaimed that the most fast computer system would be the PC clusters connected by LAN or internet.
To the point of programming interface and memory management, parallel PC clusters are divided into message pass interface (MPI) and distributed sharing memory (DSM). In MPI system, each processor can only access their own memory and the communications between processors are completed by message transference with peculiar programming method. In DSM system, distributed memory is organized into a unified encoded memory space shared by all processors with a fixed program. Programmer are faced a huge linear memory space and need not change the programming way too much.
This paper adopts DSM platform JIAJIA based on Linux, the following figure is the architecture of JIAJIA.
In this architecture, the sharing memory is distributed at all nodes, each page has a home node, and each node has a cache. Each node’s page only records the information stored in cache. Processor only handle on local access when it accesses its home sharing page. Otherwise, it will store the page directly or have an interruption. Once an interruption occurs, the interruption service routine fetches the page from corresponding home node and deposits it to the page’s cache.
The Pattern of Distributed Parallel Storage
This paper studies the distributed storage, management and query of large image data based on DSM platform JIAJIA. A logic database of image data is constructed by the solid index of ‘pyramid, layer, block, section’, and the pattern has been implemented with current RDBMS. Figure 2 illustrated the pattern.
This pattern has the following characters:
Distributed storage In this pattern, the original image data is stored in distributing files system. The logic database establishes a mapping between logic database and data files by setting up a logical connection between the field of database table and the real data file.
Parallel processing of data query and transaction The clients’ queries incur large transaction and computation and exert a severe pressure on server because of the large image data. All these transactions and computation can be assigned to PC clusters through programming following the programming interfaces provided by DSM platform JIAJIA. So server’ burden will be alleviated and provide fast speed.
Storage and Organization of Multi-Source Image Data
Considering the difficulties of storing the multi-resolution, multi-band and multi-epoch image data using RDBMS, this paper adopts the solid index mechanism of “pyramid, block, layer, section” and logic database to store and manage image data using RDBMS. This can take full advantage of the mature techniques of RDBMS.
Solid Index Mechanism of “Pyramid, Block, Layer, Section”
Storage of RS image data must consider the characters of its multi-resolution, multi-band, multi-epoch. According to the solid index mechanism of “pyramid, block, layer, section”, a solid spatial database based on spatial relations and image data characters is constructed. The organization pattern based on “pyramid, block, layer, section” is as the following figure 3 shows.
‘Pyramid’ represents image spatial resolution, and different pyramid layer represents different spatial resolution. The top of pyramid represents the minimum spatial resolution.
‘Block’ represents different region of image, i.e. the different geographical region in reality.
‘Layer’ represents optic resolution, i.e. the band of image.
‘Section’ represents different time of image. Image data of a same locale from the same band of a sensor could be different because of collecting time.
‘Pyramid’ organization can organize image data in many scales and optimize data transmission according to different requirement of different users. ‘Block’ organization helps system locate the corresponding region of data files quickly according to requirement decreasing accessing time and data quantity. ‘Layer’ and ‘Section’ organization of image data retains the RS data’s peculiar properties- optic resolution and temporal character. When user queries image data, system provides some optional choices to give different time, different band image data.
The solid index of ‘pyramid, block, layer, section’ organizes multi-image data and constructs index database of image data. So there is a logical relation between this database and real image file. Compared with image files, this database is abstract, virtual and logical but it can be managed with the current mature RDBMS.
The logic database was constructed according to the solid index mechanism and query items provided to the clients. A theme refers to peculiar image data from a certain sensor. Different themes are stored as different tables in a RDBMS. Different bands from a same sensor are also stored at different layers just contain a same index field indicating the same source. The image data from same senor, same band but different time are also stored at different tables. This table structure of logic database can provide original information query according to RS image data peculiarities, meanwhile, it is avoid of a large table affecting response speed.
Considering original files’ large data, it is necessary to cut original file into regular tiles. When clients query image data, server side provides some corresponding tiles. Before image data is put into logic database, it is cut into many tiles according to its pixels size, and then these tiles are stored at an individual database. These tiles are not real image data, they are the logical tiles which indicating the corresponding part of real image. In fact, server side looks up the corresponding image files and reads partial image data according to the record of logical tiles in database.
According to above image data management pattern adopting solid index of ‘pyramid, block, layer, section’ and logic database, a distributed parallel image database system (PARIDS) is constructed to provide storage, management and query of large image data. In the application of system, we adopted ETM six bands of 1, 2, 3, 4, 5, 7, 8, the panchromatic band of spot and ten bands of MODIS as experimental data.
The whole system includes Geo-information extracting of RS image, construction of logic database, storage and setting up logic relationship of system logic tables. In the design, the application system was divided into 3 layers, they are physical layer, logic layer and application layer. The architecture and function flow of the system are showed in figure 4.
Application layer is the interface between system and users. The communication between them is carried in browser through http protocol. The parallel background severs process parameters submitted to web servers and deploy the logic tables.
Logic layer includes logic database constructed by the solid index mechanism of ‘Pyramid, Block, Layer, Epoch’ and it is a getaway between application layer and physical layer. It is jointed to application layer by WEB server, and it is jointed to physical layer by some image processing procedures and arithmetic’s. These procedures are deployed after WEB server has deployed logic database.
Physical layer has stored large volume image data, and every image data file was classified by the solid index of ‘Pyramid, Block, Layer and Epoch’. Subsequently a strict file catalog was constructed according to the file system, which was organized by system itself.
This paper studies distributed parallel storage, management and query of RS image data using current RDBMS based on solid index of ‘pyramid, block, layer, section’ of image data and logic database. We construct a pattern and apply it to the design of parallel image database system (PARIDS) illustrating that the pattern is viable.
This pattern is constructed at parallel platform JIAJIA. JIAJIA is also a background sever clusters which improving server’ performance. A graft of this pattern from parallel platform to a stand-alone sever should be applicable in theory just sacrificing the performance of sever side. Considering the rapid improvement of hardware, this study will be more meaningful.
The parallel platform enhances the speed and efficiency of background server. So, increasing some process functions at client side will exhibit system’s superiority much more.
- Liu Ren-yi, Liu Nan, Su Guo-zhong. Manipulation of Spatial Graphical Data in RDBMS and Its Implementation in Application of GIS, ACTC GEODETICA et CARTOGRAPHICA SINICA, Vol. 29, No. 4:329-333.
- Fang Tao, Gong Jian-ya, Li De-ren. Many Key Techniques in Establishing Image Database, Journal of Wu Han Technical University of Surveying and mapping, 1997,9, Vol.22 No.3:266-269
- Fang Jin-yun. Application and Theory of Remote Sensing Image Processing Based on SVM, Postal Doctoral Research Report of CAS, 2001,8
- ORACLE. Data Cartridge Operating System Interface [EB/OL]. https://www.ORACLE.COM. 2001-10-22
- Wang Mi, Gong Jian-ya, Li De-ren. Spatial Seamless Data Organization of Large Scale Image Database, Journal of Wu Han Technical University of Surveying and Mapping, 2001,10, Vol. 26, No. 5:419-424
- https://www.esri.com. Raster Data in ArcSDETM 8.1.2, An ESRI White Paper, February 2002