Mehmet I. Saglam
Istanbul Technical University, Informatics Institute
Advanced Technologies in Engineering
Satellite Communications and Remote Sensing Program, Maslak
Email: [email protected]
Istanbul Technical University, Faculty of Electrical and Electronic Engineering
Electronics and Communication Dept., Maslak
Email: [email protected]
Okan K. Ersoy
Purdue University, School of Electrical and Computer Engineering
West Lafayette, Indiana,USA
Email: [email protected]
High resolution imaging sensors are very important in modern remote sensing technology. These sensors produce multispectral data and thereby result in one image per wavelength. With the growth of dimensionality and higher spectral resolution, a large number of classes can be identified. When pattern recognition methods are applied to remote sensing problems, smallness of the training data, which is used for designing the classifier, is an inherent problem. Furthermore, complex statistical distribution of a large number of classes constitutes another important problem.
The main purpose of developing a special classifier in this thesis was to design a system which performs better than similar classifiers such as existing support vector machines (SVM’s), specifically in dealing with complex remote sensing classification problems. A new support vector learning algorithm, which is called Linear Support Vector Machine Decision Tree with Self Organizing Map (SOM-LSVMDT) was further developed for remote sensing to solve these problems. The SOM-LSVMDT consists of a clustering part and a binary tree structure with linear support vector machines in all tree nodes. The SOM-LSVMDT simplifies the model selection problem inherent in SVM design. In addition, the SOM-LSVMDT has in-built properties for dealing with classes which can be considered as rare eve nts. There are two reasons for the occurrences of rare events. The first one is by nature due to a class having low probability of occurrence. The second one is due to the SOM-LSVMDT’s structure, in which classification in the decision tree nodes lead to rare event classes. In order to solve rare event problem, the randomly adding vector method is used to prevent all data vectors from lying on only one side of the hyperplane during training. This problem naturally occurs in the deeper tree nodes. In the SOM-LSVMDT, the SOM part divides the remote sensing data in to a number of partitions. As a consequence, small decision trees are generated, thereby the problem of rare events is reduced in scope.
Before training with LSVM, training data is partitioned by using the SOM. Then, multiple linear hyperplanes are constructed while traversing down the tree by using the LSVM. In testing part, the output of the current node is selected with the votes of each binary classification. This process is repeated until a leaf is reached. The SOM-LSVMDT stores the final class label at the leaf assigned to the data vector reaching this leaf. The computer experiments show that the SOM-LSVMDT can achieve better performance than linear support vector machine decision tree (LSVMDT). The eight, ten, and thirteen Colorado data sets are used to obtain experimental results. They are known and very complex remote sensing classification problems. The number of samples of each data set is showed in Table 1.
Table 1. Number of Samples each Colorado Data Set.
|8-Class Colorado||10-Class Colorado||13-Class Colorado|
Table 2. Performance of the SOM-LSVMDT using the Colorado data sets.
|LSVMDT||SOM-LSVMDT II||SOM-LSVMDT III||SOM-LSVMDT IV|
|8 Class Colorado||%7,17 ERROR||%3,4 ERROR||%0,97 ERROR||%1,90 ERROR|
|10 Class Colorado||%48,62 ERROR||%35,02 ERROR||%30,69 ERROR||%35,98 ERROR|
|13 Class Colorado||%29,48 ERROR||%22,64 ERROR||%20,08 ERROR||%18,40 ERROR|
For example the best previous result offered by support vector machines for ten-class Colorado problem was around 51% (49% error). The results are shown in the Table 2. The experiment is repeated for three times. Firstly, the Colorado data is divided to two clusters. For 10-class Colorado data set, the testing error decreases from 48,62% to 35,02%. Classification errors of 8 and 13 class of Colorado sets decrease, too. They are shown in the column of SOM-LSVMDT II. After that, Self Organizing Map divide the Colorado data sets to three clusters (SOM-LSVMDT III). Again classification errors decrease for all three Colorado data sets. For the last column, the performance of SOM-LSVMDT is better than LSVMDT, but the classification errors increase except 13-class Colorado data set. Whenever the Colorado data sets are clustered, the statistical distribution of the classes changes. The number of classes and samples decreases in each cluster, if the more clusters are obtained by using SOM. Furthermore, the classes already have nonlinear separation. For example, class of Douglas fir/ Ponderosa pine/Aspen has only 25 samples, while water has 408 samples in the 10-class Colorado data set. 2 or 3 clusters are appropriate for a data set which has a maximum 10 classes. To obtain more clusters by using SOM, the number of samples and classes should increase. The classification error of 13-class Colorado data set also decreases after four clustering. But the most important point is that performances of all SOM-LSVMDT’s are better than LSVMDT.