**Ali Ghafouri**

Faculty of Geodesy and Geomatics Eng.,

KN Toosi University of Technology,

FVali_Asr Street,

Mirdamad Cross, Tehran, Iran

[email protected]

**Abstract**

Any process of subpixel classification like Linear Spectral Unmixing needs endmembers to decompose mixed pixels and determine combination fractions. It is indicated by Roberts (1998) that a wrong selection of endmembers can cause the fraction to appear negative or superpositive (more than 100%). In an n-components space it is assumed that the endmembers will occur at the vertices of the hyper-solid defined as the geometric shape bounding the pixel values in that space.

Two different methods for selecting the endmembers from the edges of this component space were employed. The first method sought to maximally include pixels from the edges of the two component distribution. Simply stated, this way selected more pixels and averaged the reflectance of more pixels that occur at the edges of the distribution. The second method included only the extreme values of the distribution edge and is termed minimally inclusive and averages less number of pixels. Both procedures were applied on an ASTER scene and results are evaluated.

The correlation coefficient between some corresponding endmembers of two methods was calculated to be between 0.96 and 0.99. The error of the estimate for both inversions is consistently below 0.02 (95%). The minimally selected inversion seems to have a somewhat higher mean and higher standard deviation from the mean. The two inversions seem to have a deviation of 0.01 on their means and 0.03 on their standard deviations, with the minimally inclusive inversion to have a lower standard deviation and mean. It also has a higher minimum and a lower maximum. The latter shows that the minimally inclusive selection produces results that better constrain the fractions to a unity range.

ASTER satellite data were used to study spectral features and classifications of land cover especially agricultural fields of north-eastern Markazi province, Iran. Ground investigation data allowed the evaluation of paper results much better.

**1. Introduction**

Satellite Remote Sensing has made information collection available where field surveying has fallen short because of prohibiting factors such cost, timing and terrain difficulties [8]. As a natural consequence, Remote Sensing science is in the process of developing models for identifying spatial patterns in a larger spatial and temporal context with relative high accuracy. A satellite remote sensing based approach in identifying land covers and quantifying vegetation land cover can prove very important in the fight against land degradation and vegetation loss [6]. Numerous studies have suggested ways to infer vegetation quantities from satellite imagery, most of which take advantage the high absorbance of vegetation in the visible spectrum and the high reflectance in the near infrared.

This study employed satellite Remote Sensing and the Linear Spectral Unmixing approach to evaluate the feasibility of the approach in an agricultural area. Four endmembers were selected from the image, the four main land covers pomegranate, sunflower, cotton and grass/tree. The linear unmixing model proved promising and gave reasonable results although it was discovered that the unmixing results are highly sensitive to the endmember selection approach. On this basis, some studies executed to conclude a single method of endmember selection which could have this possibility to introduce as a robust method.

2**. Endmembers as Pure Reflectance Spectra**

An endmember is the pure reflectance spectra that were derived by a specific target material with no mixing with any other materials. Determination of endmembers is the fundamental stage in each process of classification; because selection of endmembers is comparable to training procedure in supervised image classification, and disregarding this stage which is the primary level of each project could cause very terrible effects on the results. In linear mixture modelling, matrix *E* is made up by endmembers, the equation in which *frac* denotes the *c×1* fractions vector with the proportions of the different ground cover types, *DN* represents an *n×1* pixel vector or multispectral observation and ε is error vector [4]Each column of *n×c* matrix *E* contains the spectrum of a so-called endmember, which is the reflectance typical for a resolution cell containing nothing but the cover type of interest. The i-th endmember spectrum is equal to the mean vector *ei* of class *i*, but there exist some other views as well. The endmembers can be selected either from spectral libraries or from the image itself. The selection of the components from the image, however, has the advantage of being obtained easily, being simple and, having the same scale of measurement as the data [7]. Alternatively they can be inputted by field measurements and calibrated for each component involved or selected from a spectral library [4].

In the case that they are selected from the image itself, it is possible to identify the endmembers as the extreme values from a plot of two bands. This holds true provided that, the components are spectrally distinct enough and the bands plotted are sufficiently uncorrelated. The latter can be especially complicated since an examination of the ASTER bands show a high degree of correlation particularly between the three visible bands.

**3. Ways to Determine Image Endmembers**

The endmembers spectra, which make up the columns of the matrix E used throughout this thesis, must be known before the decomposition process can take place. As mentioned above the endmembers can be selected either from Spectral Libraries or from the image itself. As well there are some other methods to extract image endmembers from itself but the most famous way which is usually picked is *Principal Component Analysis (PCA)*. Final method selection is discussed after detailed explanation of these two methods which are clarified as follows:

**3.1. Spectral Libraries**

One possibility to determine the image endmembers is to extract them from a spectral library. Such a library, whose use is usually related to geological applications [4], contains endmembers spectra that have been measured either in a laboratory or in the field. For applications in the agricultural domain, however, these libraries are less suitable because they need to account for all processes and factors influencing the data spectra. For example, several spectra corresponding to the same green vegetation type over different backgrounds may have to be included, since multiple scatterings between leaves and a bright soil background increase the near infrared reflectance of the leaves [7]. Also, additional spectra may have to be included to cover all stages of a crop’s growth or to compensate for the effects of other biological processes such as fluorescence in response to stress. Thus, unless extensive field work is done at the time of image acquisition, access to a very large library is required. After the calibration coefficients have been determined by field work, usually the image is converted to match the library. Analogously, however, the reference endmembers spectra in the library could be transformed into the endmembers spectra of the current image. In recent years, spectral libraries are used primarily to identify the composition of image endmembers [2]. First, the image endmembers themselves are determined using a method such as principal component analysis (see Section 3.2). Since the spectra found need not correspond to pure materials in the scene, their identity must subsequently be inferred from reference spectra of known materials. Smith et al. [8] express each image endmember as a linear combination of reference endmembers spectra, just as mixed pixels are expressed as linear combinations of image endmembers. In their two-step method, a function incorporating the earlier described calibration and the aforementioned alignment (unmixing) is repeatedly evaluated for different candidate groups of reference spectra until a suitable representation of the image endmembers determined in the first step is found.

Anyhow using direct spectral library method did not put into operation, because as mentioned above for the purpose of this paper which is on agricultural fields’ image, there is no appropriate spectral library. After all, Smith et al. [8] technique put into practice, and the Results section is presented bon the basis of this performance.

**3.2 Principal component analysis**

Traditionally, the objective of Principal Component Analysis (PCA) is to reduce the dimensionality of a data set while retaining as much of the relevant information as possible. This goal can be achieved by rotating the coordinate system such that most of the variation in the data is found along a limited number of axes, the so-called principal components. The axes where the data shows little or no variation is disregarded, which corresponds to restricting the original feature space to a smaller linear subspace. As was described earlier in Equation (1), linear mixtures of c (c*c* endmembers spectra.

Smith et al. [8] developed a method based on PCA to determine endmembers from remote sensing data. In summary, each pixel is represented using a new coordinate system, , where the ui compose of a set of orthonormal basis vectors spanning the original feature space. Next, the dimensionality of the data is reduced by replacing several of the *zi* by values that are identical for each pixel-the corresponding ui are virtually disregarded-and the sum of square errors between the original pixels and their simplified counterparts is calculated. It can be derived that this error is minimal if the basis vectors satisfy,

where N denotes the variance-covariance matrix of the data after subtraction of their mean vector. In other words, the eigenvectors of N define the coordinate system in which most of the variation of the data is found along a minimum number of axes.

**(a) Determination of the principal components u1 and u2by spanning the plane to achieve most pixels (b) Selection of endmembers spectra on defined triangle which coincides with the triangular shape of the cloud of pixels. Figure 1: The PCA approach to determine three endmembers spectra**

The eigenvalue ?i represents the variance of the data on the axis ui and is commonly regarded as either a primary or a secondary eigenvalue. Whereas the primary eigenvalues are higher in magnitude and account for the variation due to spectral mixing, the sum of secondary eigenvalues should be equivalent to the variances arising from the instrumentation. In many cases, the primary eigenvalues can be clearly separated from the secondary ones, which provides a criterion to test the dimensionality of the data and consequently also the number of endmembers. After discarding the information associated with the secondary eigenvalues, the linear subspace spanned by the endmembers is found.

According to the theory of convex geometry, this means that ei must be chosen such that the simplex they define contains all elements in the data set. If mixtures of only two classes are considered, then the minimum and maximum values of the first principal component can be taken as the endmembers spectra e1 and e2. For mixtures of three classes, a scatterplot of the first against the second principal component can be made. As is shown in Figure (1), with this scatterplot the smallest triangle-3D simplex-containing all pixels in the image can be determined; the vertices of this triangle define the three endmembers spectra [2].

When dealing with mixtures of more than three classes, several approaches are possible. Bryant has used a scatterplot of the first two principal components to find that the data has a pentagonal shape. By selecting one pixel near each of the five vertices, five endmembers spectra were defined. This approach, however, may fail for example if one of the endmembers is disguised by the other four since its prejudiced features are found along the other three principal components. Bateson anticipated a visualization technique based on parallel coordinates that permits to extend the set of endmembers with one spectrum at the time. Although this method solves the problem of Bryant’s approach, it requires a lot of human effort to judge the acceptability of the endmembers. Furthermore, it is well possible that different users come up with different endmembers spectra [2].

**4. Selection of Endmembers from the Component Space**

Two different methods for selecting the endmembers from the edges of the component space described above were employed in this study. The first method sought to maximally include pixels from the edges of the two component distribution. Simply stated, this way selected more pixels and averaged the reflectance of more pixels that occur at the edge of the distribution. The second method included only the extreme values of the edge of the distribution and is termed minimally inclusive and averages less number of pixels [9]. An examination of the reflectance plot of the two ways of selecting the endmembers showed that the vegetation endmember is particularly sensitive to the way of selection. In Band 5, the reflectance of the maximally selected vegetation endmember is of 30% values whereas the minimally inclusive selection in the same band is 39%. The other endmembers do not show any particular sensitivity in any of the bands and the approximate upward shift in the minimally inclusive selection is in the order of 0.1 to 0.2. The vegetation endmember as showed up in both plots is spectrally distinct from the other endmembers which is reassuring for this study since main concern is to estimate vegetation fractions.**5. Linear Spectral Unmixing to Achieve Better Classification Accuracy**

Attaining more accurate classification results is performable just by considering the effect of subpixel spectral gain of sensor. Although registration of sensor images is pixel-based but since we do not know the combination of objects underneath that pixel, it is possible to decompose pixels information using spectral capability of images. It is distinctive that the spectral resolution one sensor has, the better subpixel analysis would be possible, and hence better classification results would be achieved. The attitude of pixels decomposition which is used in this paper is based on linear mixture modeling which briefly described in section 2. Mathematical demonstration of Linear Spectral Unmixing is [4]:

In which, *DN* is the digital number of the pixel reflectance at one of spectral bands, Ei reflectance of component i in this band, and *fraci* is the fraction of that component in the mixed pixel. Completing this equation for the entire bands, can give us a matrix form of equation as [4]:

In which *E* is an *m×n* matrix of endmembers, *n* is the number of endmembers and *m* is number of spectral bands. *DN*, also an *m×n* matrix and frac an *n×1* vector. Error vector e is the residual for specified band.

If the number of endmembers selected is less than the expected number, an infinite number of solutions is possible and the problem becomes trivial. The problem can be circumvented by allowing for an error part in the equations that can also account for the errors in the measurements. In this case, the error is represented by the e component in the equation and the total error must be minimized by least squares adjustment method to have an equation of best fit. The solution is [4]:

**6. Results**

The theoretical processes discussed are implemented by specialized modules situate in ENVI 4.2 and sample results are presented in form of figures and tables.

The subset of the scene was inverted with the methodology discussed above for both ways of selecting the endmembers. Both inversions produced spatially similar results. The fraction maps of Linear Spectral Unmixing are presented in Figure 2.

**Figure 2. Four fraction maps resulted from the Linear Spectral Unmixing**

The high correlation of the two vegetation estimates is shown in the scatterplot of the two vegetation fraction images. The correlation coefficient between the two vegetation endmembers was calculated to be 0.99.

**Figure 2. The plot in the left compares the vegetation fractions resulting from the two ways of selecting endmembers. Minimally inclusive is on the x-axis Correlation coefficient is 0.99.**

The error of the estimate for both inversions is consistently below 0.02 (95%). As the results show, the mean error is very similar for the two inversions. The minimally selected inversion seems to have a somewhat higher mean and higher standard deviation from the mean (Table 1).

RMS Statistics | Minimum | Maximum | Mean | Standard Deviation |

Maximally Selection | 0.000035 | 0.017201 | 0.002255 | 0.001261 |

Minimally Selection | 0.000047 | 0.018267 | 0.002676 | 0.001459 |

Table 1. Root Mean Square statistical results

The errors indicate that the unmixing using the four selected components is feasible and promising. Without, however, ground-truth assessing of the results the true quality of the inversion could prove trivial. The validity of the inversion, however, is further supported by reasonable correlation of the unmixing results with a Tasseled Cap Greenness and the Normalized Difference Vegetation Index [9]. In addition, the results also indicate that the error consistently reduces with increasing vegetation fraction. A careful examination of the results however, reveals that there is a difference between the two inversions for the vegetation fractions.

The two inversions seem to have a deviation of agreement of 0.01 on their means and 0.03 on their standard deviations, with the minimally inclusive inversion to have a lower standard deviation and mean. It also has a higher minimum and a lower maximum. The latter shows that the Minimally inclusive selection produces results that better constrain the vegetation fractions to a unity range (Table 2 / Figure 3).

Vegetation Fractions | Minimum | Maximum | Mean | Standard Deviation |

Maximally Selection | -0.352281 | 1.321318 | 0.330505 | 0.1613333 |

Minimally Selection | -0.275903 | 1.516285 | 0.305255 | 0.1392121 |

Table 2. Vegetation Fractions statistical results

**Figure 3. Comparison of histograms of the two inversions. The maximally inclusive endmember selection is shown in green color.**

The Maximally inclusive vegetation fractions shows the highest number of negative value cells, with approximately 2.39% of the corresponding unmixed image to have negative values compared to 0.93% in the minimally inclusive unmixed image.

A reconsideration of the image fraction values from the maximally inclusive unmixed image of cells that were included in the minimally inclusive selection shows a consistent overestimation of those cells. The mean fraction value of those pixels seems to be higher than one.

A reconstruction of the reflectance spectrum for the components for the unmixing results using the vegetation endmember of the minimally inclusive vegetation fraction showed the true reflectance and the reconstructed to be almost identical. The minimally inclusive selection model seems to honor the endmember regions almost perfectly.

An additional assessment of the minimally inclusive endmember selection model was attempted. In this case, pixels with a high error component were selected and averaged from the unmixed images. Then their true reflectance values were compared to the reconstructed from the fraction image spectra. This was attempted to gain an insight of what is causing the error. The results are shown in Figure 4:

**Figure 4. Comparison of observed and reconstructed Spectra in Bands 1, 2, 3, 4, 5, and 7 for high error areas. The actual observed spectrum is shown in black. Most of the error is concentrated in Bands 2, 5, and 6.**

Figure 4 indicates that most of the error misfit is in Bands 2, 5 and 6. Band 2 contributes most to the error misfit. It seems encouraging that the reconstructed spectra replicate the observed values of Band 4 and Band 3 well because most of the vegetation information is conveyed in those two bands.

**7. Conclusion**

A linear spectral unmixing approach appears to be a tractable problem in areas with highlights and shadows. The underlying geology and vegetation can be adequately discriminated and accounted in terms of their percentages. Vegetation shows high sensitivity in the way its pure spectra are selected with a minimally inclusive selection to show the best results. The results need to be validated with ground truth sites. Modeling the shadowed areas as a mixing endmember seems to provide good results although there is no guaranty that shadowed areas are not vegetated. As a result, some of the vegetation content may be lost by this approach.

**8. References**

- ENVI 4.0 Help System, (2003), Research Systems Incorporated
- Gebbinck M.S. klein, (1998) Decomposition of mixed pixels in remote sensing images to improve the area estimation of agricultural fields, PhD. Thesis, University of Reading
- Ghafouri A., (2005) Accuracy Assessment of Sub-Pixel Classification Results, Map India 2006
- Ghafouri A., Mobasheri M.R., (2006) Mixed pixels classification on multispectral & hyperspectral mages for accuracy improvement of classification results, ISPRS Mid-term Symposium 2006- / Commission VII, WG III/7
- Green, A., A., Berman, P., Switzer and, M.D. Craig, (1988) A transformation for ordering multispectral data in terms of image quality with implications for noise removal, IEEE Transactions on Geoscience and Remote Sensing, 26, 1, 65-74
- Hadjiioannou, L., (1998) The phenomenon of desertification in Cyprus, Proceedings of the Seminar for breefing on the Convention of the United Nations for Combating Desertification, Ministry of Agriculture and Natural Resources, Nicosia, Cyprus
- Roberts. A., R., Batista, T., G., Pereira, L., G., J., Waller, K., E., and Nelson, W., B., (1998) Change Identification Using Multitemporal Spectral Mixture Analysi, Remote Sensing Change Detection Environmental Monitoring Methods and Applications, Ann Arbor Press: Michigan
- Smith, O., M., Ustin, L., S., Adams, B., J., and Gillespie, R., A., (1990) Vegetation in Deserts: I. A
- Theophilides C., (2000) Assessing the Linear Spectral Unmixing Approach in a variable topography environment