Home Articles Spatial Regression Method, A New Way of Valuation

Spatial Regression Method, A New Way of Valuation

Oliver Valentine Eboy
Universiti Malaysia Sabah, Malaysia

UniversitiSains Malaysia, Malaysia

Property tax is one of the main source of revenue to the Government. However, property tax rates need to be valued every five years to accommodate the present market value according to the law. Find out how the Spatial Regression Modelling (SRM) method effectively being adapted as an alternative to Ordinary Least Squared (OLS) method in Kota Kinabalu, Sabah, Malaysia.

Other than collecting parking fees and license fee, property tax is also one of the most important revenue for local government. The rating or assessment of the property was implemented to provide services such as development and maintenance of public amenities and cleaning of residential, commercial and industrial areas. However, the assessment needs to be update from time to time to keep up with the current market value. In order to do this, revaluation of the rates need to be conducted every five years which is in accordance with the Local Government Act 1976. Unfortunately, the revaluation was normally carried out after 10 or 20 years (Dzurllkanian Daud et al., 2008). Table 1.2 shows the pending revaluation exercise by Local Government in Malaysia. As stated in Table 1 below, only 16 local authorities had performed revaluation within 1 – 5 years after the end of last revaluation, while 29 others conducted the revaluation after 6 years or more.

Table 1: The Pending Revaluation Exercise by Local Governments in Malaysia.

Pending Revaluation (After Last Revaluation)


Percentage (%)

1-5 years after 5-year end of last revaluation



6-10 years after 5-year end of last revaluation



10-15 years after 5-year end of last revaluation



More than 15 years after 5-year end of last revaluation






Source: Dzurllkanian Daud et al. (2012)

The reason for the local government to delay revaluation exercise was probably due to the time consuming and costly process of conducting manual property valuation (Tretton, 2007; Mustafa Omar, 2004). Consequently, the rating values of the property were generally behind the current market value. Although, computers has been used in producing property rating maps and running daily administrative operation such as tax collection in most local governments, it is not used for the analyzing or calculating the property rating.

Property valuation model was then introduced to overcome this problem. It is capable to performed valuation for the property in large quantity for taxation purpose and in a very short time. The model would enable the authority to produce a faster and cheaper revaluation process with accurate property value predicted. This technique also provides uniformity and consistency in ad valorem valuations particularly when revaluations of large number of parcels at the same time (Deddis, 2002). Such an approach potentially could help local authority to speed up revaluation process and reduce cost.

Unfortunately, the usages of this approach have yet to be materialized in Malaysia as it is still developed and tested at the academic level even though such approach has been adopted by various countries such as United Kingdom, Australia, U.S, Africa, New Zealand and Europe (Dzurllkanian et al., 2006:2). Therefore, new method and new study is needed to be conducted in order to convince the local authority in Malaysia to adopt this approach.

This paper examined the capability of Spatial Regression Model in developing a property value model for tax purpose in a local authority jurisdiction area. However, the study for this paper focused on the residential properties excluding apartment and condominium.

Property Valuation Model

In normal practice, five methods of valuation were consistently been used namely comparable method, cost method, residual method, investment method and income method (Scarrett, 2008; Richmond, 1985; Ismail Omar, 1992; Appraisal Institute, 1992). There is, however, another valuation method that gains momentum especially during this age of technology which is the regression method (Brown, 1974; Gloudemans and Miller, 1978; Mark and Goldberg, 1988; Cannaday, 1989; Ismail Omar, 1992).

The traditional regression technique, which is also called as Ordinary Least Squares (OLS), is a statistical technique that is capable to model, examine, and explore spatial relationships, better understand the factors behind observed spatial patterns, and predict outcomes based on that understanding. This type of technique is suitable to be used by valuers for revaluation exercise to estimate property rating which involves many properties and covers large area (Dzurllkanian Daud et al., 2008). This approach however, is spatially limited, unable to be mapped and unfriendly within GIS for analysis (Fotheringham et al., 2002:6). Most importantly, it produces model error specifically the spatial autocorrelation error (Wilhelmsson, 2002; Suriatini Ismail, 2005; Kulczycki and Ligas, 2007), which could result in misspecification or incorrect property value estimation in the model (Suriatini Ismail, 2005; Orford, 1999).

Therefore, another modelling method based on the regression technique namely Spatial Regression Modelling (SRM) specifically used to address the spatial autocorrelation error (Suriatini Ismail, 2005; Löchl and Axhausen, 2010). It has the capability to detect the spatial autocorrelation in two different forms namely, spatial error model and spatial lag model using the Lagrange Multiplier (LM) test (Wilhelmsson, 2002).

A spatial lag model or a mixed regressive, spatial autoregressive model is appropriate when the focus of interest is the assessment of the existence and strength of spatial interaction. In this model, the property value would be estimated partially from nearby or neighboring observations of other property values. This model would assume that the property value of each property was affected by the property values in the neighborhood in a form of spatial weighted average (Suriatini Ismail, 2005). This is in addition to the other variables that provide indirect effect to the property value which represent the property and neighbourhood characteristics. The spatial error model was used for spatially autocorrelated model which occurred because of the error term in the model. Thus, the spatial error model is capable to rectify any potential bias influence of spatial autocorrelation due to the use of spatial data. It helps to find the most suitable coefficients estimation in the model and ensure that the correct inference is adopted. It is however, not appropriate for model which indicates no spatial interaction (Suriatini Ismail, 2005). Thus, SRM managed to provide good estimation in some property value model studies (Suriatini Ismail, 2005; Löchl and Axhausen, 2010) and potentially managed to eliminate the model error.

Study Area

Property within Dewan Bandaraya Kota Kinabalu or Kota Kinabalu City Hall (DBKK) jurisdiction is the city council which administers the city and district of Kota Kinabalu in the state of Sabah, Malaysia was used as study area. Figure 1 shows the location of Kota Kinabalu in Sabah. It covers a large area in Kota Kinabalu that consists of many zones. However, due to data constraint, only the selected zones in city and urban area were used for modelling purpose which includes Kota Kinabalu, Luyang, Luyang Timur, Teluk Likas, Sembulan, Tanjung Aru, Damai, Kolam, Ridge, Kepayan, Dah Yeh and Signal Hill.

Figure 1: Location of Kota Kinabalu in Sabah

DBKK is used as the study area because firstly, from the overall perspective in Malaysia, Sabah came up top recently as the highest in property price index from the year 2011 to the recent year of 2012 (JPPH, 2012). Secondly, Kota Kinabalu district in Sabah, which is under the jurisdiction of DBKK, produced rapid increased in property tax collection from year 1998 to 2010 (DBKK 2011). Finally, data availability is an important element towards completing this study. In conducting property valuation purposes especially for modelling, data is often difficult to obtain (Suriatini Ismail, 2005). However, data for Kota Kinabalu areas is obtainable with reliable accuracy level.

Methodology and Data

A modeling framework was outlined for this study to produce the property rating valuation model as shown in Figure 2 below. The first stage involved acquisition of property value including its contributing factors and the spatial elements in property valuation that needed for valuation. The attributes consists of physical building, geographical aspect, neighborhood, external facilities and legality represent the non-spatial data were compiled. While the spatial data consist of location factor which was derived using GIS where distance from each property location to the nearest location factor such as bank, tourism attraction, market and school was measured. The selected relevant data were then gathered and examined using various steps such as verification, cleaning and conversion to prepare database suitable for analysis. Data gathered during first stage would be brought in to the second stage, where analysis was performed using OLS or SRM method based on the data acquired. The SRM analysis would only be conducted if the OLS indicates spatial autocorrelation error. The model developed was run through an assessment to obtain a property rating valuation model suitable for the residential properties in the study area.

Figure 2: Property Rating Valuation Modeling Framework

Before the assessment process can be started, the spatial autocorrelation test need to be conducted in the OLS and SRM analysis stage as shown in the diagram in Figure 3 below. The test was initiated once the OLS analysis was processed. If the OLS output indicates that spatial autocorrelation was not present in the model, the OLS then can proceed to the assessment stage. However, if the spatial autocorrelation exists, the SRM analysis needs to be applied. The SRM is categorized into two model namely the spatial error and spatial lag model. Using the Lagrange Multiplier (LM) test, the model which attain significant or the highest value would be selected as the property rating model for this study.

Figure 3: The OLS and SRM Analysis Process

Data for this study was obtained from the 1997 DBKK revaluation exercise where the value was still valid and currently applied in DBKK (DBKK, 2012). Originally, the study collected 14520 observations for the whole area of DBKK through selection of residential property valuation data excluding apartments, flats and condominiums within the urban area. However, after conducted the cleaning and removing of missing or incomplete data, only 5524 records was retained for the analysis.

Model Development

In the starting of the analysis, the first model was developed by using the Ordinary Least Square (OLS). This model could be written as in equation (1) below (Charlton and Fotheringham, 2009):

for i = 1 … n (1)


y – the vector of observed values

– the vector of estimated parameters,

x – the design matrix which contains the values of the independent variables,

ε – the model error value

In this study, the dependent variable of y was the property rating value, and the independent variables of x were chosen from the database developed. Model error of ε would be acquired after the OLS model was analyzed.

In the event of which the spatial autocorrelation exists and unable to be eliminated from the model, the SRM would then be conducted. Two types of SRM model can be produced which is the spatial lag and spatial error model. A spatial lag model can be expressed in equation (2) as follows (Anselin, 2001:316):

Where, (2)

y = Dependent Variable

ρ = spatial coefficient

Wy = weight matrix for dependent variable

x = matrix of observations on the independent variables

ε = vector of error terms

While a spatial error model can be written in equation (3) as follows (Lehner, 2011:5):

Where, (3)

y = vector of dependent variable

β0 = Constant term

β1×1 ….. βnxn = Independent Variable Component

u= vector of spatially correlated error

λ = spatial autoregressive coefficient

W = spatial weight matrix

ε = random error

Variables to be used in determining property rating value were identified and would be discussed in the following section.

Model Variable Selection

In this study, after data preprocessing and cleaning, there were 5 independent variables was used to estimate the dependent variable as shown in Table 2.

Table 2: Final Selection of Variable for the Property Valuation Model

Dependant Vriables

•Property Tax Value

Independant Variables


Land Area

• Building Type

• Building Quality

• Location Factor

The dependent variable is the property rating value that was imposed by the DBKK to the property owner. This variable is measured based on currency scale in Ringgit Malaysia (RM). The 5 independent variables chosen for the model were RCA, land area, building type, building quality and location factor which also called as the property value influence factor. The Reduced Coverage Area (RCA) represents the main floor area of the property but was recalculated to be better suited for valuation purpose. While the land area referring to the land size available in the property area. Both variables were measured using square feet unit. Next, the location factor variable was obtained based on the GIS analysis conducted and the measurement was based on meter unit from the property location to the nearest location factor consists of public institutions, tourism centers, public recreations, public facilities, commercial areas, government offices and religious centers. As for the building type it represent the type of residential property consists of semi-detached, terrace, town house and kampung house which measured in nominal scale. Followed by building quality that provides the condition level of the building whether it was bad, average good or excellent which measured in ordinal scale. The summary descriptions of the variables selected for the model are shown in Table 3 below.

Table 3: Summary Description of the OLS Model Variables

Variable Name





Current Property Rating Value



Building type



Reduced Covered Area Estimation



Building Quality



Land area estimation



Distance from property to the nearest location factor

Once the variables were selected and analyzed, the output from both OLS and SRM could then be assessed and compared to determine which model best represent the DBKK area for property rating purpose.

Results and Discussion

To test the spatial autocorrelation formally, this study adopted the spatial statistics of Moran’s I to determine the existence of significant spatial autocorrelation. This test enables identification of the two forms of spatial autocorrelation, of positive or negative. Moran’s I value of the OLS model indicates positive spatial autocorrelation (Z score = 258.234, p-value = 0.00) meaning that similar residuals cluster together. This means that it is more likely for the spatial autocorrelation detected to occur out of missing variables for important property characteristics. Therefore, SRM analysis needs to be conducted.

Based on Table 4, it shows that both LM (Error) and LM (Lag) were significant (p-value of 0.000). Hence, this would require the consideration of a robust form of the statistics as decision unable to be made based on the previous result. However, both robust LM (Error) and robust LM (Lag) also produced significant result. Therefore, if both robust LM produced significant result in spatial autocorrelation, the model with the higher value prevails (Anselin, 2005). In this case, the robust LM (error) achieved higher value of 1420.9258 compare to robust LM (lag) with 31.3767. The spatial autocorrelation error detected shows that some missing variables occurred from the model that were not included in the model. The missing variables might come from the variables that had been removed from the model because of missing records or produced multicollinearity error. As a result, the SRM’s Spatial Error Model would be used for this study as the residential property rating valuation model for the entire zone of Kota Kinabalu area.

Table 4: Output from the LM spatial autocorrelation test of the study area

Lagrange Multiplier


Lagrange Multiplier

( Lag)

Robust Lagrange Multiplier


Robust Lagrange Multiplier

( Lag)











Based on the model developed to estimate property rating, the model was assessed to measure its performance. The measurement of R-squared values was consulted, in which, the higher its value, the better the accuracy of the model. The property rating model would produce accurate estimation of the property value if the measurement of R-squared was high. In this study, the R-squared achieved 0.78 value indicated that the SRM model explains approximately 78% of the variation in the dependent variable. This figure indicates good accuracy estimation of the model.

To determine the strength and type of relationship the independent variable has to the property rating value, the coefficient for each of the independent variable were measured. Table 5 shows the coefficient value of each independent variable which also called as property value influence factor. The coefficient reflects the expected change in the property rating value for every 1 unit change in the property value influence factor. For example a coefficient of 443.656 associated with building quality (BLDQ) representing RM currency may be interpreted as RM443.656 of property rating value. This shows that BLDQ gives a high increase to the residential property value in the study. Another independent variable that provided a high positive increase to the residential property value is the Building Type (BLD_TYPE) with coefficient value of 249.069. The other factors of RCA (RCA), Land Area (LAND AREA) and Location Factor (LOC_FAC) also gave positive increase albeit lower coefficient value of 0.1095 0.005 and 0.267 respectively. All the independent variables of BLDQ, BLD_TYPE, RCA, LAND AREA, LOC_FAC and including the Intercept were statistically significant at 95% confidence level based on the probability measurement which means the coefficient value for all the variables were eligible to be used to explain the model.

Table 5: Type of relationship of the property influence factor with the property rating value

Property Value Influence Factor

Coefficient (B)

Relationship With Property Rating Value



Moderate negative relationship



Strong positive relationship



Strong positive relationship



Weak positive relationship



Weak positive relationship



Weak positive relationship

Based on the Figure 4 below, the distribution of the property rating value estimated by the SRM’s spatial error model can be clearly visualize using GIS tool. The distribution of the property rating value in the map shows that parts of Bukit Padang and Tanjung Aru zones (dark color) contributed highest property values in the area. Based on the result in Table 5, there is a high probability that the high values occurred because of the high influence from the building quality and building type in that area. Additionally, this could also attribute due to the location factor as the affected zones are situated nearby attractive places such as hillside view, recreational parks and beach. On the contrary, large parts of Ridge and Kepayan zones (light color) obtain lowest values in the area. The SRM model unable to provide the reason behind this as none of the variables included in the model provide negative effect except the Intercept. The negative value in the intercept shows that there are missing variables that contributed to the negative value influence in the area which was not included in the model.

Based on the discussion with the DBKK authority, the reason of spatial autocorrelation error occurred, in the DBKK valuation data was probably due to two factors. Firstly, despite of the different in the type of building structure, the property was identified under the same category. For example, the structure of a detached house can be temporary, semi-permanent or permanent but the category of that house was still taken as mere detached house. Inconsistency in the recording of the type of structure would have contributed to the error in the model. Secondly, Some residential properties were used for commercial purpose has made the model confused as although the size of the area is big but low in value or small area size but high in value. These residential properties were mainly used either as play school or showroom cum office. Most of these houses were located along the main road or can be clearly seen from the main road.

Figure 4: Property Rating Map using SRM Spatial Error Model for DBKK area


This paper showed an example of property rating model for tax purpose developed using SRM. This model is capable of estimating a large-scale property value in the area. With some samples from the property attributes in DBKK, the model successfully estimated the property values and distributed it in a value map using GIS tool. The performance of the model is also good with 78% accuracy and this is valid to be used as a rough references or guideline for the authority to apply rating value in the area. This study also takes into account of spatial autocorrelation test and shows the relevance of using SRM as the property rating model. Although there is still much to be done especially to overcome the spatial autocorrelation problem in the DBKK data but this could be one of the early step in producing property valuation model for DBKK. Therefore, this study has proved that spatial regression model could be used to assists the local authority in determining the property rating value to be applied in the area. This also would be a major contribution to improve revaluation exercise such that accurate property rating could be obtained and at the same time cost, time and manpower to be minimized.


Authors would like to acknowledge Universiti Malaysia Sabah for providing financial support to undertaken this research and Kota Kinabalu City Hall (DBKK) for providing data and information to be used in this study.