Mohamad Nor Said
Department of Geoinformatics
Faculty of Geoinformation Science & Engineering
Universiti Teknologi Malaysia
81310, UTM Skudai, Johor, Malaysia
Tel: +6075530720, Fax: +6075566163
[email protected]
Muhammad Zaly Shah
Department of Urban and Regional Planning,
Faculty of Built Environment
Universiti Teknologi Malaysia
81310, UTM Skudai, Johor, Malaysia
Tel: +6075530720, Fax: +6075566163
[email protected]
ABSTRACT
As a measure of accessibility, road density is an important indicator of urbanization. Areas that are highly accessible are those with high percentage of road density. However, providing respectable level of road density requires considerable planning as well as large amount of financial resources. Therefore, it is critical that only adequate road density required to guarantee continued growth is provided, nothing more. Realizing this fact, the Highway Planning Unit of the Malaysian Ministry of Works has awarded a contract research to the Universiti Teknologi Malaysia to study the relationship between the transportation (road density) and land uses with an expectation that a generalized empirical model can easily be used as a guide in road planning. In accomplishing the modeling tasks, Geographical Information System (GIS) has been identified as a tool in preparing land use datasets required to test and run the model – in the process of determining the significant parameters that contribute to road density. Furthermore, GIS is also used as a graphical display tool in demonstrating the variation of existing and forecasted road density of the study sites.
1.0 Introduction
One of the most productive and powerful innovations in Geographical Information System (GIS) has been the incorporation of modeling. This involves joining the GIS database to a computerdriven model of some process or procedure. The GIS can then combine pieces of data for every object, put it through the model process, and get back a new piece of information. This allows spatial data to be processed in mass quantities using powerful, complex formulas.
There are two applicable types of modeling namely simulation and predictive models. Simulation modeling involves using the GIS to simulate a complex phenomenon in nature which generally requires a high degree of technical expertise and can vary in the degree to which it is linked to the GIS. However, once the GIS and the model are linked, they can be used to evaluate different features of the data, whether it is spatial or nonspatial. The predictive modeling, on the other hand, is a more powerful modeling tool where an expert acquires data and uses it to build a statistical model and then tested by regression analysis. Once the model has been tested on known data, it is applied to new data in order to predict results. This type of modeling has been used to predict processes like flooding, groundwater contamination, and soil loss. Similarly it is used to carry out the land use – transportation (road density) interaction study which is the subject of this paper. The ability to link GIS to these models has greatly increased the usefulness of GIS as a scientific tool.
With respect to the land use – transportation (road density) interaction modeling, which is the subject of this paper, the predictive modeling is applied. The proposed model uses an opensource urban simulation application called UrbanSim where as many as eighteen independent variables involving two major towns (Johore Bahru and Kuantan,) are tested to determine the most significant parameters that contribute to the urban road density.
In assisting the modeling process, a GIS (ArcView) is used to establish the geospatial database which includes data of various forms and from various sources. The model is regressed by least square fitting method and calibrated to study the level of acceptability. With the future land use plan the model is then used to predict the road density level and again GIS is applied to import the forecasted data and graphically display the distribution. The results demonstrate that only five variables show the significant effects on the road density with each town has its own combination of these parameters.
2.0 Simulating Land UseTransportation Interaction
2.1 Functional Relationship Between Land Use And Transportation
The demand for transportation is well understood to be derived from the demand in urban activity. Urban activity, on the other hand, is a function of landuse. Interestingly, the provision of transportation system (i.e. services and infrastructure) may also influence the development of land use. Thus, there is a clear interaction or causeandeffect relationship between transportation and landuse (see Figure 1).
Figure 1: The interrelationship between demand of transportation and development of landuse
A land usetransportation interaction model, then, is a simulation model that combines theories, data and algorithms to represent the functioning of the land use and transportation systems. Once calibrated against a known scenario, this simulation model can be used to make predictions about the future. In urban and transportation planning, this ability of knowing the likely future is important as urban and transportation policies affect people’s lives. But, more importantly, the interaction model helps create sustainable cities as land use and transportation developments have their share of adverse impacts, e.g. social ills and environmental pollution.
Existing transportation economic theories consider transportation as a derived demand of land uses (Blunden and Black, 1984). What this really means is that transportation in itself cannot exist except for providing accessibility to land uses or as a medium to move goods and services between land uses. Rarely do people travel for the sake of travelling. More often than not, people travel to obtain some perceived benefit at the destination, which is offered by the different land uses. Hence, the modeling of the interaction between land use and road transportation starts with a premise that future road transportation requirements are based on what land use changes that will be taking place (Miller, 2003). This relationship between land use and transportation activities is described graphically in Figure 2.
Figure 2: Relationship between land use and transportation activities (Source: Rosenbaum and Koenig, 1997)
Mathematically, however, this relationship between road transportation and land uses can be expressed by the following functional relationship:
where:
= Estimated road transportation requirements (e.g. road density) at time t + 1
= Estimated land uses at time t + 1
The dependent variable in Eq. (1), i.e. , which represents road transportation variable can be substituted by other variable of interest, including road density. If other socioeconomic and environment factors are considered, functional relationships as in Eq. (1) can be similarly developed that link each of these factors to the estimated future land uses .
On further examination, the estimated future land uses is itself sensitive to other variables like employment level, population size and the characteristics of the property market. The relationship between these variables is represented by:
In fact, the functional relationships between the independent variables in Eq. (2) are not mutually exclusive. For example, the population size is dependent upon the availability of employment as well as on the availability of housing facilities. Similarly, the employment level is in turn dependent upon the availability of labour/manpower provided by the population. Also, for businesses/industries to operate (and, offer employment), there must be appropriate building facilities available. Invariably, the multicollinearity between these variables makes mathematical modelling increasingly complex. Fortunately, there is a practical solution to the issue of multicollinearity between the independent variables of Eq. (2).
For the time being, if we substitute Eq. (2) into Eq. (1), the resulting functional relationship is as shown below:
In Eq. (3), it can be distinctively identified that there exist two separate functional relationships:
 The inner function g(?) representing the land use dynamics, and
 The outer function f(?) representing the transportation dynamics
The modelling strategy for this study then will be based upon empirically solving these two separate but interrelated g(?) and f(?) functions sequentially as shown in Figure 3:
Figure 3: Modeling sequence
2.2 Land Use – Transportation Models
The last 20 years saw a surge in the development of land usetransportation models. This is partly due to the availability of cheap computing power not available previously. Some of these interaction models, e.g. DRAM/EMPAL, are now outdated as theoretical understanding of the interaction between land use and transportation matures. Most stateofthe art land usetransportation interaction models are currently grounded on the discrete choice theory. At present, there are three popular discretechoicebased interaction models (Noth, et al., 2000) that are being implemented in various ongoing projects around the world. These models are:
 MEPLAN by Marcel Echenique & Partners
 TRANUS by Modellistica, Inc.
 UrbanSim by University of Washington, Oregon, USA.
Table 1 provides the theoretical and technical comparisons between these land usetransportation interaction models. Due to their similarities, MEPLAN and TRANUS are grouped together. Also, another model, i.e. DRAM/EMPAL, which uses an older spatial interaction approaches is provided for the sake of comparison. It can be seen that all models differ in terms of structure, data requirements, geographical basis and a host of other criteria as shown in Table 1.
Miller (2003) pointed out that UrbanSim is by far the most representative of the current practices in the area of land use and transportation interaction model. The fact that UrbanSim has been operationally implemented at several cities further lends credibility to UrbanSim as the preferred land usetransportation interaction model. It is also important to note that all models, except UrbanSim, are based on proprietary licences. With UrbanSim, however, the users will have access to its source code making customisation feasible and inexpensive.
Models  
Criteria  DRAM/EMPAL  MEPLAN/TRANUS  UrbanSim 
Model structure  Spatial Interaction  Spatial InputOutput  Discrete Choice 
Household location choice  Modeled  Modeled  Modeled 
Household classification  Aggregate  Aggregate  Disaggregate 
Employment location choice  Modeled  Modeled  Modeled 
Employment classification  Aggregate,(8 categories)  Aggregate  Disaggregate,(20 sectors) 
Real estate development  Not modeled  Modeled  Modeled 
Real estate classification  4 land uses  Aggregate, user defined  24 dev. types 
Real estate measures  Acres Acres, unit floor space  Acres, unit floor space  
Real estate prices  Not modeled  Modeled  Modeled 
Geographic basis  Census tracts  User defined zones  Grid cells 
Temporal basis  Quasidynamic  Cross sectional  Annual, dynamic 
Interaction with travel model  Yes  Yes  Yes 
Software access  Proprietary  Proprietary  Open source 
Table 1: Comparison between different land usetransportation interaction models
In UrbanSim, the interaction between land use and transportation is modeled upon the premise that spatial decision processes shape the physical form of urban area over time. These decision processes ultimately result in physical flows of people, goods and services within this area. These spatial decision processes include (Miller, 2003):
 Decision to develop or redevelop land for various purposes
 Location and relocation decisions of firms
 Residential location and relocation decisions of households
 Labour market decisions of workers and employers
 Activitytravel decisions of persons and households
 Economic interactions among firms that result in the flow of goods and services among them
Miller (2003) proceeds with showing that UrbanSim provides a mean to model these spatial decision processes through a set of complex modelling methods and theoretical constructs, which include among others:
 Models of spatial interaction and accessibility
 Models of land and real estate markets
 Models of intraregional economic interaction
3.0 Methodology
This study mainly involves the establishment of land use – road density model by regressing the observed data of eighteen independent variables representing the town of Johore Bahru and Kuantan. The simulation is performed using an open source software known as UrbanSim whereas the input data and display operation are carried out using ArcView GIS.
3.1 Data Modeling With UrbanSim
By design, the modeling of the interaction between land use and transportation is destined to be data intensive. And, the method that these data are collected and prepared must naturally be vigorously identified and defined as the validity and reliability of the results depend on them. At this juncture, it is crucial to state the requirement in terms of data so that UrbanSim can function as it was designed to do – simulating urban growth. To do this, the understanding of the architecture of UrbanSim is mandatory.
In general, the data required for running UrbanSim, based on Figure 4, can be divided into the following categories:
 Land use (spatial distribution) data
 Household data
 Employment data
 Property data
 Transportation data
Invariably, the above data must be obtained for the two study sites – Kuantan (Pahang) and Sungai Petani in the state of Kedah. The sources of each data and the methods used for collecting them are indicated in Table 2.
Data Type  Sources  Method 
Land Use 


Household 


Employment 


Property Development 


Transportation 


Table 2: Sources and method for data collection
Thus, from Table 2, we see that GIS is an integral part of the model as it is used to provide the land use data required for UrbanSim to perform the simulation of urban growth. The next section provides detail explanation of the role of GIS in providing the input data required for predicting road density.
Figure 4: Data modeling in UrbanSim
Source: Waddell (2002)
3.2 Geospatial Mapping And Analysis
The modeling using the UrbanSim requires certain parameters of a base year data. Some data are of spatial nature while the other are purely numbers representing a certain quantity. The spatial datasets may be extracted from many sources such as maps, aerial photographs and satellite images.
Some of the basic parameters required for the input into UrbanSim are as follows:
 A map subdivided into square grids – in the case of this study a 1Km x 1Km grid is decided
 Total number of households within a grid cell
 Percentage of occupied area for housing within a grid cell
 Percentage of occupied area for roads within a grid cell (road density)
 Distance from a grid (centre of grid cell) to the nearest major highways or arterial roads
In preparing the above datasets, a geospatial database representing mainly the land use data of the two study towns is established. A number of analytical functions including map overlay, distance measurement and attribute data manipulation is applied to extract the required information. The GIS’s cartographic tool is later used (after the model is established) to demonstrate the actual, predicted and forecasted road density levels of each town.
3.2.1 Development of Geospatial Database
A geospatial database is a database that contains objects with locational information. Maps are an example of a medium that stores geospatial objects such as land uses (residential areas, roads, etc). With the advent of information technology tools such as those available in GIS, this geospatial information can be stored in a digital database, which can then be easily and quickly retrieved and analyzed for certain purposes.
In this study, all hardcopy maps representing themes related to the study are digitized and transformed into ground coordinates. These data are obtained from various sources, mainly from the Department of Surveying and Mapping Malaysia (JUPEM) and the associated local authorities. The datasets available include:
 Boundary of the study areas
 Land use
 Road network
 Boundary of housing estates (housing area)
 Boundary of planning blocks
 1 km x 1 km grid map
To complement the map as the basic source document to extract the land uses, aerial photographs and Remote Sensing images are also used. The processing procedures for these image data are mainly the coordinate transformation and image interpretation.
The activities involved in developing this database are as follows:
 Data collection and compilation
 Map layer classification
 Determination of ground control points for coordinate transformation – establishment of configuration and acquisition of coordinate values
 Scanning and onscreen digitizing
 Error cleaning and topology construction
 Coordinate transformation
 Attribute tagging
3.2.2 Feature Clipping
Since UrbanSim requires the entry of the parameters to be made according to grid cell location, all features contained in a particular map layer have to be clipped according to cells. For this, another layer representing a grid map (created as a 1 km x 1 km square) is established. This layer is used to clip whatever feature, from which statistical information related to that features within any particular grid cell boundary can be extracted (e.g. the total length of a road network; the percentage of households; etc).
3.2.3 Analysis of Overall Land Use Pattern
One of the maps digitized and stored in the database is a land use map that comprises polygons representing a certain type of land use. The type of land use is stored as an attribute in the feature’s attribute table that can be related to its spatial location automatically. This enables the land uses of the study area to be displayed and analyzed, for example to study the overall land use pattern.
The land use pattern can be analyzed visually or statistically. Visualization can be made by portraying the types in different colors and referred to the map legend. Statistical analysis on the other hand can be made by selecting records that represent a certain type of land use and computing a particular attribute required (such as the total area). Further computations such as conversion of measuring unit and the percentage (proportion) of area according to the land use types can be made if necessary.
3.2.4 Extraction of Spatial Parameters
As mentioned earlier, this study is to be carried out using UrbanSim. The package requires input parameters that some may be of spatial nature such as the total number of household within a particular cell or the distance from a particular cell centre to a highway. In order to provide this information, spatial analysis tools available in the GIS are used which include map overlay and neighborhood search. In accomplishing the task, the associated data sets are topologised (provided with information about the relationship between features) prior to the analysis.
3.3 Development of Regression Model
Using the year 1999 base year data, a multivariate regression model is constructed. This regression model will define the interaction between the dependent variable, road density, and the independent variables, which are the land use variables.
However, in order to be mathematically correct, any collinearity between the variable must be removed. Collinearity in this case is defined as the simultaneous interaction between the independent variables. The removal of collinear variables is necessary to ensure that the final set of independent variables is truly independent of each other, and the only interaction that exists is that between the independent variables and the dependent variable.
The removal of collinear independent variables is done through a statistic called the Variance Inflation Factor (VIF) where a large VIF value, i.e. VIF > 4, indicates the existence of collinearity in a specific variable. Thus, any independent variable that produces a VIF value greater than 4 is removed from the final regression equation. The computations of the VIF for each of the independent variables are done using SPSS statistical software.
As a final step in getting the regression equation, the independent variables that are not significant at α=0.05 level are removed. The rule adopted is that if the variable’s significant value p is greater than α=0.05 , then the variable is not significant and is removed from the final regression equation. Otherwise, if p < α , the variable is considered significant and is retained in the final regression equation. Thus, the final regression equation should only contain variables that are significant as well as variables that are not collinearly related with other independent variables.
3.4 Model Calibration
In this step, the predicted road density for a base year (in this case, year 2000) is compared to the actual road density for the same year. This will give the prediction error – which will form the basis for model calibration. Several common accuracy measures can be applied to gauge the validity and reliability of the regression function, which include Mean Errors (ME), Mean Squared Errors (MSE) and Root Mean Square Errors (RMSE). This is then followed by statistical equality test using ttest. It is with these accuracy measures that the degrees of confidence in the regression models developed for the study sites are ascertained.
3.5 Road Density Forecasting
Having established the model for the two study towns, the road density of any future year can be forecasted. As a test, the variables for the year 2010 are used to compute the percentage of space required for road development of each square grid. The numerical results of this computation are then exported into ArcView GIS where graphical representation of the road density level is displayed.
4.0 Results
The major outcome of the study is the establishment of a road density model in relation to the variation of land uses. The regressed models for Johore Bahru and Kuantan are shown in the following sections, together with the graphical representation of the current (year 2000) and forecasted road density levels.
4.1 Johor Bahru
The actual road density for Johor Bahru in the year 2000 is shown in Figure 5. Based on the data input collected between 1990 and 1999, a regression model was developed that relates land use and road density for Johor Bahru. The regression model for Johor Bahru is given as:
Road Density = 4.072 + 7.329E5*P7 + 27.026*P4 + 1.387E3*H3 (4) where:
P7 = Commercial Improvement Value
P4 = Fraction Residential Land
H3 = Average household income
Figure 5: Actual road density of Johor Bahru (year 2000)
Now, UrbanSim simulates the changes in land uses for Johor Bahru for the year 2000. Incorporating these predicted land use changes of 2000 into Eq. (4), a predicted road density level for the year 2000 is produced. The predicted road density for the year 2000 is shown in Figure 6. The accuracy of the prediction is measured using an error statistic called Root Mean Squared Error (RMSE). The RMSE for the prediction of road density for Johor Bahru is 10.2249.
Figure 6: Predicted road density of Johor Bahru (year 2000)
Moving further, UrbanSim is used to provide the simulated land use changes for the year 2010. The simulated data again are fed into Eq. (4) to provide the predicted road density for the year 2010. Graphically, the road density for Johor Bahru in the year 2010 is shown in Figure 7.
Figure 7: Predicted road density of Johor Bahru (year 2010)
4.2 Kuantan
The actual road density for Kuantan in the year 2000 is given in Eq. (5). After that, the same process done for Johor Bahru is again repeated for Kuantan, which is to obtain the regression model that relates road density and land uses. However, the regression model for Kuantan is slightly different and is given as:
Road Density = 1.721E5*P5 + 8.555E7*P8 + 3.788E3*H3 (5) where:
P5 = Commercial Square Feet
P8 = Industrial Square Feet
H3 = Average household income
Figure 8 and Figure 9 respectively show the actual and predicted road density for Kuantan in the year 2000. Comparing the predicted and the actual road density in the year 2000 gives an RMSE of 5.0588.
Figure 8: Actual road density of Kuantan (year 2000)
Figure 9: Predicted road density of Kuantan (year 2000)
Figure 10: Predicted road density of Kuantan (year 2010)
Finally, Figure 10 shows the predicted road density for Kuantan in the year 2010 using the simulated land use changes given by UrbanSim for the year 2010.
5.0 Conclusion
Based on this study, it is clear that GIS can play important roles in supporting modeling process. This can be made possible either as an integral part of a software package used or coupled with other external modeling packages – as applied in this study. Requirement of geospatial data as an input to develop and establish a model, such as land use – road density model, can be fulfilled by selecting appropriate GIS functions once the geospatial database is available. This is proven to be much more handy and efficient especially when a required data has to be extracted from various sources and via complex processes such as map overlay and neighborhood analysis. Furthermore, analytical tool such as distance and area measurement, coupled with the ability to do a quick computation of a certain quantity that is purely based on the stored data on the database gives a great advantage of using GIS for variety of works such as related to spatial modeling.
References
 Blunden, W. R. and Black, J.A. (1984). LandUse/Transport System. 2nd. Ed. Sydney: Pergamon Press. 77 – 80.
 Miller, E. J. (2003). Land Use: Transportation Modeling. In: Goulias (Ed). Transportation Systems Planning: Methods and Applications. Boca Raton, FL.: CRC Press.
 Noth, P., Borning, A. and Waddell, P (2000). An Extensible, Modular Architecture for Simulating Urban Development, Transportation, and Environmental Impacts. Manuscript submitted for publications.
 Rosenbaum, A. S. and Koenig, B. E. (1997). Evaluation of Modeling Tools for Assessing Land Use Policies and Strategies. Report EPA420R97007. Ann Arbor, MI: Office of Mobile Sources, U.S. Environmental Protection Agency.
 Waddell, P. (2002). UrbanSim: Modelling Urban Development for Land Use, Transportation and Environmental Planning. Journal of the American Planning Association. 68 (3). 297 – 314.