Home Articles Analysing the factors of deforestation using GIS

Analysing the factors of deforestation using GIS

The socio – demographic factors plays a dominant role in any country’s development which directly or indirectly affects natural resources. The growth of demography leads to developments in various directions like urbanisation, road networks and other infrastructural developments. The growing needs of minerals and other natural resources that is present in the forest area leads to mining industries which in turn results into construction of roads to the nearest accessible towns. But the forest area that is most accessible to human are getting deteriorated due to land use patterns, urban policy, roads development and other factors that related to the developments. In this paper, factors such as road construction, population, urban development and mining area are considered as drivers of deforestation in the study area which is located in the Cuddapha and Chittoor District of Andhra Pradesh, India.

The socio-demographic factors and the land use patterns are identified and derived from the remote sensing images using GIS. Given these elements, the main objective of this paper is to analyse the role of different driving factors for deforestation and the relationship among these factors in the study area. For that an association analysis on deforestation factors is done. The widespread use of spatial database and spatial data mining technique can be used to understand inter – relational nature of spatial data. The spatial association extraction algorithm discussed by Koperski and Han [15] is used in our study. The algorithm searches for associations between spatial objects, or spatial objects and attributes. In this study, rules are expressed by spatial and non-spatial predicates. Typically, the analysis reveals the positive association of each of the above specified factors for deforestation. The final outcome of this paper can be used as the suggestions for policy issues and technological developments to the decision makers.

Environmental management, and land-use planning specifically, take place at different spatial and organisational levels, often corresponding with either eco-regional or administrative units, such as the national or provincial level. The information needed and the management decisions made are different for different locations [25]. At the national level, it is often sufficient to identify regions that qualify as “hot-spots” of land use change, i.e., areas that are likely to be faced with rapid land use conversions [25]. The land use changes and its impact on forest resources can be analysed using various conventional methods such as change detection study for deforestation. Once these hot-spots are identified, a more detailed land use change and its impact analysis are often needed. Using conventional analysis methods of statistical data better solutions can be derived but it is tedious and time consuming process. In order to handle the complex spatial data and derive strategic decisions from the knowledge obtained, data mining techniques can be used. The effect of land-use changes on natural resources can be determined by finding interrelationship among various factors using association rule mining.

The main objective of this paper is to analyse the role of the different factors (demographic, topography, road infra­structure, and mining units) upon land cover change and deforestation processes using association rule mining in the study area covering 26 mandals of Chittoor and Kadapa district. The strong association rules can derive that, population growth, urban development, road infrastructure, change in land use patterns, industrial development are the main reason or causes for deforestation in most developing countries. To control and decrease the forest degradation the government should know where, when, why and how such deforestation occurs and what measures can be taken to address the problem [20]. It would seem that technological advances in remote sensing especially in the form of earth observing satellites, has made it easier to the scientific community to analyze the impact on the environment as well as naturally occurring changes. The science and technologies of GIS, remote sensing and data mining technique could be a perfect tool for answering the above questions.

Remote sensing can be the basis of fast but expensive data collection and the analytical capabilities of a GIS can be used for analysing the types, location and rates of changes. By classifying the forest and non-forest areas of 1991, 2001 and 2011 satellite images and overlaying them, the changes were identified. Information of these maps and other ancillary data are entered in the database [18][23]. As the database contains spatial and non-spatial data deriving spatial association rule requires additional care. This paper presents an idea that incorporates spatial predicates describing the spatial relationships between land use patterns and surrounding objects which may cause deforestation. A spatial data association algorithm is implemented to realize knowledge discovery for deforestation analysis. Traditional linear programming model failed to give reasonable weight for different change events occurred in different locations. This paper demonstrates the application of association rule mining to spatial data. A spatial association rule describes the implication of a feature or a set of features by another set of features in spatial databases [1]. In large spatial databases, many association relationships may exist but most researchers used rules reflecting spatial objects and spatial/spatial or spatial/non-spatial relationships that contain spatial predicates, e.g. adjacent_to/near_by, inside/with_in, intersecting, etc. Spatial association rules can represent object/predicate relationships containing spatial predicates [16].

The objective of this work is to analyse the relationships between the land cover change process and socio-demographic factors including road infrastructure, urban development in relation with population growth which is considered as factors of deforestation in the study area. The purpose of this work is to demonstrate the application of association analysis to spatial data set. This work is limited to mining association rule among the factors of deforestation at the generic level.

Spatial Association Rule

1. Data Mining
Recently, there have been a lot of research activities on knowledge discovery in large databases (data mining). These studies lead to a development of set of interesting techniques, including mining strong association and dependency rules, attribute oriented induction (AOI) for mining characteristic and discriminant rules etc. Such studies set a foundation and provide some interesting methods for the exploration of highly promising spatial data mining techniques.

2. Spatial Data Mining
Spatial data mining can be categorised based on the kinds of rules to be discovered in spatial databases. A spatial characteristic rule is a general description of a set of spatial related data. For example, the description of the general weather patterns in a set of geographic regions is a spatial characteristic rule. And spatial discriminant rule is the general description of the contrasting or discriminating features of a class of spatial related data from other class(s). This study focuses on the mining of association rules in spatial database [24].

3. Spatial Association Rule Mining
A spatial association rule is a rule which describes the implication of one or a set of features by another set of features in spatial databases [24]. A spatial association rule is a rule of the form “A and B”, where A and B are set of predicates, some of which are spatial one. Spatial data mining, i.e., mining knowledge from large amounts of spatial data, is a demanding field since huge amounts of spatial data have been collected in various applications, ranging from remote sensing to geographical information systems (GIS), computer cartography, environmental assessment and planning. The collected data far exceeds people”s ability to analyze it. Thus, new and efficient methods are needed to discover knowledge from large spatial databases [3][21][23]. Conventional attribute data mining methods were extended to apply in spatial data mining. In the light of the first law of geography “everything is related to everything else but nearby things are more related than distant things”, that values from samples taken near each other tend to be more similar than those taken farther apart. It’s natural that most spatial data in GIS are not independent [10]. Most of the spatial association rule mining algorithms derived from the attribute association rule mining algorithms which assume that spatial data is independent. In these situations, the rules or knowledge derived from spatial mining will be wrong. It is, therefore, important that mining spatial association rules take into consideration of spatial dependencies.

In data mining, one of the most classic and novel algorithm for association rule is Apriori [1],which has been extended to a lot of algorithms such as Agrawal’s CD(count distribution), CaD(candidate distribution), DD(data distribution), Park’s PDM algorithm, Chueng’s DMA algorithm and FDM algorithm. With the advancement in spatial data mining, lot of methods have been used for spatial association mining such as: spatial statistical analysis, geostatistics and spatial clustering. However, most of these approaches focus on discovering the spatial relationships among neighboring data sets. In this work, Apriori algorithm is implemented to spatial data structure for an efficient determination of such spatial association rules for spatial predicates.

4. Problem Definition
Mining spatial association rules can be decomposed in three main steps, where the first one is usually performed as a data pre-processing method:

Extract spatial predicates: A spatial predicate is a spatial relationship (e.g., distance, order, topological) between the reference feature type and a set of relevant feature types.

Find all frequent patterns/predicates: A set of predicates is a frequent pattern if its support is at least equal to a certain threshold, called minsup.

Generate strong rules: A rule is strong if it reaches minimum support and the confidence is at least equal to a certain threshold, called minconf.

5. Related Definition
Definition 1: A spatial association rule is a rule in the form P1n…nPm=>Q1n …nQn (s%; c%) where at least one of the predicates P1,.Pm, Q1,.Qn is a spatial predicate, s% is the support of the rule, and c% is the confidence of the rule.

Definition 2: The support of a conjunction of predicates, P = P1n …nPm in a set S, denoted as ?(P/S), is the number of objects in S which satisfy P versus the total number of objects of S.

Definition 3: The confidence of a rule P?Q in S, ? (P?Q/S), is the possibility that Q is satisfied by a member of S when P is satisfied by the same member of S.

A single predicate is called a 1-predicate. A conjunction of k single predicates is called a k-predicate. In this paper, the large itemset contains k predicates is called k-itemset, and the set of all the large k-itemset is Lk [7][8][15].

In a large database many association relationships may exist but some may occur rarely or may not hold in most cases. This study uses the concepts of minimum support and minimum confidence to find the patterns which are relatively strong, i.e., which occur frequently and hold in most cases. A user or an expert may specify thresholds to confine the rules to be discovered to be strong ones. Although such rules are usually not 100% true, they carry some nontrivial knowledge about spatial associations, and thus it is interesting to “mine” (i.e., “discover”) them from large spatial databases. The discovered rules will be useful in geography, environmental studies, biology, engineering and other fields. In this paper, efficient methods for mining spatial association rules are used with a top down, progressive deepening search technique. The technique first searches at a large (i.e., frequently occurring) patterns and strong implication relationships among the large patterns at a coarse resolution scale. Such a deepening search process continues until no large patterns can be found. Only the candidate spatial predicates, which are worth detailed examination, will be computed by refined spatial techniques (giving detailed predicates such as intersect, contain, etc.). However, these methods cannot discover rules reflecting structure of spatial objects and spatial/spatial or spatial/ non­spatial relationships which contain spatial predicates, such as adjacent_to, near_by, inside, close to, intersecting, etc. As a complementary, spatial association rules represents object/predicate relationships containing spatial predicates. For example, the following rules are spatial association rules [10][15].

Non-spatial consequent with spatial antecedent(s).

is_a( x, house)^ close_to( x, beach) => is_expensive( x) (90%)

Spatial consequent with non­spatial/spatial antecedent(s).

is_a( x, gas_station)=>close_to( x, highway) (75%).

Various kinds of spatial predicates can be involved in spatial association rules. They may represent topological relationships between spatial objects, such as, disjoint, intersects, inside/outside, adjacent to, covers/covered by, equal, etc. They may also represent spatial orientation or ordering, such as left, right, north, primitives east, etc., or contain some distance information, such as close to, far away, etc [16]. A large number of spatial association rules can be derived from a large spatial database. However, most people will be only interested in the patterns which occur relatively frequent (i.e., with large supports) and the rules which have strong implications (i.e., with high confidence). The rules with large supports and high confidence are considered as strong rules. Based on the above definitions, the process of mining strong spatial association rules in large databases is presented here [7][24].

The objective of this study is to identify patterns within a database containing socio-demographic and land cover change data, and thus support hypothesis generation regarding the relationship between socio-demographic change and deforestation. The Census data for 1991, 2001 and 2011 are acquired at the Mandal level from the Census of India and Chief Planning office at the district Head Quarters. Data on limited and unlimited access highways are acquired from the Environmental Systems Research Institute (ESRI) Streets Data-base. Land cover data for the 1991, 2001 are acquired from the U.S. Geological Survey’s (USGS) and 2011 from National Remote Sensing Centre (NRSC) and Topographic maps from Survey of India with 1,25,000 scale during the year 1970s. From these, vector polygon data are generated from historic aerial photography [11][12][13].

Each polygon is classified using ISO Cluster method as various land use land cover patterns [3][13]. As noted above, there is a trade-off in spatio-association rule mining involving pre-processing versus calculating those relationships on-the-fly. This work uses a GIS to pre-process these relationships and encode them in a single table, so that they may be mined using association rule mining software developed for conventional (i.e. non-spatial) and extended for geographic data. In this work, integration of land cover data to census data prior to association rule mining is done, so that spatial relationships among different tract and land cover polygons are encoded as attributes of those common spatial units within a single table. This strategy allows for fast association rule mining without significant customization of either GIS or conventional association rule mining software. The following variables are then calculated for each polygon (Figure 1) [9][11][13]:

  • Land Cover Change – Change in land cover 1991–2011
  • Urban Development – Conversion and size of the location from 1991 – 2011
  • Distance to Highway – Minimum distant and access to forest
  • Population Density – Difference in population, growth rate and density

As mentioned in the introduction, the general methodology for the analysis of the factors (drivers) of deforestation in the study area is the integration of socio­ demographic data (from the India demographic census — 1991 – 2011) and land use land cover change data (from satellite images — Landsat TM for years 1991 and 2001 and IRS LISS III 2011), at the aspects level of census tracts, in a geographic information system (GIS). The main methods used to integrate census and remote sensing data are [3][5][16][11][13]:

  1. Classification of three images (scene 101/63);
  2. Construction and organization of a socio-demographic database, from census of 1991 – 2011, at the level of mandal and census tracts;
  3. Creation and organization of digital layers of road network, urban centers, rivers, census tracts and mandal boundaries;
  4. Development of a GIS structure, integrating the three main sources of data mentioned above (satellite images, census data and digital layers);
  5. Creation of variables for land cover, urban, mine and other topography, conservation units and road infrastructure within a GIS.

The spatial database to be studied adopt an extended relational data model and that rule consists of a set of spatial objects and a relational database describing non spatial properties of these objects [8][11]. The study of spatial association relationships is confined to the study area which includes some regions of Kadapa and Chittoor district of Andhra Pradesh, whose map is presented in Fig. 1, with the following database relations for organizing and representing spatial objects [8][16][11][13].

  1. Mandal(Name, Area, Type, District, Geo, …)
  2. Road(Name, Length, Type, Geo, ….)
  3. Mine(Name, Area, Type, Geo, …)
  4. Urban/Builtup(Name, Area, Type, Geo, …)
  5. Population(MandalName, Population Size, density…)

Notice that in the above relational schemata, the attribute “geo” represents a spatial object (a point, line, area, etc.) whose spatial pointer is stored in a tuple of the relation and points to a geographic map. The attribute “type” of a relation is used to categorize the types of spatial objects in the relation [8][11][13]. A number of authors have noted that there are problematic issues in applying association rule mining to spatial data.

One issue in spatial association rule mining is that whereas non-spatial association rule mining seeks to find associations among transactions that are encoded explicitly in a database, spatial association rule mining seeks to find patterns in spatial relationships that are typically not encoded in a database but are rather embedded within the spatial framework of the geo-referenced data. These spatial relationships must be extracted from the data prior to the actual association rule mining. There is therefore a trade-off between pre-processing spatial relationships among geographic objects and computing those relationships on-the-fly [9]. Pre-processing improves performance, but massive data volumes associated with encoding spatial relationships for all combinations of geographic objects prohibits the storage of all spatial relationships.

Another issue with spatial association rule mining is that conventional association rule mining is designed to work with categorical data, not numeric data such as metric distance. One approach to this problem is to discretize numeric data into ordinal categories and then mine those ordinal data for association rules [1][15]. For example, data such as metric distance may be parsed into categories of ‘near’ and ‘far.’ The choice of interval breaks in the conversion from numeric to categorical data type impacts the results of the rule mining, however, and can be particularly problematic if the discretization is too coarse to capture an interesting rule [16]. As an approach to this problem, some researchers have proposed methods for optimizing the discretization of numeric data for association rule mining, for instance using cluster analysis, computational geometry, or a heuristic method [25], although such approaches have generally not been extended to spatial association rule mining.

Next mining spatial association rules involve data mining query which extracts the above related data set from the database covering mandals, roads, builtup land, degraded areas and forest areas. Then the “adjacent_to” relationship between builtup and other related classes, “adjacent_to” relationship with mine and other feature class and “with_in” relationship with other classes are computed along with support count of each entry and presented in the table 2 and 3.

Table 1. Relational Table with Population information


Table 2 Table showing classified data with Feature class


Table 3 Spatial data with Spatial Predicates Name and Count


Table 4 Spatial data with Spatial Predicates and Count

The detailed computation process is not presented here since it is similar to mining association rules to exact spatial relationships to be presented below. Spatial association rules can be extracted directly from Table 3. A few of the interesting rules are explored here [1][15][16]:

Figure 2 Map showing the change in builtup(Urban) land


Figure 3 Map showing the buffering of road


Figure 4 Map showing extent of Mining area

Is_a(X, “Mine”) =>Adjacent_to(X, “Road”) : (73%) ……… (1)
Is_a(X, “Builtup”)=>Adjacent_to(X, “Road”) : (80%) ……..(2)
Is_a(X, “Mine”)=>Adjacent_to(X, “Degraded”) : (52%) ……(3)
Is_a(X, “Urban”) => Population(X, “High”) : (80%) ……….(4)
Is_a(X, “Rural”) => Population(X, “Low”) : (72%)………..(5)
Is_a(X, “Mine”) ^ Within(X, “Forest”)=>Adjacent_to(X, “Degraded”) : (80%)…………………………………………(6)
Is_a(X, Builtup”) ^ Adjacent_to(X, “Degraded”) => Adjacent_to(X, “Forest”) : (78%)…………………………….(7)
Is_a(X, “Urban”) ^ Adjacent_to(x, “Road”) => Population(X, “High”) : (80%)……………………………………………….(8)
Is_a(X, “Builtup”) ^ Adjacent_to(X, “Degraded”) => Adjacent_to(X, “Forest”) : (80%)……………………………(9)
Is_a(X, “Mine”) ^ Within(X, “Forest”) ^ Adjacent_to(X, ”Road”) =>Aadjacent_to(X, “Degraded”) : (87%)……….(10)
Is_a(X, “Urban”) ^ Adjacent_to(x, “Road”) ^ Adjacent_to(X, “Degraded”) => Population(X, “High”) : (80%)………..….(11)

Algorithm for Mining Spatial Association Rules
The above rule mining process can be summarized in the following algorithm.

Algorithm: Mining the spatial association rules defined by Definition 1 in a large spatial database. Input: The input consists of a spatial database, a mining query, and a set of thresholds minimum support (minsup[l]) and minimum confidence (minconf[l]) Output: Strong spatial association rules for the relevant sets of objects and relations. Method: Mining spatial association rules proceeds as follows.

Step 1: Taskrelevant DB := extract task relevant objects(Spatial and No spatial) Step 2: Predicate_DB := spatial computation(Task relevant DB) Step 3: LargePredicate_DB :=filtering with minimum support (predicate DB) Step 4: Fine predicate DB := refined spatial computation(Large predicate DB) Step 5: Find large predicates and mine rules(Fine predicate DB)

Explanation of the detailed steps of the algorithm

Step 1 is accomplished by the execution of a spatial query. All the taskrelevant objects are collected into one database: Task relevant DB [15].

Step 2 is accomplished by execution of some efficient spatial algorithms. Predicates describing spatial relations between objects are stored in an extended relational database, called Predicate_DB, which allows an attribute value to be either a single value or a set of values [15].

Step 3 computes the support for each predicate in Predicate_DB, (and store them in a adjacent_to predicate-support table), and filters out those entries whose support is below the minimum support which contains all large 1-predicates, called Largepredicate_DB.

Step 4 is accomplished by execution of some efficient spatial computation algorithms [8][15] at a fine resolution level on Largepredicate_DB obtained in Step 3.

Step 5 computes the large k-predicates for all the k”s and generates the strong association rules [15].

Result analysis and discussion
Demographic factors (population size, density and growth) Much of the land use/cover change literature accepts that population change and distribution is a significant driver of global deforestation. “For example, et al[17] estimate that population explains approximately half of the variation in deforestation worldwide while Allen and Barnes [2] consider it the primary cause of planet”s deforestation”. In this sense, the majority of the global regression models of deforestation showcase that demographic factors (mainly population size, density and growth) are the most important drivers of tropical deforestation as stated by Mather and Needle [17] and Allen and Barnes [2]. In fact, according to Carr, Suter and Barbieri, “much of the work attempting to quantify causal linkages between population and deforestation has been limited by an early focus on global level logistic regression analyses”[4]. Nevertheless, these regression models tend to suffer from data limitations, particularly in terms of which variables exist in extant, and how well they measure, what they purpose to measure [5][8]. This study shows decline in population growth and high density of population in large or urban areas which results in land scarcity and leads to deforestation from the rules if it is adjacent to forest. For example,

Is_a(X, “Urban”) => Population(X, “High”): (80%)

Is_a(X, “Rural”) => Population(X, “Low”): (72%)

Population Density and Urban growth
At last, one can see that the population growth rate of the census tract has the lowest positive association with deforestation, among all the factors (independent variables). Different from its density, the rate of population growth does not seem to have an important effect on recent deforestation. The presence of conservation units has also an important negative effect on deforestation rates. In other words, higher rates of deforestation occur in census tracts outside conservation units. Based on findings on the relationships between deforestation rates and the selected independent variables (factors), it can say that the rural census tracts with association higher rates of deforestation have bigger population size and density, are located close to urban centers (within a 10 km radius), have a more dense road network, better socio­demographic conditions and higher rates of population growth. Moreover, the census tracts with higher deforestation rates are located in areas with smoother topography and outside conservation units. This study shows, high density of population and other developments converts and merges village to town and town to urban areas which results in deforestation from the rules if it is adjacent to forest. For example,

Is_a(X, “Urban”) => Population(X, “High”): (80%)

Is_a(X, “Rural”) => Population(X, “Low”): (72%)

Is_a(X, “Urban”) ^ Adjacent_to(x, “Road”) ^ Adjacent_to(X, “Degraded”) => Population(X, “High”): (80%)

Road infrastructure and access to urban towns
Deforestation models reviewed by Kaimowitz and Angelsen find that association’s greater access to forests and markets/ towns /accelerate deforestation [14]. Few of the models referred by Chomitz and Gray of this type for Belize [6], for Cameroon by Mertens and Lambim [19] and for Costa Rica by Rosero­Bixby and Palloniall [22] show a strong relation between roads and deforestation [3]. Most studies show that forest clearing declines rapidly beyond distances of 2 or 3 kilometers from a road” as specified by Angelsen and Kaimowitz [3][14]. Concerning access to urban markets, in a case study about Belize, Chomitz and Gray, show that areas closer to urban markets have less forest cover [6]. Mertens and Lambim in a case study about Cameroon stated that, deforestation rates fall remarkably beyond a 10 km distance from an urban center [19][3][18]. In the review of 79 case studies carried out, the presence of roads, especially road construction, is an important proximate cause of tropical deforestation, appearing in 61% of all case studies reviewed. The proximity to urban centers is the factor (independent variable) presenting the second highest positive correlation with deforestation and is also highly correlated with population density. The road network density also shows a positive correlation with deforestation but the association is not as strong as with other independent variables [18][21][23]. This study shows population growth and high density of population in large or urban areas which results in various infrastructural developments and construction of roads that also leads to deforestation from the rules if it is adjacent to forest. For example,

Is_a(X, “Urban”) ^ Adjacent_to(x, “Road”) ^ Adjacent_to(X, “Degraded”) => Population(X, “High”): (80%)

Mining and Road infrastructure and access to towns
According to authors Barbier and others, logging’s indirect impact on deforestation may be greater than the direct impacts of timber removal and collateral damage to standing stock. Forested regions are damaged additionally if logging roads and operations facilitate access by follow-on settlers, who convert the logged-over forest to pasture, permanent crops, or shifting cultivation. Predictive models of follow-on settlement could be employed in environmental impact assessments of proposed logging and mining concessions that entail road building. This study shows decline in population growth and high density of population in large or urban areas which results in land scarcity and leads to deforestation from the rules if it is within to forest [3]. For example,

Is_a(X, “Mine”) ^ Within(X, “Forest”) ^ Adjacent_to(X, ”Road”) =>Aadjacent_to(X, “Degraded”) : (87%)

The land cover change variables are created by extracting land cover classes, which are then aggregated to the spatial units of analysis. This procedure enables us to estimate the total area (and the percentages) for each land cover class within every mandal level, therefore, accomplishing the integration of census and remote sensing data. The land cover variables are created by overlaying the Administrative Divisions of Mandal/census tracts layers with the land cover change maps. The urban and road infra­structure variables are created, through the overlaying of the buffers of urban centers and roads to the mandal/census tracts layers. First, buffers around the main roads are created and tested with buffers of 100, 200, 500, 800 and 1,000 meters. Afterwards, these buffers are overlayed with the mandals/census tracts layers and thus able to calculate the area (and the percentages of the area) inside the buffers roads of every mandal. Therefore, these variables are considered as a forested proxy of access to urban markets and road infra­structure. After creating all the spatial variables mentioned above, the database is layered completely, so, for every mandal and census tract the following groups of variables are available:1) census variables; 2) land cover variables; 3) other topographic variables. 4) urban and road infra­structure variables[3][9][20].

The most of the deforestation models seen on the literature are empirical and one of the methodologies most used are qualitative model that could represent the network of relationships between the independent variables and the deforestation rates. In this qualitative model, the factors positively associated with recent deforestation in the study area are population size, density and growth, socio demographic conditions and access to urban markets and road infra­structure. The major drivers of deforestation are population to density and proximity to urban centers. Besides, these two factors, together with road density, are very correlated with each other, implying that population density might be a proxy to access to urban markets and road infra­structure. However, the conservation units are distinguished by very low population densities and low levels of living conditions. In sum, the factors with the strongest positive association with deforestation are to the population density and the proximity to urban centers, which are also positively correlated with each other. The fact that many factors (independent variables) shown in table 2 are highly related with each other indicates that it is not possible to consider each variable as a single or isolated factor associated with deforestation, but instead it should be seen as a “network of relationships”, with direct and indirect effects on deforestation processes. Based on this “network of relationships” among the independent variables and showing the deforestation rates, causality between socio­demographic factors, topographic and infra­structure attributes, presence of conservation units and recent deforestation in the results for the qualitative model.

It is possible to note a web of associations between a set of different factors (demographic, topography, road and urban infra­structure and conservation units) and deforestation processes in the study area. In some cases, the deforestation could have generated income, therefore improving socioeconomic conditions of the population in the census tract. The major advantage of adopting a qualitative model is the possibility to map and represent graphically the diversity of factors associated with deforestation [3][5][8][18][20]. At last, it is important to mention that, this analysis of the relationships between population deforestation rates and the independent variables are not able to incorporate all due to the complexity of the factors involved in the processes of land cover change and deforestation. As discussed in the literature, first deforestation processes do not proceed linearly. In other words, they are not dependent on one exclusive factor (e.g. population growth or road construction), nor are historical. Instead, it is a combination of many factors (social, economic, demographic, forestation political, institutional etc.), operating in different spatial and temporal scales. In this sense, there has been a great effort from the scientific community in the search for new theories and methods of analysis that could enable a better balance between geographic coverage, analytical precision and realism for the analysis and models of deforestation.

Therefore, our analysis of the factors associated with deforestation is able to incorporate three important aspects: 1) it presents a wide geographic coverage; 2) it uses very disaggregated spatial unit of analysis (census tracts); and 3) it integrates a large and diverse number of variables for the analysis (census variables, remote sensing/land cover variables and other spatial variables as topography and road this infrastructure). In this sense, may be the most significant contribution of this study is the application of a methodology that integrates census and remote sensing variables, all information aggregated at the level of census tracts, for the development of an analysis of the associations between socio­demographic factors and deforestation.


  1. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of the _to(C, B), Twentieth VLDB Conference, Santiago: Cile (1994).
  2. ALLEN J. C. & BARNES D. F. (1985). The causes of deforestation in developing countries. Annals of the Association of American Geographers, 75, 163–184.
  3. “Analysis of demographic and socio – economic factors as drivers of deforestation in the Ribera do Iguape River Basin, Brazilian Atlantic forest: a GIs integration of Census and remote sensing data”,
  4. CARR, D. L.; SUTER, L. & BARBIERI, A. (2005). Population dynamics and tropical deforestation: state of the debate and conceptual challenges. Population and Environment,27(1), 89-113.
  5. Chin­Jui Chang ,Shiahn­Wern Shyue ”Association Rules Mining with GIS: An Application to Taiwan Census 2000”, Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009.
  6. CHOMITZ, K. M. & GRAY, D. A. (1996). Roads, land use, and deforestation: a spatial model applied to Belize. World Bank Economic Review 10, 487-512.
  7. Donato Malerba and Francesca A. Lisi ,”An ILP Method for Spatial Association Rule Mining”, Dipartimento di Informatica, Università degli Studi soning. via Orabona, 4 ­ 70126 Bari ­ Italy,2008.
  8. Donato Malerba, Francesca a. Lisi, annalisa Appice, Francesco sblendorio, “Mining Spatial Association Rule Mining in Census Data: A relational approach”, A research article, IST European SPIN project and MURST COFIN 2001.
  9. Jeremy Mennis, Jun Wei Liu, “Mining Association Rules in Spatio-Temporal data: An analysis of Urban socioeconomic and Land Cover Change”, A Research article, Transaction in GIS, 2005, 9(1):5-17.
  10. Jiangping Chen, Wuhan Hubei, “An Algorithm about Association Rule Mining Based on Spatial Autocorrelation”, the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B 6b. Beijing 2008.
  11. K.R. Manjula, S. Jyothi and S. Anand Kumar Varma, “Digitizing the Forest Resource Map Using ArcGIS”, VOL. 7, Issue-6, pp. 300 – 306, November 2010, ISSN (On line) :1694-0814 November – 2010, www.ijcsi.org.
  12. K.R. Manjula, S. Jyothi and S. Anand Kumar Varma, “Construction of Spatial Database for Forest Map Using Arcgis and Ms-Access”, Conference Proceedings, National Conference on Computing & Communication Technologies (NC3T – 2010), pp. 1 – 8, October – 2010, IKON BOOKS Publisher & Distributors.
  13. K.R. Manjula, S. Jyothi, S. Anand Kumar Varma and Dr. S. Vijya Kumar Varma, “Construction of Spatial Dataset from Remote Sensing using GIS for Deforestation Study” International Journal of Computer Applications, (ISSN: 0975 – 8887), Vol.31 Issue No. 10 October – 2011, PP.26-32,DOI-10.5120/3862-5389) www.ijcaonline.org/archives/volume31/number10/3862-5389.
  14. KAIMOWITZ, D. & ANGELSEN, A. (1998). Economic models of tropical deforestation: a review. Bogor, Indonesia: CIFOR.
  15. Krzysztof Koperski, Jiawei Han “Discovery of Spatial association Rules in Geographic Information databases”, Research article from the Natural Sciences and Engineering Research Council of Canada and NCE, 1995,1996 &1998.
  16. L.K. Sharma. O.P. Vyas, U.S. Tiwary, R. Vyas, “ A Noval Approach of Multilevel Positive and Negative Association Rule Mining for Spatial Databases’, MLDM 2005, LNAI 3587, Pp. 620 – 629, Springer- Verlag Berlin Heidlberg 2005.
  17. MATHER, A. S. & NEEDLE, C. L. (2000). The Relationships of Population and Forest Trends.The Geographical Journal, 166, 2-13.
  18. Menaka Panta, Kyehyun Kim, Cholyoung Lee, Taehoon Kim,”Liaison of Population Factor, agriculture Expansion and Food Access in Forest degradation process: View from ancillary data Sources”, GIS Application in Environment, GISDevelopment.net, 2010.
  19. MERTENS, B. & LAMBIN, E. (1997). Spatial modelling of deforestation in Southern Cameroon: spatial disaggregating of diverse deforestation processes. Applied Geography, 17(2), 143-162.
  20. Mesgari Saadi, Ranjbar Abolfazl, “Analysis and Estimation of deforestation using Satellite Imagery and GIS”, GIS Application in Environment, GISDevelopment.net, 2000.
  21. Rodman, L. C., Jackson, J., Huizar III, R., and Meentemeyer, R. K., “An Association Rule Discovery System for Geographic Data”, 2006 IEEE International Geoscience and Remote Sensing Symposium, Denver, CO, Jul. 31- Aug. 4, 2006.
  22. ROSERO-BIXBY, L. & PALLONI, A. (1998). Population and deforestation in Costa Rica.Population and Environment, 20(2), 149-185.
  23. Valentina Camaran, ”How to Effectively Analyze Deforestation in the Amazon Basin Through the use of Binary and fieldwork Data”, GIS Application in Environment, GISDevelopment.net, 1991.
  24. Vania Bogorny, Paulo Martins Engel, Luis O. Al, “Enhancing Spatial Association Rule Mining in Geographic Databses”, Anasis do XXVII congress SBC & CTD, 2007.
  25. Wei Ding , Christoph F. Eick, Jing Wang, Xiaojing Yuan “A Framework for Regional Association Rule Mining in Spatial Datasets”, Geoinformatica (2011)15:1 – 28 DOI 10.1007/s10707-010-0111-6, Business Media 2010 and Springer science 2009.