Home Blogs Geostatistical analysis for epidemiology

Geostatistical analysis for epidemiology

Etymologically speaking, ‘geostatistics’ as a term was created to entail the statistical study of natural phenomena, and in its developmental stage in the 1950s and 60s, it was used primarily to improve upon the evaluation and assessment of recoverable reserves in mineral deposits. Since its inception for the mining industry, the applications of this field of study has widened considerably and geostatistics has now emerged as a primary tool for analysing spatial data in a plethora of fields ranging from, agricultural sciences to remote sensing. Over the last few years, geostatistics has been optimized for the field of spatial epidemiology, that is concerned with the study of spatial patterns of diseases, mortality as well as the identification of potential causes of the disease such as environmental exposure, diet, unhealthy behaviours, economic or even social demographic factors. 

In recent times, geostatistical techniques such as semivariograms, origins, stochastic simulation etc. have been augmented to be able to study spatial patterns that can help public health officials in identifying areas of excess and then be able to guide surveillance and control activities which can include the need for health services and resource allocation for screening and diagnosis.  

Generally, the data that is available for human health studies ends up being collected in two main categories the first of which is individual-level data that is on the basis of patience and controls and the second one is aggregate data which could be on the basis of states, districts or pin codes. Since neither of these data sets falls traditionally within the category of geostatistical data as defined by classical spatial statistics literature, geostatistics hence ends up offering an alternative to common methods, which are used for analysing spatial point processes and lattice data.  

Since new diseases and epidemics spread through the world’s population every year, with the help of geostatistics, epidemiologists are now able to create a strong framework and enhance our ability to be able to monitor these diseases as well as identify the causes by bringing research methods and analytic techniques from both medical Geography and spatial epidemiology. Using GIS for assessment of proximity, aggregation, clustering and then performing spatial smoothing, interpolation and spatial regression, the most common application of geostatistics is generally identification of clusters, that refer to non-random spatial distribution of cases of diseases, incidence or even prevalence. Identification of multiple forms of clustering generally include: 

  • Global clustering in which no cluster areas of pre-specified and the presence of clusters is derived empirically
  • Local clustering in which the existence of small-scale clusters have been evaluated statistically
  • Focal clustering, which is basically the assessment of clustering around a predetermined set for example and environmental hazard.

One can take the example of the current COVID – 19 pandemic that is sweeping across the globe and how GIS is being used extensively for the epidemiology for disease surveillance. By mapping the rapidly evolving disease cases, geographic space local and national governments are identifying the distribution and spread of the disease across nations and looking at methods of intervention. When combined with other technologies and research methods such as medical geography and geostatistical analysis, governments are able to assess potential risk factors and plan accordingly. 

The analysis of health data and putative covariates is fast becoming a promising application for geostatistical analysis that has been able to incorporate multiple layers of information accounting for spatial correlation in the creation of maps for contaminant concentration which has resulted in a better ability to analyse and predict patterns and predictability, modelling of the spread of infection, correlated with external factors via spatial data sets. With overlapping disciplines adding to each other, a multidisciplinary approach is now required that can facilitate the formulation of hypothesises along with the identification of relationships regarding spatial patterns of diseases.  

Geostatistical analysis for epidemiology plays a vital role in this ongoing endeavour because it has the ability to take into account the aspects of randomness and special structure of the characterisation of regionalised variables. Given how the merits of geostatistics over existing empirical mapping methods are now being recognised, it will be interesting to see how newer technologies, analytical techniques and data sources will go on to impact and transform the future of this discipline. 

Also Read: How data can keep us safer amid the Coronavirus crisis