Automated hot-spot identification for spatial investigation of disease indicators

Mario A. Pitalua Rodriguez, Susan Mengel, Lisaann S. Gittner, Hafiz M.K. Khan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a new procedure that uses spatial statistics to identify clusters of counties having either a high or low incidence of a disease (dependent variable). These counties provide a spatial snapshot that describes the disease in the study area. Using this spatial snapshot as a reference, the procedure evaluates potential factors (independent variables) sorted out by the degree of similarity with the disease when comparing spatial snapshots. The greater the similarity, the greater the likelihood for a causal relationship. Similarity also can facilitate the selection of variables to be considered rather than relying only on the researcher's expertise. In particular, the procedure is used to analyze Cardiovascular Disease at the county level for the contiguous 48 states using the Public Health Exposome, a data repository of environmental factors to which a given group of people may be exposed over the course of their lifetime and that may impact their health. The proposed procedure enables the analysis of a study area with a large number of regions, such as entire countries, but is able to go to the level of detail of a smaller area, such as a county. In contrast, researchers may limit their work to a small number of regions due to computational and analytical limitations. In addition, the procedure yields a ranking of independent variables according to their effect on the dependent variable. In the past Public Health researchers reported that analytical approaches required days of extremely complex statistics and computational time that restricted their analysis to 60 variables. The proposed procedure is run at the Texas Tech High Performance Computing Center taking 12 minutes for 168 variables and a study area with 3,028 regions.

Original languageEnglish
Title of host publicationProceedings - 5th IEEE International Conference on Big Data Service and Applications, BigDataService 2019, Workshop on Big Data in Water Resources, Environment, and Hydraulic Engineering and Workshop on Medical, Healthcare, Using Big Data Technologies
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages9-18
Number of pages10
ISBN (Electronic)9781728100593
DOIs
StatePublished - Apr 2019
Event5th IEEE International Conference on Big Data Service and Applications, BigDataService 2019 - Newark, United States
Duration: Apr 4 2019Apr 9 2019

Publication series

NameProceedings - 5th IEEE International Conference on Big Data Service and Applications, BigDataService 2019, Workshop on Big Data in Water Resources, Environment, and Hydraulic Engineering and Workshop on Medical, Healthcare, Using Big Data Technologies

Conference

Conference5th IEEE International Conference on Big Data Service and Applications, BigDataService 2019
CountryUnited States
CityNewark
Period04/4/1904/9/19

Keywords

  • Exposome
  • Gi
  • Hot-spot Identification
  • Spatial Auto-Correlation
  • Variable Diffusion
  • Variable Selection

Fingerprint Dive into the research topics of 'Automated hot-spot identification for spatial investigation of disease indicators'. Together they form a unique fingerprint.

Cite this