Future-oriented tweets predict lower county-level HIV prevalence in the United States

Molly E. Ireland, H. Andrew Schwartz, Qijia Chen, Lyle H. Ungar, Dolores Albarracín

Research output: Contribution to journalArticlepeer-review

37 Scopus citations


Objective: Future orientation promotes health and well-being at the individual level. Computerized text analysis of a dataset encompassing billions of words used across the United States on Twitter tested whether community-level rates of future-oriented messages correlated with lower human immunodeficiency virus (HIV) rates and moderated the association between behavioral risk indicators and HIV. Method: Over 150 million tweets mapped to U.S. counties were analyzed using 2 methods of text analysis. First, county-level HIV rates (cases per 100,000) were regressed on aggregate usage of future-oriented language (e.g., will, gonna). A second data-driven method regressed HIV rates on individual words and phrases. Results: Results showed that counties with higher rates of future tense on Twitter had fewer HIV cases, independent of strong structural predictors of HIV such as population density. Future-oriented messages also appeared to buffer health risk: Sexually transmitted infection rates and references to risky behavior on Twitter were associated with higher HIV prevalence in all counties except those with high rates of future orientation. Data-driven analyses likewise showed that words and phrases referencing the future (e.g., tomorrow, would be) correlated with lower HIV prevalence. Conclusion: Integrating big data approaches to text analysis and epidemiology with psychological theory may provide an inexpensive, real-time method of anticipating outbreaks of HIV and etiologically similar diseases.

Original languageEnglish
Pages (from-to)1252-1260
Number of pages9
JournalHealth Psychology
StatePublished - 2015


  • Epidemiology
  • Future orientation
  • HIV
  • Language
  • Risk
  • Twitter


Dive into the research topics of 'Future-oriented tweets predict lower county-level HIV prevalence in the United States'. Together they form a unique fingerprint.

Cite this