TY - GEN
T1 - From where do tweets originate? - A GIS approach for user location inference
AU - Huang, Qunying
AU - Cao, Guofeng
AU - Wang, Caixia
N1 - Publisher Copyright:
© 2014 ACM.
PY - 2014/11/4
Y1 - 2014/11/4
N2 - A number of natural language processing and text-mining algorithms have been developed to extract the geospatial cues (e.g., place names) to infer locations of content creators from publicly available information, such as text content, online social profiles, and the behaviors or interactions of users from social networks. These studies, however, can only successfully infer user locations at city levels with relatively decent accuracy, while much higher resolution is required for meaningful spatiotemporal analysis in geospatial fields. Additionally, geographical cues exploited by current text-based approaches are hidden in the unreliable, unstructured, informal, ungrammatical, and multilingual data, and therefore are hard to extract and make meaningful correctly. Instead of using such hidden geographic cues, this paper develops a GIS approach that can infer the true origin of tweets down to the zip code level by using and mining spatial (geo-tags) and temporal (timestamps when a message was posted) information recorded on user digital footprints. Further, individual major daily activity zones and mobility can be successfully inferred and predicted. By integrating GIS data and spatiotemporal clustering methods, this proposed approach can infer individual daily physical activity zones with spatial resolution as high as 20 m by 20 m or even higher depending on the number of digit footprints collected for social media users. The research results with detailed spatial resolution are necessary and useful for various applications such as human mobility pattern analysis, business site selection, disease control, or transportation systems improvement.
AB - A number of natural language processing and text-mining algorithms have been developed to extract the geospatial cues (e.g., place names) to infer locations of content creators from publicly available information, such as text content, online social profiles, and the behaviors or interactions of users from social networks. These studies, however, can only successfully infer user locations at city levels with relatively decent accuracy, while much higher resolution is required for meaningful spatiotemporal analysis in geospatial fields. Additionally, geographical cues exploited by current text-based approaches are hidden in the unreliable, unstructured, informal, ungrammatical, and multilingual data, and therefore are hard to extract and make meaningful correctly. Instead of using such hidden geographic cues, this paper develops a GIS approach that can infer the true origin of tweets down to the zip code level by using and mining spatial (geo-tags) and temporal (timestamps when a message was posted) information recorded on user digital footprints. Further, individual major daily activity zones and mobility can be successfully inferred and predicted. By integrating GIS data and spatiotemporal clustering methods, this proposed approach can infer individual daily physical activity zones with spatial resolution as high as 20 m by 20 m or even higher depending on the number of digit footprints collected for social media users. The research results with detailed spatial resolution are necessary and useful for various applications such as human mobility pattern analysis, business site selection, disease control, or transportation systems improvement.
KW - Big data
KW - Geography
KW - Human mobility
KW - Spatial clustering
KW - Spatiotemporal clustering
UR - http://www.scopus.com/inward/record.url?scp=84964068355&partnerID=8YFLogxK
U2 - 10.1145/2755492.2755494
DO - 10.1145/2755492.2755494
M3 - Conference contribution
AN - SCOPUS:84964068355
T3 - Proceedings of the 7th ACM SIGSPATIAL International Workshop on Location-Based Social Networks, LBSN 2014 - Held in conjunction with the 22nd ACM SIGSPATIAL GIS 2014
SP - 1
EP - 8
BT - Proceedings of the 7th ACM SIGSPATIAL International Workshop on Location-Based Social Networks, LBSN 2014 - Held in conjunction with the 22nd ACM SIGSPATIAL GIS 2014
A2 - Pozdnoukhov, Alexei
PB - Association for Computing Machinery, Inc
T2 - 7th ACM SIGSPATIAL International Workshop on Location-Based Social Networks, LBSN 2014
Y2 - 4 November 2014
ER -