Dataflow model for cloud computing frameworks in big data

Dong Dai, Yong Chen, Gangyong Jia

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review


In recent years, the Big Data challenge has attracted increasing attention [1-4]. Compared with traditional data-intensive applications [5-10], these “Big Data” applications tend to be more diverse: they not only need to process the potential large data sets but also need to react to real-time updates of data sets and provide low-latency interactive access to the latest analytic results. A recent study [11] exemplifies a typical formation of these 4applications: computation/processing will be performed on both newly arrived data and historical data simultaneously and support queries on recent results. Such applications are becoming more and more common; for example, real-time tweets published on Twitter [12] need to be analyzed in real time for finding users’ community structure [13], which is needed for recommendation services and target promotions/advertisements. The transactions, ratings, and click streams collected in real time from users of online retailers like Amazon [14] or eBay [15] also need to be analyzed in a timely manner to improve the back-end recommendation system for better predictive analysis constantly.

Original languageEnglish
Title of host publicationHigh Performance Computing for Big Data
Subtitle of host publicationMethodologies and Applications
PublisherCRC Press
Number of pages16
ISBN (Electronic)9781498784009
ISBN (Print)9781498783996
StatePublished - Jan 1 2017


Dive into the research topics of 'Dataflow model for cloud computing frameworks in big data'. Together they form a unique fingerprint.

Cite this