Data Centers Job Scheduling with Deep Reinforcement Learning

Sisheng Liang, Zhou Yang, Fang Jin, Yong Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations


Efficient job scheduling on data centers under heterogeneous complexity is crucial but challenging since it involves the allocation of multi-dimensional resources over time and space. To adapt the complex computing environment in data centers, we proposed an innovative Advantage Actor-Critic (A2C) deep reinforcement learning based approach called A2cScheduler for job scheduling. A2cScheduler consists of two agents, one of which, dubbed the actor, is responsible for learning the scheduling policy automatically and the other one, the critic, reduces the estimation error. Unlike previous policy gradient approaches, A2cScheduler is designed to reduce the gradient estimation variance and to update parameters efficiently. We show that the A2cScheduler can achieve competitive scheduling performance using both simulated workloads and real data collected from an academic data center.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 24th Pacific-Asia Conference, PAKDD 2020, Proceedings
EditorsHady W. Lauw, Ee-Peng Lim, Raymond Chi-Wing Wong, Alexandros Ntoulas, See-Kiong Ng, Sinno Jialin Pan
Number of pages12
ISBN (Print)9783030474355
StatePublished - 2020
Event24th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2020 - Singapore, Singapore
Duration: May 11 2020May 14 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12085 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference24th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2020


  • Actor critic
  • Cluster scheduling
  • Deep reinforcement learning
  • Job scheduling


Dive into the research topics of 'Data Centers Job Scheduling with Deep Reinforcement Learning'. Together they form a unique fingerprint.

Cite this