Reinforcement learning for generating secure configurations

Shuvalaxmi Dass, Akbar Siami Namin

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Many security problems in software systems are because of vulnerabilities caused by improper configurations. A poorly configured software system leads to a multitude of vulnerabilities that can be exploited by adversaries. The problem becomes even more serious when the architecture of the underlying system is static and the misconfiguration remains for a longer period of time, enabling adversaries to thoroughly inspect the software system under attack during the reconnaissance stage. Employing diversification techniques such as Moving Target Defense (MTD) can minimize the risk of exposing vulnerabilities. MTD is an evolving defense technique through which the attack surface of the underlying system is continuously changing. However, the effectiveness of such dynamically changing platform depends not only on the goodness of the next configuration setting with respect to minimization of attack surfaces but also the diversity of set of configurations generated. To address the problem of generating a diverse and large set of secure software and system configurations, this paper introduces an approach based on Reinforcement Learning (RL) through which an agent is trained to generate the desirable set of configurations. The paper reports the performance of the RL-based secure and diverse configurations through some case studies.

Original languageEnglish
Article number2392
JournalElectronics (Switzerland)
Issue number19
StatePublished - Oct 1 2021


  • Moving target defense
  • Reinforcement learning
  • Secure software configuration


Dive into the research topics of 'Reinforcement learning for generating secure configurations'. Together they form a unique fingerprint.

Cite this