浏览全部资源
扫码关注微信
1. 信息工程大学三院,河南 郑州 450001
2. 河南省信息安全重点实验室,河南 郑州 450001
[ "张红旗(1962-),男,河北遵化人,博士,信息工程大学教授、博士生导师,主要研究方向为网络安全、风险评估、等级保护和信息安全管理等。" ]
[ "杨峻楠(1993-),男,河北藁城人,信息工程大学硕士生,主要研究方向为网络信息安全、博弈论和强化学习等。" ]
[ "张传富(1973-),男,山东莱芜人,博士后,信息工程大学副教授,主要研究方向为计算机建模与仿真技术等。" ]
网络出版日期:2018-08,
纸质出版日期:2018-08-25
移动端阅览
张红旗, 杨峻楠, 张传富. 基于不完全信息随机博弈与Q-learning的防御决策方法[J]. 通信学报, 2018,39(8):56-68.
Hongqi ZHANG, Junnan YANG, Chuanfu ZHANG. Defense decision-making method based on incomplete information stochastic game and Q-learning[J]. Journal on communications, 2018, 39(8): 56-68.
张红旗, 杨峻楠, 张传富. 基于不完全信息随机博弈与Q-learning的防御决策方法[J]. 通信学报, 2018,39(8):56-68. DOI: 10.11959/j.issn.1000-436x.2018145.
Hongqi ZHANG, Junnan YANG, Chuanfu ZHANG. Defense decision-making method based on incomplete information stochastic game and Q-learning[J]. Journal on communications, 2018, 39(8): 56-68. DOI: 10.11959/j.issn.1000-436x.2018145.
针对现有随机博弈大多以完全信息假设为前提,且与网络攻防实际不符的问题,将防御者对攻击者收益的不确定性转化为对攻击者类型的不确定性,构建不完全信息随机博弈模型。针对网络状态转移概率难以确定,导致无法确定求解均衡所需参数的问题,将Q-learning引入随机博弈中,使防御者在攻防对抗中通过学习得到的相关参数求解贝叶斯纳什均衡。在此基础上,设计了能够在线学习的防御决策算法。仿真实验验证了所提方法的有效性。
Most of the existing stochastic games are based on the assumption of complete information
which are not consistent with the fact of network attack and defense.Aiming at this problem
the uncertainty of the attacker’s revenue was transformed to the uncertainty of the attacker type
and then a stochastic game model with incomplete information was constructed.The probability of network state transition is difficult to determine
which makes it impossible to determine the parameter needed to solve the equilibrium.Aiming at this problem
the Q-learning was introduced into stochastic game
which allowed defender to get the relevant parameter by learning in network attack and defense and to solve Bayesian Nash equilibrium.Based on the above
a defense decision algorithm that could learn online was designed.The simulation experiment proves the effectiveness of the proposed method.
HU H , ZHANG H , LIU Y , et al . Quantitative method for network security situation based on attack prediction [J ] . Security & Communication Networks , 2017 ( 4 ): 1 - 19 .
HU H , LIU Y , ZHANG H , et al . Optimal network defense strategy selection based on incomplete information evolutionary game [J ] . IEEE Access , 2018 ,PP( 99 ):1.
FALLAH M . A puzzle-based defense strategy against flooding attacks using game theory [J ] . IEEE Transactions on Dependable & Secure Computing , 2010 , 7 ( 1 ): 5 - 19 .
FILAR J , VRIEZE K . Competitive Markov decision processes [J ] . Springer Berlin , 1996 , 36 ( 4 ): 343 - 358 .
姜伟 , 方滨兴 , 田志宏 , 等 . 基于攻防随机博弈模型的防御策略选取研究 [J ] . 计算机研究与发展 , 2010 , 47 ( 10 ): 1714 - 1723 .
JIANG W , FANG B X , TIAN Z H , et al . Research on defense strategies selection based on attack-defense stochastic game model [J ] . Journal of Computer Research and Development , 2010 , 47 ( 10 ): 1714 - 1723 .
LYE K W , WING J M . Game strategies in network security [J ] . International Journal of Information Security , 2005 , 4 ( 1-2 ): 71 - 86 .
WEI L , SARWAT A , SAAD W , et al . Stochastic games for power grid protection against coordinated cyber-physical attacks [J ] . IEEE Transactions on Smart Grid , 2016 ,PP( 99 ):1.
ARFAOUI A , LETAIFA A B , KRIBECHE A , et al . A stochastic game for adaptive security in constrained wireless body area networks [C ] // Consumer Communications & NETWORKING Conference . 2018 : 1 - 7 .
LEI C , ZHANG H Q , WAN L M , et al . Incomplete information Markov game theoretic approach to strategy generation for moving target defense [J ] . Computer Communications , 2018 , 116 : 184 - 199 .
LEI C , MA D H , ZHANG H Q . Optimal strategy selection for moving target defense based on Markov game [J ] . IEEE Access , 2017 ,PP( 99 ):1.
WATKINS C J C H , DAYAN P . Technical note:Q-learning [J ] . Machine Learning , 1992 , 8 ( 3-4 ): 279 - 292 .
刘陶 , 何炎祥 , 熊琦 . 一种基于Q学习的LDoS攻击实时防御机制及其CPN实现 [J ] . 计算机研究与发展 , 2011 , 48 ( 3 ): 432 - 439 .
LIU T , HE Y X , XIONG Q . A Q-learning based real-time mitigating mechanism against LDoS attack and its modeling and simulation with CPN [J ] . Journal of Computer Research and Development , 2011 , 48 ( 3 ): 432 - 439 .
RANDRIANSOLO A S , PYEATT L D . Q-learning:from computer network security to software security [C ] // International Conference on Machine Learning and Applications . 2015 : 257 - 262 .
YAN J , HE H , ZHONG X , et al . Q-learning-based vulnerability analysis of smart grid against sequential topology attacks [J ] . IEEE Transactions on Information Forensics & Security , 2017 , 12 ( 1 ): 200 - 210 .
HARSANYI J C , SELTEN R . A general theory of equilibrium selection in games [M ] . Boston : MIT PressPress , 1988 .
CORMEN T H , LEISERSON C E , RIVEST R L , et al . Introduction to algorithms [M ] . Boston : MIT PressPress , 2009 .
张恒巍 , 李涛 . 基于多阶段攻防信号博弈的最优主动防御 [J ] . 电子学报 , 2017 , 45 ( 2 ): 431 - 439 .
ZHANG H W , LI T . Optimal active defense based on multi-stage attack-defense signaling game [J ] . Acta Electronica Sinica , 2017 , 45 ( 2 ): 431 - 439 .
HUNG S M , GIVIGI S N . A Q-learning approach to flocking with UAVs in a stochastic environment [J ] . IEEE Transactions on Cybernetics , 2016 , 47 ( 1 ): 186 - 197 .
SZEPESVARI C , LITTMAN M . A unified analysis of value-function-based reinforcement-learning algorithms [J ] . Neural Computation , 1999 , 11 ( 8 ): 2017 - 2059 .
GORDON L , LOEB M , LUCYSHYN W , et al . 2015 CSI/FBI computer crime and security survey [C ] // The 2014 Computer Security Institute . 2015 : 48 - 64 .
王震 , 袁勇 , 安波 , 等 . 安全博弈论研究综述 [J ] . 指挥与控制学报 , 2015 , 1 ( 2 ): 121 - 149 .
WANG Z , YUAN Y , AN B , et al . An overview of security games [J ] . Journal of Command and Control , 2015 , 1 ( 2 ): 121 - 149 .
0
浏览量
1735
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构