ZHOU Quan,NIU Yingtao.Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation[J].Journal on Communications,2024,45(07):117-126.
ZHOU Quan,NIU Yingtao.Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation[J].Journal on Communications,2024,45(07):117-126. DOI: 10.11959/j.issn.1000-436x.2024131.
Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation
To improve the learning efficiency of anti-jamming algorithms based on deep reinforcement learning and enable them to adapt more quickly to unknown jamming environments
a fast deep reinforcement learning anti-jamming algorithm based on similar sample generation was proposed. By combining the similarity measurement of state-action pairs
derived from bisimulation
with an anti-jamming algorithm grounded in the deep Q-network
this algorithm was able to quickly learn effective multi-domain anti-jamming strategies in unknown
dynamic jamming environments. Specifically
once a transmission action was completed
the proposed algorithm first interacted with the environment using the deep Q-network to acquire actual state-action pairs. Then it generated a set of similar state-action pairs based on bisimulation
employing these similar state-action pairs to produce simulated training samples. Through these operations
the algorithm was able to acquire a large number of training samples at each iteration step
thereby significantly accelerating the training process and convergence speed. Simulation results show that under comb sweep jamming and intelligent blocking jamming
the proposed algorithm exhibits rapid convergence speed
and its normalized throughput after convergence significantly superior to the conventional deep Q-network algorithm
the Q-learning algorithm
and the improved Q-learning algorithm based on knowledge reuse.
关键词
Keywords
references
DON T . Principles of spread-spectrum communication systems [M ] . Berlin : Springer , 2018 .
姚富强 . 通信抗干扰工程与实践 [M ] . 北京 : 电子工业出版社 , 2012 .
YAO F Q . Communication anti-jamming engineering and practice [M ] . Beijing : Publishing House of Electronics Industry , 2012 .
XIAO L , JIANG D H , XU D J , et al . Two-dimensional antijamming mobile communication based on reinforcement learning [J ] . IEEE Transactions on Vehicular Technology , 2018 , 67 ( 10 ): 9499 - 9512 .
XIAO L , JIANG D H , WAN X Y , et al . Anti-jamming underwater transmission with mobility and learning [J ] . IEEE Communications Letters , 2018 , 22 ( 3 ): 542 - 545 .
YUAN H C , SONG F , CHU X J , et al . Joint relay and channel selection against mobile and smart jammer: a deep reinforcement learning approach [J ] . IET Communications , 2021 , 15 ( 17 ): 2237 - 2251 .
XIAO L , DING Y Z , HUANG J H , et al . UAV anti-jamming video transmissions with QoE guarantee: a reinforcement learning-based approach [J ] . IEEE Transactions on Communications , 2021 , 69 ( 9 ): 5933 - 5947 .
LU X Z , XIAO L , NIU G H , et al . Safe exploration in wireless security: a safe reinforcement learning algorithm with hierarchical structure [J ] . IEEE Transactions on Information Forensics and Security , 2022 , 17 : 732 - 743 .
YANG H L , XIONG Z H , ZHAO J , et al . Intelligent reflecting surface assisted anti-jamming communications: a fast reinforcement learning approach [J ] . IEEE Transactions on Wireless Communications , 2021 , 20 ( 3 ): 1963 - 1974 .
LI Y Y , XU Y H , LI G X , et al . Dynamic spectrum anti-jamming access with fast convergence: a labeled deep reinforcement learning approach [J ] . IEEE Transactions on Information Forensics and Security , 2023 , 18 : 5447 - 5458 .
ZHOU Q , NIU Y T , XIANG P , et al . Intra-domain knowledge reuse assisted reinforcement learning for fast anti-jamming communication [J ] . IEEE Transactions on Information Forensics and Security , 2023 , 18 : 4707 - 4720 .
YAO F Q , JIA L L , SUN Y M , et al . A hierarchical learning approach to anti-jamming channel selection strategies [J ] . Wireless Networks , 2019 , 25 ( 1 ): 201 - 213 .
SUN Y , LI B L , LIANG C H , et al . Design of serial connecting multiple spatially coupled LDPC codes for block-fading channels [J ] . Journal of Xidian University , 2019 , 46 ( 2 ): 1 - 5, 28 .
BOUZABIA H , DO T N , KADDOUM G . Deep learning-enabled deceptive jammer detection for low probability of intercept communications [J ] . IEEE Systems Journal , 2023 , 17 ( 2 ): 2166 - 2177 .
MNIH V , KAVUKCUOGLU K , SILVER D , et al . Human-level control through deep reinforcement learning [J ] . Nature , 2015 , 518 ( 7540 ): 529 - 533 .
ZENG L H , YAO F Q , ZHANG J Z , et al . Dynamic spectrum access based on prior knowledge enabled reinforcement learning with double actions in complex electromagnetic environment [J ] . China Communications , 2022 , 19 ( 7 ): 13 - 24 .
YAO F Q , JIA L L . A collaborative multi-agent reinforcement learning anti-jamming algorithm in wireless networks [J ] . IEEE Wireless Communications Letters , 2019 , 8 ( 4 ): 1024 - 1027 .
HUANG Y , ZHU X Y , WU Q H . Intelligent spectrum anti-jamming with cognitive software-defined architecture [J ] . IEEE Systems Journal , 2023 , 17 ( 2 ): 2686 - 2697 .
LI X C , CHEN J N , LING X , et al . Deep reinforcement learning-based anti-jamming algorithm using dual action network [J ] . IEEE Transactions on Wireless Communications , 2023 , 22 ( 7 ): 4625 - 4637 .
ZHANG G M , ZHANG S Y , ZHANG J W . Discovery and optimization method of attack paths based on PPO algorithm [J ] . Netinfo Security , 2023 , 23 ( 9 ): 47 - 57 .
LIU X , XU Y H , JIA L L , et al . Anti-jamming communications using spectrum waterfall: a deep reinforcement learning approach [J ] . IEEE Communications Letters , 2018 , 22 ( 5 ): 998 - 1001 .
ZHOU Q , LI Y G , NIU Y T . Intelligent anti-jamming communication for wireless sensor networks: a multi-agent reinforcement learning approach [J ] . IEEE Open Journal of the Communications Society , 2021 , 2 : 775 - 784 .