Intelligent anti-jamming decision algorithm based on proximal policy optimization

MA Song; LI Li; LI Wei; HUANG Wei; WANG Jun

doi:10.11959/j.issn.1000-436x.2024137

您当前的位置：

首页 >

文章列表页 >

Intelligent anti-jamming decision algorithm based on proximal policy optimization

Correspondences | 更新时间：2024-09-10

- Intelligent anti-jamming decision algorithm based on proximal policy optimization
- Journal on Communications Vol. 45, Issue 8, Pages: 249-257(2024)
- 作者机构：
  
  1.中国西南电子技术研究所,四川成都 610036
  2.电子科技大学通信抗干扰全国重点实验室,四川成都 611731
  3.中国西南电子设备研究所,四川成都 610036
- 作者简介：
- 基金信息：
  
  The National Natural Science Foundation of China(62131005;62071096)
- DOI：10.11959/j.issn.1000-436x.2024137
  CLC： TN92
- Received：26 December 2023，
  
  Revised：2024-04-10，
  
  Published：25 August 2024
- 稿件说明：
移动端阅览
马松,李黎,黎伟等.基于近端策略优化的智能抗干扰决策算法[J].通信学报,2024,45(08):249-257.

MA Song,LI Li,LI Wei,et al.Intelligent anti-jamming decision algorithm based on proximal policy optimization[J].Journal on Communications,2024,45(08):249-257.
马松,李黎,黎伟等.基于近端策略优化的智能抗干扰决策算法[J].通信学报,2024,45(08):249-257. DOI： 10.11959/j.issn.1000-436x.2024137.

MA Song,LI Li,LI Wei,et al.Intelligent anti-jamming decision algorithm based on proximal policy optimization[J].Journal on Communications,2024,45(08):249-257. DOI： 10.11959/j.issn.1000-436x.2024137.

摘要

针对现有基于深度强化学习的智能抗干扰方法应用于天地测控通信链路时，用于决策的深度神经网络结构复杂，卫星等飞行器资源受限，难以在有限的复杂度约束下独立完成复杂神经网络的及时训练，抗干扰决策无法收敛的问题，提出了一种基于近端策略优化的智能抗干扰决策算法。分别在飞行器和地面站部署决策神经网络和训练神经网络，地面站根据飞行器反馈的经验信息进行最优化离线训练，辅助决策神经网络进行参数更新，在满足飞行器资源约束的同时实现有效的抗干扰策略选择。仿真结果表明，与基于策略梯度和基于深度Q学习的决策算法相比，所提算法收敛速度提升37%，收敛后的系统容量提升25%。

Abstract

The existing intelligent anti-jamming methods based on deep reinforcement learning are applied to space-ground TT&C and communication links

in which the deep neural network used for decision-making has a complex structure

and the resources of satellites and other vehicles are limited

making it difficult to independently complete the timely training of complex neural network under the constraints of limited complexity

and the decision-making of anti-jamming cannot converge. Aiming at the above problems

an intelligent anti-jamming decision algorithm based on proximal policy optimization was proposed

which deployed the decision-making neural network and the training neural network in the vehicles and the ground station

respectively. The ground station conducted the optimal offline training based on the empirical information feedback from the vehicles

and assisted the decision-making neural network in parameter updating

thereby achieving the effective selection of anti-jamming strategies while satisfying the resource constraints of the vehicles. The simulation results demonstrate that the convergence speed of the proposed algorithm is increased by 37%

and the system capacity after convergence is increased by 25%

compared with the decision algorithms of policy gradient and deep Q-learning.

关键词

Keywords

references

黄韬 , 刘江 , 汪硕 , 等 . 未来网络技术与发展趋势综述 [J ] . 通信学报 , 2021 , 42 ( 1 ): 130 - 150 .

HUANG T , LIU J , WANG S , et al . Survey of the future network technology and trend [J ] . Journal on Communications , 2021 , 42 ( 1 ): 130 - 150 .

刘天华 , 王洪全 . 天地一体化信息网络在我国民航领域的应用设想 [J ] . 电讯技术 , 2018 , 58 ( 6 ): 738 - 744 .

LIU T H , WANG H Q . Thought on application of space-ground integrated information network in domestic civil aviation [J ] . Telecommunication Engineering , 2018 , 58 ( 6 ): 738 - 744 .

NIEPHAUS C , KRETSCHMER M , GHINEA G . QoS provisioning in converged satellite and terrestrial networks: a survey of the state-of-the-art [J ] . IEEE Communications Surveys & Tutorials , 2016 , 18 ( 4 ): 2415 - 2441 .

张海君 , 陈安琪 , 李亚博 , 等 . 6G移动网络关键技术 [J ] . 通信学报 , 2022 , 43 ( 7 ): 189 - 202 .

ZHANG H J , CHEN A Q , LI Y B , et al . Key technologies of 6G mobile network [J ] . Journal on Communications , 2022 , 43 ( 7 ): 189 - 202 .

GUIDOTTI A , VANELLI-CORALLI A , CONTI M , et al . Architectures and key technical challenges for 5G systems incorporating satellites [J ] . IEEE Transactions on Vehicular Technology , 2019 , 68 ( 3 ): 2624 - 2639 .

张玲翠 , 许瑶冰 , 李凤华 , 等 . 天地一体化信息网络安全动态赋能架构 [J ] . 通信学报 , 2021 , 42 ( 9 ): 87 - 95 .

ZHANG L C , XU Y B , LI F H , et al . Dynamic security-empowering architecture for space-ground integration information network [J ] . Journal on Communications , 2021 , 42 ( 9 ): 87 - 95 .

朱勇刚 , 孙艺夫 , 姚富强 , 等 . 基于多智能超表面的信道空间内生抗干扰方法 [J ] . 通信学报 , 2023 , 44 ( 10 ): 13 - 22 .

ZHU Y G , SUN Y F , YAO F Q , et al . Channel-space endogenous anti-jamming method based on multi-reconfigurable intelligent surface [J ] . Journal on Communications , 2023 , 44 ( 10 ): 13 - 22 .

BRYAN C , MARK G , JESSE S . Winning in the gray zone: using electromagnetic warfare to regain escalation dominance [R ] . 2017 .

YAO H P , WANG L Y , WANG X D , et al . The space-terrestrial integrated network: an overview [J ] . IEEE Communications Magazine , 2018 , 56 ( 9 ): 178 - 185 .

冯智斌 , 徐煜华 , 杜智勇 , 等 . 对抗智能干扰的主动防御技术 [J ] . 通信学报 , 2022 , 43 ( 10 ): 42 - 54 .

FENG Z B , XU Y H , DU Z Y , et al . Active defense technology against intelligent jammer [J ] . Journal on Communications , 2022 , 43 ( 10 ): 42 - 54 .

李少谦 , 程郁凡 , 董彬虹 , 等 . 智能抗干扰通信技术研究 [J ] . 无线电通信技术 , 2012 , 38 ( 1 ): 1 - 4 .

LI S Q , CHENG Y F , DONG B H , et al . Research on intelligent anti-jam communication techniques [J ] . Radio Communications Technology , 2012 , 38 ( 1 ): 1 - 4 .

张孟杰 , 赵睿 , 王培臣 , 等 . 基于强化学习的无人机辅助物联网抗敌意干扰算法 [J ] . 信号处理 , 2021 , 37 ( 1 ): 11 - 18 .

ZHANG M J , ZHAO R , WANG P C , et al . Anti-jamming algorithm with reinforcement learning in UAV-aided Internet of things [J ] . Journal of Signal Processing , 2021 , 37 ( 1 ): 11 - 18 .

王瑞东 , 张彦龙 , 魏鹏 , 等 . 战术跳频系统智能抗干扰决策 [J ] . 信号处理 , 2023 , 39 ( 1 ): 84 - 95 .

WANG R D , ZHANG Y L , WEI P , et al . Intelligent anti-jamming strategy for tactical frequency-hopping system [J ] . Journal of Signal Processing , 2023 , 39 ( 1 ): 84 - 95 .

HAN G A , XIAO L , POOR H V . Two-dimensional anti-jamming communication based on deep reinforcement learning [C ] // Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE Press , 2017 : 2087 - 2091 .

LIU X , XU Y H , JIA L L , et al . Anti-jamming communications using spectrum waterfall: a deep reinforcement learning approach [J ] . IEEE Communications Letters , 2018 , 22 ( 5 ): 998 - 1001 .

LI W , WANG J , LI L , et al . Intelligent anti-jamming communication with continuous action decision for ultra-dense network [C ] // Proceedings of the ICC 2019-2019 IEEE International Conference on Communications (ICC) . Piscataway : IEEE Press , 2019 : 1 - 7 .

张梦钰 , 豆亚杰 , 陈子夷 , 等 . 深度强化学习及其在军事领域中的应用综述 [J ] . 系统工程与电子技术 , 2024 , 46 ( 4 ): 1297 - 1308 .

ZHANG M Y , DOU Y J , CHEN Z Y , et al . Review of deep reinforcement learning and its applications in military field [J ] . Systems Engineering and Electronics , 2024 , 46 ( 4 ): 1297 - 1308 .

唐斯琪 , 潘志松 , 胡谷雨 , 等 . 深度强化学习在天基信息网络中的应用: 现状与前景 [J ] . 系统工程与电子技术 , 2023 , 45 ( 3 ): 886 - 901 .

TANG S Q , PAN Z S , HU G Y , et al . Application of deep reinforcement learning in space information network—status quo and prospects [J ] . Systems Engineering and Electronics , 2023 , 45 ( 3 ): 886 - 901 .

STRASSER M , PÖPPER C , ČAPKUN S . Efficient uncoordinated FHSS anti-jamming communication [C ] // Proceedings of the Tenth ACM International Symposium on Mobile Ad Hoc Networking and Computing . New York : ACM Press , 2009 : 207 - 218 .

WILHELM M , MARTINOVIC I , SCHMITT J B , et al . Short paper: reactive jamming in wireless networks: how realistic is the threat? [C ] // Proceedings of the Fourth ACM Conference on Wireless Network Security . New York : ACM Press , 2011 : 47 - 52 .

HE X F , DAI H Y , NING P . Faster learning and adaptation in security games by exploiting information asymmetry [J ] . IEEE Transactions on Signal Processing , 2016 , 64 ( 13 ): 3429 - 3443 .

SUTTON R S , BARTO A G . Reinforcement learning: an introduction [M ] . Cambridge : MIT Press , 2018 .

HE K M , SUN J . Convolutional neural networks at constrained time cost [C ] // Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE Press , 2015 : 5353 - 5360 .

邹润 , 刘阳 , 臧晴 , 等 . 国外天基空间目标监视系统发展综述 [J ] . 航天器工程 , 2023 , 32 ( 5 ): 110 - 118 .

ZOU R , LIU Y , ZANG Q , et al . Overview of development of foreign space-based space target surveillance system [J ] . Spacecraft Engineering , 2023 , 32 ( 5 ): 110 - 118 .

Views

835

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Multi-agent cooperative confrontation with proximal policy optimization in urban environments

Spectrum resource allocation for high-throughput satellite communications based on behavior cloning

Intelligent route planning method with jointing topology control of UAV swarm

GAT-based decision mechanism for decentralized joint routing and spectrum access

Multi-cluster computing power resource scheduling algorithm based on DDPG reinforcement learning

Related Author

MI Guangming

ZHANG Hui

ZHANG Jing

ZHUO Li

QIN Hao

LI Shuangyi

ZHAO Di

MENG Haowei

Related Institution

School of Information Science and Technology, Beijing University of Technology

Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology

State Key Laboratory of Integrated Services Networks, Xidian University

Hangzhou Institute of Technology, Xidian University

College of Electrical and Information Engineering,Hunan University

AI问答

⁰