基于拟牛顿法的深度强化学习在车联网边缘计算中的研究

章坚武; 芦泽韬; 章谦骅; 詹明

doi:10.11959/j.issn.1000-436x.2024101

您当前的位置：

首页 >

文章列表页 >

基于拟牛顿法的深度强化学习在车联网边缘计算中的研究

学术论文 | 更新时间：2024-06-24

- 基于拟牛顿法的深度强化学习在车联网边缘计算中的研究
- Research on deep reinforcement learning in Internet of vehicles edge computing based on Quasi-Newton method
- 通信学报 2024年45卷第5期页码：90-100
- 作者机构：
  
  1.杭州电子科技大学通信工程学院,浙江杭州 310018
  2.之江实验室天基计算研究中心,浙江杭州 311121
  3.浙江大学信息与电子工程学院,浙江杭州 310027
  4.台州学院电子与信息工程学院,浙江台州 318000
- 作者简介：
  
  [ "章坚武（1961- ），男，浙江杭州人，博士，杭州电子科技大学教授、博士生导师，主要研究方向为移动通信、多媒体信号处理与人工智能、通信网络与信息安全等。" ]
  [ "芦泽韬（2000- ），男，江西九江人，杭州电子科技大学硕士生，主要研究方向为边缘计算、强化学习等。" ]
  [ "章谦骅（1990- ），男，浙江杭州人，之江实验室天基计算研究中心工程师，浙江大学博士生，主要研究方向为天基计算、激光通信、计算卸载等。zhangqh@zhejianglab.com" ]
  [ "詹明（1975- ），男，河南新县人，博士，台州学院教授、博士生导师，主要研究方向为信道编码理论与技术、工业无线传感器网络、高可靠低时延通信和安全通信技术。" ]
- 基金信息：
  
  浙江省自然科学基金重点项目(LZ23F010001)
- DOI：10.11959/j.issn.1000-436x.2024101
  中图分类号： TN92
- 收稿日期：2024-02-04，
  
  修回日期：2024-05-07，
  
  纸质出版日期：2024-05-30
- 稿件说明：
移动端阅览
章坚武,芦泽韬,章谦骅等.基于拟牛顿法的深度强化学习在车联网边缘计算中的研究[J].通信学报,2024,45(05):90-100.

ZHANG Jianwu,LU Zetao,ZHANG Qianhua,et al.Research on deep reinforcement learning in Internet of vehicles edge computing based on Quasi-Newton method[J].Journal on Communications,2024,45(05):90-100.
章坚武,芦泽韬,章谦骅等.基于拟牛顿法的深度强化学习在车联网边缘计算中的研究[J].通信学报,2024,45(05):90-100. DOI： 10.11959/j.issn.1000-436x.2024101.

ZHANG Jianwu,LU Zetao,ZHANG Qianhua,et al.Research on deep reinforcement learning in Internet of vehicles edge computing based on Quasi-Newton method[J].Journal on Communications,2024,45(05):90-100. DOI： 10.11959/j.issn.1000-436x.2024101.

摘要

为了解决车联网中由于多任务和资源限制导致的任务卸载决策不理想的问题，提出了拟牛顿法的深度强化学习双阶段在线卸载（QNRLO）算法。该算法首先引入批归一化技术优化深度神经网络的训练过程，随后采用拟牛顿法进行优化，有效逼近最优解。通过此双阶段优化，算法显著提升了在多任务和动态无线信道条件下的性能，提高了计算效率。通过引入拉格朗日算子和重构的对偶函数，将非凸优化问题转化为对偶函数的凸优化问题，确保算法的全局最优性。此外，算法考虑了车联网模型中的系统传输时间分配，增强了模型的实用性。与现有算法相比，所提算法显著提高了任务卸载的收敛性和稳定性，并能有效处理车联网中的任务卸载问题，具有较高的实用性和可靠性。

Abstract

To address the issues of ineffective task offloading decisions caused by multitasking and resource constraints in vehicular networks

the Quasi-Newton method deep reinforcement learning dual-phase online offloading (QNRLO) algorithm was proposed. The algorithm was designed by initially incorporating batch normalization techniques to optimize the training process of deep neural networks. Subsequently

optimization was performed using the Quasi-Newton method to effectively approximate the optimal solution. Through this dual-stage optimization

performance was significantly enhanced under conditions of multitasking and dynamic wireless channels

improving computational efficiency. By introducing Lagrange multipliers and a reconstructed dual function

the non-convex optimization problem was transformed into a convex optimization problem of the dual function

ensuring the global optimality of the algorithm. Additionally

system transmission time allocation in the vehicular network model was considered

enhancing the practicality of the algorithm. Compared to existing algorithms

the proposed algorithm improves the convergence and stability of task offloading significantly

addresses task offloading issues in vehicular networks effectively

and offers high practicality and reliability.

关键词

Keywords

references

SONKOLY B , CZENTYE J , SZALAY M , et al . Survey on placement methods in the edge and beyond [J ] . IEEE Communications Surveys & Tutorials , 2021 , 23 ( 4 ): 2590 - 2629 .

REN J , ZHANG D Y , HE S W , et al . A survey on end-edge-cloud orchestrated network computing paradigms: transparent computing, mobile edge computing, fog computing, and cloudlet [J ] . ACM Computing Surveys , 2019 , 52 ( 6 ): 125 .

MAO Y Y , YOU C S , ZHANG J , et al . A survey on mobile edge computing: the communication perspective [J ] . IEEE Communications Surveys & Tutorials , 2017 , 19 ( 4 ): 2322 - 2358 .

MACH P , BECVAR Z . Mobile edge computing: a survey on architecture and computation offloading [J ] . IEEE Communications Surveys & Tutorials , 2017 , 19 ( 3 ): 1628 - 1656 .

YOUSEFPOUR A , FUNG C , NGUYEN T , et al . All one needs to know about fog computing and related edge computing paradigms: a complete survey [J ] . Journal of Systems Architecture , 2019 , 98 : 289 - 330 .

ABDEL-HALIM I T , FAHMY H M A . Mobility prediction in vehicular ad-hoc networks: prediction aims, techniques, use cases, and research challenges [J ] . IEEE Intelligent Transportation Systems Magazine , 2021 , 13 ( 2 ): 105 - 126 .

GUPTA M , BENSON J , PATWA F , et al . Secure V2V and V2I communication in intelligent transportation using cloudlets [J ] . IEEE Transactions on Services Computing , 2022 , 15 ( 4 ): 1912 - 1925 .

NING Z L , ZHANG K Y , WANG X J , et al . Intelligent edge computing in Internet of vehicles: a joint computation offloading and caching solution [J ] . IEEE Transactions on Intelligent Transportation Systems , 2021 , 22 ( 4 ): 2212 - 2225 .

LUO Q Y , LI C L , LUAN T H , et al . Self-learning based computation offloading for Internet of vehicles: model and algorithm [J ] . IEEE Transactions on Wireless Communications , 2021 , 20 ( 9 ): 5913 - 5925 .

BOZORGCHENANI A , MAGHSUDI S , TARCHI D , et al . Computation offloading in heterogeneous vehicular edge networks: on-line and off-policy bandit solutions [J ] . IEEE Transactions on Mobile Computing , 2022 , 21 ( 12 ): 4233 - 4248 .

ZHANG D J , YU F R , YANG R Z , et al . Software-defined vehicular networks with trust management: a deep reinforcement learning approach [J ] . IEEE Transactions on Intelligent Transportation Systems , 2022 , 23 ( 2 ): 1400 - 1414 .

GAO H H , HUANG W Q , LIU T , et al . PPO2: location privacy-oriented task offloading to edge computing using reinforcement learning for intelligent autonomous transport systems [J ] . IEEE Transactions on Intelligent Transportation Systems , 2023 , 24 ( 7 ): 7599 - 7612 .

KIRAN B R , SOBH I , TALPAERT V , et al . Deep reinforcement learning for autonomous driving: a survey [J ] . IEEE Transactions on Intelligent Transportation Systems , 2022 , 23 ( 6 ): 4909 - 4926 .

WANG X , WANG S , LIANG X X , et al . Deep reinforcement learning: a survey [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2024 , 35 ( 4 ): 5064 - 5078 .

MENICKELLY M , WILD S M , XIE M . A stochastic quasi-newton method in the absence of common random numbers [J ] . arXiv Preprint , arXiv: 2302.09128 , 2023 .

HONG T , LIU R , LIU Z W , et al . An asynchronous collision-tolerant ACRDA scheme based on satellite-selection collaboration-beamforming for LEO satellite IoT networks [J ] . Sensors , 2023 , 23 ( 7 ): 3549 .

KRUTIKOV V , TOVBIS E , BYKOV A , et al . Properties of the quadratic transformation of dual variables [J ] . Algorithms , 2023 , 16 ( 3 ): 148 .

SEGU M , TONIONI A , TOMBARI F . Batch normalization embeddings for deep domain generalization [J ] . Pattern Recognition , 2023 , 135 : 109115 .

MA Y T , KLABJAN D . Diminishing batch normalization [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2024 , 35 ( 5 ): 6544 - 6557 .

WANG F . Computation rate maximization for wireless powered mobile edge computing [C ] // Proceedings of the 2017 23rd Asia-Pacific Conference on Communications (APCC) . Piscataway : IEEE Press , 2017 : 1 - 6 .

QIAN Y F , JIANG Y Y , HU L , et al . Blockchain-based privacy-aware content caching in cognitive Internet of vehicles [J ] . IEEE Network , 2020 , 34 ( 2 ): 46 - 51 .

GUO S T , XIAO B , YANG Y Y , et al . Energy-efficient dynamic offloading and resource scheduling in mobile cloud computing [C ] // Proceedings of the IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications . Piscataway : IEEE Press , 2016 : 1 - 9 .

WANG Y T , SHENG M , WANG X J , et al . Mobile-edge computing: partial computation offloading using dynamic voltage scaling [J ] . IEEE Transactions on Communications , 2016 , 64 ( 10 ): 4268 - 4282 .

YOU C S , HUANG K B , CHAE H . Energy efficient mobile cloud computing powered by wireless energy transfer [J ] . IEEE Journal on Selected Areas in Communications , 2016 , 34 ( 5 ): 1757 - 1771 .

BI S Z , ZHANG Y J . Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading [J ] . IEEE Transactions on Wireless Communications , 2018 , 17 ( 6 ): 4177 - 4190 .

YANG B , CAO X L , XIONG K , et al . Edge intelligence for autonomous driving in 6G wireless system: design challenges and solutions [J ] . IEEE Wireless Communications , 2021 , 28 ( 2 ): 40 - 47 .

LI C L , RAVANBAKHSH S , POCZOS B . Annealing Gaussian into ReLU: a new sampling strategy for leaky-ReLU RBM [J ] . arXiv Preprint , arXiv: 1611.03879 , 2016 .

MANGANINI G , FIORAVANTI S , RAMPONI G . Newton-based policy search for networked multi-agent reinforcement learning [C ] // Proceedings of the 2022 IEEE 61st Conference on Decision and Control (CDC) . Piscataway : IEEE Press , 2022 : 7241 - 7247 .

LI P Z , SEFEROGLU H , DASARI V R , et al . Model-distributed DNN training for memory-constrained edge computing devices [C ] // Proceedings of the 2021 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN) . Piscataway : IEEE Press , 2021 : 1 - 6 .

LI Y J , ZENG Z B , LI J , et al . Distributed model training based on data parallelism in edge computing-enabled elastic optical networks [J ] . IEEE Communications Letters , 2021 , 25 ( 4 ): 1241 - 1244 .

HUANG L , BI S Z , ZHANG Y J A . Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks [J ] . IEEE Transactions on Mobile Computing , 2020 , 19 ( 11 ): 2581 - 2593 .

LIAO Y , CAI Z R , SUN G D , et al . Deep learning channel estimation based on edge intelligence for NR-V2I [J ] . IEEE Transactions on Intelligent Transportation Systems , 2022 , 23 ( 8 ): 13306 - 13315 .

YANG J L , DUAN Y X , QIAO T , et al . Prototyping federated learning on edge computing systems [J ] . Frontiers of Computer Science , 2020 , 14 ( 6 ): 146318 .

FENG C Y , YANG H H , WANG S Y , et al . Hybrid learning: when centralized learning meets federated learning in the mobile edge computing systems [J ] . IEEE Transactions on Communications , 2023 , 71 ( 12 ): 7008 - 7022 .

浏览量

101

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于交通流量预测的车联网双边拍卖边缘计算迁移方案

内容新鲜度保障的车联网多智能体缓存分发策略

面向新能源汽车能源交易系统的PoRT共识机制

基于多智能体深度强化学习的低轨星座跳波束资源调度研究

基于新型可净化多重签名的车联网高效假名证书分发方案