浏览全部资源
扫码关注微信
重庆邮电大学移动通信技术重点实验室,重庆 400065
[ "朱江(1977-),男,湖北荆州人,博士,重庆邮电大学教授,主要研究方向为认知无线电、移动通信、网络安全态势感知。" ]
[ "王婷婷(1993-),女,安徽安庆人,重庆邮电大学硕士生,主要研究方向为网络安全态势感知。" ]
[ "宋永辉(1991-),男,河北邯郸人,重庆邮电大学硕士生,主要研究方向为认知无线电。" ]
[ "刘亚利(1990-),男,河南商丘人,重庆邮电大学硕士生,主要研究方向为认知无线电。" ]
网络出版日期:2018-04,
纸质出版日期:2018-04-25
移动端阅览
朱江, 王婷婷, 宋永辉, 等. 无线网络中基于深度Q学习的传输调度方案[J]. 通信学报, 2018,39(4):35-44.
Jiang ZHU, Tingting WANG, Yonghui SONG, et al. Transmission scheduling scheme based on deep Q learning in wireless network[J]. Journal on communications, 2018, 39(4): 35-44.
朱江, 王婷婷, 宋永辉, 等. 无线网络中基于深度Q学习的传输调度方案[J]. 通信学报, 2018,39(4):35-44. DOI: 10.11959/j.issn.1000-436x.2018058.
Jiang ZHU, Tingting WANG, Yonghui SONG, et al. Transmission scheduling scheme based on deep Q learning in wireless network[J]. Journal on communications, 2018, 39(4): 35-44. DOI: 10.11959/j.issn.1000-436x.2018058.
针对无线网络中的数据传输问题,提出一种基于深度Q学习(QL
Q learning)的传输调度方案。该方案通过建立马尔可夫决策过程(MDP
Markov decision process)系统模型来描述系统的状态转移情况;使用Q学习算法在系统状态转移概率未知的情况下学习和探索系统的状态转移信息,以获取调度节点的近似最优策略。另外,当系统状态的规模较大时,采用深度学习(DL
deep learning)的方法来建立状态和行为之间的映射关系,以避免策略求解中产生的较大计算量和存储空间。仿真结果表明,该方法在功耗、吞吐量、分组丢失率方面的性能逼近基于策略迭代的最优策略,且算法复杂度较低,解决了维灾问题。
To cope with the problem of data transmission in wireless networks
a deep Q learning based transmission scheduling scheme was proposed.The Markov decision process system model was formulated to describe the state transition of the system.The Q learning algorithm was adopted to learn and explore the system states transition information in the case of unknown system states transition probability to obtain the approximate optimal strategy of the schedule node.In addition
when the system state scale was big
the deep learning method was employed to map the relation between state and behavior to solve the problem of the large amount of computation and storage space in Q learning process.The simulation results show that the proposed scheme can approach the optimal strategy based on strategy iteration in terms of power consumption
throughput
packets loss rate.And the proposed scheme has a lower complexity
which can solve the problem of the curse of dimensionality.
朱江 , 徐斌阳 , 李少谦 . 一种基于马尔可夫决策过程的认知无线电网络传输调度方案 [J ] . 电子与信息学报 , 2009 , 31 ( 8 ): 2019 - 2023 .
ZHU J , XU B Y , LI S Q . A transmission and scheduling scheme based on Markov decision process in cognitive radio networks [J ] . Journal of Electronics & Information Technology , 2009 , 31 ( 8 ): 2019 - 2023 .
ZHU J , PENG Z Z , LI F . A transmission and scheduling scheme based on W-learning algorithm in wireless networks [C ] // 8th International ICST Conference on Communications and Networking in China (CHINACOM) . 2013 : 85 - 90 .
LI H , HAN Z . Competitive spectrum access in cognitive radio networks:graphical game and learning [C ] // Wireless Communications and Networking Conference (WCNC) . 2010 : 1 - 6 .
林晓辉 , 谭宇 , 张俊玲 , 等 . 无线传输中基于马尔可夫决策的高能效策略 [J ] . 系统工程与电子技术 , 2014 , 36 ( 7 ): 1433 - 1438 .
LIN X H , TAN Y , ZHANG J L , et al . MDP-based energy efficient policy for wireless transmission [J ] . Systems Engineering and Electronics , 2014 , 36 ( 7 ): 1433 - 1438 .
WANG H S , MOAYERI N . Finite-state Markov channel-a useful model for radio communication channels [J ] . IEEE Transactions on Vehicular Technology , 1995 , 44 ( 1 ): 163 - 171 .
GAO Q , ZHU G , LIN S , et al . Robust QoS-aware cross-layer design of adaptive modulation transmission on OFDM systems in high-speed railway [J ] . IEEE Access , 2016 ,PP( 99 ):1.
CHEN X , CHEN W . Delay-optimal probabilistic scheduling for low-complexity wireless links with fixed modulation and coding:a cross-layer design [J ] . IEEE Transactions on Vehicular Technology , 2016 :1.
LAU V K N , . Performance of variable rate bit interleaved coding for high bandwidth efficiency [C ] // The Vehicular Technology Conference . 2000 : 2054 - 2058 .
CHUNG S T , GOLDSMITH A J . Degrees of freedom in adaptive modulation:a unified view [C ] // IEEE Transactions on Communications . 2001 : 1561 - 1571 .
WEI Q , LIU D , SHI G . A novel dual iterative Q-learning method for optimal battery management in smart residential environments [J ] . IEEE Transactions on Industrial Electronics , 2015 , 62 ( 4 ): 2509 - 2518 .
NI J , LIU M , REN L , et al . A multiagent Q-learning-based optimal allocation approach for urban water resource management system [J ] . IEEE Transactions on Automation Science & Engineering , 2014 , 11 ( 1 ): 204 - 214 .
SILVER D , HUANG A , MADDISON C J , et al . Mastering the game of Go with deep neural networks and tree search [J ] . Nature , 2016 , 529 ( 7587 ): 484 - 489 .
WEI C , ZHANG Z , QIAO W , et al . An adaptive network-based reinforcement learning method for MPPT control of PMSG wind energy conversion systems [J ] . IEEE Transactions on Power Electronics , 2016 :1.
KIM T , SUN Z , COOK C , et al . Invited-cross-layer modeling and optimization for electromigration induced reliability [C ] // Design Automation Conference . 2016 : 1 - 6 .
COMSA I S , ZHANG S , AYDIN M . A novel dynamic Q-learning-based scheduler technique for LTE-advanced technologies using neural networks [C ] // Conference on Local Computer Networks . 2012 : 332 - 335 .
TENG T H , TAN A H . Fast reinforcement learning under uncertainties with self-organizing neural networks [C ] // IEEE / WIC / ACM International Conference on Web Intelligence and Intelligent Agent Technology . 2015 : 51 - 58 .
KOBAYASHI T , SHIBUYA T , TANAKA F , et al . Q-learning in continuous state-action space by using a selective desensitization neural network [J ] . IEICE Technical Report Neurocomputing , 2011 , 111 : 119 - 123 .
周文云 . 强化学习维数灾问题解决方法研究 [D ] . 苏州:苏州大学 , 2009 .
ZHOU W Y . Research on the curse of dimensionality in reinforcement learning [D ] . Suzhou:Soochow University , 2009 .
LIU W , LIU N , SUN H , et al . Dispatching algorithm design for elevator group control system with Q-learning based on a recurrent neural network [C ] // Control and Decision Conference . 2013 : 3397 - 3402 .
WEI Q , LEWISF L , SUN Q , et al . Discrete-time deterministic Q-learning:a novel convergence analysis [J ] . IEEE transactions on cybernetics , 2016 : 1 - 14 .
李军 , 徐玖平 . 运筹学:非线性系统优化 [M ] . 北京 : 科学出版社 , 2003 .
LI J , XU J P . Operations research:nonlinear system optimization [M ] . Beijing : Science PressPress , 2003 .
0
浏览量
1320
下载量
1
CSCD
关联资源
相关文章
相关作者
相关机构