支持合并的自适应tile coding算法

施梦宇; 刘全; 傅启明

doi:10.11959/j.issn.1000-436x.2015047

您当前的位置：

首页 >

文章列表页 >

支持合并的自适应tile coding算法

学术论文 | 更新时间：2024-06-05

- 支持合并的自适应tile coding算法
- Mergeable adaptive tile coding method
- 通信学报 2015年36卷第2期页码：186-192
- 作者机构：
  
  1. 苏州大学计算机科学与技术学院，江苏苏州 215006
  2. 吉林大学符号计算与知识工程教育部重点实验室，吉林长春 130012
- 作者简介：
  
  [ "施梦宇（1989-），男，江苏淮安人，苏州大学硕士生，主要研究方向为强化学习。" ]
  [ "刘全（1969-），男，内蒙古牙克石人，苏州大学教授、博士生导师，主要研究方向为强化学习、智能信息处理和自动推理。" ]
  [ "傅启明（1985-），男，江苏淮安人，苏州大学博士生，主要研究方向为强化学习、贝叶斯推理和遗传算法。" ]
- 基金信息：
  
  国家自然科学基金资助项目(61272005);国家自然科学基金资助项目(61472262);江苏省自然科学基金资助项目(BK2012616)
- DOI：10.11959/j.issn.1000-436x.2015047
  中图分类号： TP181
- 网络出版日期：2015-02，
  
  纸质出版日期：2015-02-25
- 稿件说明：
移动端阅览
施梦宇, 刘全, 傅启明. 支持合并的自适应tile coding算法[J]. 通信学报, 2015,36(2):186-192.

Meng-yu SHI, Quan LIU, Qi-ming FU. Mergeable adaptive tile coding method[J]. Journal on communications, 2015, 36(2): 186-192.
施梦宇, 刘全, 傅启明. 支持合并的自适应tile coding算法[J]. 通信学报, 2015,36(2):186-192. DOI： 10.11959/j.issn.1000-436x.2015047.

Meng-yu SHI, Quan LIU, Qi-ming FU. Mergeable adaptive tile coding method[J]. Journal on communications, 2015, 36(2): 186-192. DOI： 10.11959/j.issn.1000-436x.2015047.

摘要

针对自适应 tile coding 算法会产生多余划分的问题，提出一种支持合并的自适应 tile coding 算法——MATC。该算法能够消除传统自适应tile coding算法中产生的多余划分，进一步解决连续状态空间离散化的问题。将MATC算法应用于离散动作连续状态的Mountain Car问题上，实验结果表明，该算法在学习过程中能消除传统tile coding算法的误划分所产生的不良影响，更准确地自动调整划分的精度，并更快地收敛到最佳策略。

Abstract

In order to solve many unnecessary division

mergence supported adaptive tile coding algorithm was presented which would eliminate the unnecessary division.Simulation is conducted on mountain car problem with discrete actions and continuous state space Results show that the proposed method can eliminate the influence of false division in the traditional tile coding method and achieve a more accurate adaptive partition of continuous state space.A higher convergence rate is achieved at the same time.

关键词

Keywords

references

SUTTON R S , BARTO A G . Reinforcement Learning:An Introduction [M ] . Cambridge:MIT Press , 1998 .

LIN C S , KIM H . Selection of learning parameters for CMAC-based adaptive critic learning [J ] . IEEE Trans Neural Networks , 1999 , 6 ( 3 ): 642 - 647 .

PELLEG D , MOORE A , SHROFF N B . X-means:extending K-Means with efficient estimation of the number of clusters [A ] . Proc of the 17th International Conf on Machine Learning [C ] . Boston:Morgan Kaufmann Press , 2000 . 727 - 734 .

PELLEG D , MOORE A . Accelerating exact k-means algorithms with geometric reasoning [A ] . Proc of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining [C ] . 1999 . 277 - 281 .

陈宗海 , 文锋 , 聂建斌等 . 基于节点生长 k-均值聚类算法的强化学习方法 [J ] . 计算机研究与发展 , 2006 , 43 ( 4 ): 661 - 666 .

CHEN Z H , WEN F , NIE J B , et al . A reinforcement learning method based on node-growing k-means cluster algorithm [J ] . Journal of Computer Research and Development , 2006 , 43 ( 4 ): 661 - 666 .

文锋 , 陈宗海 , 卓睿等 . 连续状态自适应离散化基于K-均值聚类的强化学习方法 [J ] . 控制与决策 , 2006 , 21 ( 2 ): 143 - 147 .

WEN F , CHEN Z H , ZHUO R , et al . Reinforcement learning method of continuous state adaptively discretized based on K-means clustering [J ] . Control and Decision , 2006 , 21 ( 2 ): 143 - 147 .

顾冬雷 , 陈卫东 , 席裕庚 . 一种基于增强学习的自适应控制方法 [J ] . 控制与决策 , 2002 , 17 ( 4 ): 473 - 479 .

GU D L , CHEN W D , XI Y G . A novel adaptive control algorithm based on reinforcement learning [J ] . Control and Decision , 2002 , 17 ( 4 ): 473 - 479 .

MOORE A W , ATKESON C G . The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces [J ] . Machine Learning , 1995 , 21 ( 3 ): 199 - 233 .

UTHER W T B , VELOSO M M . Tree based discretization for continuous state space reinforcement learning [A ] . AAAI’98 [C ] . Madison,Wisconsin,United States , 1998

SHERSTOV A A , STONE P . Function Approximation Via Tile Coding:Automating Parameter Choice Abstraction,Reformulation and Approximation [M ] . Springer Berlin Heidelberg , 2005 : 194 - 205 .

WHITESON S , TAYLOR M E , STONE P . Adaptive tile Coding for Value Function Approximation [M ] . Computer Science Department,University of Texas at Austin , 2007 .

WHITESON S , STONE P . Evolutionary function approximation for reinforcement learning [J ] . The Journal of Machine Learning Research , 2006 , 7 : 877 - 917 .

NOKHBEH-ZAEEM M , KHASHABI D , TALEBI H A , et al . Adaptive tiled neural networks [A ] . 2011 IEEE International Conference on Systems,Man,and Cybernetics (SMC) [C ] . New Orleans,LA,USA , 2011 . 2543 - 2548 .

LIN S , WRIGHT R . Evolutionary tile coding:an automated state abstraction algorithm for reinforcement learning [A ] . AAAI Workshops [C ] . 2010 .

浏览量

1107

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

增量式双自然策略梯度的行动者评论家算法

基于软提示微调和强化学习的网络安全命名实体识别方法研究

基于审计博弈的安全协作频谱感知方案

基于强化学习的在线离线混部云环境下的调度框架

基于深度强化学习的微服务多维动态防御策略研究