1. 哈尔滨工程大学 计算机科学与技术学院,黑龙江 哈尔滨 150001
2. 中国新兴建设开发总公司,北京100143
[ "何忠政(1986-),男,山东潍坊人,哈尔滨工程大学博士生,主要研究方向为容错计算、移动计算、容错调度。" ]
[ "门朝光[通信作者](1963-),男,黑龙江哈尔滨人,哈尔滨工程大学教授,主要研究方向为容错计算、移动计算、容错调度。E-mail:menchaoguang@hrbeu.edu.cn。" ]
[ "陈拥军(1971-),男,北京人,中国新兴建设开发总公司教授级高工,主要研究方向为容错计算、移动计算、容错调度、土木工程。" ]
[ "李香(1975-),女,黑龙江哈尔滨人,哈尔滨工程大学副教授,主要研究方向为容错计算、移动计算、容错调度。" ]
网络首发:2015-07,
纸质出版:2015-07-25
移动端阅览
何忠政, 门朝光, 陈拥军, 等. 基于遗传算法的异构系统多副本容错调度算法[J]. 通信学报, 2015,36(7):153-165.
Zhong-zheng HE, Chao-guang MEN, Yong-jun CHEN, et al. Multi-duplication fault tolerant scheduling algorithm based on genetic algorithm in heterogeneous systems[J]. Journal on Communications, 2015, 36(7): 153-165.
何忠政, 门朝光, 陈拥军, 等. 基于遗传算法的异构系统多副本容错调度算法[J]. 通信学报, 2015,36(7):153-165. DOI: 10.11959/j.issn.1000-436x.2015203.
Zhong-zheng HE, Chao-guang MEN, Yong-jun CHEN, et al. Multi-duplication fault tolerant scheduling algorithm based on genetic algorithm in heterogeneous systems[J]. Journal on Communications, 2015, 36(7): 153-165. DOI: 10.11959/j.issn.1000-436x.2015203.
针对异构系统中基于多副本机制的容错调度方法忽略调度makespan、任务间依赖与系统链路失效及严格调度方式调度 makespan 较长问题,首先提出通用调度方式下同时考虑节点和链路失效的可靠性计算方法;然后给出该通用调度问题的 0-1 整数规划模型;接着提出可靠性意识多副本任务通用调度(RAMD_TGS
reliability-aware multi-duplication task general scheduling)算法,通过遗传算法种群进化来搜索副本映射节点和开始执行时间。实验表明该算法不仅满足可靠性要求,而且与严格调度方式相比能进一步减小调度makespan,该算法资源占用开销也是可接受的。
The fault-tolerant task scheduling mechanisms based on multi-duplication didn’t consider the scheduling makespan
the dependencies between tasks
the failures of the links and the longer scheduling makespan caused by the strict scheduling method in the heterogeneous distributed system.So the reliability calculation method that can involve the processor failures and the link failures was proposed firstly.Then the 0-1 integer linear program was proposed for the general scheduling problem.At last
the RAMD_TGS(reliability-aware multi-duplication task general scheduling) algorithm was proposed to solve the 0-1 integer linear program.The algorithm searched the mapped processor and the start execution time on the mapped processor for the task duplication by the evolution of the genetic algorithm.The experiments show that the RAMD_TGS algorithm can meet the reliability requirements and outperforms the existing scheduling algorithms based on the strict scheduling method in terms of scheduling makespan.The resource usages of the algorithm are also acceptable.
XIAO Y T , KENLI L , RENFA L , et al . Reliability-aware scheduling strategy for heterogeneous distributed computing systems [J ] . Journal of Parallel and Distributed Computing , 2010 , 70 ( 9 ): 941 - 952 .
ZHAO L , REN Y , XIANG Y , et al . Fault tolerant scheduling with dynamic number of replicas in heterogeneous system [A ] . IEEE International Conference on High Performance Computing and Communications [C ] . Melbourne , 2010 . 434 - 441 .
QIN X , JIANG H . A novel fault-tolerant scheduling algorithm for precedence constrained tasks in real-time heterogeneous systems [J ] . Parallel Computing , 2006 , 32 ( 5 ): 331 - 356 .
QIN Z , BHARADWAJ V . On the design of communication-aware fault-tolerant scheduling algorithms for precedence constrained tasks in grid computing systems with dedicated communication devices [J ] . Journal of Parallel and Distributed Computing , 2009 , 69 ( 3 ): 282 - 294 .
QIN Z , BHARADWAJ V , CHEN K . On the design of fault-tolerant scheduling strategies using primary-backup approach for computational grids with low replication costs [J ] . IEEE Transactions on Computers , 2009 , 58 ( 3 ): 380 - 393 .
王吉 , 包卫东 , 朱晓敏 . 虚拟化云平台中实时任务容错调度算法研究 [J ] . 通信学报 , 2014 , 35 ( 10 ): 171 - 180 .
WANG J , BAOW D , ZHU X M . Fault-tolerant scheduling algorithm forreal-time tasks in virtualized cloud [J ] . Journal on Communications , 2014 , 35 ( 10 ): 171 - 180 .
GIRAULT A , KALLA H . A novel bicriteria scheduling heuristics providing a guaranteed global system failure rate [J ] . IEEE Transactions on Dependable and Secure Computing , 2009 , 6 ( 4 ): 241 - 254 .
GIRAULT A , KALLA H , SIGHIREANU M , et al . An algorithm for automatically obtaining distributed and fault-tolerant static schedules [A ] . International Conference on Dependable Systems and Networks [C ] . 2003 . 159 - 168 .
ANNE B , MOURAD H , YVES R . Fault tolerant scheduling of precedence task graphs on heterogeneous platforms [A ] . Proceedings of the 22nd International Conference on Parallel and Distributed Processing [C ] . 2008 . 1 - 8 .
PAUL P , VIACHESLAV I , PETRU E , et al . Design optimization of time and cost-constrained fault-tolerant embedded systems with check pointing and replication [J ] . IEEE Transactions on Very Large Scale Integration Systems , 2009 , 17 ( 3 ): 389 - 402 .
ANNE B , MOURAD H , YVES R . Contention awareness and fault-tolerant scheduling for precedence constrained tasks in heterogeneous systems [J ] . Parallel Computing , 2009 , 35 ( 2 ): 83 - 108 .
ZHAO L , REN Y , SAKURAI K . Reliable workflow scheduling with less resource redundancy [J ] . Parallel Computing , 2013 , 39 ( 10 ): 567 - 585 .
ALAIN G,ÉRIK S , DENIS T . Reliability versus performance for critical applications [J ] . Journal of Parallel and Distributed Computing , 2009 , 69 ( 3 ): 326 - 336 .
ZHAO L , REN Y Z , SAKURAI K . A resource minimizing scheduling algorithm with ensuring the deadline and reliability in heterogeneous systems [A ] . International Conference on Advanced Information Networking and Applications [C ] . 2011 . 275 - 282 .
谢国琪 , 李仁发 , 刘琳 等 . 异构分布式系统 DAG 可靠性模型与容错算法 [J ] . 计算机学报 , 2013 , 36 ( 10 ): 2019 - 2032 .
XIEG Q , LIR F , LIUL , et al . DAG reliability model and fault-tolerant algorithm for heterogeneous distributed systems [J ] . ChineseJournal of Computers , 2013 , 36 ( 10 ): 2019 - 2032 .
CHIN S H , SUH T , YU H C . Genetic algorithm based scheduling method for efficiency and reliability in mobile grid [A ] . International Conference on Ubiquitous Information Technologies & Applications [C ] . 2009 . 1 - 6 .
ATAKAN D,FÜSUN Ö . Biobjective scheduling algorithms for execution time-reliability trade-off in heterogeneous computing systems [J ] . The Computer Journal , 2005 , 48 ( 3 ): 300 - 314 .
WANG X F , YEO C S , BUYYA R , et al . Optimizing the makespan and reliability for workflow applications with reputation and a look-ahead genetic algorithm [J ] . Future Generation Computer Systems , 2011 , 27 ( 8 ): 1124 - 1134 .
BENOIT A , DUFOSSÉ F , GIRAULT A , et al . Reliability and performance optimization of pipelined real-time systems [J ] . Journal of Parallel and Distributed Computing , 2013 , 73 ( 6 ): 851 - 865 .
ANNE BENOIT , CANON L C , JEANNOT E , et al . Reliability of task graph schedules with transient and fail-stop failures:complexity and algorithms [J ] . Journal of Scheduling , 2012 , 15 ( 5 ): 615 - 627 .
TOPCUOGLU H , HARIRI S , WU M . Performance-effective and lowcomplexity task scheduling for heterogeneous computing [J ] . IEEE Transactions on Parallel and Distributed Systems , 2002 , 13 ( 3 ): 260 - 274 .
0
浏览量
1608
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621