浏览全部资源
扫码关注微信
[ "张建勋(1978-),男,河北保定人,北京理工大学博士生,主要研究方向为高性能计算、多核缓存优化技术。" ]
[ "古志民(1964-),男,山西运城人,博士,北京理工大学教授、博士生导师,主要研究方向为并行计算与分布式计算、多核缓存优化等。" ]
[ "胡潇涵(1988-),女,山西陵川人,北京理工大学硕士生,主要研究方向为多核计算、缓存优化研究等。" ]
[ "蔡旻(1982-),男,湖南长沙人,北京理工大学博士生,主要研究方向为多核体系结构、存储子系统优化和体系结构模拟。" ]
网络出版日期:2014-08,
纸质出版日期:2014-08-25
移动端阅览
张建勋, 古志民, 胡潇涵, 等. 面向非规则大数据分析应用的多核帮助线程预取方法[J]. 通信学报, 2014,35(8):137-146.
Jian-xun ZHANG, Zhi-min GU, Xiao-han HU, et al. Multi-core helper thread prefetching for irregular data intensive applications[J]. Journal on communications, 2014, 35(8): 137-146.
张建勋, 古志民, 胡潇涵, 等. 面向非规则大数据分析应用的多核帮助线程预取方法[J]. 通信学报, 2014,35(8):137-146. DOI: 10.3969/j.issn.1000-436x.2014.08.017.
Jian-xun ZHANG, Zhi-min GU, Xiao-han HU, et al. Multi-core helper thread prefetching for irregular data intensive applications[J]. Journal on communications, 2014, 35(8): 137-146. DOI: 10.3969/j.issn.1000-436x.2014.08.017.
大数据分析应用往往采用基于大型稀疏图的遍历算法,其主要特点是非规则数据密集访存。以频繁使用的具有大型稀疏图遍历特征的介度中心算法为例,提出一种基于帮助线程的多参数预取控制模型和参数优化方法,从而达到提高非规则数据密集程序性能的目的。在商用多核平台Q6600和I7上运用该方法后,介度中心算法在不同规模输入下平均性能加速比分别为1.20和1.11。实验结果表明,帮助线程预取能够有效提升该类非规则应用程序的性能。
Big data analysis applications often use sparse graph traversal algorithm which characterized by irregular data intensive memory access. For improving performance of memory access in sparse graph traversal algorithm
helper thread prefetching could convert discontinuous locality into continuous-instant spatial-temporal locality effectively by using the shared last level cache of chip multi-processor platforms. Betweenness centrality algorithm was used as a case study
the multi-parameter prefetching model of helper thread and optimized instances were presented and evaluated on commercial CMP platforms Q6600 and I7
the average speedup of betweenness centrality algorithm at different input scale is 1.20 and 1.11 respectively. The experiment results show that helper thread prefetching can improve the perform-ance of irregular applications effectively.
SOFFER A , HEILD M . Big data meets social analytics [A ] . Proc of 2012 Conference of IBM Lotusphere [C ] . Orlando, Florida, USA , 2012 .
BRANDES U . A faster algorithm for betweenness centrality [J ] . Jour-nal of Mathematical Sociology , 2001 , 25 ( 2 ): 163 - 177 .
BADER D , MADDURI K , GILBERT J , et al . Designing scalable synthetic compact applications for benchmarking high productivity computing systems [J ] . CTWatch Quarterly , 2006 , 4B ( 2 ): 1 - 10 .
LUK C . Tolerating memory latency through software controlled pre-execution in simultaneous multithreading processors [A ] . Proc of the 28th Annual International Symposium on Computer Architecture (ISCA) [C ] Göteborg, Sweden , 2001 . 40 - 51 .
SMITH JE . Decoupled access/execute computer architectures [A ] . Proc of the 9th International Symposium on Computer Architecture (ISCA) [C ] . Austin, TX, USA , 1982 . 112 - 119 .
SONG Y , KALOGEROPULOS S , TIRUMALAI P . Design and im-plementation of a compiler framework for helper threading on multi-core processors [A ] . Proc of the 14th International Conference on Par-allel Architectures and Compilation Techniques (PACT) [C ] . Saint Louis, MO, USA , 2005 . 99 - 109 .
KIM D , LIAO S , WANG P , et al . Physical experimentation with pre-fetching helper threads on Intel's hyper-threaded processors [A ] . Proc of the International Symposium on Code generation and Optimization (CGO) [C ] . Palo Alto, Calif , 2004 . 27 - 38 .
LEE J , JUNG C , KIM D , et al . Prefetching with helper threads for lossely coupled multiprocessor systems [A ] . IEEE Transactions on Par-allel and Distributed System [C ] . 2009 , 20 ( 9 ): 1309 - 1324 .
LU J , DAS A , HSU W , et al . Dynamic helper threaded prefetching on the Sun UltraSparc CMP processor [A ] . Proc of 38th Annual IEEE/ACM International Symposium Micro Architecture (MI-CRO) [C ] . Barcelona, Spain , 2005 . 93 - 104 .
ZHOU J , CIESLEWICZ J , ROSS K , et al Improving database per-formance on simultaneous multithreading processors [A ] . Proc of the 31th International Conference on Very Large Data Bases [C ] Trond-heim, Norway , 2005 . 49 - 60 .
HUANG Y , TANG J , GU Z , et al . The performance optimization of threaded prefetching for linked data structures [A ] . International Journal of Parallel Programming 2012 , 40 ( 2 ): 141 - 163 .
DUDAS A , JUHASZ S . Reconfigurable pre-execution in data parallel applications on multicore systems [A ] . Electrical Engineering and In-telligent Systems Lecture Notes in Electrical Engineering 2013 . 29 - 38 .
MADDURI K , EDIGER D , JIANG K , et al . A faster parallel algo-rithm and efficient multithreaded implementations for evaluating be-tweenness centrality on massive datasets [A ] . Proc of IEEE Interna-tional Symposium on Parallel & Distributed Processing(IPDPS) [C ] . Rome, Italy , 2009 . 1 - 8 .
TU D , TAN G , SUN N . Fine-grained parallel betweenness centrality algorithm without lock synchronization [A ] . Journal of Software 2011 , 22 ( 5 ): 986 - 995 .
EDMONDS N , HOEFLER T , LUMSDAINE A , et al . A space-efficient parallel algorithm for computing betweenness centrality in distributed memory [A ] . roc of International Conference on High Performance Computing (HiPC) [C ] . Goa, India , 2010 . 1 - 10 .
TU D , TAN G . Characterizing betweenness centrality algorithm on multi-core architectures [A ] . Proc of IEEE International Symposium on Parallel and Distributed Processing with Application(ISPA) [C ] . Chengdu, China , 2009 . 182 - 189 .
TAN G , VUGRANAM C , SREEDHAR , et al . Analysis and perform-ance results of computing betweeness centrality on IBM Cyclops64 [A ] . Journal of Supercomputing 2011 , 1 ( 56 ): 1 - 24 .
TAN G , VUGRANAM C , SREEDHAR , et al . Just-in-time locality and percolation for optimizing irregular applications on a manycore archi-tecture [A ] . Proc of Conference on languages and compilers for parallel Computing Lecture Notes in Computer Science [C ] . Edmonton, Can-ada , 2008 . 331 - 342 .
ZHANG J , GU Z . Exposing the shared cache behavior of helper thread on CMP platforms [A ] . Proc of the 14th IEEE International Conference on Computational Science and Engineering(CSE) [C ] Dalinan, China , 2011 . 379 - 386 .
0
浏览量
0
下载量
4
CSCD
关联资源
相关文章
相关作者
相关机构