浏览全部资源
扫码关注微信
1. 南京邮电大学计算机学院,江苏 南京210023
2. 南京邮电大学江苏省无线传感网高技术研究重点实验室,江苏 南京210003
[ "王海艳(1974-),女,江苏东台人,南京邮电大学教授,主要研究方向为服务计算、可信计算、大数据应用与云计算技术、隐私保护技术。" ]
[ "伏彩航(1990-),男,江苏连云港人,南京邮电大学硕士生,主要研究方向为大数据应用与云计算技术。" ]
网络出版日期:2016-04,
纸质出版日期:2016-04-25
移动端阅览
王海艳, 伏彩航. 基于HBase数据分类的压缩策略选择方法[J]. 通信学报, 2016,37(4):12-22.
Hai-yan WANG, Cai-hang FU. Compression strategies selection method based on classification of HBase data[J]. Journal of communications, 2016, 37(4): 12-22.
王海艳, 伏彩航. 基于HBase数据分类的压缩策略选择方法[J]. 通信学报, 2016,37(4):12-22. DOI: 10.11959/j.issn.1000-436x.2016068.
Hai-yan WANG, Cai-hang FU. Compression strategies selection method based on classification of HBase data[J]. Journal of communications, 2016, 37(4): 12-22. DOI: 10.11959/j.issn.1000-436x.2016068.
为解决现有的HBase数据压缩策略选择方法未考虑数据的冷热性,以及在选择过程中存在片面性和不可靠性的缺陷,提出了基于HBase数据分类的压缩策略选择方法。依据数据文件的访问频度将HBase数据划分为冷热数据,并限定具体的访问级别;在此基础上增加评估层,综合考虑基于相邻区和统计列的选择方法,提出基于数据访问级别的压缩策略选择方法。仿真实验及结果表明,提出的压缩策略选择方法不仅节省了存储空间,还大大提高了数据查询的性能。
Most of the current compression strategies selection methods for HBase data did not consider whether the data was cold or hot. Besides
problem of incompleteness and unreliability existed during selection process. To address the problems above
a compression strategies selection method based on classification of HBase data was put forward. HBase data was classified into cold and hot data according to the access frequency of each data file and an access level would be designated to each file. On this base
an evaluation layer was added and a compression strategies selection method based on access level with integration of neighbor sector and statistic column based selection methods. Simulation experiments and results demonstrate that the proposed compression strategies selection method based on classification of HBase data can not only save storage space but also greatly improve the query performance of HBase data.
程学旗 , 靳小龙 , 王元卓 , 等 . 大数据系统和分析技术综述 [J ] . 软件学报 , 2014 , 25 ( 9 ): 1889 - 1908 .
CHENG X Q , JIN X L , WANG Y Z , et al . Survey on big data system and analytic technology [J ] . Journal of Software , 2014 , 25 ( 9 ): 1889 - 1908 .
郭嘉凯 . 如何存储“冷数据” [J ] . 软件和信息服务 , 2013 , 23 ( 10 ): 58 - 59 .
GUO J K . How to store“cloud data” [J ] . Software and Information Service , 2013 , 23 ( 10 ): 58 - 59 .
王振玺 , 乐嘉锦 , 王梅 , 等 . 列存储数据区级压缩模式与压缩策略选择方法 [J ] . 计算机学报 , 2010 , 33 ( 8 ): 1524 - 1530 .
WANG Z X , LE J J , WANG M , et al . Sector-based compression and compression strategy selection method for column stores [J ] . Chinese Journal of Computers , 2010 , 33 ( 8 ): 1524 - 1530 .
TRONDHEIM N , STONEBRAKER M , ABADI D J , et al . C-store-A column-oriented DBMS [C ] // The 31st VLDB Conference . Trondheim, Norway , c 2005 : 553 - 564 .
YAN K , XIE M Y , ZHU H . Fixed-length string compression for direct operations in column-oriented databases [C ] // 2013 Ninth International Conference on Natural Computation (ICNC). Shenyang,China , c 2013 : 1171 - 1176 .
TOMMY S , SANJAY M . On compressing data in wireless sensor networks for energy efficiency and real time delivery [J ] . Distributed and Parallel Databases , 2013 , 31 ( 2 ): 151 - 182 .
MEHLA U S , DASGUPTA K S . Hamming distance based reordering and columnwise bit stuffing with difference vector: a etter scheme for test data compression with run length based codes [C ] // VLSID 23rd In-ternational Conference on VLSI Design . India,Bangalore , c 2010 : 33 - 38 .
丘建平 , 张广艳 , 舒继武 . DMStone:一个分级存储系统性能测试工具 [J ] . 软件学报 , 2012 , 23 ( 4 ): 987 - 995 .
QIU J P , ZHANG G Y , SHU J W . DMStone: a tool for evaluating hie-rarchical storage management systems [J ] . Journal of Software , 2012 , 23 ( 4 ): 987 - 995 .
LEVANDOSKI J J , LARSON P A , STOICA R . Identifying hot and cold data in main-memory databases [C ] // 2013 IEEE 29th International Conference on Data Engineering (ICDE). Australia, Brisbane , c 2013 : 26 - 37 .
GAO H B , WANG D F . LSD2H: a novel storage method of linked sensor data based on HBase [C ] // 2014 10th International Conference on Semantics, Knowledge and Grids (SKG). Beijing, China , c 2014 : 116 - 119 .
朱敏 , 程佳 , 柏文阳 . 一种基于HBase的RDF 数据存储模型 [J ] . 计算机研究与发展 , 2013 , 50 ( Suppl. ): 23 - 31 .
ZHU M , CHENG J , BAI W Y . A storage model for RDF data based on HBase [J ] . Journal of Computer Research and Development , 2013 , 50 ( Suppl. ): 23 - 31 .
葛微 , 罗圣美 , 周文辉 , 等 . HiBase:一种基于分层式索引的高效HBase查询技术与系统 [C ] // 2014中国大数据技术大会 . 中国, 北京 , c 2014
GE W , LUO S M , ZHOU W H , et al . HiBase: a hierarchical ndexing mechanism and system for efficient HBase query [C ] // Big Data Tech-nology Conference 2014 , China,Beijing , c 2014
IDREOSS.Self-organizing tuple reconstruction in column-stores [C ] // Proceedings of the SIGMOD . Providence, Rhode Island,USA , c 2009 : 297 - 308 .
宁正元 , 王李近 . 统计与决策常用算法及其实现 [M ] . 北京 : 清华大学出版社 , 2009 : 260 - 345 .
NING Z Y , WANG L J . Statistical and decision algorithm and its rea-lization [M ] . Beijing : Tsinghua University Press , 2009 : 260 - 345 .
李光 , 王亚东 , 苏小红 . 隐私保持的决策树分类挖掘 [J ] . 电子学报 , 2010 , 38 ( 1 ): 204 - 212 .
LI G , WANG Y D , SU X H . Privacy preserving data mining on deci-sion tree [J ] . Acta Electronica Sinica , 2010 , 38 ( 1 ): 204 - 212 .
崔颖安 , 李雪 , 王志晓 , 等 . 在线社交媒体数据抽样方法的比较研究 [J ] . 计算机学报 . 2014 , 37 ( 8 ): 1859 - 1876 .
CUI Y A , LI X , WANG Z X , et al . A comparison on methodologies of sampling online social media [J ] . Chinese Journal of Computer , 2014 , 37 ( 8 ): 1859 - 1876 .
WEI Y , YONG W F . A Cost-effective and reliable cloud storage [C ] // 2014 IEEE International Conference on Cloud Computing . Anchorage, AK , c 2014 : 938 - 939 .
TPC [EB/OL ] . http://www.tpc.org/tpch/ http://www.tpc.org/tpch/ , 2011 .
龙礡涛 . 列存储数据仓库中压缩技术的研究与实现 [D ] . 上海 : 东华大学 , 2013 .
LONG B T . Research and implementation of compression technology in column-oriented data warehouse [D ] . Shanghai : Donghua University , 2013 .
NYAMAGWA M , LIU J , UEHARA T . Cloud foren: a novel framework for digital forensics in cloud computing [J ] . Journal of Harbin Institute of Technology , 2014 , 21 ( 6 ): 39 - 45 .
0
浏览量
1518
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构