浏览全部资源
扫码关注微信
1. 中国科学院信息工程研究所 北京100093
2. 国家计算机网络应急技术处理协调中心 北京100029
[ "王勇(1985-),男,黑龙江双鸭山人,中国科学院信息工程研究所博士生,主要研究方向为海量数据存储、网络安全。" ]
[ "云晓春(1971-),男,黑龙江哈尔滨人,博士,中国科学院信息工程研究所研究员、博士生导师,主要研究方向为网络信息安全。" ]
[ "王树鹏(1980-),男,山东济南人,博士,中国科学院信息工程研究所副研究员,主要研究方向为海量数据存储、网络安全。" ]
[ "王曦(1985-),女,黑龙江哈尔滨人,中国科学院信息工程研究所助理研究员,主要研究方向为网络安全。" ]
网络出版日期:2016-03,
纸质出版日期:2016-03-25
移动端阅览
王勇, 云晓春, 王树鹏, 等. CBFM:支持属性删减的布鲁姆过滤器矩阵多维元素查询算法[J]. 通信学报, 2016,37(3):139-147.
Yong WANG, Xiao-chun YUN, ANGShu-peng WANG, et al. CBFM:cutted Bloom filter matrix for multi-dimensional membership query[J]. Journal on communications, 2016, 37(3): 139-147.
王勇, 云晓春, 王树鹏, 等. CBFM:支持属性删减的布鲁姆过滤器矩阵多维元素查询算法[J]. 通信学报, 2016,37(3):139-147. DOI: 10.11959/j.issn.1000-436x.2016061.
Yong WANG, Xiao-chun YUN, ANGShu-peng WANG, et al. CBFM:cutted Bloom filter matrix for multi-dimensional membership query[J]. Journal on communications, 2016, 37(3): 139-147. DOI: 10.11959/j.issn.1000-436x.2016061.
为了提升多维元素成员查询的灵活性和准确率,提出了一种新型索引结构CBFM(cutted Bloom filter matrix)。该索引方法通过独立属性布鲁姆过滤器笛卡尔乘积构建位矩阵,支持任意属性组合的多维元素成员查询,同时支持属性组合按需删减和属性加权,极大地提升内存空间利用率,降低查询误判率。理论分析证明相比于BFM(Bloom filter matrix)索引方法,CBFM具有更高的内存利用率。仿真实验表明,在分配内存相同的情况下,CBFM方法相比于其他方法,具有最低的查询误判率,特别在内存受限场景下,CBFM相比于BFM方法,查询误判率最大降低3个数量级,极大地提升了多维元素成员查询的准确率。
In order to improve the flexibility and accuracy of mu i-dimensional membership query
a new indexing structure called CBFM(cutted Bloom filter matrix)was proposed.CBFM built the bit matrix by the Cartesian product of different bloom filters
each representing one attribute of primary data.In this way
the proposed matrix supported by-attribute membership query.Besides
the attribute combinations in the bit matrix could be reduced and weighted on demand to further enhance memory utilization rate.Theoretical analysis proves that CBFM utilizes memory more effi-ciently than BFM
the current state of art.Experiments also show that
on the scenario of memory size fixed
the false positive rate of CBFM is lower than that of all other indexin ethods.Especially on the scenario of memory constrained
the false positive rate of CBFM can be 3 orders of magnitude lower than that of BFM(Bloom filter matrix) indexing me-thod.CBFM is an accurate data structure for multi-dimensional membership query.
LYNCH C . Big data:how do your data grow [J ] . Nature , 2008 , 455 ( 7209 ): 28 - 29 .
CHANG F , DEAN J , GHEMAWAT S , et al . Bigtable:a distributed storage system for structured data [J ] . ACM Transactions on Computer Systems (TOCS), 2008 , 26 ( 2 ): 4 .
VORA M N . Hadoop-HBase for large-scale data [J ] . IEEE Computer Science and Network Technology , 2011 ,( 1 ): 601 - 605 .
DEAN J , GHEMAWAT S . MapReduce:simplified data processing on large clusters [J ] . Communications of the ACM , 2008 , 51 ( 1 ): 107 - 113 .
THUSOO A , SARMA J S , JAIN N , et al . Hive:a warehousin solu-tion over a map-reduce framework [J ] . Proceedings of the VLDB Endowment , 2009 , 2 ( 2 ): 1626 - 1629 .
KORNACKER M , BEHM A , BITTORF V , et al . Impala:a modern,open-source SQL engine for Hadoop [C ] . Conference on Innovative Data Systems Research (CIDR'15). c 2015 .
TARKOMA S , ROTHENBERG C E , LAGERSPETZ E . Theory and practice of bloom filters for distributed systems [J ] . Communications Surveys & Tutorials,IEEE , 2012 , 14 ( 1 ): 131 - 144 .
RODEH O , TEPERMAN A , zFS-a scalable distributed file system using object disks [C ] // Mass Storage Systems and Technologies (MSST), c 2003 : 207 - 218 .
DEBNATH B , SENGUPTA S , LI J . FlashStore:high throughput per-sistent key-value store [J ] . Proceedings of the VLDB Endowment , 2010 , 3 ( 1 - 2 ): 1414 - 1425 .
ZHOU X , ZHANG X , WANG Y , et al . Efficient distributed mul-ti-dimensional index for big data management [M ] . Springer Berlin Heidelberg , 2013 . 130 - 141 .
NISHIMURA S , DAS S , AGRAWAL D , et al . MD-HBase:a scalable multi-dimensional data infrastructure for location aware services [J ] . IEEE Mobile Data Management (MDM), 2011 , 1 : 7 - 16 .
BLOOM B H . Space/time trade-offs in hash coding with allowable errors [J ] . Communications of the ACM , 1970 , 13 ( 7 ): 422 - 426 .
GUO D , WU J , CHEN H , et al . Theory and network applications of dynamic bloom filters [C ] // INFOCOM . c 2006 : 1 - 12 .
谢鲲 , 秦拯 , 文吉刚 , 等 . 联合多维布鲁姆过滤器查询算法 [J ] . 通信学报 , 2008 , 29 ( 1 ): 56 - 64 .
XIE K , QIN Z , WEN J G , et al . Combine multi-dimension Bloom filter for membership queries [J ] . Journal on Communications , 2008 , 29 ( 1 ): 56 - 64 .
WANG Z , LUO T , XU G , et al . A new indexing technique for support-ing by-attribute membership query of multidimensional data [M ] . Springer Berlin Heidelberg . 2013 : 266 - 277 .
WANG Z , LUO T , XU G , et al . The application of cartesian-join of bloom filters to supporting membership query of multid mensional data [C ] // IEEE Big Data . c 2014 : 288 - 295 .
BRODER A , MITZENMACHER M . Network applications of bloom filters:a survey [J ] . Internet Mathematics (MDM), 2004 , 1 ( 4 ): 485 - 509 .
CHENG X , LI H , WANG Y , et al . BF-matrix:a secondary index for the cloud storage [M ] . Springer International Publishing . 2014 : 384 - 396 .
CRAINICEANU A , LEMIRE D . Bloofi:multidimensional Bloom filters [J ] . Information Systems , 2015 , 54 : 311 - 324 .
0
浏览量
505
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构