浏览全部资源
扫码关注微信
1. 中国科学院计算技术研究所,北京 100080
2. 中国科学院研究生院,北京 100039
3. 中国科学院信息工程研究所,北京 100093
[ "乔延臣(1988-),男,山东聊城人,中国科学院博士生,主要研究方向为网络信息安全、恶意代码等。" ]
[ "云晓春(1971-),男,黑龙江哈尔滨人,博士,中国科学院研究员、博士生导师,主要研究方向为信息安全、计算机网络等。" ]
[ "庹宇鹏(1984-),男,河北廊坊人,中国科学院信息工程研究所助理研究员,主要研究方向为网络异常检测、移动互联网大数据挖掘。" ]
[ "张永铮(1978-),男,黑龙江哈尔滨人,博士,中国科学院研究员、博士生导师,主要研究方向为网络安全态势感知。" ]
网络出版日期:2016-11,
纸质出版日期:2016-11-25
移动端阅览
乔延臣, 云晓春, 庹宇鹏, 等. 基于simhash与倒排索引的复用代码快速溯源方法[J]. 通信学报, 2016,37(11):104-113.
Yan-chen QIAO, Xiao-chun YUN, Yu-peng TUO, et al. Fast reused code tracing method based on simhash and inverted index[J]. Journal on communications, 2016, 37(11): 104-113.
乔延臣, 云晓春, 庹宇鹏, 等. 基于simhash与倒排索引的复用代码快速溯源方法[J]. 通信学报, 2016,37(11):104-113. DOI: 10.11959/j.issn.1000-436x.2016225.
Yan-chen QIAO, Xiao-chun YUN, Yu-peng TUO, et al. Fast reused code tracing method based on simhash and inverted index[J]. Journal on communications, 2016, 37(11): 104-113. DOI: 10.11959/j.issn.1000-436x.2016225.
提出了一种新颖的复用代码精确快速溯源方法。该方法以函数为单位,基于simhash与倒排索引技术,能在海量代码中快速溯源相似函数。首先基于simhash利用海量样本构建具有三级倒排索引结构的代码库。对于待溯源函数,依据函数中代码块的simhash值快速发现相似代码块,继而倒排索引潜在相似函数,依据代码块跳转关系精确判定是否相似,并溯源至所在样本。实验结果表明,该方法在保证高准确率与召回率的前提下,基于代码库能快速识别样本中的编译器插入函数与复用函数。
A novel method for fast and accurately tracing reused code was proposed. Based on simhash and inverted in-dex
the method can fast trace similar functions in massive code. First of all
a code database with three-level inverted in-dex structures was constructed. For the function to be traced
similar code blocks could be found quickly according to simhash value of the code block in the function code. Then the potential similar functions could be fast traced using in-verted index. Finally
really similar functions could be identified by comparing jump relationships of similar code blocks. Further
malware samples containing similar functions could be traced. The experimental results show that the method can quickly identify the functions inserted by compilers and the reused functions based on the code database under the premise of high accuracy and recall rate.
董志强 , 肖新光 , 张栗伟 . 编码心理学分析病毒同源性 [J ] . 信息安全与通信保密 , 2005 ( 8 ): 55 - 59 .
DONG Z Q , XIAO X G , ZHANG S W . Malware homology identifica-tion based on programming psychology [J ] . China Information Security , 2005 ( 8 ): 55 - 59 .
GReAT . Gauss: abnormal distribution 2012 [R/OL ] . https://securelist. com/analysis/publications/36620/gauss-abnormal- distribution/ https://securelist. com/analysis/publications/36620/gauss-abnormal- distribution/
YURY Y , NAMESTNIKOV V K , OLEG K . Chthonic: a new modification of ZeuS 2014 [R/OL ] . https://securelist. com/blog/ virus-watch/68176/chthonic-a-new-modification-of-zeus/ https://securelist. com/blog/ virus-watch/68176/chthonic-a-new-modification-of-zeus/ .
SKELORU V . Visgean/Zeus [EB/OL ] . Github 2011. https://github. com/Visgean/Zeus Github 2011. https://github. com/Visgean/Zeus .
GREAT . A fanny equation: “i am your father, stuxnet”2015 [EB/OL ] . https://securelist.com/blog/research/68787/a-fanny-equation-i-am-your-father-stuxnet/ https://securelist.com/blog/research/68787/a-fanny-equation-i-am-your-father-stuxnet/ .
QIAO Y C , YUN X , ZHANG Y . Fast reused function retrieval method based on simhash and inverted index [C ] // 2016 15th IEEE Interna-tional Conference on Trust, Security and Privacy in Computing and Communications . 2016 .
BENCSATH B , PEK G , BUTTYAN L , et al Duqu: a stuxnet-like malware found in the wild [R ] . CrySyS Lab Technical Report . 2011 .
GREAT . Cloud Atlas: RedOctober APT is back in style 2014 [R/OL ] . http://securelist.com/blog/research/68083/cloud-atlas-redoctober-apt-is-back-in-style/ http://securelist.com/blog/research/68083/cloud-atlas-redoctober-apt-is-back-in-style/ .
LABS F S . PITOU: The “silent” resurrection of the notorious Srizbi kernel spambot [R ] . . 2014 .
MYLES G , COLLBERG C , K-gram based software birthmarks [C ] // Proceedings of the 2005 ACM Symposium on Applied Computing . 2005 : 314 - 318 .
SÆBJØRNSEN A , WILLCOCK J , PANAS T , et al . Detecting code clones in binary executables [C ] // 18th International Symposium on Software Testing and Analysis . 2009 : 117 - 128 .
LAKHOTIA A , PREDA M D , GIACOBAZZI R . Fast location of similar code fragments using semantic'juice' [C ] // 2nd ACM SIGPLAN Program Protection and Reverse Engineering Workshop . 2013 : 1 - 6 .
RUTTENBERG B , MILES C , KELLOGG L , et al . Identifying shared software components to support malware forensics [J ] . Detection of In-trusions and Malware, and Vulnerability Assessment: Springer , 2014 , 21 - 40 .
OUELLETTE J , PFEFFER A , LAKHOTIA A , et al . Countering malware evolution using cloud-based learning [C ] // 2013 8th International Con-ference on Malicious and Unwanted Software , 2013 .
DAVID Y , YAHAV E , Tracelet-based code search in executables [C ] // ACM SIGPLAN Notices . 2014 .
ALRABAEE S , SHIRANI P , WANG L , et al . SIGMA: a semantic inte-grated graph matching approach for identifying reused functions in binary code [J ] . Digital Investigation , 2015 , 12 : S61 - S71 .
CHARIKAR M S . Similarity estimation techniques from rounding algorithms [C ] // 34th Annual ACM Symposium on Theory of Comput-ing . 2002 .
MANKU G S , JAIN A , SARMA A D . Detecting near-duplicates for web crawling [C ] // 16th International Conference on World Wide Web. Banff, Alberta, Canada , 2007 : 141 - 50 .
UDDIN M S , ROY C K , SCHNEIDER K A , et al . On the effectiveness of simhash for detecting near-miss clones in large scale software sys-tems [C ] // 2011 18th Working Conference on Reverse Engineering(WCRE) , 2011 .
郭颖 , 陈峰宏 , 周明辉 . 大规模代码克隆的检测方法 [J ] . 计算机科学与探索 , 2014 ( 4 ): 417 - 426 .
GUO Y , CHEN F H , ZHOU M H . Code clone detection method for large scale source code [J ] . Journal of Frontiers of Computer Science &Technology , 2014 ( 4 ): 417 - 426 .
TIMO J , RINNE S L . ssh-3.2.9.1 2003 [EB/OL ] . http://down1.chinaunix. net/distfiles/ssh-3.2.9.1.tar.gz http://down1.chinaunix. net/distfiles/ssh-3.2.9.1.tar.gz .
VX Heaven [EB/OL ] . http://vxheaven.org/ http://vxheaven.org/ .
Wikipedia . Agobot 2016 [EB/OL ] . https://en. Wikipedia. org/wiki/Agobot https://en. Wikipedia. org/wiki/Agobot .
KHOO W M , MYCROFT A , ANDERSON R , et al . Rendezvous: a search engine for binary code [C ] // Proceedings of the 10th Working Confer-ence on Mining Software Repositories . 2013 .
0
浏览量
1661
下载量
7
CSCD
关联资源
相关文章
相关作者
相关机构