浏览全部资源
扫码关注微信
1.华中农业大学信息学院,湖北 武汉430070
2.湖北工业大学太阳能高效利用及储能运行控制湖北省重点实验室,湖北 武汉 430068
[ "郭曦(1983- ),男,湖北鄂州人,博士,华中农业大学副教授,主要研究方向为软件分析与测试、信息安全、大模型开发等。" ]
[ "王盼(1987- ),女,河南济源人,博士,湖北工业大学副教授,主要研究方向为功率变换器、新能源发电技术、电能质量控制、新型配电网技术等。" ]
收稿日期:2024-05-27,
修回日期:2024-11-29,
纸质出版日期:2024-12-25
移动端阅览
郭曦,王盼.面向代码重用检测的程序语义分析模型[J].通信学报,2024,45(12):179-196.
GUO Xi,WANG Pan.Program semantic analysis model for code reuse detection[J].Journal on Communications,2024,45(12):179-196.
郭曦,王盼.面向代码重用检测的程序语义分析模型[J].通信学报,2024,45(12):179-196. DOI: 10.11959/j.issn.1000-436x.2024269.
GUO Xi,WANG Pan.Program semantic analysis model for code reuse detection[J].Journal on Communications,2024,45(12):179-196. DOI: 10.11959/j.issn.1000-436x.2024269.
程序相似性分析在代码缺陷检测、产权保护等领域有广泛的用途,但普遍存在计算开销过大等问题,为此提出了一种基于模糊匹配和统计推理的代码相似度分析方法。针对二进制程序,首先对其进行反汇编分析,然后进行函数边界识别操作,从而提取函数的执行边界信息。在此基础上,在基本块的粒度上使用动态规划的分析方法获得基本块之间的相似度结果,并在控制流图的基础上进行邻域搜索,从而将相似性分析从基本块级别扩展至函数级别。最后通过相似度函数的统计分析得出二进制文件的语义相似度。在该过程中对预训练模型进行优化分析,并对参数进行调优,从而可以对跨平台代码进行相似度分析。实验结果表明,相对于目前主流的分析工具,所提方法在分析精度方面较传统的分析工具有较大提高,其分析精度平均提高7.1%。
Program similarity analysis had a wide range of applications in areas such as code plagiarism and property protection
but it generally suffered from problems such as excessive computational overhead
a code similarity analysis method based on fuzzy matching and statistical inference was proposed. For binary programs
first disassembly analysis was performed and then function boundary recognition operations was performed to extract the execution boundary information of the function. On this basis
dynamic programming analysis methods were used to obtain similarity results between basic blocks at the granularity of the basic blocks
and neighborhood search was performed on the basis of the control flow graph to extend similarity analysis from the basic block level to the function level. Finally
the semantic similarity of binary files was obtained through statistical analysis of similarity functions. During this process
the pre trained model was optimized and analyzed
and the parameters were tuned to enable similarity analysis of cross platform code. The experimental results show that the proposed method has a significant improvement in analysis accuracy compared to traditional analysis tools
with an average increase of 7.1% in analysis accuracy compared to current mainstream analysis tools.
陈锦富 , 王震鑫 , 蔡赛华 , 等 . 基于蜕变测试的区块链智能合约漏洞检测方法 [J ] . 通信学报 , 2023 , 44 ( 10 ): 164 - 176 .
CHEN J F , WANG Z X , CAI S H , et al . Vulnerability detection method for blockchain smart contracts based on metamorphic testing [J ] . Journal on Communications , 2023 , 44 ( 10 ): 164 - 176 .
王金伟 , 陈正嘉 , 谢雪 , 等 . 基于Ngram-TFIDF的深度恶意代码可视化分类方法 [J ] . 通信学报 , 2024 , 45 ( 6 ): 160 - 175 .
WANG J W , CHEN Z J , XIE X , et al . Deep visualization classification method for malicious code based on Ngram-TFIDF [J ] . Journal on Communications , 2024 , 45 ( 6 ): 160 - 175 .
LIN W , GUO Q L , YIN J W , et al . FSmell: recognizing inline function in binary code [C ] // Proceedings of the European Symposium on Research in Computer Security . Berlin : Springer , 2024 : 487 - 506 .
KIM S , KIM H , CHA S K . FunProbe: probing functions from binary code through probabilistic analysis [C ] // Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering . New York : ACM Press , 2023 : 1419 - 1430 .
YU S , QU Y , HU X C , et al . DeepDi: learning a relational graph convolutional network model on instructions for fast and accurate disassembly [C ] // Proceedings of the USENIX Security Symposium . Berkeley : USENIX Association , 2022 : 2709 - 2725 .
LIU B C , HUO W , ZHANG C , et al . αDiff: cross-version binary code similarity detection with DNN [C ] // Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering . New York : ACM Press , 2018 : 667 - 678 .
FENG Q , ZHOU R D , XU C C , et al . Scalable graph-based bug search for firmware images [C ] // Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security . New York : ACM Press , 2016 : 480 - 491 .
DEVLIN J , CHANG M W , LEE K , et al . BERT: pre-training of deep bidirectional transformers for language understanding [C ] // Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . Association for Computational Linguistics : Minnesota . 2019 : 4171 - 4186 .
YU Z P , CAO R , TANG Q Y , et al . Order matters: semantic-aware neural networks for binary code similarity detection [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2020 , 34 ( 1 ): 1145 - 1152 .
WANG H , QU W J , KATZ G , et al . jTrans: jump-aware transformer for binary code similarity detection [C ] // Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis . New York : ACM Press , 2022 : 1 - 13 .
DING S H H , FUNG B C M , CHARLAND P . Asm2Vec: boosting static representation robustness for binary clone search against code obfuscation and compiler optimization [C ] // Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE Press , 2019 : 472 - 489 .
ZHANG X C , SUN W J , PANG J M , et al . Similarity metric method for binary basic blocks of cross-instruction set architecture [C ] // Proceedings 2020 Workshop on Binary Analysis Research . Reston : Internet Society , 2020 .
YANG S G , CHENG L , ZENG Y C , et al . Asteria: deep learning-based AST-encoding for cross-platform binary code similarity detection [C ] // Proceedings of the 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) . Piscataway : IEEE Press , 2021 : 224 - 236 .
YANG J , FU C , LIU X Y , et al . Codee: a tensor embedding scheme for binary code search [J ] . IEEE Transactions on Software Engineering , 2022 , 48 ( 7 ): 2224 - 2244 .
DAVID Y , PARTUSH N , YAHAV E . Statistical similarity of binaries [J ] . ACM SIGPLAN Notices , 2016 , 51 ( 6 ): 266 - 280 .
BAO T , BURKET J , WOO M , et al . BYTEWEIGHT: learning to recognize functions in binary code [C ] // Proceedings of USENIX Security Symposium , 2014 : 845 - 860 .
HUANG H , YOUSSEF A M , DEBBABI M . BinSequence: fast, accurate and scalable binary code reuse detection [C ] // Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security . New York : ACM Press , 2017 : 155 - 166 .
PEI K X , XUAN Z , YANG J F , et al . Trex: learning execution semantics from micro-traces for binary similarity [J ] . arXiv Preprint , arXiv: 2012.08680 , 2020 .
0
浏览量
8
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构