
浏览全部资源
扫码关注微信
1.西安电子科技大学计算机科学与技术学院,陕西 西安 710126
2.西安电子科技大学陕西省网络与系统安全重点实验室,陕西 西安 710071
[ "程珂(1993- ),男,安徽潜山人,博士,西安电子科技大学副教授,主要研究方向为隐私保护、机器学习、应用密码学等。" ]
[ "夏昱珩(2003- ),男,江苏镇江人,西安电子科技大学硕士生,主要研究方向为隐私保护、大模型。" ]
[ "代川云(2003- ),男,四川达州人,西安电子科技大学硕士生,主要研究方向为隐私保护、机器学习。" ]
[ "付家瑄(1997- ),男,陕西西安人,西安电子科技大学博士生,主要研究方向为物联网安全、隐私保护和机器学习。" ]
[ "祝幸辉(1990- ),男,河北邢台人,博士,西安电子科技大学讲师,主要研究方向为云计算与数据安全、物联网安全等。" ]
[ "沈玉龙(1978- ),男,江苏泗洪人,博士,西安电子科技大学教授,主要研究方向为云计算与数据安全、无线网络安全等。" ]
收稿日期:2025-02-21,
修回日期:2025-06-13,
纸质出版日期:2025-06-25
移动端阅览
程珂,夏昱珩,代川云等.基于秘密分享的大语言模型密态推理[J].通信学报,2025,46(06):168-184.
CHENG Ke,XIA Yuheng,DAI Chuanyun,et al.Cryptographic inference for large language model via secret sharing[J].Journal on Communications,2025,46(06):168-184.
程珂,夏昱珩,代川云等.基于秘密分享的大语言模型密态推理[J].通信学报,2025,46(06):168-184. DOI: 10.11959/j.issn.1000-436x.2025115.
CHENG Ke,XIA Yuheng,DAI Chuanyun,et al.Cryptographic inference for large language model via secret sharing[J].Journal on Communications,2025,46(06):168-184. DOI: 10.11959/j.issn.1000-436x.2025115.
大语言模型推理服务可能导致用户输入提示信息泄露给服务器端或专有模型权重泄露给用户。安全多方计算、同态加密等密码学技术为解决上述问题提供了可行方案,但由于计算和通信开销过大,在处理大语言模型推理任务时难以实际应用。基于此,提出了基于轻量级秘密分享的大语言模型密态推理方案,在不泄露用户输入和模型权重的前提下,高效精准地实现大语言模型推理。实验表明,相较现有先进工作,所提方案密态推理效率提升1.2~10倍,通信开销减少20%~90%。
Inference services based on large language models may lead to the leakage of user input hints to the server or proprietary model weights to the user. Cryptographic techniques such as secure multi-party computation and homomorphic encryption provide feasible solutions to the above problems
but they are still difficult to apply practically to the task of inference over large language models due to the excessive computational and communication overhead. Based on this
a lightweight secret-sharing-based cryptographic inference scheme for large language models was proposed
by which inference could be performed efficiently and accurately while ensuring that neither user inputs nor model parameters were revealed. The experimental results show that the proposed scheme improves the efficiency by 1.2~10 times and reduces the communication cost by 20%~90% compared with the existing state-of-the-art works.
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [J ] . arXiv Preprint , arXiv: 1706.03762 , 2017 .
LIU X N , ZHENG Y F , YUAN X L , et al . Securely outsourcing neural network inference to the cloud with lightweight techniques [J ] . IEEE Transactions on Dependable and Secure Computing , 2023 , 20 ( 1 ): 620 - 636 .
MOHASSEL P , RINDAL P . ABY3: a mixed protocol framework for machine learning [C ] // Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security . New York : ACM Press , 2018 : 35 - 52 .
KNOTT B , VENKATARAMAN S , HANNUN A , et al . CrypTen: secure multi-party computation meets machine learning [J ] . arXiv Preprint , arXiv: 2109.00984 , 2021 .
TAN S J , KNOTT B , TIAN Y , et al . CryptGPU: fast privacy-preserving machine learning on the GPU [C ] // Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE Press , 2021 : 1021 - 1038 .
RATHEE D , RATHEE M , KUMAR N , et al . CrypTFlow2: practical 2-party secure inference [C ] // Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security . New York : ACM Press , 2020 : 325 - 342 .
RYFFEL T , THOLONIAT P , POINTCHEVAL D , et al . AriaNN: low-interaction privacy-preserving deep learning via function secret sharing [J ] . Proceedings on Privacy Enhancing Technologies , 2022 ( 1 ): 291 - 316 .
MOHASSEL P , ZHANG Y P . SecureML: a system for scalable privacy-preserving machine learning [C ] // Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE Press , 2017 : 19 - 38 .
WAGH S , TOPLE S , BENHAMOUDA F , et al . Falcon: honest-majority maliciously secure framework for private deep learning [J ] . Proceedings on Privacy Enhancing Technologies , 2021 ( 1 ): 188 - 208 .
SONG A X , FU J X , MU X T , et al . L-SecNet: towards secure and lightweight deep neural network inference [J ] . Journal of Networking and Network Applications , 2023 , 3 ( 4 ): 171 - 181 .
GUO C , CHENG K , FU J X , et al . GFS-CNN: a GPU-friendly secure computation platform for convolutional neural networks [J ] . Journal of Networking and Network Applications , 2023 , 3 ( 2 ): 66 - 72 .
任艳丽 , 余凌赞 , 何港 , 等 . 一种隐私保护的卷积神经网络预测方案 [J ] . 计算机学报 , 2023 , 46 ( 8 ): 1606 - 1619 .
REN Y L , YU L Z , HE G , et al . A scheme of privacy-preserving convolutional neural network prediction [J ] . Chinese Journal of Computers , 2023 , 46 ( 8 ): 1606 - 1619 .
HAO M , LI H W , CHEN H X , et al . Iron: private inference on transformers [C ] // Proceedings of the 36th International Conference on Neural Information Processing Systems . New York : ACM Press , 2022 : 15718 - 15731 .
CHEN T Y , BAO H B , HUANG S H , et al . THE-X: privacy-preserving transformer inference with homomorphic encryption [C ] // Proceedings of the Findings of the Association for Computational Linguistics . Stroudsburg : ACL Press , 2022 : 3510 - 3520 .
LI D , WANG H , SHAO R , et al . MPCFormer: fast, performant and private Transformer inference with MPC [C ] // Proceedings of the Eleventh International Conference on Learning Representations . Piscataway : IEEE Press , 2022 : 1 - 16 .
PANG Q , ZHU J H , MÖLLERING H , et al . BOLT: privacy-preserving, accurate and efficient inference for transformers [C ] // Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE Press , 2024 : 4753 - 4771 .
LUO J L , ZHANG Y H , ZHANG Z , et al . SecFormer: fast and accurate privacy-preserving inference for transformer models via SMPC [C ] // Proceedings of the Findings of the Association for Computational Linguistics . Stroudsburg : ACL Press , 2024 : 13333 - 13348 .
WAGH S . Pika: secure computation using function secret sharing over rings [J ] . Proceedings on Privacy Enhancing Technologies , 2022 ( 4 ): 351 - 377 .
DEMMLER D , SCHNEIDER T , ZOHNER M . ABY - A framework for efficient mixed-protocol secure two-party computation [C ] // Proceedings of 2015 Network and Distributed System Security Symposium . Rosten : Internet Society , 2015 : 1 - 15 .
PATRA A , SCHNEIDER T , SURESH A , et al . ABY2.0: improved mixed-protocol secure two-party computation [C ] // Proceedings of the 30th USENIX Security Symposium (USENIX Security 21) . Berkeley : USENIX Association , 2021 : 2165 - 2182 .
LU W J , HUANG Z C , GU Z , et al . BumbleBee: secure two-party inference framework for large transformers [C ] // Proceedings of 2025 Network and Distributed System Security Symposium . Rosten : Internet Society , 2025 : 1 - 18 .
ZHANG J , LIU J , YANG X , et al . Secure transformer inference made non-interactive [C ] // Proceedings of the 39th International Conference on Neural Information Processing Systems . New York : ACM Press , 2025 : 1 - 15 .
AGRAWAL N , SHAHIN SHAMSABADI A , KUSNER M J , et al . QUOTIENT: two-party secure neural network training and prediction [C ] // Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security . New York : ACM Press , 2019 : 1231 - 1247 .
BEAVER D . Efficient multiparty protocols using circuit randomization [C ] // Advances in Cryptology — CRYPTO’91 . Berlin : Springer , 2007 : 420 - 432 .
DEVLIN J , CHANG M W , LEE K , et al . BERT: pre-training of deep bidirectional transformers for language understanding [C ] // Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics . Stroudsburg : ACL Press , 2019 : 4171 - 4186 .
ACHIAM J , ADLER S , AGARWAL S , et al . Gpt-4 technical report [J ] . arXiv Preprint , arXiv: 2303.08774 , 2023 .
GUPTA K , JAWALKAR N , MUKHERJEE A , et al . Sigma: secure GPT inference with function secret sharing [C ] // Proceedings on Privacy Enhancing Technologies . Saarland : DBLP , 2024 : 1 - 19 .
BOYLE E , GILBOA N , ISHAI Y . Function secret sharing: improvements and extensions [C ] // Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security . New York : ACM Press , 2016 : 1292 - 1303 .
JAWALKAR N , GUPTA K , BASU A , et al . Orca: FSS-based secure training and inference with GPUs [C ] // Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE Press , 2024 : 597 - 616 .
DAMGÅRD I , NIELSEN J B , NIELSEN M , et al . The TinyTable protocol for 2-party secure computation, or: gate-scrambling revisited [C ] // Advances in Cryptology – CRYPTO 2017 . Berlin : Springer , 2017 : 167 - 187 .
LINDELL Y . How to simulate it–A tutorial on the simulation proof technique [C ] // Tutorials on the Foundations of Cryptography . Berlin : Springer , 2017 : 277 - 346 .
LEHMKUHL R , MISHRA P , SRINIVASAN A , et al . Muse: secure inference resilient to malicious clients [C ] // Proceedings of the 30th USENIX Security Symposium (USENIX Security 21) . Berkeley : USENIX Association , 2021 : 2201 - 2218 .
WANG A . Glue: a multi-task benchmark and analysis platform for natural language understanding [J ] . arXiv Preprint , arXiv: 1804.07461 , 2018 .
MERITY S , XIONG C , BRADBURY J , et al . Pointer sentinel mixture models [J ] . arXiv Preprint , arXiv: 1609.07843 , 2016 .
YANG A , YANG B , ZHANG B , et al . Qwen2.5 technical report [J ] . arXiv Preprint , arXiv: 2412.15115 , 2024 .
0
浏览量
0
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621