深度学习数据窃取攻击在数据沙箱模式下的威胁分析与防御方法研究

潘鹤中; 韩培义; 向夏雨; 段少明; 庄荣飞; 刘川意

doi:10.11959/j.issn.1000-436x.2021215

您当前的位置：

首页 >

文章列表页 >

深度学习数据窃取攻击在数据沙箱模式下的威胁分析与防御方法研究

学术论文 | 更新时间：2024-06-05

- 深度学习数据窃取攻击在数据沙箱模式下的威胁分析与防御方法研究
- Threat analysis and defense methods of deep-learning-based data theft in data sandbox mode
- 通信学报 2021年42卷第11期页码：133-144
- 作者机构：
  
  1. 北京邮电大学网络空间安全学院，北京 100876
  2. 哈尔滨工业大学（深圳）计算机科学与技术学院，广东深圳 518055
  3. 鹏城实验室网络空间安全中心，广东深圳 518066
- 作者简介：
  
  [ "潘鹤中（1991− ），男，辽宁本溪人，北京邮电大学博士生，主要研究方向为云安全、数据安全、密码学" ]
  [ "韩培义（1992− ），男，山西吕梁人，博士，哈尔滨工业大学（深圳）助理研究员，主要研究方向为数据安全和隐私保护" ]
  [ "向夏雨（1991− ），男，湖南花垣人，北京邮电大学博士生，主要研究方向为隐私保护、医疗大数据分析" ]
  [ "段少明（1994− ），男，湖南邵阳人，哈尔滨工业大学（深圳）博士生，主要研究方向数据安全和机器学习" ]
  [ "庄荣飞（1992− ），男，福建泉州人，哈尔滨工业大学（深圳）博士生，主要研究方向为数据安全、机器学习安全、隐私保护" ]
  [ "刘川意（1982− ），男，四川乐山人，博士，哈尔滨工业大学（深圳）教授，主要研究方向为云计算与云安全、大规模存储系统、数据保护与数据安全" ]
- 基金信息：
  
  国家自然科学基金资助项目(61872110)
- DOI：10.11959/j.issn.1000-436x.2021215
  中图分类号： TP309.2
- 网络出版日期：2021-11，
  
  纸质出版日期：2021-11-25
- 稿件说明：
移动端阅览
潘鹤中, 韩培义, 向夏雨, 等. 深度学习数据窃取攻击在数据沙箱模式下的威胁分析与防御方法研究[J]. 通信学报, 2021,42(11):133-144.

Hezhong PAN, Peiyi HAN, Xiayu XIANG, et al. Threat analysis and defense methods of deep-learning-based data theft in data sandbox mode[J]. Journal on communications, 2021, 42(11): 133-144.
潘鹤中, 韩培义, 向夏雨, 等. 深度学习数据窃取攻击在数据沙箱模式下的威胁分析与防御方法研究[J]. 通信学报, 2021,42(11):133-144. DOI： 10.11959/j.issn.1000-436x.2021215.

Hezhong PAN, Peiyi HAN, Xiayu XIANG, et al. Threat analysis and defense methods of deep-learning-based data theft in data sandbox mode[J]. Journal on communications, 2021, 42(11): 133-144. DOI： 10.11959/j.issn.1000-436x.2021215.

摘要

详细分析了数据沙箱模式下，深度学习数据窃取攻击的威胁模型，量化评估了数据处理阶段和模型训练阶段攻击的危害程度和鉴别特征。针对数据处理阶段的攻击，提出基于模型剪枝的数据泄露防御方法，在保证原模型可用性的前提下减少数据泄露量；针对模型训练阶段的攻击，提出基于模型参数分析的攻击检测方法，从而拦截恶意模型防止数据泄露。这2种防御方法不需要修改或加密数据，也不需要人工分析深度学习模型训练代码，能够更好地应用于数据沙箱模式下数据窃取防御。实验评估显示，基于模型剪枝的防御方法最高能够减少73%的数据泄露，基于模型参数分析的检测方法能够有效识别95%以上的攻击行为。

Abstract

The threat model of deep-learning-based data theft in data sandbox model was analyzed in detail

and the degree of damage and distinguishing characteristics of this attack were quantitatively evaluated both in the data processing stage and the model training stage.Aiming at the attack in the data processing stage

a data leakage prevention method based on model pruning was proposed to reduce the amount of data leakage while ensuring the availability of the original model.Aiming at the attack in model training stage

an attack detection method based on model parameter analysis was proposed to intercept malicious models and prevent data leakage.These two methods do not need to modify or encrypt data

and do not need to manually analyze the training code of deep learning model

so they can be better applied to data theft defense in data sandbox mode.Experimental evaluation shows that the defense method based on model pruning can reduce 73% of data leakage

and the detection method based on model parameter analysis can effectively identify more than 95% of attacks.

关键词

Keywords

references

DELACROIX S , MONTGOMERY J . From research data ethics prin-ciples to practice:data trusts as a governance tool [J ] . SSRN Electronic Journal,2020:doi.org/10.2139/ssrn.3736090 .

O’HARA K . Data trusts:ethics,architecture and governance for trustworthy data stewardship [R ] . 2019 .

CARLINI N , LIU C , ERLINGSSON Ú , et al . The secret sharer:evaluating and testing unintended memorization in neural networks [C ] // Proceedings of the 28th USENIX Security Symposium . Berkeley:USENIX Association , 2019 : 267 - 284 .

CARLINI N , TRAMER F , WALLACE E , et al . Extracting training data from large language models [C ] // Proceedings of the 30th USENIX Security Symposium . Berkeley:USENIX Association , 2021 : 2633 - 2650 .

ZHANG Y H , JIA R X , PEI H Z , et al . The secret revealer:generative model-inversion attacks against deep neural networks [C ] // Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2020 : 250 - 258 .

ZHU L G , LIU Z J , HAN S . Deep leakage from gradients [J ] . arXiv Preprint,arXiv:1906.0835 , 2019 .

SONG C Z , RISTENPART T , SHMATIKOV V . Machine learning models that remember too much [C ] // Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security . New York:ACM Press , 2017 : 587 - 601 .

ZHANG T W . Privacy-preserving machine learning through data obfuscation [J ] . arXiv Preprint,arXiv:1807.01860 , 2018 .

BRAKERSKI Z , GENTRY C , VAIKUNTANATHAN V . (Leveled) fully homomorphic encryption without bootstrapping [J ] . ACM Transactions on Computation Theory , 2014 , 6 ( 3 ): 1 - 36 .

PAILLIER P , . Public-key cryptosystems based on composite degree residuosity classes [C ] // Advances in Cryptology — EUROCRYPT’99 . Berlin:Springer , 1999 : 223 - 238 .

ABADI M , CHU A , GOODFELLOW I , et al . Deep learning with differential privacy [C ] // Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security . New York:ACM Press , 2016 : 308 - 318 .

GOLATKAR A , ACHILLE A , SOATTO S . Eternal sunshine of the spotless net:selective forgetting in deep networks [C ] // Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2020 : 9304 - 9312 .

JIA J Y , SALEM A , BACKES M , et al . MemGuard:defending against black-box membership inference attacks via adversarial examples [C ] // Proceedings of 2019 ACM SIGSAC Conference on Computer and Communications Security . New York:ACM Press , 2019 : 259 - 274 .

PAPERNOT N , ABADI M , ERLINGSSON U , et al . Semi-supervised knowledge transfer for deep learning from private training data [J ] . arXiv Preprint,arXiv:1610.05755 , 2016 .

FREDRIKSON M , JHA S , RISTENPART T . Model inversion attacks that exploit confidence information and basic countermeasures [C ] // Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security . New York:ACM Press , 2015 : 1322 - 1333 .

HITAJ B , ATENIESE G , PEREZ-CRUZ F , . Deep models under the GAN:information leakage from collaborative deep learning [C ] // Proceedings of 2017 ACM SIGSAC Conference on Computer and Communications Security . New York:ACM Press , 2017 : 603 - 618 .

PAN X D , ZHANG M , JI S L , et al . Privacy risks of general-purpose language models [C ] // Proceedings of 2020 IEEE Symposium on Security and Privacy (SP) . Piscataway:IEEE Press , 2020 : 1314 - 1331 .

SALEM A , BHATTACHARYA A , BACKES M , et al . Updates-leak:data set inference and reconstruction attacks in online learning [C ] // Proceedings of the 29th USENIX Security Symposium . Berkeley:USENIX Association , 2020 : 1291 - 1308 .

WANG Z B , SONG M K , ZHANG Z F , et al . Beyond inferring class representatives:user-level privacy leakage from federated learning [C ] // Proceedings of IEEE INFOCOM 2019 - IEEE Conference on Computer Communications . Piscataway:IEEE Press , 2019 : 2512 - 2520 .

杨攀 , 桂小林 , 姚婧 , 等 . 支持同态算术运算的数据加密方案算法研究 [J ] . 通信学报 , 2015 , 36 ( 1 ): 171 - 182 .

YANG P , GUI X L , YAO J , et al . Research on algorithms of data en-cryption scheme that supports homomorphic arithmetical operations [J ] . Journal on Communications , 2015 , 36 ( 1 ): 171 - 182 .

闫玺玺 , 原笑含 , 汤永利 , 等 . 基于区块链且支持验证的属性基搜索加密方案 [J ] . 通信学报 , 2020 , 41 ( 2 ): 187 - 198 .

YAN X X , YUAN X H , TANG Y L , et al . Verifiable attribute-based searchable encryption scheme based on blockchain [J ] . Journal on Communications , 2020 , 41 ( 2 ): 187 - 198 .

ZHANG Q C , YANG L T , CHEN Z K . Privacy preserving deep computation model on cloud for big data feature learning [J ] . IEEE Transactions on Computers , 2016 , 65 ( 5 ): 1351 - 1362 .

RAHULAMATHAVAN Y , PHAN R C W , VELURU S , et al . Privacy-preserving multi-class support vector machine for outsourcing the data classification in cloud [J ] . IEEE Transactions on Dependable and Secure Computing , 2014 , 11 ( 5 ): 467 - 479 .

于东 , 康海燕 . 面向时序数据发布的隐私保护方法研究 [J ] . 通信学报 , 2015 , 36 ( S1 ): 243 - 249 .

YU D , KANG H Y . Privacy protection method on time-series data publication [J ] . Journal on Communications , 2015 , 36 ( S1 ): 243 - 249 .

韩培义 , 刘川意 , 王佳慧 , 等 . 面向云存储的数据加密系统与技术研究 [J ] . 通信学报 , 2020 , 41 ( 8 ): 55 - 65 .

HAN P Y , LIU C Y , WANG J H , et al . Research on data encryption system and technology for cloud storage [J ] . Journal on Communica-tions , 2020 , 41 ( 8 ): 55 - 65 .

CAO Y Z , YANG J F . Towards making systems forget with machine unlearning [C ] // Proceedings of 2015 IEEE Symposium on Security and Privacy . Piscataway:IEEE Press , 2015 : 463 - 480 .

NASR M , SHOKRI R , HOUMANSADR A . Machine learning with membership privacy using adversarial regularization [C ] // Proceedings of 2018 ACM SIGSAC Conference on Computer and Communications Security . New York:ACM Press , 2018 : 634 - 646 .

张佳乐 , 赵彦超 , 陈兵 , 等 . 边缘计算数据安全与隐私保护研究综述 [J ] . 通信学报 , 2018 , 39 ( 3 ): 1 - 21 .

ZHANG J L , ZHAO Y C , CHEN B , et al . Survey on data security and privacy-preserving for the research of edge computing [J ] . Journal on Communications , 2018 , 39 ( 3 ): 1 - 21 .

LIU K , DOLAN-GAVITT B , GARG S . Fine-pruning:defending against backdooring attacks on deep neural networks [C ] // Research in Attacks,Intrusions,and Defenses . Cham:Springer International Publishing , 2018 : 273 - 294 .

浏览量

399

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据