基于对抗补丁的可泛化的Grad-CAM攻击方法

司念文; 张文林; 屈丹; 常禾雨; 李盛祥; 牛铜

doi:10.11959/j.issn.1000-436x.2021025

您当前的位置：

首页 >

文章列表页 >

基于对抗补丁的可泛化的Grad-CAM攻击方法

学术论文 | 更新时间：2024-06-05

- 基于对抗补丁的可泛化的Grad-CAM攻击方法
- Generalized Grad-CAM attacking method based on adversarial patch
- 通信学报 2021年42卷第3期页码：23-35
- 作者机构：
  
  1. 信息工程大学信息系统工程学院，河南郑州 450001
  2. 信息工程大学密码工程学院，河南郑州 450001
- 作者简介：
  
  [ "司念文（1992- ），男，湖北襄阳人，信息工程大学博士生，主要研究方向为深度学习的安全性与可解释性。" ]
  [ "张文林（1982- ），男，湖北黄冈人，博士，信息工程大学副教授、硕士生导师，主要研究方向为语音信号处理、语音识别、机器学习等。" ]
  [ "屈丹（1974- ），女，吉林九台人，博士，信息工程大学教授、博士生导师，主要研究方向为语音识别、智能信息处理、机器学习等。" ]
  [ "常禾雨（1993- ），女，河南郑州人，信息工程大学博士生，主要研究方向为深度学习与行人检测、行人重识别。" ]
  [ "李盛祥（1991- ），男，湖南邵阳人，信息工程大学博士生，主要研究方向为多智能体强化学习。" ]
  [ "牛铜（1984- ），男，河南郑州人，信息工程大学副教授，主要研究方向为语音增强、语音识别、深度学习等。" ]
- 基金信息：
  
  国家自然科学基金资助项目(61673395)
- DOI：10.11959/j.issn.1000-436x.2021025
  中图分类号： TP391
- 网络出版日期：2021-03，
  
  纸质出版日期：2021-03-25
- 稿件说明：
移动端阅览
司念文, 张文林, 屈丹, 等. 基于对抗补丁的可泛化的Grad-CAM攻击方法[J]. 通信学报, 2021,42(3):23-35.

Nianwen SI, Wenlin ZHANG, Dan QU, et al. Generalized Grad-CAM attacking method based on adversarial patch[J]. Journal on communications, 2021, 42(3): 23-35.
司念文, 张文林, 屈丹, 等. 基于对抗补丁的可泛化的Grad-CAM攻击方法[J]. 通信学报, 2021,42(3):23-35. DOI： 10.11959/j.issn.1000-436x.2021025.

Nianwen SI, Wenlin ZHANG, Dan QU, et al. Generalized Grad-CAM attacking method based on adversarial patch[J]. Journal on communications, 2021, 42(3): 23-35. DOI： 10.11959/j.issn.1000-436x.2021025.

摘要

为了验证Grad-CAM解释方法的脆弱性，提出了一种基于对抗补丁的Grad-CAM攻击方法。通过在CNN分类损失函数后添加对Grad-CAM类激活图的约束项，可以针对性地优化出一个对抗补丁并合成对抗图像。该对抗图像可在分类结果保持不变的情况下，使Grad-CAM解释结果偏向补丁区域，实现对解释结果的攻击。同时，通过在数据集上的批次训练及增加扰动范数约束，提升了对抗补丁的泛化性和多场景可用性。在ILSVRC2012数据集上的实验结果表明，与现有方法相比，所提方法能够在保持模型分类精度的同时，更简单有效地攻击Grad-CAM解释结果。

Abstract

To verify the fragility of the Grad-CAM

a Grad-CAM attack method based on adversarial patch was proposed.By adding a constraint to the Grad-CAM in the classification loss function

an adversarial patch could be optimized and the adversarial image could be synthesized.The adversarial image guided the Grad-CAM interpretation result towards the patch area while the classification result remains unchanged

so as to attack the interpretations.Meanwhile

through batch-training on the dataset and increasing perturbation norm constraint

the generalization and the multi-scene usability of the adversarial patch were improved.Experimental results on the ILSVRC2012 dataset show that compared with the existing methods

the proposed method can attack the interpretation results of the Grad-CAM more simply and effectively while maintaining the classification accuracy.

关键词

Keywords

references

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [J ] . arXiv Preprint,arXiv:1409.1556v6 , 2014 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2016 : 770 - 778 .

HUANG G , LIU Z , MAATEN L V D , et al . Densely connected convolutional networks [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2017 : 2261 - 2269 .

VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [J ] . arXiv Preprint,arXiv:1706.03762v5 , 2017 .

DEVLIN J , CHANG M W , LEE K , et al . Bert:pre-training of deep bidirectional transformers for language understanding [J ] . arXiv Preprint,arXiv:1810.04805 , 2018 .

SIMONYAN K , VEDALDI A , ZISSERMAN A . Deep inside convolutional networks:visualising image classification models and saliency maps [J ] . arXiv Preprint,arXiv:1312.6034 , 2013 .

SPRINGENBERG J T , DOSOVITSKIY A , BROX T , et al . Striving for simplicity:the all convolutional net [J ] . arXiv Preprint,arXiv:1412.6806 , 2014 .

SMILKOV D , THORAT N , KIM B , et al . SmoothGrad:removing noise by adding noise [J ] . arXiv Preprint,arXiv:1706.03825 , 2017 .

SUNDARARAJAN M , TALY A , YAN Q Q . Axiomatic attribution for deep networks [J ] . arXiv Preprint,arXiv:1703.01365 , 2017 .

ZHOU B , KHOSLA A , LAPEDRIZA A , et al . Learning deep features for discriminative localization [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2016 : 2921 - 2929 .

SELVARAJU R R , COGSWELL M , DAS A , et al . Grad-CAM:visual explanations from deep networks via gradient-based localization [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2017 : 618 - 626 .

CHATTOPADHAY A , SARKAR A , HOWLADER P , et al . Grad-CAM++:generalized gradient-based visual explanations for deep convolutional networks [C ] // 2018 IEEE Winter Conference on Applications of Computer Vision . Piscataway:IEEE Press , 2018 : 839 - 847 .

WANG H F , DU M N , YANG F , et al . Score-CAM:improved visual explanations via score-weighted class activation mapping [J ] . arXiv Preprint,arXiv:1910.01279 , 2019 .

GHORBANI A , ABID A , ZOU J . Interpretation of neural networks is fragile [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . Palo Alto:AAAI Press , 2018 : 3681 - 3688 .

DOMBROWSKI A K , ALBER M , ANDERS C , et al . Explanations can be manipulated and geometry is to blame [J ] . arXiv Preprint,arXiv:1906.07983 , 2019 .

HEO J , JOO S , MOON T . Fooling neural network interpretations via adversarial model manipulation [J ] . arXiv Preprint,arXiv:1902.02041 , 2019 .

BROWN T B , MANÉ D , ROY A , et al . Adversarial patch [J ] . arXiv Preprint,arXiv:1712.09665v2 , 2017 .

RUSSAKOVSKY O , DENG J , SU H , et al . ImageNet large scale visual recognition challenge [J ] . International Journal of Computer Vision , 2015 , 115 ( 3 ): 211 - 252 .

FUKUI H , HIRAKAWA T , YAMASHITA T , et al . Attention branch network:learning of attention mechanism for visual explanation [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2019 : 10705 - 10714 .

LI K P , WU Z Y , PENG K C , et al . Tell me where to look:guided attention inference network [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 9215 - 9223 .

SUBRAMANYA A , PILLAI V , PIRSIAVASH H . Fooling network interpretation in image classification [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2019 : 2020 - 2029 .

SZEGEDY C , ZAREMBA W , SUTSKEVER I , et al . Intriguing properties of neural networks [J ] . arXiv Preprint,arXiv:1312.6199v4 , 2013 .

GOODFELLOW I J , SHLENS J , SZEGEDY C . Explaining and harnessing adversarial examples [J ] . arXiv Preprint,arXiv:1412.6572v3 , 2014 .

PASZKE A , GROSS S , CHINTALA S , et al . Automatic differentiation in PyTorch [C ] // Advances in Neural Information Processing Systems Workshop . Massachusetts:MIT Press , 2017 : 1 - 4 .

DONG Y P , LIAO F Z , PANG T Y , et al . Boosting adversarial attacks with momentum [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 9185 - 9193 .

浏览量

656

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于神经网络的恶意DNS流量检测方法

基于同态密文转换的隐私保护卷积神经网络推理方案

基于函数加密的密文卷积神经网络模型

基于卷积神经网络的车载数字孪生持续认证方案

基于深度学习的光学遥感图像目标检测研究进展