AI安全对抗赛第二名方案分享

原创
2020/04/08 16:55
阅读数 1.4K

本项目为AI安全对抗赛第二名方案介绍,可完美复现。团队名为:我不和你们玩了,队伍成员一人,姓名张鑫,在读于西安电子科技大学,目前研二,初赛排名第6,提交次数58次。决赛排名第2,提交次数84次。

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

赛题背景

目前人工智能和机器学习技术被广泛应用在人机交互、推荐系统、安全防护等各个领域,其受攻击的可能性以及是否具备强抗打击能力备受业界关注。具体场景包括语音,图像识别,信用评估,防止欺诈,过滤恶意邮件,抵抗恶意代码攻击,网络攻击等等。攻击者也试图通过各种手段绕过,或直接对AI模型进行攻击达到对抗目的。在人机交互这一环节,随着移动设备的普及,语音、图像作为新兴的人机输入手段,其便捷和实用性被大众所欢迎。因此图像识别的准确性对人工智能产业至关重要。这一环节也是最容易被攻击者利用,通过对数据源的细微修改,在用户感知不到的情况下,使机器做出了错误的操作。这种方法会导致AI系统被入侵、错误命令被执行,执行后的连锁反应会造成的严重后果。

本次竞赛的题目和数据由百度安全部、百度大数据研究院提供,竞赛平台AI Studio由百度AI技术生态部提供。期待参赛者们能够以此为契机,学习对抗样本理论知识并提升深度学习工程实践能力。欢迎全球范围开发者积极参与,鼓励高校教师积极参与指导。

赛题描述

  • 初赛:初赛中,选手通过对 指定图像 添加扰动,使目标模型(Target Model)(一个为ResNeXt50模型并公开模型结构与参数(白盒);一个为MobileNetV2模型并公开模型结构与参数(白盒);一个不公开模型结构与参数(黑盒)。)分类错误,例如对于一张分类为A的图片,目标模型只要判别扰动后的样本不为A,即可判定成功。同时以生成扰动量越小越优。
  • 复赛:复赛选手的目标与初赛相同: 利用给定,将指定的120张图片样本生成为攻击样本,主办方根据选手提供的攻击样本在后台使用上述5个Target Model(一个是与初赛相同的ResNeXt50白盒模型(白盒),一个是人工加固的模型(灰盒),另外三个均为黑盒模型,其中包括由AutoDL技术训练的模型。)进行评估,只要使Target Model分类结果与Label不一致,则判定为攻击成功。样本攻击成功数越多、扰动越小,得分越高。
 

一 熟悉baseline

方案将在baseline的基础上增加函数进行修改,因此我们先浏览一遍baseline。

 

1.1 熟悉baseline目录结构

In[ ]
#解压代码压缩包
import zipfile
tar = zipfile.ZipFile('/home/aistudio/data/data19725/attack_by_xin.zip','r')
tar.extractall()
In[ ]
cd baidu_attack_by_xin/
/home/aistudio/baidu_attack_by_xin
 

baseline存在以下目录

  • attack 存放核心算法代码
  • models 存放模型结构定意
  • models_parameters 存放模型参数
  • input_imgae 存放120张输入图片,与标签文件
  • output_image 存放输出图片
  • utils.py 定义了一些常用工具,比如读入图片处理,打印参数等
  • attack_FGSM.py 算法主函数,包含模型定义,算法调用。此文件将直接在下面notebook中展示

note: 加粗部分为我的方案将要修改的部分

 

1.2 baseline浏览

代码大体分为以下几部分:

  1. 定义模型,导入参数。

    采用模型为ResNeXt50_32x4d,损失函数为交叉熵,与训练分类模型无异。区别在于如何更新参数,这点将在后文FGSM算法中介绍。

  2. 依次读入图片调用FGSM算法,生成对抗样本

In[ ]
#coding=utf-8

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import functools
import numpy as np
import paddle.fluid as fluid

#加载自定义文件
import models
##################################################
##################################################
#此处为导入算法FGSM和PGD
#我方案中函数也将定义在attack/attack_pp.py中
from attack.attack_pp import FGSM, PGD
##################################################
##################################################
from utils import init_prog, save_adv_image, process_img, tensor2img, calc_mse, add_arguments, print_arguments

path = "/home/aistudio/baidu_attack_by_xin/"
######Init args
image_shape = [3,224,224]
class_dim=121
input_dir = path + "input_image/"
output_dir = path +  "output_image/"
model_name="ResNeXt50_32x4d"
pretrained_model= path + "models_parameters/86.45+88.81ResNeXt50_32x4d"

val_list = 'val_list.txt'
use_gpu=True

######Attack graph
adv_program=fluid.Program()
#完成初始化
with fluid.program_guard(adv_program):
    input_layer = fluid.layers.data(name='image', shape=image_shape, dtype='float32')
    #设置为可以计算梯度
    input_layer.stop_gradient=False

    # model definition
    model = models.__dict__[model_name]()
    out_logits = model.net(input=input_layer, class_dim=class_dim)
    out = fluid.layers.softmax(out_logits)

    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
    exe.run(fluid.default_startup_program())

    #记载模型参数
    fluid.io.load_persistables(exe, pretrained_model)

#设置adv_program的BN层状态
init_prog(adv_program)

#创建测试用评估模式
eval_program = adv_program.clone(for_test=True)

#定义梯度
with fluid.program_guard(adv_program):
    label = fluid.layers.data(name="label", shape=[1] ,dtype='int64')
    loss = fluid.layers.cross_entropy(input=out, label=label)
    gradients = fluid.backward.gradients(targets=loss, inputs=[input_layer])[0]

######Inference
def inference(img):
    fetch_list = [out.name]

    result = exe.run(eval_program,
                     fetch_list=fetch_list,
                     feed={ 'image':img })
    result = result[0][0]
    pred_label = np.argmax(result)
    pred_score = result[pred_label].copy()
    return pred_label, pred_score

######FGSM attack
#untarget attack
def attack_nontarget_by_FGSM(img, src_label):
    pred_label = src_label

    step = 8.0/256.0
    eps = 32.0/256.0
    while pred_label == src_label:
        #生成对抗样本
        adv=FGSM(adv_program=adv_program,eval_program=eval_program,gradients=gradients,o=img,
                 input_layer=input_layer,output_layer=out,step_size=step,epsilon=eps,
                 isTarget=False,target_label=0,use_gpu=use_gpu)

        pred_label, pred_score = inference(adv)
        step *= 2
        if step > eps:
            break

    print("Test-score: {0}, class {1}".format(pred_score, pred_label))

    adv_img=tensor2img(adv)
    return adv_img

####### Main #######
def get_original_file(filepath):
    with open(filepath, 'r') as cfile:
        full_lines = [line.strip() for line in cfile]
    cfile.close()
    original_files = []
    for line in full_lines:
        label, file_name = line.split()
        original_files.append([file_name, int(label)])
    return original_files

def gen_adv():
    ########如果你没有头绪可以从这部分看起#######################
    mse = 0
    original_files = get_original_file(input_dir + val_list)

    for filename, label in original_files:
        img_path = input_dir + filename
        print("Image: {0} ".format(img_path))
        ##读入图像,转换维度,归一化##########
        img=process_img(img_path)
        ####将图像输入attack_nontarget_by_FGSM函数,得到被攻击后的图像#######
        adv_img = attack_nontarget_by_FGSM(img, label)
        image_name, image_ext = filename.split('.')
        ##Save adversarial image(.png) 保存图像
        save_adv_image(adv_img, output_dir+image_name+'.jpg')

        org_img = tensor2img(img)
        ##对比攻击图像与原图像的差异,计算mse
        score = calc_mse(org_img, adv_img)
        mse += score
    print("ADV {} files, AVG MSE: {} ".format(len(original_files), mse/len(original_files)))


gen_adv()
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02085620_10074.jpg 
Non-Targeted attack target_label=o_label=1
Non-Targeted attack target_label=o_label=1
Non-Targeted attack target_label=o_label=1
Test-score: 0.1829851120710373, class 1
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02085782_1039.jpg 
Non-Targeted attack target_label=o_label=2
Non-Targeted attack target_label=o_label=2
Non-Targeted attack target_label=o_label=2
Test-score: 0.7980572581291199, class 2
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02085936_10130.jpg 
Non-Targeted attack target_label=o_label=3
Test-score: 0.558245837688446, class 54
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02086079_10600.jpg 
Non-Targeted attack target_label=o_label=4
Test-score: 0.4213048815727234, class 5
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02086240_1059.jpg 
Non-Targeted attack target_label=o_label=5
Non-Targeted attack target_label=o_label=5
Non-Targeted attack target_label=o_label=5
Test-score: 0.6555399894714355, class 5
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02086646_1002.jpg 
Non-Targeted attack target_label=o_label=6
Non-Targeted attack target_label=o_label=6
Test-score: 0.13172577321529388, class 68
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02086910_1048.jpg 
Non-Targeted attack target_label=o_label=7
Test-score: 0.24971207976341248, class 1
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02087046_1206.jpg 
Non-Targeted attack target_label=o_label=8
Test-score: 0.7929812669754028, class 108
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02087394_11337.jpg 
Non-Targeted attack target_label=o_label=9
Test-score: 0.33581840991973877, class 93
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088094_1003.jpg 
Non-Targeted attack target_label=o_label=10
Non-Targeted attack target_label=o_label=10
Non-Targeted attack target_label=o_label=10
Test-score: 0.24336178600788116, class 10
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088238_10013.jpg 
Non-Targeted attack target_label=o_label=11
Non-Targeted attack target_label=o_label=11
Test-score: 0.18260332942008972, class 1
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088364_10108.jpg 
Non-Targeted attack target_label=o_label=12
Test-score: 0.4022131860256195, class 6
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088466_10083.jpg 
Non-Targeted attack target_label=o_label=13
Non-Targeted attack target_label=o_label=13
Test-score: 0.17899833619594574, class 96
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088632_101.jpg 
Non-Targeted attack target_label=o_label=14
Non-Targeted attack target_label=o_label=14
Non-Targeted attack target_label=o_label=14
Test-score: 0.748593807220459, class 14
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02089078_1064.jpg 
Non-Targeted attack target_label=o_label=15
Test-score: 0.8527135848999023, class 13
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02089867_1029.jpg 
Non-Targeted attack target_label=o_label=16
Test-score: 0.9962345957756042, class 17
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02089973_1066.jpg 
Non-Targeted attack target_label=o_label=17
Test-score: 0.9775213003158569, class 16
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02090379_1272.jpg 
Non-Targeted attack target_label=o_label=18
Test-score: 0.9822365045547485, class 13
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02090622_10343.jpg 
Non-Targeted attack target_label=o_label=19
Test-score: 0.8759220242500305, class 22
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02090721_1292.jpg 
Non-Targeted attack target_label=o_label=20
Test-score: 0.9765266180038452, class 27
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091032_10079.jpg 
Non-Targeted attack target_label=o_label=21
Test-score: 0.8322737812995911, class 22
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091134_10107.jpg 
Non-Targeted attack target_label=o_label=22
Test-score: 0.0860852524638176, class 102
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091244_1000.jpg 
Non-Targeted attack target_label=o_label=23
Non-Targeted attack target_label=o_label=23
Non-Targeted attack target_label=o_label=23
Test-score: 0.8737558126449585, class 23
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091467_1110.jpg 
Non-Targeted attack target_label=o_label=24
Non-Targeted attack target_label=o_label=24
Test-score: 0.12693266570568085, class 99
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091635_1319.jpg 
Non-Targeted attack target_label=o_label=25
Test-score: 0.16966181993484497, class 32
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091831_10576.jpg 
Non-Targeted attack target_label=o_label=26
Test-score: 0.26041099429130554, class 8
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02092002_10699.jpg 
Non-Targeted attack target_label=o_label=27
Test-score: 0.9970411658287048, class 20
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02092339_1100.jpg 
Non-Targeted attack target_label=o_label=28
Test-score: 0.22998812794685364, class 59
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093256_11023.jpg 
Non-Targeted attack target_label=o_label=29
Test-score: 0.3322082757949829, class 9
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093428_10947.jpg 
Non-Targeted attack target_label=o_label=30
Non-Targeted attack target_label=o_label=30
Test-score: 0.4645201563835144, class 117
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093647_1037.jpg 
Non-Targeted attack target_label=o_label=31
Non-Targeted attack target_label=o_label=31
Test-score: 0.09869368374347687, class 33
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093754_1062.jpg 
Non-Targeted attack target_label=o_label=32
Test-score: 0.35046255588531494, class 111
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093859_1003.jpg 
Non-Targeted attack target_label=o_label=33
Non-Targeted attack target_label=o_label=33
Test-score: 0.23188208043575287, class 83
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093991_1026.jpg 
Non-Targeted attack target_label=o_label=34
Test-score: 0.6639878749847412, class 35
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02094114_1173.jpg 
Non-Targeted attack target_label=o_label=35
Test-score: 0.9812427759170532, class 20
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02094258_1004.jpg 
Non-Targeted attack target_label=o_label=36
Test-score: 0.990171492099762, class 35
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02094433_10126.jpg 
Non-Targeted attack target_label=o_label=37
Test-score: 0.7961386442184448, class 43
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02095314_1033.jpg 
Non-Targeted attack target_label=o_label=38
Test-score: 0.22626255452632904, class 17
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02095570_1031.jpg 
Non-Targeted attack target_label=o_label=39
Test-score: 0.3102056682109833, class 46
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02095889_1003.jpg 
Non-Targeted attack target_label=o_label=40
Non-Targeted attack target_label=o_label=40
Non-Targeted attack target_label=o_label=40
Test-score: 0.2282048910856247, class 38
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096051_1110.jpg 
Non-Targeted attack target_label=o_label=41
Test-score: 0.6431247591972351, class 39
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096177_10031.jpg 
Non-Targeted attack target_label=o_label=42
Test-score: 0.7510316967964172, class 116
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096294_1111.jpg 
Non-Targeted attack target_label=o_label=43
Non-Targeted attack target_label=o_label=43
Non-Targeted attack target_label=o_label=43
Test-score: 0.1718287616968155, class 43
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096437_1055.jpg 
Non-Targeted attack target_label=o_label=44
Test-score: 0.1869039088487625, class 25
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096585_10604.jpg 
Non-Targeted attack target_label=o_label=45
Test-score: 0.517105758190155, class 8
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097047_1412.jpg 
Non-Targeted attack target_label=o_label=46
Non-Targeted attack target_label=o_label=46
Test-score: 0.6849876642227173, class 48
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097130_1193.jpg 
Non-Targeted attack target_label=o_label=47
Test-score: 0.41658449172973633, class 33
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097209_1038.jpg 
Non-Targeted attack target_label=o_label=48
Test-score: 0.16146445274353027, class 47
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097298_10676.jpg 
Non-Targeted attack target_label=o_label=49
Test-score: 0.609655499458313, class 33
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097474_1070.jpg 
Non-Targeted attack target_label=o_label=50
Test-score: 0.9852504134178162, class 54
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097658_1018.jpg 
Non-Targeted attack target_label=o_label=51
Test-score: 0.997583270072937, class 43
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02098105_1078.jpg 
Non-Targeted attack target_label=o_label=52
Test-score: 0.2030351161956787, class 50
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02098286_1009.jpg 
Non-Targeted attack target_label=o_label=53
Test-score: 0.43629899621009827, class 31
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02098413_11385.jpg 
Non-Targeted attack target_label=o_label=54
Test-score: 0.2666652798652649, class 50
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099267_1018.jpg 
Non-Targeted attack target_label=o_label=55
Test-score: 0.24120940268039703, class 64
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099429_1039.jpg 
Non-Targeted attack target_label=o_label=56
Test-score: 0.5451071858406067, class 105
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099601_100.jpg 
Non-Targeted attack target_label=o_label=57
Non-Targeted attack target_label=o_label=57
Test-score: 0.2970142066478729, class 63
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099712_1150.jpg 
Non-Targeted attack target_label=o_label=58
Test-score: 0.8002893924713135, class 29
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099849_1068.jpg 
Non-Targeted attack target_label=o_label=59
Test-score: 0.4128360450267792, class 70
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02100236_1244.jpg 
Non-Targeted attack target_label=o_label=60
Test-score: 0.5225626826286316, class 70
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02100583_10249.jpg 
Non-Targeted attack target_label=o_label=61
Test-score: 0.7810189127922058, class 59
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02100735_10064.jpg 
Non-Targeted attack target_label=o_label=62
Test-score: 0.20395173132419586, class 12
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02100877_1062.jpg 
Non-Targeted attack target_label=o_label=63
Test-score: 0.1305573433637619, class 62
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02101006_135.jpg 
Non-Targeted attack target_label=o_label=64
Test-score: 0.2646207809448242, class 96
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02101388_10017.jpg 
Non-Targeted attack target_label=o_label=65
Non-Targeted attack target_label=o_label=65
Non-Targeted attack target_label=o_label=65
Test-score: 0.8180195689201355, class 65
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02101556_1116.jpg 
Non-Targeted attack target_label=o_label=66
Test-score: 0.966513991355896, class 69
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102040_1055.jpg 
Non-Targeted attack target_label=o_label=67
Test-score: 0.9941025376319885, class 62
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102177_1160.jpg 
Non-Targeted attack target_label=o_label=68
Non-Targeted attack target_label=o_label=68
Non-Targeted attack target_label=o_label=68
Test-score: 0.24618981778621674, class 68
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102318_10000.jpg 
Non-Targeted attack target_label=o_label=69
Test-score: 0.4054173231124878, class 68
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102480_101.jpg 
Non-Targeted attack target_label=o_label=70
Test-score: 0.3680818974971771, class 69
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102973_1037.jpg 
Non-Targeted attack target_label=o_label=71
Test-score: 0.7708333730697632, class 116
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02104029_1075.jpg 
Non-Targeted attack target_label=o_label=72
Test-score: 0.10377514362335205, class 58
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02104365_10071.jpg 
Non-Targeted attack target_label=o_label=73
Test-score: 0.2432803213596344, class 64
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105056_1165.jpg 
Non-Targeted attack target_label=o_label=74
Test-score: 0.40509727597236633, class 55
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105162_10076.jpg 
Non-Targeted attack target_label=o_label=75
Test-score: 0.15848565101623535, class 85
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105251_1588.jpg 
Non-Targeted attack target_label=o_label=76
Test-score: 0.10340812057256699, class 52
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105412_1159.jpg 
Non-Targeted attack target_label=o_label=77
Test-score: 0.34471365809440613, class 87
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105505_1018.jpg 
Non-Targeted attack target_label=o_label=78
Test-score: 0.5453677177429199, class 106
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105641_10051.jpg 
Non-Targeted attack target_label=o_label=79
Test-score: 0.29593008756637573, class 40
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105855_10095.jpg 
Non-Targeted attack target_label=o_label=80
Test-score: 0.4963936507701874, class 81
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106030_11148.jpg 
Non-Targeted attack target_label=o_label=81
Test-score: 0.9998124241828918, class 80
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106166_1205.jpg 
Non-Targeted attack target_label=o_label=81
Non-Targeted attack target_label=o_label=81
Non-Targeted attack target_label=o_label=81
Test-score: 0.49886953830718994, class 82
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106382_1005.jpg 
Non-Targeted attack target_label=o_label=83
Test-score: 0.9096139669418335, class 101
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106550_10048.jpg 
Non-Targeted attack target_label=o_label=84
Test-score: 0.9285635948181152, class 64
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106662_10122.jpg 
Non-Targeted attack target_label=o_label=85
Non-Targeted attack target_label=o_label=85
Non-Targeted attack target_label=o_label=85
Test-score: 0.26929906010627747, class 85
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107142_10952.jpg 
Non-Targeted attack target_label=o_label=86
Test-score: 0.09236325323581696, class 87
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107312_105.jpg 
Non-Targeted attack target_label=o_label=87
Test-score: 0.9715318083763123, class 8
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107574_1026.jpg 
Non-Targeted attack target_label=o_label=88
Test-score: 0.8894961476325989, class 91
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107683_1003.jpg 
Non-Targeted attack target_label=o_label=89
Test-score: 0.2595520317554474, class 97
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107908_1030.jpg 
Non-Targeted attack target_label=o_label=90
Test-score: 0.6221421957015991, class 91
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108000_1087.jpg 
Non-Targeted attack target_label=o_label=91
Test-score: 0.9019123911857605, class 90
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108089_1104.jpg 
Non-Targeted attack target_label=o_label=92
Test-score: 0.36616942286491394, class 45
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108422_1096.jpg 
Non-Targeted attack target_label=o_label=93
Non-Targeted attack target_label=o_label=93
Test-score: 0.36447763442993164, class 103
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108551_1025.jpg 
Non-Targeted attack target_label=o_label=94
Test-score: 0.715351939201355, class 64
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108915_10564.jpg 
Non-Targeted attack target_label=o_label=95
Test-score: 0.5345490574836731, class 8
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02109047_10160.jpg 
Non-Targeted attack target_label=o_label=96
Test-score: 0.5757254362106323, class 13
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02109525_10032.jpg 
Non-Targeted attack target_label=o_label=97
Non-Targeted attack target_label=o_label=97
Non-Targeted attack target_label=o_label=97
Test-score: 0.19706158339977264, class 97
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02109961_11224.jpg 
Non-Targeted attack target_label=o_label=98
Test-score: 0.6545760631561279, class 99
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110063_11105.jpg 
Non-Targeted attack target_label=o_label=99
Test-score: 0.6142775416374207, class 112
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110185_10116.jpg 
Non-Targeted attack target_label=o_label=100
Test-score: 0.8787350058555603, class 98
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110627_10147.jpg 
Non-Targeted attack target_label=o_label=101
Test-score: 0.3279683589935303, class 48
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110806_1214.jpg 
Non-Targeted attack target_label=o_label=102
Test-score: 0.3767395615577698, class 91
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110958_10378.jpg 
Non-Targeted attack target_label=o_label=103
Non-Targeted attack target_label=o_label=103
Non-Targeted attack target_label=o_label=103
Test-score: 0.5809877514839172, class 103
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02111129_1111.jpg 
Non-Targeted attack target_label=o_label=104
Test-score: 0.5515199899673462, class 94
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02111277_10237.jpg 
Non-Targeted attack target_label=o_label=105
Test-score: 0.5786005258560181, class 71
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02111500_1048.jpg 
Non-Targeted attack target_label=o_label=106
Non-Targeted attack target_label=o_label=106
Test-score: 0.07512383162975311, class 72
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02111889_10059.jpg 
Non-Targeted attack target_label=o_label=107
Test-score: 0.3027220666408539, class 72
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02112018_10158.jpg 
Non-Targeted attack target_label=o_label=108
Non-Targeted attack target_label=o_label=108
Non-Targeted attack target_label=o_label=108
Test-score: 0.8465248346328735, class 108
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02112137_1005.jpg 
Non-Targeted attack target_label=o_label=109
Non-Targeted attack target_label=o_label=109
Non-Targeted attack target_label=o_label=109
Test-score: 0.4224722981452942, class 80
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02112350_10079.jpg 
Non-Targeted attack target_label=o_label=110
Non-Targeted attack target_label=o_label=110
Non-Targeted attack target_label=o_label=110
Test-score: 0.5601344108581543, class 110
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02112706_105.jpg 
Non-Targeted attack target_label=o_label=111
Test-score: 0.3139760494232178, class 119
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113023_1136.jpg 
Non-Targeted attack target_label=o_label=112
Test-score: 0.9765301942825317, class 113
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113186_1030.jpg 
Non-Targeted attack target_label=o_label=113
Test-score: 0.9946261048316956, class 112
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113624_1461.jpg 
Non-Targeted attack target_label=o_label=114
Test-score: 0.8437007069587708, class 115
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113712_10525.jpg 
Non-Targeted attack target_label=o_label=115
Test-score: 0.9968068599700928, class 3
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113799_1155.jpg 
Non-Targeted attack target_label=o_label=116
Test-score: 0.7794128656387329, class 71
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113978_1034.jpg 
Non-Targeted attack target_label=o_label=117
Test-score: 0.05417395010590553, class 120
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02115641_10261.jpg 
Non-Targeted attack target_label=o_label=118
Test-score: 0.21042458713054657, class 117
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02115913_1010.jpg 
Non-Targeted attack target_label=o_label=119
Test-score: 0.5643325448036194, class 104
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02116738_10024.jpg 
Non-Targeted attack target_label=o_label=120
Non-Targeted attack target_label=o_label=120
Non-Targeted attack target_label=o_label=120
Test-score: 0.19919021427631378, class 120
ADV 120 files, AVG MSE: 4.72762380070619
 

介绍FGSM之前,我们回顾以下梯度下降算法更新公式:

θ:=θ−α∗▽J(θ)\theta := \theta - \alpha * \bigtriangledown J(\theta)θ:=θαJ(θ)

其中,θ\thetaθ为模型参数,α\alphaα为步长,J(θ)J(\theta)J(θ)为目标函数。

如上迭代可使 J(θ)J(\theta)J(θ)不断变小。

而上一code框中,定义目标为模型输出概率与标签的交叉熵,按照如上公式迭代,会使模型预测更加准确。

而我们的目的是让模型迷惑,预测不出正确的标签,因此,我们只需改变一下符号:

θ:=θ+▽J(θ)\theta := \theta + \bigtriangledown J(\theta)θ:=θ+J(θ)

FGSM思想大概如此,此外

  • FGSM所用梯度会施以一sign函数,这意味这如果某一维度梯度为-0.000000000001,经过sign函数后将为-1。
In[ ]
"""
Explaining and Harnessing Adversarial Examples, I. Goodfellow et al., ICLR 2015
实现了FGSM 支持定向和非定向攻击的单步FGSM


input_layer:输入层
output_layer:输出层
step_size:攻击步长
adv_program:生成对抗样本的prog 
eval_program:预测用的prog
isTarget:是否定向攻击
target_label:定向攻击标签
epsilon:约束linf大小
o:原始数据
use_gpu:是否使用GPU

返回:
生成的对抗样本
"""
def FGSM(adv_program,eval_program,gradients,o,input_layer,output_layer,step_size=16.0/256,epsilon=16.0/256,isTarget=False,target_label=0,use_gpu=False):
    
    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
   
    result = exe.run(eval_program,
                     fetch_list=[output_layer],
                     feed={ input_layer.name:o })
    result = result[0][0]
   
    o_label = np.argsort(result)[::-1][:1][0]
    
    if not isTarget:
        #无定向攻击 target_label的值自动设置为原标签的值
        print("Non-Targeted attack target_label=o_label={}".format(o_label))
        target_label=o_label
    else:
        print("Targeted attack target_label={} o_label={}".format(target_label,o_label))
        
        
    target_label=np.array([target_label]).astype('int64')
    target_label=np.expand_dims(target_label, axis=0)
    
    #计算梯度
    g = exe.run(adv_program,
                     fetch_list=[gradients],
                     feed={ input_layer.name:o,'label': target_label  }
               )
    g = g[0][0]
    
    
    if isTarget:
        adv=o-np.sign(g)*step_size
    else:
        #################################
        #注意此处符号
        adv=o+np.sign(g)*step_size
    
    #实施linf约束
    adv=linf_img_tenosr(o,adv,epsilon)
    
    return adv
 

1.3 让我们看看生成的对抗样本和原来有什么区别

In[ ]
#定义一个观察图片区别的函数
def show_images_diff(original_img,adversarial_img):
    #original_img = np.array(Image.open(original_img))
    #adversarial_img = np.array(Image.open(adversarial_img))
    original_img=cv2.resize(original_img.copy(),(224,224))
    adversarial_img=cv2.resize(adversarial_img.copy(),(224,224))

    plt.figure(figsize=(10,10))

    #original_img=original_img/255.0
    #adversarial_img=adversarial_img/255.0

    plt.subplot(1, 3, 1)
    plt.title('Original Image')
    plt.imshow(original_img)
    plt.axis('off')

    plt.subplot(1, 3, 2)
    plt.title('Adversarial Image')
    plt.imshow(adversarial_img)
    plt.axis('off')

    plt.subplot(1, 3, 3)
    plt.title('Difference')
    difference = 0.0+adversarial_img - original_img
        
    l0 = np.where(difference != 0)[0].shape[0]*100/(224*224*3)
    l2 = np.linalg.norm(difference)/(256*3)
    linf=np.linalg.norm(difference.copy().ravel(),ord=np.inf)
    # print(difference)
    print("l0={}% l2={} linf={}".format(l0, l2,linf))
    
    #(-1,1)  -> (0,1)
    #灰色打底 容易看出区别
    difference=difference/255.0
        
    difference=difference/2.0+0.5
   
    plt.imshow(difference)
    plt.axis('off')

    plt.show()
    

    #plt.savefig('fig_cat.png')
In[ ]
from PIL import Image, ImageOps
import cv2
import matplotlib.pyplot as plt
original_img=np.array(Image.open("/home/aistudio/baidu_attack_by_xin/input_image/n02085782_1039.jpg"))
adversarial_img=np.array(Image.open("/home/aistudio/baidu_attack_by_xin/output_image/n02085782_1039.jpg"))
show_images_diff(original_img,adversarial_img)
l0=92.0014880952381% l2=3.203206511496744 linf=31.0
 

在肉眼观察下,并看不出对抗样本有什么变化,但其实模型已经认不出这条狗了

 

二 模型改进

baseline采用了FGSM算法,并且只攻击了一个模型,因此改进的思路为两路:

  • 第一路,训练更多的模型,攻击更多的模型已寻求泛化能力。
  • 第二路,改进算法,试试更发杂的,效果更好的算法。
 

2.1 更多的模型

集成模型选取思路为多元,尽可能多的不同模型,才可能逼近赛题背后的黑盒模型。

  • 橙色和蓝色框中模型均采用pytorch进行迁移训练,训练集测试集为原始Stanford Dogs数据集划分,迭代次数均为25,学习率均为0.001。随后将pytorch转换为oxnn模型,再用x2paddle转换为paddle模型。至于转换的细节我将在稍后的系列文章中详细介绍

  • 红色框中模型为直接用paddlepaddle训练而来,迭代次数为20,其余参数与前述相同。

  • 人工加固ResNeXt50_32x4d模型

决赛中的灰盒模型为人工加固的模型,结构为ResNeXt50。为了攻击黑盒模型,我在本地训练了一个加固模型,作为灰盒模型的逼近。训练加固模型涉及到训练集的选取和训练方法的选取。 训练集的构成主要有两部分,第一部分我采用不同的方法攻击初赛的白盒模型(此白盒模型与灰盒模型具有相同的网络结构),将生成的n个样本集作为训练集的一部分,思路如图3.2所示。这样基于的假设是,不同的攻击方法生成的对抗样本在真实的灰盒模型表现上会有差异,有些图片依然能被灰盒模型正确识别。将n个样本集集合,就可以构建出完全将灰盒攻击成功的图片集。

第二部分为Stanford Dogs数据集中随机选取的8000张图片和原始的120张图片,这些图片的选取是为了保持模型的泛化能力。

note: 这种方法效果非常明显,因为从比赛来讲,如果你训练的模型和背后的黑盒具有相同的结构,迁移能力要比其他结构的模型好很多

 

集成模型代码实践

幸好不同模型参数的命名方式不同,因此一股脑读进来不会出错,代码如下。

In[ ]
#coding=utf-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys
import os
import numpy as np
import paddle.fluid as fluid
import pandas as pd
import models
from attack.attack_pp import FGSM, PGD,linf_img_tenosr,ensem_mom_attack_threshold_9model,\
ensem_mom_attack_threshold_9model2,ensem_mom_attack_threshold_9model_tarversion
from utils import init_prog, save_adv_image, process_img, tensor2img, calc_mse, add_arguments, print_arguments

image_shape = [3, 224, 224]
class_dim=121
input_dir = "./input_image/"
output_dir = "./output_image_attack/"
os.makedirs("./output_image_attack") 
#######################################################################
#这就是所用的所有模型
model_name1="ResNeXt50_32x4d"
pretrained_model1="./models_parameters/86.45+88.81ResNeXt50_32x4d"

model_name2="MobileNetV2"
pretrained_model2="./models_parameters/MobileNetV2"

model_name4="VGG16"
pretrained_model4="./models_parameters/VGG16"

model_name3="Densenet121"
pretrained_model3="./models_parameters/Densenet121"

model_name5="mnasnet1_0"
pretrained_model5="./models_parameters/mnasnet1_0"

model_name6="wide_resnet"
pretrained_model6="./models_parameters/wide_resnet"

model_name7="googlenet"
pretrained_model7="./models_parameters/googlenet"

model_name8="nas_mobile_net"
pretrained_model8="./models_parameters/nas_mobile_net"

model_name9="alexnet"
pretrained_model9="./models_parameters/alexnet"
########################################################################
val_list = 'val_list.txt'
use_gpu=True

mydict = {0: 1, 1: 10, 2: 100, 3: 101, 4: 102, 5: 103, 6: 104, 7: 105, 8: 106, 9: 107, 10: 108, 11: 109, 12: 11, 13: 110, 14: 111, 15: 112, 16: 113, 17: 114, 18: 115, 19: 116, 20: 117, 21: 118, 22: 119, 23: 12, 24: 120, 25: 13, 26: 14, 27: 15, 28: 16, 29: 17, 30: 18, 31: 19, 32: 2, 33: 20, 34: 21, 35: 22, 36: 23, 37: 24, 38: 25, 39: 26, 40: 27, 41: 28, 42: 29, 43: 3, 44: 30, 45: 31, 46: 32, 47: 33, 48: 34, 49: 35, 50: 36, 51: 37, 52: 38, 53: 39, 54: 4, 55: 40, 56: 41, 57: 42, 58: 43, 59: 44, 60: 45, 61: 46, 62: 47, 63: 48, 64: 49, 65: 5, 66: 50, 67: 51, 68: 52, 69: 53, 70: 54, 71: 55, 72: 56, 73: 57, 74: 58, 75: 59, 76: 6, 77: 60, 78: 61, 79: 62, 80: 63, 81: 64, 82: 65, 83: 66, 84: 67, 85: 68, 86: 69, 87: 7, 88: 70, 89: 71, 90: 72, 91: 73, 92: 74, 93: 75, 94: 76, 95: 77, 96: 78, 97: 79, 98: 8, 99: 80, 100: 81, 101: 82, 102: 83, 103: 84, 104: 85, 105: 86, 106: 87, 107: 88, 108: 89, 109: 9, 110: 90, 111: 91, 112: 92, 113: 93, 114: 94, 115: 95, 116: 96, 117: 97, 118: 98, 119: 99}
origdict = {1: 0, 2: 32, 3: 43, 4: 54, 5: 65, 6: 76, 7: 87, 8: 98, 9: 109, 10: 1, 11: 12, 12: 23, 13: 25, 14: 26, 15: 27, 16: 28, 17: 29, 18: 30, 19: 31, 20: 33, 21: 34, 22: 35, 23: 36, 24: 37, 25: 38, 26: 39, 27: 40, 28: 41, 29: 42, 30: 44, 31: 45, 32: 46, 33: 47, 34: 48, 35: 49, 36: 50, 37: 51, 38: 52, 39: 53, 40: 55, 41: 56, 42: 57, 43: 58, 44: 59, 45: 60, 46: 61, 47: 62, 48: 63, 49: 64, 50: 66, 51: 67, 52: 68, 53: 69, 54: 70, 55: 71, 56: 72, 57: 73, 58: 74, 59: 75, 60: 77, 61: 78, 62: 79, 63: 80, 64: 81, 65: 82, 66: 83, 67: 84, 68: 85, 69: 86, 70: 88, 71: 89, 72: 90, 73: 91, 74: 92, 75: 93, 76: 94, 77: 95, 78: 96, 79: 97, 80: 99, 81: 100, 82: 101, 83: 102, 84: 103, 85: 104, 86: 105, 87: 106, 88: 107, 89: 108, 90: 110, 91: 111, 92: 112, 93: 113, 94: 114, 95: 115, 96: 116, 97: 117, 98: 118, 99: 119, 100: 2, 101: 3, 102: 4, 103: 5, 104: 6, 105: 7, 106: 8, 107: 9, 108: 10, 109: 11, 110: 13, 111: 14, 112: 15, 113: 16, 114: 17, 115: 18, 116: 19, 117: 20, 118: 21, 119: 22, 120: 24}

adv_program=fluid.Program()
startup_program = fluid.Program()

new_scope = fluid.Scope()
#完成初始化
with fluid.program_guard(adv_program):
    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
    label = fluid.layers.data(name="label", shape=[1] ,dtype='int64')
    label2 = fluid.layers.data(name="label2", shape=[1] ,dtype='int64')
    adv_image = fluid.layers.create_parameter(name="adv_image",shape=(1,3,224,224),dtype='float32')
    
    model1 = models.__dict__[model_name1]()
    out_logits1 = model1.net(input=adv_image, class_dim=class_dim)
    out1 = fluid.layers.softmax(out_logits1)

    model2 = models.__dict__[model_name2](scale=2.0)
    out_logits2 = model2.net(input=adv_image, class_dim=class_dim)
    out2 = fluid.layers.softmax(out_logits2)

    _input1 = fluid.layers.create_parameter(name="_input_1", shape=(1,3,224,224),dtype='float32')
    
    model3 = models.__dict__[model_name3]()
    input_layer3,out_logits3 = model3.x2paddle_net(input =adv_image )
    out3 = fluid.layers.softmax(out_logits3[0])
    
    model4 = models.__dict__[model_name4]()
    input_layer4,out_logits4 = model4.x2paddle_net(input =adv_image )
    out4 = fluid.layers.softmax(out_logits4[0])


    model5 = models.__dict__[model_name5]()
    input_layer5,out_logits5 = model5.x2paddle_net(input =adv_image )
    out5 = fluid.layers.softmax(out_logits5[0])

    model6 = models.__dict__[model_name6]()
    input_layer6,out_logits6 = model6.x2paddle_net(input =adv_image)
    out6 = fluid.layers.softmax(out_logits6[0])

    model7 = models.__dict__[model_name7]()
    input_layer7,out_logits7 = model7.x2paddle_net(input =adv_image)
    out7 = fluid.layers.softmax(out_logits7[0])

    model8 = models.__dict__[model_name8]()
    input_layer8,out_logits8 = model8.x2paddle_net(input =adv_image)
    out8 = fluid.layers.softmax(out_logits8[0])
    
    
    model9 = models.__dict__[model_name9]()
    input_layer9,out_logits9 = model9.x2paddle_net(input =adv_image)
    out9 = fluid.layers.softmax(out_logits9[0])

    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
    exe.run(fluid.default_startup_program())

    one_hot_label = fluid.one_hot(input=label, depth=121)
    one_hot_label2 = fluid.one_hot(input=label2, depth=121)
    smooth_label = fluid.layers.label_smooth(label=one_hot_label, epsilon=0.1, dtype="float32")[0]
    smooth_label2 = fluid.layers.label_smooth(label=one_hot_label2, epsilon=0.1, dtype="float32")[0]



    ze = fluid.layers.fill_constant(shape=[1], value=-1, dtype='float32')
    loss = 1.2*fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out1, label=label[0]))\
    + 0.2*fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out2, label=label[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out3, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out4, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out5, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out6, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out7, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out8, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out9, label=label2[0]))
    
    avg_loss=fluid.layers.reshape(loss ,[1])#这里修改loss

init_prog(adv_program)
eval_program = adv_program.clone(for_test=True)

with fluid.program_guard(adv_program): 
    #没有解决变量重名的问题
    #此部分代码为加载模型参数
    def if_exist(var):
        b = os.path.exists(os.path.join(pretrained_model1, var.name))
        return b
    def if_exist2(var):
        b = os.path.exists(os.path.join(pretrained_model2, var.name))
        return b
    def if_exist3(var):
        b = os.path.exists(os.path.join(pretrained_model3, var.name))
        return b
    def if_exist4(var):
        b = os.path.exists(os.path.join(pretrained_model4, var.name))
        return b
    def if_exist5(var):
        b = os.path.exists(os.path.join(pretrained_model5, var.name))
        return b
    def if_exist6(var):
        b = os.path.exists(os.path.join(pretrained_model6, var.name))
        return b
    def if_exist7(var):
        b = os.path.exists(os.path.join(pretrained_model7, var.name))
        return b
    def if_exist8(var):
        b = os.path.exists(os.path.join(pretrained_model8, var.name))
        return b
    def if_exist9(var):
        b = os.path.exists(os.path.join(pretrained_model9, var.name))
        return b
    fluid.io.load_vars(exe,
                       pretrained_model1,
                       fluid.default_main_program(),
                       predicate=if_exist)
    fluid.io.load_vars(exe,
                       pretrained_model2,
                       fluid.default_main_program(),
                       predicate=if_exist2)
    fluid.io.load_vars(exe,
                       pretrained_model3,
                       fluid.default_main_program(),
                       predicate=if_exist3)
    fluid.io.load_vars(exe,
                       pretrained_model4,
                       fluid.default_main_program(),
                       predicate=if_exist4)
    fluid.io.load_vars(exe,
                       pretrained_model5,
                       fluid.default_main_program(),
                       predicate=if_exist5)
    fluid.io.load_vars(exe,
                       pretrained_model6,
                       fluid.default_main_program(),
                       predicate=if_exist6)
    fluid.io.load_vars(exe,
                       pretrained_model7,
                       fluid.default_main_program(),
                       predicate=if_exist7)
    fluid.io.load_vars(exe,
                       pretrained_model8,
                       fluid.default_main_program(),
                       predicate=if_exist8)

    fluid.io.load_vars(exe,
                       pretrained_model9,
                       fluid.default_main_program(),
                       predicate=if_exist9)
    gradients = fluid.backward.gradients(targets=avg_loss, inputs=[adv_image])[0]
    #gradients = fluid.backward.gradients(targets=avg_loss, inputs=[adv_image])
    #print(gradients.shape)
    
def attack_nontarget_by_ensemble(img, src_label,src_label2,label,momentum): #src_label2为转换后的标签
    adv,m=ensem_mom_attack_threshold_9model_tarversion(adv_program=adv_program,eval_program=eval_program,gradients=gradients,o=img,
                src_label = src_label,
                src_label2 = src_label2,
                label = label,
                out1 = out1,out2 = out2 ,out3 = out3 ,out4 = out4,out5 = out5,out6 = out6,out7 = out7 ,out8 = out8,out9 = out9,mm = momentum)#添加了mm

    adv_img=tensor2img(adv)
    return adv_img,m

def get_original_file(filepath):
    with open(filepath, 'r') as cfile:
        full_lines = [line.strip() for line in cfile]
    cfile.close()
    original_files = []
    for line in full_lines:
        label, file_name = line.split()
        original_files.append([file_name, int(label)])
    return original_files
    
def gen_adv():
    mse = 0
    original_files = get_original_file(input_dir + val_list)
    #下一个图片的初始梯度方向为上一代的最后的值
    global momentum
    momentum=0
    
    for filename, label in original_files:
        img_path = input_dir + filename
        print("Image: {0} ".format(img_path))
        img=process_img(img_path)
        #adv_img = attack_nontarget_by_ensemble(img, label,origdict[label],label)
        adv_img,m = attack_nontarget_by_ensemble(img, label,origdict[label],label,momentum)
        #m为上一个样本最后一次梯度值
        momentum = m
        #adv_img 已经经过转换了,范围是0-255

        image_name, image_ext = filename.split('.')
        ##Save adversarial image(.png)
        save_adv_image(adv_img, output_dir+image_name+'.png')
        org_img = tensor2img(img)
        score = calc_mse(org_img, adv_img)
        print("Image:{0}, mase = {1} ".format(img_path,score))
        mse += score
    print("ADV {} files, AVG MSE: {} ".format(len(original_files), mse/len(original_files)))
 

note: 代码中此处函数已替换为我的实现,方案将在下一模块介绍

 

2.2 改进算法

加入动量迭代

动量项是缓解局部最优的常用手段,我们将对抗样本的生成依然看成优化问题,那么用到动量也就符合常理。

实际代码如下:

 

随机梯度反向

  • 大粒度 此方法受启发于[2],论文中采取双路寻优,一路采用常规方法梯度上升,如图中绿线所示。另一路先采取梯度下降到达这一分类局部最优再进行梯度上升,以期找到更快的上升路径,如图3.3中蓝线所示。 而本人在实现过程中,对其进行简化,仅在迭代的第一步进行梯度下降。

实际代码如下:

  • 小粒度 随机选取梯度中5%进行取反,可视为像素粒度的梯度反向,反转比例为一超参数。

代码如下:

其中比例为一超参数,而选取随机选取一定%为生成与梯度相同形状的随机数,设定阈值选取一定%乘以-1.

 

添加高斯噪声

此方法受启发于[6],论文作者认为攻击模型的梯度具有噪声,损害了迁移能力。论文作者用一组原始图片加噪声后的梯度的平均代替原来的梯度,效果得到提升。而我与论文作者理解不同,添加噪声意在增加梯度的噪声,以越过局部最优,再者多次计算梯度非常耗时,因此我选用了只加一次噪声,均值为0,方差为超参数。

代码如下:

 

攻击后进行目标攻击

此方法受启发于[2],作者在成功越过分界线后进行消除无用噪声操作,作者认为此举可以加强对抗样本的迁移能力。 我的做法与此不同,我认为不仅要越过边界,还要走向这个错误分类的低谷。此举依据的假设是:尽管不同模型的分界线存在差异,模型学到的特征应是相似的。思路如图3.4中红色箭头所示,带有圆圈的数字表示迭代步数。 因此,在成功攻击之后,我又添加了两步定向攻击,目标为攻击成功时被错分的类别。在集成攻击时,目标为被错分类别的众数。

目标攻击代码: 选取的目标标签为9个模型预测的众数

 

将上述策略集成,核心函数代码全貌展示在下一代码框中, 我将自己设计的函数命名为:ensem_mom_attack_threshold_9model_tarversion

In[ ]
def ensem_mom_attack_threshold_9model_tarversion(adv_program,eval_program,gradients,o,src_label2,src_label,out1,out2,out3,out4,out5,out6,out7,out8,out9,label,mm,iteration=20,use_gpu = True):
    origdict = {1: 0, 2: 32, 3: 43, 4: 54, 5: 65, 6: 76, 7: 87, 8: 98, 9: 109, 10: 1, 11: 12, 12: 23, 13: 25, 14: 26, 15: 27, 16: 28, 17: 29, 18: 30, 19: 31, 20: 33, 21: 34, 22: 35, 23: 36, 24: 37, 25: 38, 26: 39, 27: 40, 28: 41, 29: 42, 30: 44, 31: 45, 32: 46, 33: 47, 34: 48, 35: 49, 36: 50, 37: 51, 38: 52, 39: 53, 40: 55, 41: 56, 42: 57, 43: 58, 44: 59, 45: 60, 46: 61, 47: 62, 48: 63, 49: 64, 50: 66, 51: 67, 52: 68, 53: 69, 54: 70, 55: 71, 56: 72, 57: 73, 58: 74, 59: 75, 60: 77, 61: 78, 62: 79, 63: 80, 64: 81, 65: 82, 66: 83, 67: 84, 68: 85, 69: 86, 70: 88, 71: 89, 72: 90, 73: 91, 74: 92, 75: 93, 76: 94, 77: 95, 78: 96, 79: 97, 80: 99, 81: 100, 82: 101, 83: 102, 84: 103, 85: 104, 86: 105, 87: 106, 88: 107, 89: 108, 90: 110, 91: 111, 92: 112, 93: 113, 94: 114, 95: 115, 96: 116, 97: 117, 98: 118, 99: 119, 100: 2, 101: 3, 102: 4, 103: 5, 104: 6, 105: 7, 106: 8, 107: 9, 108: 10, 109: 11, 110: 13, 111: 14, 112: 15, 113: 16, 114: 17, 115: 18, 116: 19, 117: 20, 118: 21, 119: 22, 120: 24}
    mydict = {0: 1, 1: 10, 2: 100, 3: 101, 4: 102, 5: 103, 6: 104, 7: 105, 8: 106, 9: 107, 10: 108, 11: 109, 12: 11, 13: 110, 14: 111, 15: 112, 16: 113, 17: 114, 18: 115, 19: 116, 20: 117, 21: 118, 22: 119, 23: 12, 24: 120, 25: 13, 26: 14, 27: 15, 28: 16, 29: 17, 30: 18, 31: 19, 32: 2, 33: 20, 34: 21, 35: 22, 36: 23, 37: 24, 38: 25, 39: 26, 40: 27, 41: 28, 42: 29, 43: 3, 44: 30, 45: 31, 46: 32, 47: 33, 48: 34, 49: 35, 50: 36, 51: 37, 52: 38, 53: 39, 54: 4, 55: 40, 56: 41, 57: 42, 58: 43, 59: 44, 60: 45, 61: 46, 62: 47, 63: 48, 64: 49, 65: 5, 66: 50, 67: 51, 68: 52, 69: 53, 70: 54, 71: 55, 72: 56, 73: 57, 74: 58, 75: 59, 76: 6, 77: 60, 78: 61, 79: 62, 80: 63, 81: 64, 82: 65, 83: 66, 84: 67, 85: 68, 86: 69, 87: 7, 88: 70, 89: 71, 90: 72, 91: 73, 92: 74, 93: 75, 94: 76, 95: 77, 96: 78, 97: 79, 98: 8, 99: 80, 100: 81, 101: 82, 102: 83, 103: 84, 104: 85, 105: 86, 106: 87, 107: 88, 108: 89, 109: 9, 110: 90, 111: 91, 112: 92, 113: 93, 114: 94, 115: 95, 116: 96, 117: 97, 118: 98, 119: 99}
    
    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
    
    target_label=np.array([src_label]).astype('int64')
    target_label=np.expand_dims(target_label, axis=0)
    target_label2=np.array([src_label2]).astype('int64')
    target_label2=np.expand_dims(target_label2, axis=0)
    
    img = o.copy()
    decay_factor = 0.90
    steps=90
    epsilons = np.linspace(5, 388, num=75)
    flag_traget = 0#表示非目标攻击
    flag2=0 #退出的标志
    for epsilon in epsilons[:]:
        #print("now momentum is {}".format(momentum))
        if flag_traget==0:
            #momentum = mm
            momentum = 0
            adv=img.copy()
            for i in range(steps):
                
                if i<50:
                    adv_noise = (adv+np.random.normal(loc=0.0, scale=0.5+epsilon/90,size = (3,224,224))).astype('float32')
                else:
                    adv_noise = (adv+np.random.normal(loc=0.0, scale=0.1,size = (3,224,224))).astype('float32')
                g,resul1,resul2,resul3,resul4,resul5,resul6,resul7,resul8,resul9 = exe.run(adv_program,
                             fetch_list=[gradients,out1,out2,out3,out4,out5,out6,out7,out8,out9],
                             feed={'label2':target_label2,'adv_image':adv_noise,'label': target_label })
               
                #print(g[0][0].shape,g[0][1].shape,g[0][2].shape)
                g = (g[0][0]+g[0][1]+g[0][2])/3 #三通道梯度平均
                #print(g.shape)
                velocity = g / (np.linalg.norm(g.flatten(),ord=1) + 1e-10)
                momentum = decay_factor * momentum + velocity
                #print(momentum.shape)
                norm_m = momentum / (np.linalg.norm(momentum.flatten(),ord=2) + 1e-10)
                #print(norm_m.shape)
                _max = np.max(abs(norm_m))
                tmp = np.percentile(abs(norm_m), [25, 99.45, 99.5])#将图片变动的像素点限定在0.5%
                thres = tmp[2]
                mask = abs(norm_m)>thres
                norm_m_m = np.multiply(norm_m,mask)
                if i<50: #前50步,2%的梯度反响,随着i递减   试试5%
                    dir_mask = np.random.rand(3,224,224)
                    #print(dir_mask)
                    dir_mask = dir_mask>(0.15-i/900)  
                    #print(dir_mask)
                    dir_mask[dir_mask==0] = -1
                    #print(dir_mask)
                    norm_m_m = np.multiply(norm_m_m,dir_mask)
                    #print(norm_m_m.shape)
                #步长也随着step衰减
                if i==0:
                    adv=adv+epsilon*norm_m_m 
                else:
                    adv=adv-epsilon*norm_m_m 
                    #adv=adv-(epsilon-i/30)*norm_m_m 
                #实施linf约束
                adv=linf_img_tenosr(img,adv,epsilon)
        else:
            for i in range(2):
                adv_noise = (adv+np.random.normal(loc=0.0, scale=0.1,size = (3,224,224))).astype('float32')
                target_label=np.array([t_label]).astype('int64')
                target_label=np.expand_dims(target_label, axis=0)
                target_label2=np.array([origdict[t_label]]).astype('int64')
                target_label2=np.expand_dims(target_label2, axis=0)
                g,resul1,resul2,resul3,resul4,resul5,resul6,resul7,resul8,resul9 = exe.run(adv_program,
                         fetch_list=[gradients,out1,out2,out3,out4,out5,out6,out7,out8,out9],
                         feed={'label2':target_label2,'adv_image':adv_noise,'label': target_label }
                          )
                g = (g[0][0]+g[0][1]+g[0][2])/3 #三通道梯度平均
                velocity = g / (np.linalg.norm(g.flatten(),ord=1) + 1e-10)
                momentum = decay_factor * momentum + velocity
                #print(momentum.shape)
                norm_m = momentum / (np.linalg.norm(momentum.flatten(),ord=2) + 1e-10)
                #print(norm_m.shape)
                _max = np.max(abs(norm_m))
                tmp = np.percentile(abs(norm_m), [25, 99.45, 99.5])#将图片变动的像素点限定在0.5%
                thres = tmp[2]
                mask = abs(norm_m)>thres
                norm_m_m = np.multiply(norm_m,mask)
                adv=adv+epsilon*norm_m_m
                #实施linf约束
                adv=linf_img_tenosr(img,adv,epsilon)
            flag2=1
            
        print("epsilon is {}".format(epsilon))
        print("label is:{}; model1:{}; model2:{}; model3:{}; model4:{}; model5:{}; model6:{}; model7:{}; model8:{} ; model9:{} ".format(label,resul1.argmax(),resul2.argmax(),mydict[resul3.argmax()],mydict[resul4.argmax()],\
        mydict[resul5.argmax()],mydict[resul6.argmax()],mydict[resul7.argmax()],mydict[resul8.argmax()],mydict[resul9.argmax()]))#模型3标签到真正标签
        

        if((label!=resul1.argmax()) and(label!=resul2.argmax())and(origdict[label]!=resul3.argmax())and(origdict[label]!=resul4.argmax())and(origdict[label]!=resul5.argmax())\
        and(origdict[label]!=resul6.argmax())and(origdict[label]!=resul7.argmax())and(origdict[label]!=resul8.argmax())and(origdict[label]!=resul9.argmax())):
            res_list = [resul1.argmax(),resul2.argmax(),mydict[resul3.argmax()],mydict[resul4.argmax()],mydict[resul5.argmax()],mydict[resul6.argmax()],mydict[resul7.argmax()],mydict[resul8.argmax()],mydict[resul9.argmax()]]
            ser = pd.Series(res_list)
            t_label = ser.mode()[0]#取众数作为target_label
            flag_traget=1
            if(flag2 == 1):
                break
    return adv,momentum
 

3 方案复现

算法介绍完毕,让我们运行一下完整的方案,生成对抗样本。

note: 由于集成模型较多,代码将运行1个小时左右。可提前中止运行(点击notebook右上角运行菜单,选中中断执行),查看已成的对抗样本。

In[ ]
gen_adv()
 

接来看看对抗样本与原样本有什么区别

In[ ]
#定义一个观察图片区别的函数
def show_images_diff(original_img,adversarial_img):
    #original_img = np.array(Image.open(original_img))
    #adversarial_img = np.array(Image.open(adversarial_img))
    original_img=cv2.resize(original_img.copy(),(224,224))
    adversarial_img=cv2.resize(adversarial_img.copy(),(224,224))

    plt.figure(figsize=(10,10))

    #original_img=original_img/255.0
    #adversarial_img=adversarial_img/255.0

    plt.subplot(1, 3, 1)
    plt.title('Original Image')
    plt.imshow(original_img)
    plt.axis('off')

    plt.subplot(1, 3, 2)
    plt.title('Adversarial Image')
    plt.imshow(adversarial_img)
    plt.axis('off')

    plt.subplot(1, 3, 3)
    plt.title('Difference')
    difference = 0.0+adversarial_img - original_img
        
    l0 = np.where(difference != 0)[0].shape[0]*100/(224*224*3)
    l2 = np.linalg.norm(difference)/(256*3)
    linf=np.linalg.norm(difference.copy().ravel(),ord=np.inf)
    # print(difference)
    print("l0={}% l2={} linf={}".format(l0, l2,linf))
    
    #(-1,1)  -> (0,1)
    #灰色打底 容易看出区别
    difference=difference/255.0
        
    difference=difference/2.0+0.5
   
    plt.imshow(difference)
    plt.axis('off')

    plt.show()
    

    #plt.savefig('fig_cat.png')#plt.savefig('fig_cat.png')10model_ensemble_attack.#plt.savefig('fig_cat.png')#plt.savefig('fig_cat.png')10model_ensemble_attack.py10model_ensemble_attack.py
In[ ]
from PIL import Image, ImageOps
import cv2
import matplotlib.pyplot as plt
#########################################
##此处的pname可替换为你想查看的图片
pname = "n02085620_10074.jpg"
#########################################
image_name, image_ext = pname.split('.')
pname_attack = image_name + ".png"
original_img=np.array(Image.open("/home/aistudio/baidu_attack_by_xin/input_image/" + pname))
adversarial_img=np.array(Image.open("/home/aistudio/baidu_attack_by_xin/output_image_attack/" + pname_attack))
show_images_diff(original_img,adversarial_img)
l0=22.507440476190474% l2=5.516279998781479 linf=221.0
 

此方案生成的对抗样本具有良好的迁移能力,可以在AI安全对抗赛中取得第二名,以此标准

其中M表示防御模型,y表示样本I的真实标签。如果防御算法对样本识别正确,此次攻击不成功,扰动量直接置为上限128。如果攻击成功,计算对抗样本和原始样本的扰动量,采用平均L2距离。每个对抗样本都会在m个防御模型上计算扰动量,n代表样本个数,最后对所有的扰动量进行平均,做为本次攻击的整体距离得分,得分越小越好。

为衡量标准,我方案生成的图片可以达到3.78089。

 

4 临门一脚

然而这不够,作为一个竞赛,人人都虎视眈眈盯着奖金的时候,还需要不断的提升。因此还需要临门一脚,一种后处理方法。

小扰动截断

使用上述方法后,我的结果在95-96分之间波动,为进一步提升成绩,我选用最高分96.53分图片进行后处理。后处理方法为:将攻击后的图片与原图片进行对比,对一定阈值以下的扰动进行截断。 经过不断上探阈值,发现阈值为17(图片的像素范围为0-255)的时候效果最好。此方法提分0.3左右。

代码如下:

提前预警: 运行下面代码需要生成全部对抗样本。

In[ ]
#coding=utf-8

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import functools
import numpy as np
import paddle.fluid as fluid

#加载自定义文件
import models
from attack.attack_pp import FGSM, PGD
from utils import *

######Init args
image_shape = [3,224,224]
class_dim=121
input_dir = "./input_image/"
attacked_dir = "./output_image_attack/"
output_dir = "./posopt_output_image/"
drop_thres = 10
os.makedirs("./posopt_output_image") 
val_list = 'val_list.txt'
use_gpu=True


####### Main #######
def get_original_file(filepath):
    with open(filepath, 'r') as cfile:
        full_lines = [line.strip() for line in cfile]
    cfile.close()
    original_files = []
    for line in full_lines:
        label, file_name = line.split()
        original_files.append([file_name, int(label)])
    return original_files

def gen_diff():
    original_files = get_original_file(input_dir + val_list)

    for filename, label in original_files:
        image_name, image_ext = filename.split('.')
        img_path = input_dir + filename
        print("Image: {0} ".format(img_path))
        img=process_img(img_path)
        adv_img_path = attacked_dir + image_name+'.png'
        adv=process_img(adv_img_path)
        
        org_img = tensor2img(img)
        adv_img = tensor2img(adv)
        #10/256 以下的扰动全部截断
        diff = abs(org_img-adv_img)<drop_thres   #<10的为1
        diff_max = abs(org_img-adv_img)>=drop_thres  #>=10的为1
        #<10的保留org_img
        tmp1 = np.multiply(org_img,diff)
        #>10的保留adv_img
        tmp2 = np.multiply(adv_img,diff_max)
        final_img = tmp1+tmp2
        
        save_adv_image(final_img, output_dir+image_name+'.png')


gen_diff()
 

写在赛后

  1. 以上就是本人在AI安全对抗赛取得第二名的全部方案,如需在终端执行,下面介绍了终端执行说明,感谢阅读。
  2. 决赛赛程中我霸榜半个月有余,绞尽脑汁尝试各种攻击方法,不断阅读论文,不断尝试效果。我很享受这种过程,照猫画虎快速入门了paddle,成长了很多。
  3. 然而最后半个小时还是被反超。赛后交流发现他们序列攻击了十几个模型,然而我集成了9个模型就停止了。也许霸榜让我有了一丝松懈。
  4. 致读者,这是一个非常好的入门对抗样本的机会,细嚼baseline和我的方案将让你入门4-5种这个领域的算法。
  5. 感谢AI studio让我用到了v100,我自己的台式机装的1050可是连n年前的vgg都跑不动。
 

参考文献

[1] Liu Y , Chen X , Liu C , et al. Delving into Transferable Adversarial Examples and Black-box Attacks[J]. 2016.

[2] Shi Y , Wang S , Han Y . Curls & Whey: Boosting Black-Box Adversarial Attacks[J]. 2019.

[3] Narodytska N , Kasiviswanathan S P . Simple Black-Box Adversarial Perturbations for Deep Networks[J]. 2016.

[4] Huang Q , Katsman I , He H , et al. Enhancing Adversarial Example Transferability with an Intermediate Level Attack[J]. 2019.

[5] https://www.cs.cmu.edu/~sbhagava/papers/face-rec-ccs16.pdf

[6] Understanding and Enhancing the Transferability of Adversarial Examples

 

代码使用说明

依赖库:

使用步骤:

  • 在终端输入
  • cd baidu_attack_by_xin/
  • python 9model_ensemble_attack.py
  • python pert_drop.py
  • 结果保存在posopt_output_image

使用AI Studio一键上手实践项目吧:https://aistudio.baidu.com/aistudio/projectdetail/296291 

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

>> 访问 PaddlePaddle 官网,了解更多相关内容 

展开阅读全文
打赏
0
0 收藏
分享
加载中
更多评论
打赏
0 评论
0 收藏
0
分享
返回顶部
顶部