TensorFlow 2.0 Beta : 上級 Tutorials : 画像生成 :- FGSM を使用した敵対的サンプル (翻訳/解説)

翻訳 : (株)クラスキャットセールスインフォメーション
作成日時 : 07/09/2019

* 本ページは、TensorFlow の本家サイトの TF 2.0 Beta – Advanced Tutorials – Image generation の以下のページを翻訳した上で適宜、補足説明したものです：

Adversarial example using FGSM

* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

画像生成 :- FGSM を使用した敵対的サンプル

このチュートリアルは Goodfellow et al による Explaining and Harnessing Adversarial Examples で記述されているように Fast Gradient Signed メソッド (FGSM) 攻撃を使用して敵対的サンプルを作成します。これはニューラルネットワークを騙す最初の最もポピュラーな攻撃の一つでした。

敵対的サンプルとは何でしょう？

敵対的サンプルはニューラルネットワークを混乱させ、与えられた入力の誤分類という結果になる目的で作成された特殊化された (= specialised) 入力です。これらの悪名高い入力は人間の目には見分けがつきませんが、ネットワークに画像の内容を識別することを失敗させます。そのような攻撃の幾つかのタイプがありますが、ここでは焦点は fast gradient sign メソッド攻撃にあります、これはホワイトボックス攻撃でその目標は誤分類を確実にすることです。ホワイトボックス攻撃では攻撃者は攻撃されるモデルへの完全なアクセスを持ちます。下で示される敵対的画像の最も有名な例の一つは前述のペーパーから取りました。

ここで、パンダの画像から始め、攻撃者は元の画像に小さな摂動 (= perturbations) (歪み (= distortion)) を追加します、これはモデルがこの画像をテナガザルとして、高い信頼度でラベル付けする結果になります。これらの摂動を追加する過程は下で説明されます。

Fast gradient sign メソッド

fast gradient sign メソッドは敵対的サンプルを作成するニューラルネットワークの勾配を利用することにより動作します。入力画像に対して、メソッドは損失を最大化する新しい画像を作成する入力画像に関する損失の勾配を使用します。この新しい画像は敵対的画像と呼ばれます。これは次の式を使用して要約できます :

$adv\_x = x + \epsilon*\text{sign}(\nabla_xJ(\theta, x, y))$

ここで

adv_x : 敵対的画像。
x : 元の入力画像。
y : 元の入力ラベル。
$\epsilon$ : 摂動が小さいことを確実にする乗数。
$\theta$ : モデルパラメータ。
$J$ : 損失。

ここで面白い特質は、勾配は入力画像に関して取られることです。これが成されるのは目的が損失を最大化する画像を作成することであるからです。これを達成する方法は画像の各ピクセルが損失値にどのくらい寄与するかを見出し、それに従って摂動を追加することです。

これはかなり高速に動作します、何故ならば連鎖率を使用してそして必要な勾配を見つけることにより、各ピクセルが損失にどのくらい寄与するかを見い出すことは容易だからです。こうして、画像に関する勾配が使用されます。更に、モデルはもはや訓練されませんので (そのため勾配は訓練可能な変数, i.e. モデルパラメータに関して取られません)、モデルパラメータは定数で在り続けます。唯一のゴールは既に訓練されたモデルを騙すことです。

そこで事前訓練されたモデルを試して騙してみましょう。このチュートリアルでは、モデルは MobileNetV2 で、ImageNet 上で事前訓練されました。

from __future__ import absolute_import, division, print_function, unicode_literals

!pip install -q tensorflow==2.0.0-beta1
import tensorflow as tf
import matplotlib as mpl
import matplotlib.pyplot as plt

mpl.rcParams['figure.figsize'] = (8, 8)
mpl.rcParams['axes.grid'] = False

事前訓練された MobileNetV2 モデルと ImageNet クラス名をロードしましょう。

pretrained_model = tf.keras.applications.MobileNetV2(include_top=True,
                                                     weights='imagenet')
pretrained_model.trainable = False

# ImageNet labels
decode_predictions = tf.keras.applications.mobilenet_v2.decode_predictions

Downloading data from https://github.com/JonathanCMitchell/mobilenet_v2_keras/releases/download/v1.1/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224.h5
14540800/14536120 [==============================] - 1s 0us/step

# Helper function to preprocess the image so that it can be inputted in MobileNetV2
def preprocess(image):
  image = tf.cast(image, tf.float32)
  image = image/255
  image = tf.image.resize(image, (224, 224))
  image = image[None, ...]
  return image

# Helper function to extract labels from probability vector
def get_imagenet_label(probs):
  return decode_predictions(probs, top=1)[0][0]

元の画像

Wikimedia Common からラブラドール・レトリバー by Mirko CC-BY-SA 3.0 のサンプル画像を使用してそれから敵対的サンプルを作成します。最初のステップはそれを MobileNetV2 モデルに入力として供給できるように前処理することです。

image_path = tf.keras.utils.get_file('YellowLabradorLooking_new.jpg', 'https://storage.googleapis.com/download.tensorflow.org/example_images/YellowLabradorLooking_new.jpg')
image_raw = tf.io.read_file(image_path)
image = tf.image.decode_image(image_raw)

image = preprocess(image)
image_probs = pretrained_model.predict(image)

Downloading data from https://storage.googleapis.com/download.tensorflow.org/example_images/YellowLabradorLooking_new.jpg
90112/83281 [================================] - 0s 0us/step

画像を見てみましょう。

plt.figure()
plt.imshow(image[0])
_, image_class, class_confidence = get_imagenet_label(image_probs)
plt.title('{} : {:.2f}% Confidence'.format(image_class, class_confidence*100))
plt.show()

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
40960/35363 [==================================] - 0s 0us/step

敵対的画像を作成する

fast gradient sign メソッドを実装する

最初のステップは摂動を作成することです、これは元の画像を歪めるために使用されて敵対的画像という結果になります。言及したように、このタスクのためには、勾配は画像に関して取られます。

loss_object = tf.keras.losses.CategoricalCrossentropy()

def create_adversarial_pattern(input_image, input_label):
  with tf.GradientTape() as tape:
    tape.watch(input_image)
    prediction = pretrained_model(input_image)
    loss = loss_object(input_label, prediction)

  # Get the gradients of the loss w.r.t to the input image.
  gradient = tape.gradient(loss, input_image)
  # Get the sign of the gradients to create the perturbation
  signed_grad = tf.sign(gradient)
  return signed_grad

結果としての摂動もまた可視化できます。

perturbations = create_adversarial_pattern(image, image_probs)
plt.imshow(perturbations[0])

WARNING: Logging before flag parsing goes to stderr.
W0628 01:18:35.199363 140125669582592 deprecation.py:323] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow/python/ops/math_grad.py:1220: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0628 01:18:35.360933 140125669582592 image.py:648] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

<matplotlib.image.AxesImage at 0x7f70b87057b8>

epsilon の異なる値のためにこれを試して結果としての画像を観測してみましょう。epsilon の値が増加するにつれてネットワークを騙すことが容易になることに気がつくでしょう、けれども、これはより同一視できる摂動という結果とのトレードオフとなります。

def display_images(image, description):
  _, label, confidence = get_imagenet_label(pretrained_model.predict(image))
  plt.figure()
  plt.imshow(image[0])
  plt.title('{} \n {} : {:.2f}% Confidence'.format(description,
                                                   label, confidence*100))
  plt.show()

epsilons = [0, 0.01, 0.1, 0.15]
descriptions = [('Epsilon = {:0.3f}'.format(eps) if eps else 'Input')
                for eps in epsilons]

for i, eps in enumerate(epsilons):
  adv_x = image + eps*perturbations
  adv_x = tf.clip_by_value(adv_x, 0, 1)
  display_images(adv_x, descriptions[i])

Next steps

敵対的攻撃について知った今、先に進んでこれを異なるデータセットと異なるアーキテクチャで試すことができます。貴方自身のモデルを作成して訓練し、そしてそれから同じメソッドを使用してそれを騙すことを試みることもできるかもしれません。epsilon を変更するにつれて予測の信頼度がどのように変化するかを試して見ることもできます。

パワフルですが、このチュートリアルで示される攻撃は敵対的攻撃への研究の単なるスタートで、それからよりパワフルな攻撃を作成する複数のペーパーがあります。敵対的攻撃に加えて、研究は防御の作成にも繋がりました、これは強固な機械学習モデルを作成することを目的とします。敵対的攻撃と防御の包括的なリストのためにこの survey ペーパーをレビューしても良いです。

敵対的攻撃と防御のより多くの実装については、敵対的サンプル・ライブラリ CleverHans を見ることを望むかもしれません。

以上

月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31