TensorFlow 2.0 Alpha : ガイド : Keras : TensorFlow の Keras Functional API (翻訳/解説)

翻訳 : (株)クラスキャットセールスインフォメーション
作成日時 : 03/15/2019

* 本ページは、TensorFlow の本家サイトの TF 2.0 Alpha の以下のページを翻訳した上で適宜、補足説明したものです：

Keras: The Keras Functional API in TensorFlow

* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

ガイド : Keras : TensorFlow の Keras Functional API

セットアップ

!pip install -q pydot
!apt-get install graphviz

Requirement already satisfied: pydot in /usr/local/lib/python3.6/dist-packages (1.3.0)
Requirement already satisfied: pyparsing>=2.1.4 in /usr/local/lib/python3.6/dist-packages (from pydot) (2.3.1)
Reading package lists... Done
Building dependency tree       
Reading state information... Done
graphviz is already the newest version (2.40.1-2).
0 upgraded, 0 newly installed, 0 to remove and 10 not upgraded.

from __future__ import absolute_import, division, print_function

!pip install -q tensorflow-gpu==2.0.0-alpha0
import tensorflow as tf

tf.keras.backend.clear_session()  # For easy reset of notebook state.

Introduction

貴方はモデルを作成するために既に keras.Sequential() の使用に慣れているでしょう。Functional API は Sequential よりもより柔軟なモデルを作成する方法です : それは非線形トポロジーのモデル、共有層を持つモデルそしてマルチ入力と出力を持つモデルを扱うことができます。

それは深層学習モデルが通常は層の有向非巡回グラフ (DAG, directed acyclic graph) であるという考えに基づきます。Functional API は層のグラフを構築するためのツールのセットです。

次のモデルを考えてください :

(input: 784-dimensional vectors)
       ↧
[Dense (64 units, relu activation)]
       ↧
[Dense (64 units, relu activation)]
       ↧
[Dense (10 units, softmax activation)]
       ↧
(output: probability distribution over 10 classes)

それは単純な 3 層のグラフです。

このモデルを functional API で構築するには、入力ノードを作成することから始めるでしょう :

from tensorflow import keras

inputs = keras.Input(shape=(784,))

ここで私達のデータの shape だけを指定します : 784-次元ベクトルです。バッチサイズは常に省略されることに注意してください、各サンプルの shape を指定するだけです。shape (32, 32, 3) の画像のための入力のためであれば、次を使用したでしょう :

img_inputs = keras.Input(shape=(32, 32, 3))

返された、inputs は、貴方のモデルに供給することを想定する入力データの shape と dtype についての情報を含みます :

inputs.shape

TensorShape([None, 784])

inputs.dtype

tf.float32

この入力オブジェクト上で層を呼び出すことにより層のグラフで新しいノードを作成します :

from tensorflow.keras import layers

dense = layers.Dense(64, activation='relu')
x = dense(inputs)

“layer call” アクションは “inputs” からこの作成した層への矢印を描くようなものです。入力を dense 層に “passing” して、そして x を得ます。

層のグラフに 2, 3 のより多くの層を追加しましょう :

x = layers.Dense(64, activation='relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)

この時点で、層のグラフの入力と出力を指定することによりモデルを作成できます :

model = keras.Model(inputs=inputs, outputs=outputs)

おさらいとして、ここに完全なモデル定義過程があります :

inputs = keras.Input(shape=(784,), name='img')
x = layers.Dense(64, activation='relu')(inputs)
x = layers.Dense(64, activation='relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)

model = keras.Model(inputs=inputs, outputs=outputs, name='mnist_model')

モデル要約がどのようなものかを確認しましょう :

model.summary()

Model: "mnist_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
img (InputLayer)             [(None, 784)]             0         
_________________________________________________________________
dense_3 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_4 (Dense)              (None, 64)                4160      
_________________________________________________________________
dense_5 (Dense)              (None, 10)                650       
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________

モデルをグラフとしてプロットすることもできます :

keras.utils.plot_model(model, 'my_first_model.png')

そしてオプションでプロットされたグラフに各層の入力と出力 shape を表示します :

keras.utils.plot_model(model, 'my_first_model_with_shape_info.png', show_shapes=True)

この図と私達が書いたコードは事実上同じです。コードバージョンでは、接続矢印は call 演算で単純に置き換えられます。

「層のグラフ」は深層学習モデルのための非常に直感的なメンタルイメージで、functional API はこのメンタルイメージを密接に映すモデルを作成する方法です。

訓練、評価そして推論

Functional API を使用して構築されたモデルのための訓練、評価と推論は Sequential モデルのためと正確に同じ方法で動作します。

ここに簡単な例があります。

ここで私達は MNIST 画像データをロードし、それをベクトルに reshape し、(検証分割上でパフォーマンスを監視しながら) データ上でモデルを fit させて最後にテストデータ上でモデルを評価します :

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255

model.compile(loss='sparse_categorical_crossentropy',
              optimizer=keras.optimizers.RMSprop(),
              metrics=['accuracy'])
history = model.fit(x_train, y_train,
                    batch_size=64,
                    epochs=5,
                    validation_split=0.2)
test_scores = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', test_scores[0])
print('Test accuracy:', test_scores[1])

Train on 48000 samples, validate on 12000 samples
Epoch 1/5
48000/48000 [==============================] - 3s 64us/sample - loss: 0.3414 - accuracy: 0.9016 - val_loss: 0.1719 - val_accuracy: 0.9501
Epoch 2/5
48000/48000 [==============================] - 3s 57us/sample - loss: 0.1568 - accuracy: 0.9526 - val_loss: 0.1365 - val_accuracy: 0.9605
Epoch 3/5
48000/48000 [==============================] - 3s 58us/sample - loss: 0.1144 - accuracy: 0.9660 - val_loss: 0.1262 - val_accuracy: 0.9625
Epoch 4/5
48000/48000 [==============================] - 3s 54us/sample - loss: 0.0929 - accuracy: 0.9716 - val_loss: 0.1100 - val_accuracy: 0.9701
Epoch 5/5
48000/48000 [==============================] - 3s 55us/sample - loss: 0.0759 - accuracy: 0.9770 - val_loss: 0.1139 - val_accuracy: 0.9670
Test loss: 0.100577776569454
Test accuracy: 0.9696

モデル訓練と評価についての完全なガイドは、Guide to Training & Evaluation を見てください。

セービングとシリアライゼーション

Functional API を使用して構築されたモデルのためのセービングとシリアライゼーションは Sequential モデルのためと正確に同じ方法で動作します。

Functional モデルをセーブするための標準的な方法はモデル全体を単一のファイルにセーブするために model.save() を呼び出すことです。貴方は後でこのファイルから同じモデルを再作成できます、モデルを作成したコードへのアクセスをもはや持たない場合でさえも。

このファイルは以下を含みます :- モデルのアーキテクチャ – モデルの重み値 (これは訓練の間に学習されました) – もしあれば、(compile に渡した) モデルの訓練 config – もしあれば、optimizer とその状態 (これは貴方がやめたところから訓練を再開することを可能にします)

model.save('path_to_my_model.h5')
del model
# Recreate the exact same model purely from the file:
model = keras.models.load_model('path_to_my_model.h5')

モデル・セービングについての完全なガイドは、Guide to Saving and Serializing Models を見てください。

マルチモデルを定義するために層群の同じグラフを使用する

functional API では、モデルは層群のグラフ内のそれらの入力と出力を指定することにより作成されます。それは層群の単一のグラフが複数のモデルを生成するために使用できることを意味しています。

下の例では、2 つのモデルをインスタンス化するために層の同じスタックを使用しています : 画像入力を 16-次元ベクトルに変換するエンコーダモデルと、訓練のための end-to-end autoencoder モデルです。

encoder_input = keras.Input(shape=(28, 28, 1), name='img')
x = layers.Conv2D(16, 3, activation='relu')(encoder_input)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.Conv2D(16, 3, activation='relu')(x)
encoder_output = layers.GlobalMaxPooling2D()(x)

encoder = keras.Model(encoder_input, encoder_output, name='encoder')
encoder.summary()

x = layers.Reshape((4, 4, 1))(encoder_output)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
x = layers.Conv2DTranspose(32, 3, activation='relu')(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation='relu')(x)

autoencoder = keras.Model(encoder_input, decoder_output, name='autoencoder')
autoencoder.summary()

Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
img (InputLayer)             [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 26, 26, 16)        160       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 32)        4640      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 6, 6, 32)          9248      
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 16)          4624      
_________________________________________________________________
global_max_pooling2d (Global (None, 16)                0         
=================================================================
Total params: 18,672
Trainable params: 18,672
Non-trainable params: 0
_________________________________________________________________
Model: "autoencoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
img (InputLayer)             [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 26, 26, 16)        160       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 32)        4640      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 6, 6, 32)          9248      
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 16)          4624      
_________________________________________________________________
global_max_pooling2d (Global (None, 16)                0         
_________________________________________________________________
reshape (Reshape)            (None, 4, 4, 1)           0         
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 6, 6, 16)          160       
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 8, 8, 32)          4640      
_________________________________________________________________
up_sampling2d (UpSampling2D) (None, 24, 24, 32)        0         
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 26, 26, 16)        4624      
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 28, 28, 1)         145       
=================================================================
Total params: 28,241
Trainable params: 28,241
Non-trainable params: 0
_________________________________________________________________

デコーディング・アーキテクチャをエンコーディング・アーキテクチャと厳密に対照的に作成しますので、入力 shape (28, 28, 1) と同じ出力 shape を得ることに注意してください。Conv2D 層の反対は Conv2DTranspose 層で、MaxPooling2D 層の反対は UpSampling2D 層です。

総てのモデルは、ちょうど層のように callable です

任意のモデルを、それを入力上あるいは他の層の出力上で呼び出すことによって、それが層であるかのように扱うことができます。モデルを呼び出すことによりモデルのアーキテクチャを単に再利用しているのではなく、その重みを再利用していることに注意してください。

これを実際に見てみましょう。ここに autoencoder サンプルの異なるテイクがあります、これはエンコーダ・モデル、デコーダ・モデルを作成し、autoencoder モデルを得るためにそれらを 2 つのコールにチェインします :

encoder_input = keras.Input(shape=(28, 28, 1), name='original_img')
x = layers.Conv2D(16, 3, activation='relu')(encoder_input)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.Conv2D(16, 3, activation='relu')(x)
encoder_output = layers.GlobalMaxPooling2D()(x)

encoder = keras.Model(encoder_input, encoder_output, name='encoder')
encoder.summary()

decoder_input = keras.Input(shape=(16,), name='encoded_img')
x = layers.Reshape((4, 4, 1))(decoder_input)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
x = layers.Conv2DTranspose(32, 3, activation='relu')(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation='relu')(x)

decoder = keras.Model(decoder_input, decoder_output, name='decoder')
decoder.summary()

autoencoder_input = keras.Input(shape=(28, 28, 1), name='img')
encoded_img = encoder(autoencoder_input)
decoded_img = decoder(encoded_img)
autoencoder = keras.Model(autoencoder_input, decoded_img, name='autoencoder')
autoencoder.summary()

Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
original_img (InputLayer)    [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 26, 26, 16)        160       
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 24, 24, 32)        4640      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 8, 8, 32)          0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 6, 6, 32)          9248      
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 4, 4, 16)          4624      
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 16)                0         
=================================================================
Total params: 18,672
Trainable params: 18,672
Non-trainable params: 0
_________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
encoded_img (InputLayer)     [(None, 16)]              0         
_________________________________________________________________
reshape_1 (Reshape)          (None, 4, 4, 1)           0         
_________________________________________________________________
conv2d_transpose_4 (Conv2DTr (None, 6, 6, 16)          160       
_________________________________________________________________
conv2d_transpose_5 (Conv2DTr (None, 8, 8, 32)          4640      
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 24, 24, 32)        0         
_________________________________________________________________
conv2d_transpose_6 (Conv2DTr (None, 26, 26, 16)        4624      
_________________________________________________________________
conv2d_transpose_7 (Conv2DTr (None, 28, 28, 1)         145       
=================================================================
Total params: 9,569
Trainable params: 9,569
Non-trainable params: 0
_________________________________________________________________
Model: "autoencoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
img (InputLayer)             [(None, 28, 28, 1)]       0         
_________________________________________________________________
encoder (Model)              (None, 16)                18672     
_________________________________________________________________
decoder (Model)              (None, 28, 28, 1)         9569      
=================================================================
Total params: 28,241
Trainable params: 28,241
Non-trainable params: 0
_________________________________________________________________

見て取れるように、モデルはネストできます : モデルはサブモデルを含むことができます (何故ならば モデルはちょうど層のようなものだからです)。

モデル・ネスティングのための一般的なユースケースはアンサンブルです。例として、ここにモデルのセットを (それらの予測を平均する) 単一のモデルにどのようにアンサンブルするかがあります :

def get_model():
  inputs = keras.Input(shape=(128,))
  outputs = layers.Dense(1, activation='sigmoid')(inputs)
  return keras.Model(inputs, outputs)

model1 = get_model()
model2 = get_model()
model3 = get_model()

inputs = keras.Input(shape=(128,))
y1 = model1(inputs)
y2 = model2(inputs)
y3 = model3(inputs)
outputs = layers.average([y1, y2, y3])
ensemble_model = keras.Model(inputs=inputs, outputs=outputs)

複雑なグラフ・トポロジーを操作する

マルチ入力と出力を持つモデル

functional API はマルチ入力と出力を操作することを容易にします。これは Sequential API では処理できません。

ここに単純なサンプルがあります。

貴方はプライオリティによりカスタム課題チケットをランク付けしてそれらを正しい部門に転送するためのシステムを構築しているとします。

貴方のモデルは 3 入力を持ちます :

チケットのタイトル (テキスト入力)
チケットのテキスト本体 (テキスト入力)
ユーザにより付加された任意のタグ (カテゴリカル入力)

それは 2 つの出力を持つでしょう :

0 と 1 の間のプライオリティ・スコア (スカラー sigmoid 出力)
チケットを処理すべき部門 (部門集合に渡る softmax 出力)

このモデルを Functional API で数行で構築しましょう。

num_tags = 12  # Number of unique issue tags
num_words = 10000  # Size of vocabulary obtained when preprocessing text data
num_departments = 4  # Number of departments for predictions

title_input = keras.Input(shape=(None,), name='title')  # Variable-length sequence of ints
body_input = keras.Input(shape=(None,), name='body')  # Variable-length sequence of ints
tags_input = keras.Input(shape=(num_tags,), name='tags')  # Binary vectors of size `num_tags`

# Embed each word in the title into a 64-dimensional vector
title_features = layers.Embedding(num_words, 64)(title_input)
# Embed each word in the text into a 64-dimensional vector
body_features = layers.Embedding(num_words, 64)(body_input)

# Reduce sequence of embedded words in the title into a single 128-dimensional vector
title_features = layers.LSTM(128)(title_features)
# Reduce sequence of embedded words in the body into a single 32-dimensional vector
body_features = layers.LSTM(32)(body_features)

# Merge all available features into a single large vector via concatenation
x = layers.concatenate([title_features, body_features, tags_input])

# Stick a logistic regression for priority prediction on top of the features
priority_pred = layers.Dense(1, activation='sigmoid', name='priority')(x)
# Stick a department classifier on top of the features
department_pred = layers.Dense(num_departments, activation='softmax', name='department')(x)

# Instantiate an end-to-end model predicting both priority and department
model = keras.Model(inputs=[title_input, body_input, tags_input],
                    outputs=[priority_pred, department_pred])

モデルをプロットしましょう :

keras.utils.plot_model(model, 'multi_input_and_output_model.png', show_shapes=True)

このモデルをコンパイルするとき、各出力に異なる損失を割り当てることができます。トータルの訓練損失へのそれらの寄与ををモジュール化するために、各損失に異なる重みを割り当てることさえできます。

model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
              loss=['binary_crossentropy', 'categorical_crossentropy'],
              loss_weights=[1., 0.2])

出力層に名前を与えましたので、このように損失を指定することもできるでしょう :

model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
              loss={'priority': 'binary_crossentropy',
                    'department': 'categorical_crossentropy'},
              loss_weights=[1., 0.2])

入力とターゲットの NumPy 配列のリストを渡すことでモデルを訓練できます :

import numpy as np

# Dummy input data
title_data = np.random.randint(num_words, size=(1280, 10))
body_data = np.random.randint(num_words, size=(1280, 100))
tags_data = np.random.randint(2, size=(1280, num_tags)).astype('float32')
# Dummy target data
priority_targets = np.random.random(size=(1280, 1))
dept_targets = np.random.randint(2, size=(1280, num_departments))

model.fit({'title': title_data, 'body': body_data, 'tags': tags_data},
          {'priority': priority_targets, 'department': dept_targets},
          epochs=2,
          batch_size=32)

Epoch 1/2
1280/1280 [==============================] - 11s 9ms/sample - loss: 1.2694 - priority_loss: 0.6984 - department_loss: 2.8547
Epoch 2/2
1280/1280 [==============================] - 11s 9ms/sample - loss: 1.2137 - priority_loss: 0.6489 - department_loss: 2.8242

Dataset オブジェクトで fit を呼び出すとき、それは ([title_data, body_data, tags_data], [priority_targets, dept_targets]) のようなリストのタプルか、({‘title’: title_data, ‘body’: body_data, ‘tags’: tags_data}, {‘priority’: priority_targets, ‘department’: dept_targets}) のような辞書のタプルを yield すべきです。

より詳細な説明については、完全なガイド Guide to Training & Evaluation を参照してください。

resnet モデル

マルチ入力と出力を持つモデルに加えて、Functional API は非線形接続トポロジー、つまり層がシークエンシャルに接続されないモデルを操作することも容易にします、これもまた (名前が示すように) Sequential API で扱えません。

これの一般的なユースケースは residual 接続です。

これを示すために CIFAR10 のための toy ResNet モデルを構築しましょう。

inputs = keras.Input(shape=(32, 32, 3), name='img')
x = layers.Conv2D(32, 3, activation='relu')(inputs)
x = layers.Conv2D(64, 3, activation='relu')(x)
block_1_output = layers.MaxPooling2D(3)(x)

x = layers.Conv2D(64, 3, activation='relu', padding='same')(block_1_output)
x = layers.Conv2D(64, 3, activation='relu', padding='same')(x)
block_2_output = layers.add([x, block_1_output])

x = layers.Conv2D(64, 3, activation='relu', padding='same')(block_2_output)
x = layers.Conv2D(64, 3, activation='relu', padding='same')(x)
block_3_output = layers.add([x, block_2_output])

x = layers.Conv2D(64, 3, activation='relu')(block_3_output)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(10, activation='softmax')(x)

model = keras.Model(inputs, outputs, name='toy_resnet')
model.summary()

Model: "toy_resnet"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
img (InputLayer)                [(None, 32, 32, 3)]  0                                            
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 30, 30, 32)   896         img[0][0]                        
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 28, 28, 64)   18496       conv2d_8[0][0]                   
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 9, 9, 64)     0           conv2d_9[0][0]                   
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 9, 9, 64)     36928       max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
conv2d_11 (Conv2D)              (None, 9, 9, 64)     36928       conv2d_10[0][0]                  
__________________________________________________________________________________________________
add (Add)                       (None, 9, 9, 64)     0           conv2d_11[0][0]                  
                                                                 max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
conv2d_12 (Conv2D)              (None, 9, 9, 64)     36928       add[0][0]                        
__________________________________________________________________________________________________
conv2d_13 (Conv2D)              (None, 9, 9, 64)     36928       conv2d_12[0][0]                  
__________________________________________________________________________________________________
add_1 (Add)                     (None, 9, 9, 64)     0           conv2d_13[0][0]                  
                                                                 add[0][0]                        
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 7, 7, 64)     36928       add_1[0][0]                      
__________________________________________________________________________________________________
global_average_pooling2d (Globa (None, 64)           0           conv2d_14[0][0]                  
__________________________________________________________________________________________________
dense_9 (Dense)                 (None, 256)          16640       global_average_pooling2d[0][0]   
__________________________________________________________________________________________________
dropout (Dropout)               (None, 256)          0           dense_9[0][0]                    
__________________________________________________________________________________________________
dense_10 (Dense)                (None, 10)           2570        dropout[0][0]                    
==================================================================================================
Total params: 223,242
Trainable params: 223,242
Non-trainable params: 0
__________________________________________________________________________________________________

モデルをプロットしましょう :

keras.utils.plot_model(model, 'mini_resnet.png', show_shapes=True)

それを訓練しましょう :

(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
              loss='categorical_crossentropy',
              metrics=['acc'])
model.fit(x_train, y_train,
          batch_size=64,
          epochs=1,
          validation_split=0.2)

Train on 40000 samples, validate on 10000 samples
40000/40000 [==============================] - 318s 8ms/sample - loss: 1.9034 - acc: 0.2767 - val_loss: 1.6173 - val_acc: 0.3870

<tensorflow.python.keras.callbacks.History at 0x7f27a93392b0>

層を共有する

functional API のためのもう一つの良いユースケースは共有層を使用するモデルです。共有層は同じモデルで複数回再利用される層インスタンスです : それらは層グラフのマルチパスに対応する特徴を学習します。

共有層は類似空間 (例えば、類似語彙を特徴付けるテキストの 2 つの異なるピース) に由来する入力をエンコードするためにしばしば使用されます、何故ならばそれらは異なる入力に渡る情報の共有を可能にし、そしてそれらはより少ないデータ上でそのようなモデルを訓練することを可能にするからです。与えられた単語が入力の一つであれば、それは共有層を通り抜ける総ての入力の処理に役立つでしょう。

Functional API の層を共有するためには、単に同じ層インスタンスを複数回呼び出すだけです。例えば、ここに2 つの異なるテキスト入力に渡り共有された Embedding 層があります :

# Embedding for 1000 unique words mapped to 128-dimensional vectors
shared_embedding = layers.Embedding(1000, 128)

# Variable-length sequence of integers
text_input_a = keras.Input(shape=(None,), dtype='int32')

# Variable-length sequence of integers
text_input_b = keras.Input(shape=(None,), dtype='int32')

# We reuse the same layer to encode both inputs
encoded_input_a = shared_embedding(text_input_a)
encoded_input_b = shared_embedding(text_input_b)

層のグラフのノードを抽出して再利用する

Functional API で貴方が操作する層のグラフは静的データ構造ですので、それはアクセスして調査可能です。これは Functional モデルをどのようにプロットできるかです、例えば。

これはまた中間層 (グラフの「ノード」) の活性にアクセスしてそれらを他の場所で再利用できることも意味します。これは特徴抽出のために極めて有用です、例えば！

サンプルを見てみましょう。これは ImageNet 上で事前訓練された重みを持つ VGG19 モデルです :

from tensorflow.keras.applications import VGG19

vgg19 = VGG19()

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5
574717952/574710816 [==============================] - 6s 0us/step

そしてこれらはグラフデータ構造に問い合わせて得られた、モデルの中間的な活性です。

features_list = [layer.output for layer in vgg19.layers]

私達は新しい特徴抽出モデルを作成するためにこれらの特徴を使用できます、それは中間層の活性の値を返します — そしてこの総てを 3 行で行なうことができます。

feat_extraction_model = keras.Model(inputs=vgg19.input, outputs=features_list)

img = np.random.random((1, 224, 224, 3)).astype('float32')
extracted_features = feat_extraction_model(img)

他のものの中では、ニューラルスタイルトランスファーを実装するときこれは役立ちます。

カスタム層を書いて API を拡張する

tf.keras は広範囲の組み込み層を持ちます。幾つかの例がここにあります :

畳み込み層: Conv1D, Conv2D, Conv3D, Conv2DTranspose, etc.
Pooling 層: MaxPooling1D, MaxPooling2D, MaxPooling3D, AveragePooling1D, etc.
RNN 層: GRU, LSTM, ConvLSTM2D, etc.
BatchNormalization, Dropout, Embedding, etc.

貴方が必要なものを見つけられないならば、貴方自身の層を作成して API を拡張することは容易です。

総ての層は Layer クラスをサブクラス化して次を実装します :- call メソッド、層により行われる計算を指定します。- build メソッド、層の重みを作成します (これは単にスタイル慣習であることに注意してください ; __init__ で重みを作成しても良いでしょう)。

スクラッチから層を作成することについて更に学習するためには、ガイド Guide to writing layers and models from scratch を調べてください。

ここに Dense 層の単純な実装があります :

class CustomDense(layers.Layer):

  def __init__(self, units=32):
    super(CustomDense, self).__init__()
    self.units = units

  def build(self, input_shape):
    self.w = self.add_weight(shape=(input_shape[-1], self.units),
                             initializer='random_normal',
                             trainable=True)
    self.b = self.add_weight(shape=(self.units,),
                             initializer='random_normal',
                             trainable=True)

  def call(self, inputs):
    return tf.matmul(inputs, self.w) + self.b
      
inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)

model = keras.Model(inputs, outputs)

シリアライゼーションをサポートするカスタム層を作成することを望む場合、get_config メソッドもまた定義するべきです、これは層インスタンスのコンストラクタ引数を返します :

class CustomDense(layers.Layer):

  def __init__(self, units=32):
    super(CustomDense, self).__init__()
    self.units = units

  def build(self, input_shape):
    self.w = self.add_weight(shape=(input_shape[-1], self.units),
                             initializer='random_normal',
                             trainable=True)
    self.b = self.add_weight(shape=(self.units,),
                             initializer='random_normal',
                             trainable=True)

  def call(self, inputs):
    return tf.matmul(inputs, self.w) + self.b

  def get_config(self):
    return {'units': self.units}
    
    
inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)

model = keras.Model(inputs, outputs)
config = model.get_config()

new_model = keras.Model.from_config(
    config, custom_objects={'CustomDense': CustomDense})

オプションで、クラスメソッド from_config(cls, config) を実装することもできるでしょう、これはその config 辞書が与えられたときに層インスタンスを再作成する責任を負います。from_config のデフォルト実装は :

def from_config(cls, config):
  return cls(**config)

いつ Functional API を使用するか

新しいモデルを作成するために Functional API を使用するか、単に Model クラスを直接的にサブクラス化するかをどのように決めるのでしょう？

一般に、Functional API は高位で、利用するにより簡単で安全で、そしてサブクラス化されたモデルがサポートしない多くの特徴を持ちます。

けれども、層の有向非巡回グラフのように容易には表現できないモデルを作成するとき、モデルのサブクラス化は貴方により大きな柔軟性を与えます (例えば、貴方は Tree-RNN を Functional API では実装できないでしょう、Model を直接的にサブクラス化しなければならないでしょう)。

Functional API の強みがここにあります :

下にリストされる特性は Sequential に対しても総て真ですが (それはまたデータ構造です)、それらはサブクラス化されたモデルについては真ではありません (それは Python バイトコードであり、データ構造ではありません)。

それはより冗長ではありません。(= It is less verbose.)

No super(MyClass, self).__init__(…), no def call(self, …):, etc.

以下を :

inputs = keras.Input(shape=(32,))
x = layers.Dense(64, activation='relu')(inputs)
outputs = layers.Dense(10)(x)
mlp = keras.Model(inputs, outputs)

サブクラス化されたバージョンと比較してください :

class MLP(keras.Model):
  
  def __init__(self, **kwargs):
    super(MLP, self).__init__(**kwargs)
    self.dense_1 = layers.Dense(64, activation='relu')
    self.dense_2 = layers.Dense(10)
    
  def call(self, inputs):
    x = self.dense_1(inputs)
    return self.dense_2(x)
 
# Instantiate the model.
mlp = MLP()
# Necessary to create the model's state.
# The model doesn't have a state until it's called at least once.
_ = mlp(tf.zeros((1, 32)))

それは定義する間に貴方のモデルを検証します。

Functional API では、入力仕様 (shape と dtype) が (Input を通して) 前もって作成され、そして層を呼び出すたびに、その層はそれに渡される仕様が仮定に適合するかをチェックして、そうでないならば役立つエラーメッセージをあげます。

これは Functional API で構築できたどのようなモデルも実行されることを保証します。(収束関連のデバッグ以外の) 総てのデバッグはモデル構築時に静的に発生し、実行時ではありません。これはコンパイラの型チェックに類似しています。

貴方の Functional モデルはプロット可能で調査可能

モデルをグラフとしてプロットできます、そしてこのグラフの中間ノードに容易にアクセスできます — 例えば、前の例で見たように中間層の活性を抽出して再利用するためにです。

features_list = [layer.output for layer in vgg19.layers]
feat_extraction_model = keras.Model(inputs=vgg19.input, outputs=features_list)

貴方の Functional モデルはシリアライズとクローン可能

Functional モデルはコードのピースというよりもデータ構造ですから、それは安全にシリアライズ可能で (どのような元のコードにもアクセスすることなく正確に同じモデルを再作成することを可能にする) 単一ファイルにセーブできます。より詳細については saving and serialization guide を見てください。

Functional API の欠点がここにあります :

それは動的アーキテクチャをサポートしません。

Functional API はモデルを層の DAG として扱います。これは殆どの深層学習アーキテクチャに対しては真ですが、総てではありません : 例えば、再帰 (= recursive) ネットワークや Tree RNNs はこの仮定に従いませんそして Functional API では実装できません。

時に、総てを単にスクラッチから書く必要があります。

進んだアーキテクチャを書く時、「層の DAG を定義する」という範囲の外にあることをすることを望むかもしれません : 例えば、貴方のモデル・インスタンス上の複数のカスタム訓練と推論メソッドを公開することを望むかもしれません。これはサブクラス化を必要とします。

Functional API とモデルのサブクラス化の間の違いへより深く潜るために、What are Symbolic and Imperative APIs in TensorFlow 2.0? を読むことができます。

異なる API スタイルを上手く組み合わせる

重要なことは、Functional API かモデルのサブクラス化の間を選択することはモデルの一つのカテゴリに貴方を制限する二者択一ではありません。tf.keras API の総てのモデルはそれらが Sequential モデルか、Functional モデルか、あるいはスクラッチから書かれたサブクラス化されたモデル/層であろうと、各々と相互作用できます。

サブクラス化されたモデル/層の一部として Functional モデルや Sequential モデルを常に使用できます :

units = 32
timesteps = 10
input_dim = 5

# Define a Functional model
inputs = keras.Input((None, units))
x = layers.GlobalAveragePooling1D()(inputs)
outputs = layers.Dense(1, activation='sigmoid')(x)
model = keras.Model(inputs, outputs)


class CustomRNN(layers.Layer):

  def __init__(self):
    super(CustomRNN, self).__init__()
    self.units = units
    self.projection_1 = layers.Dense(units=units, activation='tanh')
    self.projection_2 = layers.Dense(units=units, activation='tanh')
    # Our previously-defined Functional model
    self.classifier = model

  def call(self, inputs):
    outputs = []
    state = tf.zeros(shape=(inputs.shape[0], self.units))
    for t in range(inputs.shape[1]):
      x = inputs[:, t, :]
      h = self.projection_1(x)
      y = h + self.projection_2(state)
      state = y
      outputs.append(y)
    features = tf.stack(outputs, axis=1)
    print(features.shape)
    return self.classifier(features)

rnn_model = CustomRNN()
_ = rnn_model(tf.zeros((1, timesteps, input_dim)))

(1, 10, 32)

反対に、Functional API でサブクラス化された層やモデルをそれが (次のパターンの一つに従う) call メソッドを実装する限りは使用することもできます :

call(self, inputs, **kwargs)、ここで inputs は tensor か tensor のネスト構造 (e.g. tensor のリスト) で、**kwargs は非 tensor 引数 (非 inputs) です。
call(self, inputs, training=None, **kwargs)、ここで training は層が訓練モードか推論モードで動作するべきかを示すブーリアンです。
call(self, inputs, mask=None, **kwargs)、ここで mask はブーリアンのマスク tensor です (推論のために、RNN に有用です)。
call(self, inputs, training=None, mask=None, **kwargs) — もちろんマスキングと訓練固有の動作の両者を同時に持つことができます。

加えて、貴方のカスタム層やモデルで get_config メソッドを実装する場合、貴方がそれで作成した Functional モデルは依然としてシリアライズ可能でクローン可能です。

ここに Functional モデルでスクラッチから書かれたカスタム RNN を使用する簡単なサンプルがあります :

units = 32
timesteps = 10
input_dim = 5
batch_size = 16


class CustomRNN(layers.Layer):

  def __init__(self):
    super(CustomRNN, self).__init__()
    self.units = units
    self.projection_1 = layers.Dense(units=units, activation='tanh')
    self.projection_2 = layers.Dense(units=units, activation='tanh')
    self.classifier = layers.Dense(1, activation='sigmoid')

  def call(self, inputs):
    outputs = []
    state = tf.zeros(shape=(inputs.shape[0], self.units))
    for t in range(inputs.shape[1]):
      x = inputs[:, t, :]
      h = self.projection_1(x)
      y = h + self.projection_2(state)
      state = y
      outputs.append(y)
    features = tf.stack(outputs, axis=1)
    return self.classifier(features)

# Note that we specify a static batch size for the inputs with the `batch_shape`
# arg, because the inner computation of `CustomRNN` requires a static batch size
# (when we create the `state` zeros tensor).
inputs = keras.Input(batch_shape=(batch_size, timesteps, input_dim))
x = layers.Conv1D(32, 3)(inputs)
outputs = CustomRNN()(x)

model = keras.Model(inputs, outputs)

rnn_model = CustomRNN()
_ = rnn_model(tf.zeros((1, 10, 5)))

以上

2019年3月
月	火	水	木	金	土	日
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31