Keras 2 : examples : ビジョン Transformer による画像分類 (翻訳/解説)
翻訳 : (株)クラスキャット セールスインフォメーション
作成日時 : 11/12/2021 (keras 2.6.0)
* 本ページは、Keras の以下のドキュメントを翻訳した上で適宜、補足説明したものです:
- Code examples : Computer Vision : Image classification with Vision Transformer (Author: Khalid Salama)
* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。
- 人工知能研究開発支援
- 人工知能研修サービス(経営者層向けオンサイト研修)
- テクニカルコンサルティングサービス
- 実証実験(プロトタイプ構築)
- アプリケーションへの実装
- 人工知能研修サービス
- PoC(概念実証)を失敗させないための支援
- テレワーク & オンライン授業を支援
- お住まいの地域に関係なく Web ブラウザからご参加頂けます。事前登録 が必要ですのでご注意ください。
- ウェビナー運用には弊社製品「ClassCat® Webinar」を利用しています。
◆ お問合せ : 本件に関するお問い合わせ先は下記までお願いいたします。
株式会社クラスキャット セールス・マーケティング本部 セールス・インフォメーション |
E-Mail:sales-info@classcat.com ; WebSite: https://www.classcat.com/ ; Facebook |
Keras 2 : examples : ビジョン Transformer による画像分類
Description: 画像分類のためのビジョン Transformer (ViT) モデルの実装。
イントロダクション
このサンプルは画像分類のための Alexey Dosovitskiy et al. による ビジョン Transformer (ViT) モデルを実装し、そしてそれを CIFAR-100 データセットで実演します。ViT モデルは、畳込み層を使用することなく、画像パッチのシークエンスに self-attention を持つ Transformer アーキテクチャを適用します。
このサンプルは TensorFlow 2.4 またはそれ以上、そして TensorFlow Addons を必要とします、これは次のコマンドを使用してインストールできます :
pip install -U tensorflow-addons
セットアップ
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow_addons as tfa
データの準備
num_classes = 100
input_shape = (32, 32, 3)
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar100.load_data()
print(f"x_train shape: {x_train.shape} - y_train shape: {y_train.shape}")
print(f"x_test shape: {x_test.shape} - y_test shape: {y_test.shape}")
x_train shape: (50000, 32, 32, 3) - y_train shape: (50000, 1) x_test shape: (10000, 32, 32, 3) - y_test shape: (10000, 1)
ハイパーパラメータの設定
learning_rate = 0.001
weight_decay = 0.0001
batch_size = 256
num_epochs = 100
image_size = 72 # We'll resize input images to this size
patch_size = 6 # Size of the patches to be extract from the input images
num_patches = (image_size // patch_size) ** 2
projection_dim = 64
num_heads = 4
transformer_units = [
projection_dim * 2,
projection_dim,
] # Size of the transformer layers
transformer_layers = 8
mlp_head_units = [2048, 1024] # Size of the dense layers of the final classifier
データ増強の使用
data_augmentation = keras.Sequential(
[
layers.Normalization(),
layers.Resizing(image_size, image_size),
layers.RandomFlip("horizontal"),
layers.RandomRotation(factor=0.02),
layers.RandomZoom(
height_factor=0.2, width_factor=0.2
),
],
name="data_augmentation",
)
# Compute the mean and the variance of the training data for normalization.
data_augmentation.layers[0].adapt(x_train)
多層パーセプトロン (MLP) の実装
def mlp(x, hidden_units, dropout_rate):
for units in hidden_units:
x = layers.Dense(units, activation=tf.nn.gelu)(x)
x = layers.Dropout(dropout_rate)(x)
return x
パッチ作成を層として実装する
class Patches(layers.Layer):
def __init__(self, patch_size):
super(Patches, self).__init__()
self.patch_size = patch_size
def call(self, images):
batch_size = tf.shape(images)[0]
patches = tf.image.extract_patches(
images=images,
sizes=[1, self.patch_size, self.patch_size, 1],
strides=[1, self.patch_size, self.patch_size, 1],
rates=[1, 1, 1, 1],
padding="VALID",
)
patch_dims = patches.shape[-1]
patches = tf.reshape(patches, [batch_size, -1, patch_dims])
return patches
サンプル画像のためにパッチを表示しましょう
import matplotlib.pyplot as plt
plt.figure(figsize=(4, 4))
image = x_train[np.random.choice(range(x_train.shape[0]))]
plt.imshow(image.astype("uint8"))
plt.axis("off")
resized_image = tf.image.resize(
tf.convert_to_tensor([image]), size=(image_size, image_size)
)
patches = Patches(patch_size)(resized_image)
print(f"Image size: {image_size} X {image_size}")
print(f"Patch size: {patch_size} X {patch_size}")
print(f"Patches per image: {patches.shape[1]}")
print(f"Elements per patch: {patches.shape[-1]}")
n = int(np.sqrt(patches.shape[1]))
plt.figure(figsize=(4, 4))
for i, patch in enumerate(patches[0]):
ax = plt.subplot(n, n, i + 1)
patch_img = tf.reshape(patch, (patch_size, patch_size, 3))
plt.imshow(patch_img.numpy().astype("uint8"))
plt.axis("off")
Image size: 72 X 72 Patch size: 6 X 6 Patches per image: 144 Elements per patch: 108
パッチエンコーディング層を実装する
PatchEncoder 層はパッチをサイズ projection_dim のベクトルに射影することで線形に変換します。更に、射影されたベクトルに学習可能な位置埋め込みを追加します。
class PatchEncoder(layers.Layer):
def __init__(self, num_patches, projection_dim):
super(PatchEncoder, self).__init__()
self.num_patches = num_patches
self.projection = layers.Dense(units=projection_dim)
self.position_embedding = layers.Embedding(
input_dim=num_patches, output_dim=projection_dim
)
def call(self, patch):
positions = tf.range(start=0, limit=self.num_patches, delta=1)
encoded = self.projection(patch) + self.position_embedding(positions)
return encoded
ViT モデルの構築
ViT モデルは複数の Transformer ブロックから成り、これはパッチのシークエンスに適用される self-attention メカニズムとして layers.MultiHeadAttention 層を使用します。Transformer は [batch_size, num_patches, projection_dim] テンソルを生成します、これは最終的なクラス確率出力を生成するために softmax を持つ分類器ヘッドを通して処理されます。
画像表現としてサーブするように学習可能な埋め込みをエンコードされたパッチのシークエンスの先頭に追加する、論文 で記述されているテクニックとは違い、最後の Transformer ブロックの総ての出力は layers.Flatten() で reshape されて分類器ヘッドへの画像表現入力として使用されます。layers.GlobalAveragePooling1D 層はまた Transformer ブロックの出力を集約するために代わりに使用できることにも注意してください、特にパッチ数と射影次元が大きいときです。
def create_vit_classifier():
inputs = layers.Input(shape=input_shape)
# Augment data.
augmented = data_augmentation(inputs)
# Create patches.
patches = Patches(patch_size)(augmented)
# Encode patches.
encoded_patches = PatchEncoder(num_patches, projection_dim)(patches)
# Create multiple layers of the Transformer block.
for _ in range(transformer_layers):
# Layer normalization 1.
x1 = layers.LayerNormalization(epsilon=1e-6)(encoded_patches)
# Create a multi-head attention layer.
attention_output = layers.MultiHeadAttention(
num_heads=num_heads, key_dim=projection_dim, dropout=0.1
)(x1, x1)
# Skip connection 1.
x2 = layers.Add()([attention_output, encoded_patches])
# Layer normalization 2.
x3 = layers.LayerNormalization(epsilon=1e-6)(x2)
# MLP.
x3 = mlp(x3, hidden_units=transformer_units, dropout_rate=0.1)
# Skip connection 2.
encoded_patches = layers.Add()([x3, x2])
# Create a [batch_size, projection_dim] tensor.
representation = layers.LayerNormalization(epsilon=1e-6)(encoded_patches)
representation = layers.Flatten()(representation)
representation = layers.Dropout(0.5)(representation)
# Add MLP.
features = mlp(representation, hidden_units=mlp_head_units, dropout_rate=0.5)
# Classify outputs.
logits = layers.Dense(num_classes)(features)
# Create the Keras model.
model = keras.Model(inputs=inputs, outputs=logits)
return model
モデルをコンパイル, 訓練, そして評価する
def run_experiment(model):
optimizer = tfa.optimizers.AdamW(
learning_rate=learning_rate, weight_decay=weight_decay
)
model.compile(
optimizer=optimizer,
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[
keras.metrics.SparseCategoricalAccuracy(name="accuracy"),
keras.metrics.SparseTopKCategoricalAccuracy(5, name="top-5-accuracy"),
],
)
checkpoint_filepath = "/tmp/checkpoint"
checkpoint_callback = keras.callbacks.ModelCheckpoint(
checkpoint_filepath,
monitor="val_accuracy",
save_best_only=True,
save_weights_only=True,
)
history = model.fit(
x=x_train,
y=y_train,
batch_size=batch_size,
epochs=num_epochs,
validation_split=0.1,
callbacks=[checkpoint_callback],
)
model.load_weights(checkpoint_filepath)
_, accuracy, top_5_accuracy = model.evaluate(x_test, y_test)
print(f"Test accuracy: {round(accuracy * 100, 2)}%")
print(f"Test top 5 accuracy: {round(top_5_accuracy * 100, 2)}%")
return history
vit_classifier = create_vit_classifier()
history = run_experiment(vit_classifier)
Epoch 1/100 176/176 [==============================] - 33s 136ms/step - loss: 4.8863 - accuracy: 0.0294 - top-5-accuracy: 0.1117 - val_loss: 3.9661 - val_accuracy: 0.0992 - val_top-5-accuracy: 0.3056 Epoch 2/100 176/176 [==============================] - 22s 127ms/step - loss: 4.0162 - accuracy: 0.0865 - top-5-accuracy: 0.2683 - val_loss: 3.5691 - val_accuracy: 0.1630 - val_top-5-accuracy: 0.4226 Epoch 3/100 176/176 [==============================] - 22s 127ms/step - loss: 3.7313 - accuracy: 0.1254 - top-5-accuracy: 0.3535 - val_loss: 3.3455 - val_accuracy: 0.1976 - val_top-5-accuracy: 0.4756 Epoch 4/100 176/176 [==============================] - 23s 128ms/step - loss: 3.5411 - accuracy: 0.1541 - top-5-accuracy: 0.4121 - val_loss: 3.1925 - val_accuracy: 0.2274 - val_top-5-accuracy: 0.5126 Epoch 5/100 176/176 [==============================] - 22s 127ms/step - loss: 3.3749 - accuracy: 0.1847 - top-5-accuracy: 0.4572 - val_loss: 3.1043 - val_accuracy: 0.2388 - val_top-5-accuracy: 0.5320 Epoch 6/100 176/176 [==============================] - 22s 127ms/step - loss: 3.2589 - accuracy: 0.2057 - top-5-accuracy: 0.4906 - val_loss: 2.9319 - val_accuracy: 0.2782 - val_top-5-accuracy: 0.5756 Epoch 7/100 176/176 [==============================] - 22s 127ms/step - loss: 3.1165 - accuracy: 0.2331 - top-5-accuracy: 0.5273 - val_loss: 2.8072 - val_accuracy: 0.2972 - val_top-5-accuracy: 0.5946 Epoch 8/100 176/176 [==============================] - 22s 127ms/step - loss: 2.9902 - accuracy: 0.2563 - top-5-accuracy: 0.5556 - val_loss: 2.7207 - val_accuracy: 0.3188 - val_top-5-accuracy: 0.6258 Epoch 9/100 176/176 [==============================] - 22s 127ms/step - loss: 2.8828 - accuracy: 0.2800 - top-5-accuracy: 0.5827 - val_loss: 2.6396 - val_accuracy: 0.3244 - val_top-5-accuracy: 0.6402 Epoch 10/100 176/176 [==============================] - 23s 128ms/step - loss: 2.7824 - accuracy: 0.2997 - top-5-accuracy: 0.6110 - val_loss: 2.5580 - val_accuracy: 0.3494 - val_top-5-accuracy: 0.6568 Epoch 11/100 176/176 [==============================] - 23s 130ms/step - loss: 2.6743 - accuracy: 0.3209 - top-5-accuracy: 0.6333 - val_loss: 2.5000 - val_accuracy: 0.3594 - val_top-5-accuracy: 0.6726 Epoch 12/100 176/176 [==============================] - 23s 130ms/step - loss: 2.5800 - accuracy: 0.3431 - top-5-accuracy: 0.6522 - val_loss: 2.3900 - val_accuracy: 0.3798 - val_top-5-accuracy: 0.6878 Epoch 13/100 176/176 [==============================] - 23s 128ms/step - loss: 2.5019 - accuracy: 0.3559 - top-5-accuracy: 0.6671 - val_loss: 2.3464 - val_accuracy: 0.3960 - val_top-5-accuracy: 0.7002 Epoch 14/100 176/176 [==============================] - 22s 128ms/step - loss: 2.4207 - accuracy: 0.3728 - top-5-accuracy: 0.6905 - val_loss: 2.3130 - val_accuracy: 0.4032 - val_top-5-accuracy: 0.7040 Epoch 15/100 176/176 [==============================] - 23s 128ms/step - loss: 2.3371 - accuracy: 0.3932 - top-5-accuracy: 0.7093 - val_loss: 2.2447 - val_accuracy: 0.4136 - val_top-5-accuracy: 0.7202 Epoch 16/100 176/176 [==============================] - 23s 128ms/step - loss: 2.2650 - accuracy: 0.4077 - top-5-accuracy: 0.7201 - val_loss: 2.2101 - val_accuracy: 0.4222 - val_top-5-accuracy: 0.7246 Epoch 17/100 176/176 [==============================] - 22s 127ms/step - loss: 2.1822 - accuracy: 0.4204 - top-5-accuracy: 0.7376 - val_loss: 2.1446 - val_accuracy: 0.4344 - val_top-5-accuracy: 0.7416 Epoch 18/100 176/176 [==============================] - 22s 128ms/step - loss: 2.1485 - accuracy: 0.4284 - top-5-accuracy: 0.7476 - val_loss: 2.1094 - val_accuracy: 0.4432 - val_top-5-accuracy: 0.7454 Epoch 19/100 176/176 [==============================] - 22s 128ms/step - loss: 2.0717 - accuracy: 0.4464 - top-5-accuracy: 0.7618 - val_loss: 2.0718 - val_accuracy: 0.4584 - val_top-5-accuracy: 0.7570 Epoch 20/100 176/176 [==============================] - 22s 127ms/step - loss: 2.0031 - accuracy: 0.4605 - top-5-accuracy: 0.7731 - val_loss: 2.0286 - val_accuracy: 0.4610 - val_top-5-accuracy: 0.7654 Epoch 21/100 176/176 [==============================] - 22s 127ms/step - loss: 1.9650 - accuracy: 0.4700 - top-5-accuracy: 0.7820 - val_loss: 2.0225 - val_accuracy: 0.4642 - val_top-5-accuracy: 0.7628 Epoch 22/100 176/176 [==============================] - 22s 127ms/step - loss: 1.9066 - accuracy: 0.4839 - top-5-accuracy: 0.7904 - val_loss: 1.9961 - val_accuracy: 0.4746 - val_top-5-accuracy: 0.7656 Epoch 23/100 176/176 [==============================] - 22s 127ms/step - loss: 1.8564 - accuracy: 0.4952 - top-5-accuracy: 0.8030 - val_loss: 1.9769 - val_accuracy: 0.4828 - val_top-5-accuracy: 0.7742 Epoch 24/100 176/176 [==============================] - 22s 128ms/step - loss: 1.8167 - accuracy: 0.5034 - top-5-accuracy: 0.8099 - val_loss: 1.9730 - val_accuracy: 0.4766 - val_top-5-accuracy: 0.7728 Epoch 25/100 176/176 [==============================] - 22s 128ms/step - loss: 1.7788 - accuracy: 0.5124 - top-5-accuracy: 0.8174 - val_loss: 1.9187 - val_accuracy: 0.4926 - val_top-5-accuracy: 0.7854 Epoch 26/100 176/176 [==============================] - 23s 128ms/step - loss: 1.7437 - accuracy: 0.5187 - top-5-accuracy: 0.8206 - val_loss: 1.9732 - val_accuracy: 0.4792 - val_top-5-accuracy: 0.7772 Epoch 27/100 176/176 [==============================] - 23s 128ms/step - loss: 1.6929 - accuracy: 0.5300 - top-5-accuracy: 0.8287 - val_loss: 1.9109 - val_accuracy: 0.4928 - val_top-5-accuracy: 0.7912 Epoch 28/100 176/176 [==============================] - 23s 129ms/step - loss: 1.6647 - accuracy: 0.5400 - top-5-accuracy: 0.8362 - val_loss: 1.9031 - val_accuracy: 0.4984 - val_top-5-accuracy: 0.7824 Epoch 29/100 176/176 [==============================] - 23s 129ms/step - loss: 1.6295 - accuracy: 0.5488 - top-5-accuracy: 0.8402 - val_loss: 1.8744 - val_accuracy: 0.4982 - val_top-5-accuracy: 0.7910 Epoch 30/100 176/176 [==============================] - 22s 128ms/step - loss: 1.5860 - accuracy: 0.5548 - top-5-accuracy: 0.8504 - val_loss: 1.8551 - val_accuracy: 0.5108 - val_top-5-accuracy: 0.7946 Epoch 31/100 176/176 [==============================] - 22s 127ms/step - loss: 1.5666 - accuracy: 0.5614 - top-5-accuracy: 0.8548 - val_loss: 1.8720 - val_accuracy: 0.5076 - val_top-5-accuracy: 0.7960 Epoch 32/100 176/176 [==============================] - 22s 127ms/step - loss: 1.5272 - accuracy: 0.5712 - top-5-accuracy: 0.8596 - val_loss: 1.8840 - val_accuracy: 0.5106 - val_top-5-accuracy: 0.7966 Epoch 33/100 176/176 [==============================] - 22s 128ms/step - loss: 1.4995 - accuracy: 0.5779 - top-5-accuracy: 0.8651 - val_loss: 1.8660 - val_accuracy: 0.5116 - val_top-5-accuracy: 0.7904 Epoch 34/100 176/176 [==============================] - 22s 128ms/step - loss: 1.4686 - accuracy: 0.5849 - top-5-accuracy: 0.8685 - val_loss: 1.8544 - val_accuracy: 0.5126 - val_top-5-accuracy: 0.7954 Epoch 35/100 176/176 [==============================] - 22s 127ms/step - loss: 1.4276 - accuracy: 0.5992 - top-5-accuracy: 0.8743 - val_loss: 1.8497 - val_accuracy: 0.5164 - val_top-5-accuracy: 0.7990 Epoch 36/100 176/176 [==============================] - 22s 127ms/step - loss: 1.4102 - accuracy: 0.5970 - top-5-accuracy: 0.8768 - val_loss: 1.8496 - val_accuracy: 0.5198 - val_top-5-accuracy: 0.7948 Epoch 37/100 176/176 [==============================] - 22s 126ms/step - loss: 1.3800 - accuracy: 0.6112 - top-5-accuracy: 0.8814 - val_loss: 1.8033 - val_accuracy: 0.5284 - val_top-5-accuracy: 0.8068 Epoch 38/100 176/176 [==============================] - 22s 126ms/step - loss: 1.3500 - accuracy: 0.6103 - top-5-accuracy: 0.8862 - val_loss: 1.8092 - val_accuracy: 0.5214 - val_top-5-accuracy: 0.8128 Epoch 39/100 176/176 [==============================] - 22s 127ms/step - loss: 1.3575 - accuracy: 0.6127 - top-5-accuracy: 0.8857 - val_loss: 1.8175 - val_accuracy: 0.5198 - val_top-5-accuracy: 0.8086 Epoch 40/100 176/176 [==============================] - 22s 126ms/step - loss: 1.3030 - accuracy: 0.6283 - top-5-accuracy: 0.8927 - val_loss: 1.8361 - val_accuracy: 0.5170 - val_top-5-accuracy: 0.8056 Epoch 41/100 176/176 [==============================] - 22s 125ms/step - loss: 1.3160 - accuracy: 0.6247 - top-5-accuracy: 0.8923 - val_loss: 1.8074 - val_accuracy: 0.5260 - val_top-5-accuracy: 0.8082 Epoch 42/100 176/176 [==============================] - 22s 126ms/step - loss: 1.2679 - accuracy: 0.6329 - top-5-accuracy: 0.9002 - val_loss: 1.8430 - val_accuracy: 0.5244 - val_top-5-accuracy: 0.8100 Epoch 43/100 176/176 [==============================] - 22s 126ms/step - loss: 1.2514 - accuracy: 0.6375 - top-5-accuracy: 0.9034 - val_loss: 1.8318 - val_accuracy: 0.5196 - val_top-5-accuracy: 0.8034 Epoch 44/100 176/176 [==============================] - 22s 126ms/step - loss: 1.2311 - accuracy: 0.6431 - top-5-accuracy: 0.9067 - val_loss: 1.8283 - val_accuracy: 0.5218 - val_top-5-accuracy: 0.8050 Epoch 45/100 176/176 [==============================] - 22s 125ms/step - loss: 1.2073 - accuracy: 0.6484 - top-5-accuracy: 0.9098 - val_loss: 1.8384 - val_accuracy: 0.5302 - val_top-5-accuracy: 0.8056 Epoch 46/100 176/176 [==============================] - 22s 125ms/step - loss: 1.1775 - accuracy: 0.6558 - top-5-accuracy: 0.9117 - val_loss: 1.8409 - val_accuracy: 0.5294 - val_top-5-accuracy: 0.8078 Epoch 47/100 176/176 [==============================] - 22s 126ms/step - loss: 1.1891 - accuracy: 0.6563 - top-5-accuracy: 0.9103 - val_loss: 1.8167 - val_accuracy: 0.5346 - val_top-5-accuracy: 0.8142 Epoch 48/100 176/176 [==============================] - 22s 127ms/step - loss: 1.1586 - accuracy: 0.6621 - top-5-accuracy: 0.9161 - val_loss: 1.8285 - val_accuracy: 0.5314 - val_top-5-accuracy: 0.8086 Epoch 49/100 176/176 [==============================] - 22s 126ms/step - loss: 1.1586 - accuracy: 0.6634 - top-5-accuracy: 0.9154 - val_loss: 1.8189 - val_accuracy: 0.5366 - val_top-5-accuracy: 0.8134 Epoch 50/100 176/176 [==============================] - 22s 126ms/step - loss: 1.1306 - accuracy: 0.6682 - top-5-accuracy: 0.9199 - val_loss: 1.8442 - val_accuracy: 0.5254 - val_top-5-accuracy: 0.8096 Epoch 51/100 176/176 [==============================] - 22s 126ms/step - loss: 1.1175 - accuracy: 0.6708 - top-5-accuracy: 0.9227 - val_loss: 1.8513 - val_accuracy: 0.5230 - val_top-5-accuracy: 0.8104 Epoch 52/100 176/176 [==============================] - 22s 126ms/step - loss: 1.1104 - accuracy: 0.6743 - top-5-accuracy: 0.9226 - val_loss: 1.8041 - val_accuracy: 0.5332 - val_top-5-accuracy: 0.8142 Epoch 53/100 176/176 [==============================] - 22s 127ms/step - loss: 1.0914 - accuracy: 0.6809 - top-5-accuracy: 0.9236 - val_loss: 1.8213 - val_accuracy: 0.5342 - val_top-5-accuracy: 0.8094 Epoch 54/100 176/176 [==============================] - 22s 126ms/step - loss: 1.0681 - accuracy: 0.6856 - top-5-accuracy: 0.9270 - val_loss: 1.8429 - val_accuracy: 0.5328 - val_top-5-accuracy: 0.8086 Epoch 55/100 176/176 [==============================] - 22s 126ms/step - loss: 1.0625 - accuracy: 0.6862 - top-5-accuracy: 0.9301 - val_loss: 1.8316 - val_accuracy: 0.5364 - val_top-5-accuracy: 0.8090 Epoch 56/100 176/176 [==============================] - 22s 127ms/step - loss: 1.0474 - accuracy: 0.6920 - top-5-accuracy: 0.9308 - val_loss: 1.8310 - val_accuracy: 0.5440 - val_top-5-accuracy: 0.8132 Epoch 57/100 176/176 [==============================] - 22s 127ms/step - loss: 1.0381 - accuracy: 0.6974 - top-5-accuracy: 0.9297 - val_loss: 1.8447 - val_accuracy: 0.5368 - val_top-5-accuracy: 0.8126 Epoch 58/100 176/176 [==============================] - 22s 126ms/step - loss: 1.0230 - accuracy: 0.7011 - top-5-accuracy: 0.9341 - val_loss: 1.8241 - val_accuracy: 0.5418 - val_top-5-accuracy: 0.8094 Epoch 59/100 176/176 [==============================] - 22s 127ms/step - loss: 1.0113 - accuracy: 0.7023 - top-5-accuracy: 0.9361 - val_loss: 1.8216 - val_accuracy: 0.5380 - val_top-5-accuracy: 0.8134 Epoch 60/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9953 - accuracy: 0.7031 - top-5-accuracy: 0.9386 - val_loss: 1.8356 - val_accuracy: 0.5422 - val_top-5-accuracy: 0.8122 Epoch 61/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9928 - accuracy: 0.7084 - top-5-accuracy: 0.9375 - val_loss: 1.8514 - val_accuracy: 0.5342 - val_top-5-accuracy: 0.8182 Epoch 62/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9740 - accuracy: 0.7121 - top-5-accuracy: 0.9387 - val_loss: 1.8674 - val_accuracy: 0.5366 - val_top-5-accuracy: 0.8092 Epoch 63/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9742 - accuracy: 0.7112 - top-5-accuracy: 0.9413 - val_loss: 1.8274 - val_accuracy: 0.5414 - val_top-5-accuracy: 0.8144 Epoch 64/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9633 - accuracy: 0.7147 - top-5-accuracy: 0.9393 - val_loss: 1.8250 - val_accuracy: 0.5434 - val_top-5-accuracy: 0.8180 Epoch 65/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9407 - accuracy: 0.7221 - top-5-accuracy: 0.9444 - val_loss: 1.8456 - val_accuracy: 0.5424 - val_top-5-accuracy: 0.8120 Epoch 66/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9410 - accuracy: 0.7194 - top-5-accuracy: 0.9447 - val_loss: 1.8559 - val_accuracy: 0.5460 - val_top-5-accuracy: 0.8144 Epoch 67/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9359 - accuracy: 0.7252 - top-5-accuracy: 0.9421 - val_loss: 1.8352 - val_accuracy: 0.5458 - val_top-5-accuracy: 0.8110 Epoch 68/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9232 - accuracy: 0.7254 - top-5-accuracy: 0.9460 - val_loss: 1.8479 - val_accuracy: 0.5444 - val_top-5-accuracy: 0.8132 Epoch 69/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9138 - accuracy: 0.7283 - top-5-accuracy: 0.9456 - val_loss: 1.8697 - val_accuracy: 0.5312 - val_top-5-accuracy: 0.8052 Epoch 70/100 176/176 [==============================] - 22s 126ms/step - loss: 0.9095 - accuracy: 0.7295 - top-5-accuracy: 0.9478 - val_loss: 1.8550 - val_accuracy: 0.5376 - val_top-5-accuracy: 0.8170 Epoch 71/100 176/176 [==============================] - 22s 126ms/step - loss: 0.8945 - accuracy: 0.7332 - top-5-accuracy: 0.9504 - val_loss: 1.8286 - val_accuracy: 0.5436 - val_top-5-accuracy: 0.8198 Epoch 72/100 176/176 [==============================] - 22s 125ms/step - loss: 0.8936 - accuracy: 0.7344 - top-5-accuracy: 0.9479 - val_loss: 1.8727 - val_accuracy: 0.5438 - val_top-5-accuracy: 0.8182 Epoch 73/100 176/176 [==============================] - 22s 126ms/step - loss: 0.8775 - accuracy: 0.7355 - top-5-accuracy: 0.9510 - val_loss: 1.8522 - val_accuracy: 0.5404 - val_top-5-accuracy: 0.8170 Epoch 74/100 176/176 [==============================] - 22s 126ms/step - loss: 0.8660 - accuracy: 0.7390 - top-5-accuracy: 0.9513 - val_loss: 1.8432 - val_accuracy: 0.5448 - val_top-5-accuracy: 0.8156 Epoch 75/100 176/176 [==============================] - 22s 126ms/step - loss: 0.8583 - accuracy: 0.7441 - top-5-accuracy: 0.9532 - val_loss: 1.8419 - val_accuracy: 0.5462 - val_top-5-accuracy: 0.8226 Epoch 76/100 176/176 [==============================] - 22s 126ms/step - loss: 0.8549 - accuracy: 0.7443 - top-5-accuracy: 0.9529 - val_loss: 1.8757 - val_accuracy: 0.5454 - val_top-5-accuracy: 0.8086 Epoch 77/100 176/176 [==============================] - 22s 125ms/step - loss: 0.8578 - accuracy: 0.7384 - top-5-accuracy: 0.9531 - val_loss: 1.9051 - val_accuracy: 0.5462 - val_top-5-accuracy: 0.8136 Epoch 78/100 176/176 [==============================] - 22s 125ms/step - loss: 0.8530 - accuracy: 0.7442 - top-5-accuracy: 0.9526 - val_loss: 1.8496 - val_accuracy: 0.5384 - val_top-5-accuracy: 0.8124 Epoch 79/100 176/176 [==============================] - 22s 125ms/step - loss: 0.8403 - accuracy: 0.7485 - top-5-accuracy: 0.9542 - val_loss: 1.8701 - val_accuracy: 0.5550 - val_top-5-accuracy: 0.8228 Epoch 80/100 176/176 [==============================] - 22s 126ms/step - loss: 0.8410 - accuracy: 0.7491 - top-5-accuracy: 0.9538 - val_loss: 1.8737 - val_accuracy: 0.5502 - val_top-5-accuracy: 0.8150 Epoch 81/100 176/176 [==============================] - 22s 126ms/step - loss: 0.8275 - accuracy: 0.7547 - top-5-accuracy: 0.9532 - val_loss: 1.8391 - val_accuracy: 0.5534 - val_top-5-accuracy: 0.8156 Epoch 82/100 176/176 [==============================] - 22s 125ms/step - loss: 0.8221 - accuracy: 0.7528 - top-5-accuracy: 0.9562 - val_loss: 1.8775 - val_accuracy: 0.5428 - val_top-5-accuracy: 0.8120 Epoch 83/100 176/176 [==============================] - 22s 125ms/step - loss: 0.8270 - accuracy: 0.7526 - top-5-accuracy: 0.9550 - val_loss: 1.8464 - val_accuracy: 0.5468 - val_top-5-accuracy: 0.8148 Epoch 84/100 176/176 [==============================] - 22s 126ms/step - loss: 0.8080 - accuracy: 0.7551 - top-5-accuracy: 0.9576 - val_loss: 1.8789 - val_accuracy: 0.5486 - val_top-5-accuracy: 0.8204 Epoch 85/100 176/176 [==============================] - 22s 125ms/step - loss: 0.8058 - accuracy: 0.7593 - top-5-accuracy: 0.9573 - val_loss: 1.8691 - val_accuracy: 0.5446 - val_top-5-accuracy: 0.8156 Epoch 86/100 176/176 [==============================] - 22s 126ms/step - loss: 0.8092 - accuracy: 0.7564 - top-5-accuracy: 0.9560 - val_loss: 1.8588 - val_accuracy: 0.5524 - val_top-5-accuracy: 0.8172 Epoch 87/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7897 - accuracy: 0.7613 - top-5-accuracy: 0.9604 - val_loss: 1.8649 - val_accuracy: 0.5490 - val_top-5-accuracy: 0.8166 Epoch 88/100 176/176 [==============================] - 22s 126ms/step - loss: 0.7890 - accuracy: 0.7635 - top-5-accuracy: 0.9598 - val_loss: 1.9060 - val_accuracy: 0.5446 - val_top-5-accuracy: 0.8112 Epoch 89/100 176/176 [==============================] - 22s 126ms/step - loss: 0.7682 - accuracy: 0.7687 - top-5-accuracy: 0.9620 - val_loss: 1.8645 - val_accuracy: 0.5474 - val_top-5-accuracy: 0.8150 Epoch 90/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7958 - accuracy: 0.7617 - top-5-accuracy: 0.9600 - val_loss: 1.8549 - val_accuracy: 0.5496 - val_top-5-accuracy: 0.8140 Epoch 91/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7978 - accuracy: 0.7603 - top-5-accuracy: 0.9590 - val_loss: 1.9169 - val_accuracy: 0.5440 - val_top-5-accuracy: 0.8140 Epoch 92/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7898 - accuracy: 0.7630 - top-5-accuracy: 0.9594 - val_loss: 1.9015 - val_accuracy: 0.5540 - val_top-5-accuracy: 0.8174 Epoch 93/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7550 - accuracy: 0.7722 - top-5-accuracy: 0.9622 - val_loss: 1.9219 - val_accuracy: 0.5410 - val_top-5-accuracy: 0.8098 Epoch 94/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7692 - accuracy: 0.7689 - top-5-accuracy: 0.9599 - val_loss: 1.8928 - val_accuracy: 0.5506 - val_top-5-accuracy: 0.8184 Epoch 95/100 176/176 [==============================] - 22s 126ms/step - loss: 0.7783 - accuracy: 0.7661 - top-5-accuracy: 0.9597 - val_loss: 1.8646 - val_accuracy: 0.5490 - val_top-5-accuracy: 0.8166 Epoch 96/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7547 - accuracy: 0.7711 - top-5-accuracy: 0.9638 - val_loss: 1.9347 - val_accuracy: 0.5484 - val_top-5-accuracy: 0.8150 Epoch 97/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7603 - accuracy: 0.7692 - top-5-accuracy: 0.9616 - val_loss: 1.8966 - val_accuracy: 0.5522 - val_top-5-accuracy: 0.8144 Epoch 98/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7595 - accuracy: 0.7730 - top-5-accuracy: 0.9610 - val_loss: 1.8728 - val_accuracy: 0.5470 - val_top-5-accuracy: 0.8170 Epoch 99/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7542 - accuracy: 0.7736 - top-5-accuracy: 0.9622 - val_loss: 1.9132 - val_accuracy: 0.5504 - val_top-5-accuracy: 0.8156 Epoch 100/100 176/176 [==============================] - 22s 125ms/step - loss: 0.7410 - accuracy: 0.7787 - top-5-accuracy: 0.9635 - val_loss: 1.9233 - val_accuracy: 0.5428 - val_top-5-accuracy: 0.8120 313/313 [==============================] - 4s 12ms/step - loss: 1.8487 - accuracy: 0.5514 - top-5-accuracy: 0.8186 Test accuracy: 55.14% Test top 5 accuracy: 81.86%
(訳注: 実験結果)
Epoch 1/100 176/176 [==============================] - 56s 259ms/step - loss: 4.4883 - accuracy: 0.0459 - top-5-accuracy: 0.1590 - val_loss: 3.9390 - val_accuracy: 0.1102 - val_top-5-accuracy: 0.3046 Epoch 2/100 176/176 [==============================] - 44s 253ms/step - loss: 3.9518 - accuracy: 0.0947 - top-5-accuracy: 0.2878 - val_loss: 3.5528 - val_accuracy: 0.1566 - val_top-5-accuracy: 0.4120 Epoch 3/100 176/176 [==============================] - 44s 253ms/step - loss: 3.7089 - accuracy: 0.1299 - top-5-accuracy: 0.3606 - val_loss: 3.3396 - val_accuracy: 0.1898 - val_top-5-accuracy: 0.4724 Epoch 4/100 176/176 [==============================] - 44s 253ms/step - loss: 3.5268 - accuracy: 0.1584 - top-5-accuracy: 0.4148 - val_loss: 3.2081 - val_accuracy: 0.2162 - val_top-5-accuracy: 0.5032 Epoch 5/100 176/176 [==============================] - 44s 252ms/step - loss: 3.3893 - accuracy: 0.1828 - top-5-accuracy: 0.4531 - val_loss: 3.1479 - val_accuracy: 0.2404 - val_top-5-accuracy: 0.5418 Epoch 6/100 176/176 [==============================] - 44s 252ms/step - loss: 3.2760 - accuracy: 0.2044 - top-5-accuracy: 0.4847 - val_loss: 2.9722 - val_accuracy: 0.2652 - val_top-5-accuracy: 0.5644 Epoch 7/100 176/176 [==============================] - 44s 253ms/step - loss: 3.1756 - accuracy: 0.2248 - top-5-accuracy: 0.5142 - val_loss: 2.9041 - val_accuracy: 0.2778 - val_top-5-accuracy: 0.5846 Epoch 8/100 176/176 [==============================] - 44s 252ms/step - loss: 3.0697 - accuracy: 0.2447 - top-5-accuracy: 0.5399 - val_loss: 2.7892 - val_accuracy: 0.3088 - val_top-5-accuracy: 0.6068 Epoch 9/100 176/176 [==============================] - 44s 252ms/step - loss: 2.9803 - accuracy: 0.2633 - top-5-accuracy: 0.5619 - val_loss: 2.6955 - val_accuracy: 0.3140 - val_top-5-accuracy: 0.6226 Epoch 10/100 176/176 [==============================] - 44s 252ms/step - loss: 2.8729 - accuracy: 0.2798 - top-5-accuracy: 0.5854 - val_loss: 2.6300 - val_accuracy: 0.3378 - val_top-5-accuracy: 0.6432 Epoch 11/100 176/176 [==============================] - 44s 252ms/step - loss: 2.7786 - accuracy: 0.3007 - top-5-accuracy: 0.6072 - val_loss: 2.5503 - val_accuracy: 0.3538 - val_top-5-accuracy: 0.6590 Epoch 12/100 176/176 [==============================] - 44s 252ms/step - loss: 2.6838 - accuracy: 0.3190 - top-5-accuracy: 0.6293 - val_loss: 2.4795 - val_accuracy: 0.3614 - val_top-5-accuracy: 0.6716 Epoch 13/100 176/176 [==============================] - 44s 252ms/step - loss: 2.6191 - accuracy: 0.3315 - top-5-accuracy: 0.6482 - val_loss: 2.3928 - val_accuracy: 0.3854 - val_top-5-accuracy: 0.6844 Epoch 14/100 176/176 [==============================] - 44s 248ms/step - loss: 2.5404 - accuracy: 0.3484 - top-5-accuracy: 0.6626 - val_loss: 2.3646 - val_accuracy: 0.3784 - val_top-5-accuracy: 0.6904 Epoch 15/100 176/176 [==============================] - 45s 254ms/step - loss: 2.4683 - accuracy: 0.3638 - top-5-accuracy: 0.6776 - val_loss: 2.3108 - val_accuracy: 0.4054 - val_top-5-accuracy: 0.7050 Epoch 16/100 176/176 [==============================] - 44s 248ms/step - loss: 2.4039 - accuracy: 0.3759 - top-5-accuracy: 0.6936 - val_loss: 2.2930 - val_accuracy: 0.4042 - val_top-5-accuracy: 0.7104 Epoch 17/100 176/176 [==============================] - 44s 252ms/step - loss: 2.3235 - accuracy: 0.3947 - top-5-accuracy: 0.7077 - val_loss: 2.2113 - val_accuracy: 0.4270 - val_top-5-accuracy: 0.7240 Epoch 18/100 176/176 [==============================] - 44s 248ms/step - loss: 2.2864 - accuracy: 0.4007 - top-5-accuracy: 0.7165 - val_loss: 2.2300 - val_accuracy: 0.4184 - val_top-5-accuracy: 0.7246 Epoch 19/100 176/176 [==============================] - 44s 252ms/step - loss: 2.2215 - accuracy: 0.4154 - top-5-accuracy: 0.7284 - val_loss: 2.1324 - val_accuracy: 0.4428 - val_top-5-accuracy: 0.7348 Epoch 20/100 176/176 [==============================] - 44s 252ms/step - loss: 2.1664 - accuracy: 0.4289 - top-5-accuracy: 0.7384 - val_loss: 2.1190 - val_accuracy: 0.4438 - val_top-5-accuracy: 0.7404 Epoch 21/100 176/176 [==============================] - 44s 248ms/step - loss: 2.1069 - accuracy: 0.4400 - top-5-accuracy: 0.7526 - val_loss: 2.1561 - val_accuracy: 0.4386 - val_top-5-accuracy: 0.7302 Epoch 22/100 176/176 [==============================] - 44s 253ms/step - loss: 2.0619 - accuracy: 0.4491 - top-5-accuracy: 0.7619 - val_loss: 2.0877 - val_accuracy: 0.4526 - val_top-5-accuracy: 0.7458 Epoch 23/100 176/176 [==============================] - 45s 254ms/step - loss: 2.0092 - accuracy: 0.4631 - top-5-accuracy: 0.7726 - val_loss: 2.0565 - val_accuracy: 0.4662 - val_top-5-accuracy: 0.7546 Epoch 24/100 176/176 [==============================] - 44s 249ms/step - loss: 1.9632 - accuracy: 0.4721 - top-5-accuracy: 0.7826 - val_loss: 2.0649 - val_accuracy: 0.4556 - val_top-5-accuracy: 0.7558 Epoch 25/100 176/176 [==============================] - 44s 253ms/step - loss: 1.9191 - accuracy: 0.4791 - top-5-accuracy: 0.7894 - val_loss: 2.0108 - val_accuracy: 0.4672 - val_top-5-accuracy: 0.7702 Epoch 26/100 176/176 [==============================] - 44s 252ms/step - loss: 1.8696 - accuracy: 0.4913 - top-5-accuracy: 0.7990 - val_loss: 1.9582 - val_accuracy: 0.4800 - val_top-5-accuracy: 0.7740 Epoch 27/100 176/176 [==============================] - 44s 252ms/step - loss: 1.8345 - accuracy: 0.5016 - top-5-accuracy: 0.8070 - val_loss: 1.9745 - val_accuracy: 0.4802 - val_top-5-accuracy: 0.7682 Epoch 28/100 176/176 [==============================] - 44s 248ms/step - loss: 1.7866 - accuracy: 0.5112 - top-5-accuracy: 0.8133 - val_loss: 2.0060 - val_accuracy: 0.4756 - val_top-5-accuracy: 0.7684 Epoch 29/100 176/176 [==============================] - 44s 248ms/step - loss: 1.7457 - accuracy: 0.5214 - top-5-accuracy: 0.8187 - val_loss: 1.9975 - val_accuracy: 0.4762 - val_top-5-accuracy: 0.7706 Epoch 30/100 176/176 [==============================] - 44s 252ms/step - loss: 1.7155 - accuracy: 0.5288 - top-5-accuracy: 0.8271 - val_loss: 1.9511 - val_accuracy: 0.4898 - val_top-5-accuracy: 0.7808 Epoch 31/100 176/176 [==============================] - 44s 252ms/step - loss: 1.6749 - accuracy: 0.5394 - top-5-accuracy: 0.8336 - val_loss: 1.8900 - val_accuracy: 0.5012 - val_top-5-accuracy: 0.7868 Epoch 32/100 176/176 [==============================] - 44s 248ms/step - loss: 1.6348 - accuracy: 0.5453 - top-5-accuracy: 0.8409 - val_loss: 1.9359 - val_accuracy: 0.4884 - val_top-5-accuracy: 0.7818 Epoch 33/100 176/176 [==============================] - 44s 248ms/step - loss: 1.5989 - accuracy: 0.5556 - top-5-accuracy: 0.8467 - val_loss: 1.8946 - val_accuracy: 0.4998 - val_top-5-accuracy: 0.7882 Epoch 34/100 176/176 [==============================] - 44s 253ms/step - loss: 1.5687 - accuracy: 0.5647 - top-5-accuracy: 0.8512 - val_loss: 1.9065 - val_accuracy: 0.5038 - val_top-5-accuracy: 0.7882 Epoch 35/100 176/176 [==============================] - 44s 252ms/step - loss: 1.5415 - accuracy: 0.5661 - top-5-accuracy: 0.8566 - val_loss: 1.8890 - val_accuracy: 0.5048 - val_top-5-accuracy: 0.7888 Epoch 36/100 176/176 [==============================] - 44s 252ms/step - loss: 1.5104 - accuracy: 0.5768 - top-5-accuracy: 0.8601 - val_loss: 1.8976 - val_accuracy: 0.5064 - val_top-5-accuracy: 0.7872 Epoch 37/100 176/176 [==============================] - 44s 252ms/step - loss: 1.4934 - accuracy: 0.5792 - top-5-accuracy: 0.8641 - val_loss: 1.9018 - val_accuracy: 0.5076 - val_top-5-accuracy: 0.7952 Epoch 38/100 176/176 [==============================] - 44s 252ms/step - loss: 1.4630 - accuracy: 0.5853 - top-5-accuracy: 0.8702 - val_loss: 1.8687 - val_accuracy: 0.5128 - val_top-5-accuracy: 0.7964 Epoch 39/100 176/176 [==============================] - 44s 248ms/step - loss: 1.4324 - accuracy: 0.5923 - top-5-accuracy: 0.8754 - val_loss: 1.8712 - val_accuracy: 0.5116 - val_top-5-accuracy: 0.7986 Epoch 40/100 176/176 [==============================] - 44s 248ms/step - loss: 1.4009 - accuracy: 0.6034 - top-5-accuracy: 0.8784 - val_loss: 1.8606 - val_accuracy: 0.5124 - val_top-5-accuracy: 0.7976 Epoch 41/100 176/176 [==============================] - 44s 252ms/step - loss: 1.3814 - accuracy: 0.6062 - top-5-accuracy: 0.8819 - val_loss: 1.8512 - val_accuracy: 0.5202 - val_top-5-accuracy: 0.7986 Epoch 42/100 176/176 [==============================] - 44s 248ms/step - loss: 1.3507 - accuracy: 0.6139 - top-5-accuracy: 0.8872 - val_loss: 1.8880 - val_accuracy: 0.5138 - val_top-5-accuracy: 0.7948 Epoch 43/100 176/176 [==============================] - 44s 252ms/step - loss: 1.3326 - accuracy: 0.6181 - top-5-accuracy: 0.8910 - val_loss: 1.8408 - val_accuracy: 0.5204 - val_top-5-accuracy: 0.7978 Epoch 44/100 176/176 [==============================] - 44s 252ms/step - loss: 1.3187 - accuracy: 0.6238 - top-5-accuracy: 0.8913 - val_loss: 1.8610 - val_accuracy: 0.5214 - val_top-5-accuracy: 0.8012 Epoch 45/100 176/176 [==============================] - 44s 252ms/step - loss: 1.2864 - accuracy: 0.6286 - top-5-accuracy: 0.8953 - val_loss: 1.8809 - val_accuracy: 0.5236 - val_top-5-accuracy: 0.7988 Epoch 46/100 176/176 [==============================] - 44s 248ms/step - loss: 1.2739 - accuracy: 0.6316 - top-5-accuracy: 0.8995 - val_loss: 1.8727 - val_accuracy: 0.5182 - val_top-5-accuracy: 0.7964 Epoch 47/100 176/176 [==============================] - 44s 248ms/step - loss: 1.2364 - accuracy: 0.6432 - top-5-accuracy: 0.9045 - val_loss: 1.8513 - val_accuracy: 0.5210 - val_top-5-accuracy: 0.8044 Epoch 48/100 176/176 [==============================] - 44s 252ms/step - loss: 1.2349 - accuracy: 0.6427 - top-5-accuracy: 0.9053 - val_loss: 1.8271 - val_accuracy: 0.5304 - val_top-5-accuracy: 0.8054 Epoch 49/100 176/176 [==============================] - 44s 248ms/step - loss: 1.2043 - accuracy: 0.6484 - top-5-accuracy: 0.9090 - val_loss: 1.8211 - val_accuracy: 0.5280 - val_top-5-accuracy: 0.8064 Epoch 50/100 176/176 [==============================] - 44s 248ms/step - loss: 1.1972 - accuracy: 0.6533 - top-5-accuracy: 0.9091 - val_loss: 1.8617 - val_accuracy: 0.5274 - val_top-5-accuracy: 0.8082 Epoch 51/100 176/176 [==============================] - 44s 248ms/step - loss: 1.1763 - accuracy: 0.6553 - top-5-accuracy: 0.9134 - val_loss: 1.8478 - val_accuracy: 0.5280 - val_top-5-accuracy: 0.8106 Epoch 52/100 176/176 [==============================] - 44s 248ms/step - loss: 1.1683 - accuracy: 0.6586 - top-5-accuracy: 0.9154 - val_loss: 1.8413 - val_accuracy: 0.5294 - val_top-5-accuracy: 0.8098 Epoch 53/100 176/176 [==============================] - 44s 248ms/step - loss: 1.1481 - accuracy: 0.6638 - top-5-accuracy: 0.9170 - val_loss: 1.8692 - val_accuracy: 0.5228 - val_top-5-accuracy: 0.8042 Epoch 54/100 176/176 [==============================] - 44s 248ms/step - loss: 1.1358 - accuracy: 0.6665 - top-5-accuracy: 0.9176 - val_loss: 1.8547 - val_accuracy: 0.5300 - val_top-5-accuracy: 0.8104 Epoch 55/100 176/176 [==============================] - 44s 252ms/step - loss: 1.1196 - accuracy: 0.6707 - top-5-accuracy: 0.9205 - val_loss: 1.8496 - val_accuracy: 0.5328 - val_top-5-accuracy: 0.8050 Epoch 56/100 176/176 [==============================] - 44s 248ms/step - loss: 1.0949 - accuracy: 0.6794 - top-5-accuracy: 0.9234 - val_loss: 1.8431 - val_accuracy: 0.5328 - val_top-5-accuracy: 0.8132 Epoch 57/100 176/176 [==============================] - 44s 252ms/step - loss: 1.0947 - accuracy: 0.6776 - top-5-accuracy: 0.9248 - val_loss: 1.8211 - val_accuracy: 0.5394 - val_top-5-accuracy: 0.8160 Epoch 58/100 176/176 [==============================] - 44s 248ms/step - loss: 1.0776 - accuracy: 0.6844 - top-5-accuracy: 0.9264 - val_loss: 1.8287 - val_accuracy: 0.5392 - val_top-5-accuracy: 0.8134 Epoch 59/100 176/176 [==============================] - 44s 252ms/step - loss: 1.0663 - accuracy: 0.6873 - top-5-accuracy: 0.9285 - val_loss: 1.8335 - val_accuracy: 0.5396 - val_top-5-accuracy: 0.8118 Epoch 60/100 176/176 [==============================] - 44s 249ms/step - loss: 1.0459 - accuracy: 0.6932 - top-5-accuracy: 0.9306 - val_loss: 1.8631 - val_accuracy: 0.5370 - val_top-5-accuracy: 0.8100 Epoch 61/100 176/176 [==============================] - 44s 248ms/step - loss: 1.0331 - accuracy: 0.6961 - top-5-accuracy: 0.9315 - val_loss: 1.8498 - val_accuracy: 0.5342 - val_top-5-accuracy: 0.8120 Epoch 62/100 176/176 [==============================] - 44s 249ms/step - loss: 1.0255 - accuracy: 0.6989 - top-5-accuracy: 0.9318 - val_loss: 1.8368 - val_accuracy: 0.5372 - val_top-5-accuracy: 0.8144 Epoch 63/100 176/176 [==============================] - 44s 249ms/step - loss: 1.0136 - accuracy: 0.7020 - top-5-accuracy: 0.9344 - val_loss: 1.8613 - val_accuracy: 0.5316 - val_top-5-accuracy: 0.8084 Epoch 64/100 176/176 [==============================] - 44s 249ms/step - loss: 1.0133 - accuracy: 0.7012 - top-5-accuracy: 0.9342 - val_loss: 1.8366 - val_accuracy: 0.5370 - val_top-5-accuracy: 0.8164 Epoch 65/100 176/176 [==============================] - 44s 249ms/step - loss: 0.9962 - accuracy: 0.7040 - top-5-accuracy: 0.9360 - val_loss: 1.8553 - val_accuracy: 0.5384 - val_top-5-accuracy: 0.8120 Epoch 66/100 176/176 [==============================] - 44s 248ms/step - loss: 0.9789 - accuracy: 0.7090 - top-5-accuracy: 0.9382 - val_loss: 1.8977 - val_accuracy: 0.5340 - val_top-5-accuracy: 0.8118 Epoch 67/100 176/176 [==============================] - 44s 249ms/step - loss: 0.9709 - accuracy: 0.7134 - top-5-accuracy: 0.9398 - val_loss: 1.9148 - val_accuracy: 0.5334 - val_top-5-accuracy: 0.8102 Epoch 68/100 176/176 [==============================] - 44s 248ms/step - loss: 0.9692 - accuracy: 0.7134 - top-5-accuracy: 0.9380 - val_loss: 1.8374 - val_accuracy: 0.5378 - val_top-5-accuracy: 0.8110 Epoch 69/100 176/176 [==============================] - 44s 248ms/step - loss: 0.9531 - accuracy: 0.7149 - top-5-accuracy: 0.9416 - val_loss: 1.8445 - val_accuracy: 0.5344 - val_top-5-accuracy: 0.8096 Epoch 70/100 176/176 [==============================] - 44s 248ms/step - loss: 0.9451 - accuracy: 0.7206 - top-5-accuracy: 0.9424 - val_loss: 1.8832 - val_accuracy: 0.5350 - val_top-5-accuracy: 0.8052 Epoch 71/100 176/176 [==============================] - 44s 252ms/step - loss: 0.9344 - accuracy: 0.7246 - top-5-accuracy: 0.9430 - val_loss: 1.8757 - val_accuracy: 0.5400 - val_top-5-accuracy: 0.8102 Epoch 72/100 176/176 [==============================] - 44s 252ms/step - loss: 0.9279 - accuracy: 0.7255 - top-5-accuracy: 0.9451 - val_loss: 1.8589 - val_accuracy: 0.5416 - val_top-5-accuracy: 0.8186 Epoch 73/100 176/176 [==============================] - 44s 249ms/step - loss: 0.9075 - accuracy: 0.7281 - top-5-accuracy: 0.9483 - val_loss: 1.8734 - val_accuracy: 0.5344 - val_top-5-accuracy: 0.8152 Epoch 74/100 176/176 [==============================] - 44s 248ms/step - loss: 0.9139 - accuracy: 0.7274 - top-5-accuracy: 0.9480 - val_loss: 1.8772 - val_accuracy: 0.5390 - val_top-5-accuracy: 0.8112 Epoch 75/100 176/176 [==============================] - 44s 249ms/step - loss: 0.9048 - accuracy: 0.7290 - top-5-accuracy: 0.9479 - val_loss: 1.8879 - val_accuracy: 0.5414 - val_top-5-accuracy: 0.8172 Epoch 76/100 176/176 [==============================] - 44s 252ms/step - loss: 0.9134 - accuracy: 0.7273 - top-5-accuracy: 0.9458 - val_loss: 1.8363 - val_accuracy: 0.5436 - val_top-5-accuracy: 0.8170 Epoch 77/100 176/176 [==============================] - 44s 252ms/step - loss: 0.8957 - accuracy: 0.7314 - top-5-accuracy: 0.9471 - val_loss: 1.8529 - val_accuracy: 0.5448 - val_top-5-accuracy: 0.8098 Epoch 78/100 176/176 [==============================] - 44s 249ms/step - loss: 0.8839 - accuracy: 0.7377 - top-5-accuracy: 0.9502 - val_loss: 1.8492 - val_accuracy: 0.5414 - val_top-5-accuracy: 0.8132 Epoch 79/100 176/176 [==============================] - 44s 248ms/step - loss: 0.8842 - accuracy: 0.7353 - top-5-accuracy: 0.9503 - val_loss: 1.8921 - val_accuracy: 0.5276 - val_top-5-accuracy: 0.8084 Epoch 80/100 176/176 [==============================] - 44s 248ms/step - loss: 0.8723 - accuracy: 0.7390 - top-5-accuracy: 0.9507 - val_loss: 1.8795 - val_accuracy: 0.5408 - val_top-5-accuracy: 0.8164 Epoch 81/100 176/176 [==============================] - 44s 248ms/step - loss: 0.8654 - accuracy: 0.7420 - top-5-accuracy: 0.9520 - val_loss: 1.9055 - val_accuracy: 0.5364 - val_top-5-accuracy: 0.8112 Epoch 82/100 176/176 [==============================] - 44s 248ms/step - loss: 0.8624 - accuracy: 0.7430 - top-5-accuracy: 0.9525 - val_loss: 1.8813 - val_accuracy: 0.5426 - val_top-5-accuracy: 0.8088 Epoch 83/100 176/176 [==============================] - 44s 248ms/step - loss: 0.8534 - accuracy: 0.7454 - top-5-accuracy: 0.9518 - val_loss: 1.8672 - val_accuracy: 0.5426 - val_top-5-accuracy: 0.8082 Epoch 84/100 176/176 [==============================] - 44s 248ms/step - loss: 0.8570 - accuracy: 0.7432 - top-5-accuracy: 0.9522 - val_loss: 1.9157 - val_accuracy: 0.5424 - val_top-5-accuracy: 0.8106 Epoch 85/100 176/176 [==============================] - 44s 252ms/step - loss: 0.8441 - accuracy: 0.7464 - top-5-accuracy: 0.9538 - val_loss: 1.9172 - val_accuracy: 0.5486 - val_top-5-accuracy: 0.8114 Epoch 86/100 176/176 [==============================] - 44s 249ms/step - loss: 0.8283 - accuracy: 0.7526 - top-5-accuracy: 0.9559 - val_loss: 1.9155 - val_accuracy: 0.5404 - val_top-5-accuracy: 0.8068 Epoch 87/100 176/176 [==============================] - 44s 249ms/step - loss: 0.8414 - accuracy: 0.7474 - top-5-accuracy: 0.9551 - val_loss: 1.9040 - val_accuracy: 0.5434 - val_top-5-accuracy: 0.8088 Epoch 88/100 176/176 [==============================] - 44s 248ms/step - loss: 0.8341 - accuracy: 0.7494 - top-5-accuracy: 0.9546 - val_loss: 1.9008 - val_accuracy: 0.5434 - val_top-5-accuracy: 0.8100 Epoch 89/100 176/176 [==============================] - 44s 249ms/step - loss: 0.8339 - accuracy: 0.7510 - top-5-accuracy: 0.9549 - val_loss: 1.9055 - val_accuracy: 0.5456 - val_top-5-accuracy: 0.8046 Epoch 90/100 176/176 [==============================] - 44s 248ms/step - loss: 0.8184 - accuracy: 0.7542 - top-5-accuracy: 0.9562 - val_loss: 1.8846 - val_accuracy: 0.5446 - val_top-5-accuracy: 0.8126 Epoch 91/100 176/176 [==============================] - 44s 253ms/step - loss: 0.8016 - accuracy: 0.7577 - top-5-accuracy: 0.9584 - val_loss: 1.8735 - val_accuracy: 0.5498 - val_top-5-accuracy: 0.8122 Epoch 92/100 176/176 [==============================] - 44s 249ms/step - loss: 0.8034 - accuracy: 0.7599 - top-5-accuracy: 0.9568 - val_loss: 1.9315 - val_accuracy: 0.5464 - val_top-5-accuracy: 0.8078 Epoch 93/100 176/176 [==============================] - 44s 248ms/step - loss: 0.7979 - accuracy: 0.7596 - top-5-accuracy: 0.9593 - val_loss: 1.8893 - val_accuracy: 0.5488 - val_top-5-accuracy: 0.8124 Epoch 94/100 176/176 [==============================] - 44s 249ms/step - loss: 0.7981 - accuracy: 0.7606 - top-5-accuracy: 0.9588 - val_loss: 1.9554 - val_accuracy: 0.5432 - val_top-5-accuracy: 0.8074 Epoch 95/100 176/176 [==============================] - 44s 249ms/step - loss: 0.7971 - accuracy: 0.7601 - top-5-accuracy: 0.9582 - val_loss: 1.9349 - val_accuracy: 0.5366 - val_top-5-accuracy: 0.8108 Epoch 96/100 176/176 [==============================] - 44s 249ms/step - loss: 0.7916 - accuracy: 0.7612 - top-5-accuracy: 0.9596 - val_loss: 1.8965 - val_accuracy: 0.5388 - val_top-5-accuracy: 0.8096 Epoch 97/100 176/176 [==============================] - 44s 249ms/step - loss: 0.7899 - accuracy: 0.7630 - top-5-accuracy: 0.9607 - val_loss: 1.9262 - val_accuracy: 0.5480 - val_top-5-accuracy: 0.8118 Epoch 98/100 176/176 [==============================] - 44s 248ms/step - loss: 0.7810 - accuracy: 0.7658 - top-5-accuracy: 0.9610 - val_loss: 1.9396 - val_accuracy: 0.5426 - val_top-5-accuracy: 0.8106 Epoch 99/100 176/176 [==============================] - 44s 248ms/step - loss: 0.7942 - accuracy: 0.7617 - top-5-accuracy: 0.9581 - val_loss: 1.9000 - val_accuracy: 0.5358 - val_top-5-accuracy: 0.8076 Epoch 100/100 176/176 [==============================] - 44s 249ms/step - loss: 0.7715 - accuracy: 0.7715 - top-5-accuracy: 0.9599 - val_loss: 1.9278 - val_accuracy: 0.5454 - val_top-5-accuracy: 0.8166 313/313 [==============================] - 6s 20ms/step - loss: 1.8421 - accuracy: 0.5497 - top-5-accuracy: 0.8191 Test accuracy: 54.97% Test top 5 accuracy: 81.91% CPU times: user 49min 51s, sys: 3min 41s, total: 53min 33s Wall time: 1h 13min 45s
100 エポック後、ViT モデルはテストデータ上でおよそ 55% 精度と 82% top-5 精度を獲得しました。これは CIFAR-100 データセット上で競争力のある結果ではありません、同じデータ上でスクラッチから訓練された ResNet50V2 は 67% 精度を獲得できるからです。
この 論文 で報告されている最先端の結果は JFT-300M データセットを使用して ViT モデルを事前訓練してからそれをターゲットデータセット上で再調整することで獲得されたことに注意してください。事前訓練なしにモデル品質を向上するために、より多いエポックの間モデルを訓練し、より大きい数の Transformer 層を使用し、入力画像をリサイズし、パッチサイズを変更し、あるいは射影次元を増やすことを試すことができます。また、論文で述べられているように、モデルの品質はアーキテクチャの選択だけではなく、学習率スケジュール, optimizer, 重み減衰 etc. のようなパラメータにも影響されます。実際には、大規模で高解像度のデータセットを使用して事前訓練された ViT モデルを再調整することが推奨されます。
以上