Keras 2 : examples : コンピュータビジョン – COVID-19 関連所見を持つ胸部 CT スキャンからの 3D 画像分類 (翻訳/解説)

翻訳 : (株)クラスキャットセールスインフォメーション
作成日時 : 10/31/2021 (keras 2.6.0)

* 本ページは、Keras の以下のドキュメントを翻訳した上で適宜、補足説明したものです：

Code examples : Computer Vision : 3D image classification from CT scans (Author: Hasib Zunair)

* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

クラスキャット人工知能研究開発支援サービス ★ 無料 Web セミナー開催中 ★

◆ クラスキャットは人工知能・テレワークに関する各種サービスを提供しております。お気軽にご相談ください :

人工知能研究開発支援
1. 人工知能研修サービス(経営者層向けオンサイト研修)
2. テクニカルコンサルティングサービス
3. 実証実験(プロトタイプ構築)
4. アプリケーションへの実装
人工知能研修サービス
PoC(概念実証)を失敗させないための支援
テレワーク & オンライン授業を支援

◆ 人工知能とビジネスをテーマに WEB セミナーを定期的に開催しています。スケジュール。

お住まいの地域に関係なく Web ブラウザからご参加頂けます。事前登録 が必要ですのでご注意ください。
ウェビナー運用には弊社製品「ClassCat® Webinar」を利用しています。

◆ お問合せ : 本件に関するお問い合わせ先は下記までお願いいたします。

株式会社クラスキャット セールス・マーケティング本部セールス・インフォメーション

E-Mail：sales-info@classcat.com ; WebSite: https://www.classcat.com/ ; Facebook

Keras 2 : examples : COVID-19 関連所見を持つ胸部 CT スキャンからの 3D 画像分類

イントロダクション

このサンプルは、コンピュータ・トモグラフィー (CT) のウイルス性肺炎 (= viral pneumonia) の存在を予測する 3D 畳込みニューラルネットワーク (CNN) を構築するために必要なステップを示します。2D CNN は一般には RGB 画像 (3 チャネル) を処理するために使用されます。3D CNN は単純に 3D の等値です ; それは入力として 3D ボリューム、あるいは 2D フレームのシークエンス (e.g. CT スキャンのスライス) を取り、3D CNN はボリュームデータの表現を学習する強力なモデルです。

リファレンス

セットアップ

import os
import zipfile
import numpy as np
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers

MosMedData : COVID-19 関連所見を持つ胸部 CT スキャンをダウンロードする

このサンプルでは、MosMedData : COVID-19 関連所見を持つ胸部 CT スキャンのサブセットを使用します。このデータセットは COVID-19 関連所見を持つ、そしてそのような所見のない胸部 CT スキャンから成ります。

ウイルス性肺炎の存在を予測する分類器を構築するために CT スキャンの関連する放射線学の所見をラベルとして使用していきます。従って、タスクは二値分類問題です。

# Download url of normal CT scans.
url = "https://github.com/hasibzunair/3D-image-classification-tutorial/releases/download/v0.2/CT-0.zip"
filename = os.path.join(os.getcwd(), "CT-0.zip")
keras.utils.get_file(filename, url)

# Download url of abnormal CT scans.
url = "https://github.com/hasibzunair/3D-image-classification-tutorial/releases/download/v0.2/CT-23.zip"
filename = os.path.join(os.getcwd(), "CT-23.zip")
keras.utils.get_file(filename, url)

# Make a directory to store the data.
os.makedirs("MosMedData")

# Unzip data in the newly created directory.
with zipfile.ZipFile("CT-0.zip", "r") as z_fp:
    z_fp.extractall("./MosMedData/")

with zipfile.ZipFile("CT-23.zip", "r") as z_fp:
    z_fp.extractall("./MosMedData/")

Downloading data from https://github.com/hasibzunair/3D-image-classification-tutorial/releases/download/v0.2/CT-0.zip
1065476096/1065471431 [==============================] - 236s 0us/step
Downloading data from https://github.com/hasibzunair/3D-image-classification-tutorial/releases/download/v0.2/CT-23.zip
 271171584/1045162547 [======>.......................] - ETA: 2:56

データのロードと前処理

ファイルは拡張子 .nii を持つ Nifti 形式で提供されます。スキャン (データ) を読むために、nibabel パッケージを使用します。”pip install nibabel” でパッケージをインストールできます。CT スキャンは Hounsfield 単位 (HU) で raw ボクセル強度をストアしています。このデータセットではそれらは -1024 から 2000 以上の範囲です。400 以上は異なる radiointensity を持つ骨ですから、これは上界 (= higher bound) として使用されます。CT スキャンを正規化するために -1000 と 400 の間の閾値は一般に使用されます。

データを処理するために、以下を行ないます :

最初にボリュームを 90 度回転させて、そして向きが固定されます。
HU 値を 0 と 1 の間にスケールします。
幅、高さと奥行き (= depth)をリサイズします。

ここではデータを処理するために幾つかのヘルパー関数を定義します。これらの関数は訓練と検証データセットを構築するときに使用されます。

import nibabel as nib

from scipy import ndimage


def read_nifti_file(filepath):
    """Read and load volume"""
    # Read file
    scan = nib.load(filepath)
    # Get raw data
    scan = scan.get_fdata()
    return scan


def normalize(volume):
    """Normalize the volume"""
    min = -1000
    max = 400
    volume[volume < min] = min
    volume[volume > max] = max
    volume = (volume - min) / (max - min)
    volume = volume.astype("float32")
    return volume


def resize_volume(img):
    """Resize across z-axis"""
    # Set the desired depth
    desired_depth = 64
    desired_width = 128
    desired_height = 128
    # Get current depth
    current_depth = img.shape[-1]
    current_width = img.shape[0]
    current_height = img.shape[1]
    # Compute depth factor
    depth = current_depth / desired_depth
    width = current_width / desired_width
    height = current_height / desired_height
    depth_factor = 1 / depth
    width_factor = 1 / width
    height_factor = 1 / height
    # Rotate
    img = ndimage.rotate(img, 90, reshape=False)
    # Resize across z-axis
    img = ndimage.zoom(img, (width_factor, height_factor, depth_factor), order=1)
    return img


def process_scan(path):
    """Read and resize volume"""
    # Read scan
    volume = read_nifti_file(path)
    # Normalize
    volume = normalize(volume)
    # Resize width, height and depth
    volume = resize_volume(volume)
    return volume

クラス・ディレクトリから CT スキャンのパスを読みましょう。

# Folder "CT-0" consist of CT scans having normal lung tissue,
# no CT-signs of viral pneumonia.
normal_scan_paths = [
    os.path.join(os.getcwd(), "MosMedData/CT-0", x)
    for x in os.listdir("MosMedData/CT-0")
]
# Folder "CT-23" consist of CT scans having several ground-glass opacifications,
# involvement of lung parenchyma.
abnormal_scan_paths = [
    os.path.join(os.getcwd(), "MosMedData/CT-23", x)
    for x in os.listdir("MosMedData/CT-23")
]

print("CT scans with normal lung tissue: " + str(len(normal_scan_paths)))
print("CT scans with abnormal lung tissue: " + str(len(abnormal_scan_paths)))

CT scans with normal lung tissue: 100
CT scans with abnormal lung tissue: 100

normal_scan_paths[:10]

['/content/MosMedData/CT-0/study_0090.nii.gz',
 '/content/MosMedData/CT-0/study_0014.nii.gz',
 '/content/MosMedData/CT-0/study_0063.nii.gz',
 '/content/MosMedData/CT-0/study_0097.nii.gz',
 '/content/MosMedData/CT-0/study_0094.nii.gz',
 '/content/MosMedData/CT-0/study_0033.nii.gz',
 '/content/MosMedData/CT-0/study_0093.nii.gz',
 '/content/MosMedData/CT-0/study_0048.nii.gz',
 '/content/MosMedData/CT-0/study_0023.nii.gz',
 '/content/MosMedData/CT-0/study_0070.nii.gz']

abnormal_scan_paths[:10]

['/content/MosMedData/CT-23/study_0958.nii.gz',
 '/content/MosMedData/CT-23/study_0955.nii.gz',
 '/content/MosMedData/CT-23/study_0945.nii.gz',
 '/content/MosMedData/CT-23/study_0989.nii.gz',
 '/content/MosMedData/CT-23/study_0987.nii.gz',
 '/content/MosMedData/CT-23/study_0950.nii.gz',
 '/content/MosMedData/CT-23/study_1107.nii.gz',
 '/content/MosMedData/CT-23/study_0947.nii.gz',
 '/content/MosMedData/CT-23/study_0968.nii.gz',
 '/content/MosMedData/CT-23/study_1072.nii.gz']

訓練と検証データセットを構築する

クラス・ディレクトリからスキャンデータを読み、ラベルを割当てます。スキャンデータを 128x128x64 の shape を持つようにダウンサンプリングします。raw HU 値を範囲 0 と 1 にリスケールします。最後に、データセットを訓練と検証サブセットに分割します。

# Read and process the scans.
# Each scan is resized across height, width, and depth and rescaled.
abnormal_scans = np.array([process_scan(path) for path in abnormal_scan_paths])
normal_scans = np.array([process_scan(path) for path in normal_scan_paths])

# For the CT scans having presence of viral pneumonia
# assign 1, for the normal ones assign 0.
abnormal_labels = np.array([1 for _ in range(len(abnormal_scans))])
normal_labels = np.array([0 for _ in range(len(normal_scans))])

# Split data in the ratio 70-30 for training and validation.
x_train = np.concatenate((abnormal_scans[:70], normal_scans[:70]), axis=0)
y_train = np.concatenate((abnormal_labels[:70], normal_labels[:70]), axis=0)
x_val = np.concatenate((abnormal_scans[70:], normal_scans[70:]), axis=0)
y_val = np.concatenate((abnormal_labels[70:], normal_labels[70:]), axis=0)
print(
    "Number of samples in train and validation are %d and %d."
    % (x_train.shape[0], x_val.shape[0])
)

Number of samples in train and validation are 140 and 60.

データ増強

CT スキャンデータはまた訓練の間、ランダムな角度で回転することにより増強もされます。データは shape (samples, height, width, depth) のランク 3 テンソルにストアされますので、データ上で 3D 畳込みを実行できるように軸 4 でサイズ 1 の次元を追加します。新しい shape は従って (samples, height, width, depth, 1) です。実際には様々な種類の前処理と増強テクニックがありますが、このサンプルは始めるために 2, 3 の単純なものを示しています。

import random

from scipy import ndimage


@tf.function
def rotate(volume):
    """Rotate the volume by a few degrees"""

    def scipy_rotate(volume):
        # define some rotation angles
        angles = [-20, -10, -5, 5, 10, 20]
        # pick angles at random
        angle = random.choice(angles)
        # rotate volume
        volume = ndimage.rotate(volume, angle, reshape=False)
        volume[volume < 0] = 0
        volume[volume > 1] = 1
        return volume

    augmented_volume = tf.numpy_function(scipy_rotate, [volume], tf.float32)
    return augmented_volume


def train_preprocessing(volume, label):
    """Process training data by rotating and adding a channel."""
    # Rotate volume
    volume = rotate(volume)
    volume = tf.expand_dims(volume, axis=3)
    return volume, label


def validation_preprocessing(volume, label):
    """Process validation data by only adding a channel."""
    volume = tf.expand_dims(volume, axis=3)
    return volume, label

訓練と検証データローダを定義する間に、訓練データは増強関数に渡されてこれは様々な角度でボリュームをランダムに回転させます。訓練と検証データは既に 0 と 1 の間の値を持つようにリスケールされていることに注意してください。

# Define data loaders.
train_loader = tf.data.Dataset.from_tensor_slices((x_train, y_train))
validation_loader = tf.data.Dataset.from_tensor_slices((x_val, y_val))

batch_size = 2
# Augment the on the fly during training.
train_dataset = (
    train_loader.shuffle(len(x_train))
    .map(train_preprocessing)
    .batch(batch_size)
    .prefetch(2)
)
# Only rescale.
validation_dataset = (
    validation_loader.shuffle(len(x_val))
    .map(validation_preprocessing)
    .batch(batch_size)
    .prefetch(2)
)

増強された CT スキャンデータを可視化します。

import matplotlib.pyplot as plt

data = train_dataset.take(1)
images, labels = list(data)[0]
images = images.numpy()
image = images[0]
print("Dimension of the CT scan is:", image.shape)
plt.imshow(np.squeeze(image[:, :, 30]), cmap="gray")

Dimension of the CT scan is: (128, 128, 64, 1)

<matplotlib.image.AxesImage at 0x7fea680354e0>

CT スキャンは多くのスライスを持ちますので、スライスのモンタージュを可視化しましょう。

def plot_slices(num_rows, num_columns, width, height, data):
    """Plot a montage of 20 CT slices"""
    data = np.rot90(np.array(data))
    data = np.transpose(data)
    data = np.reshape(data, (num_rows, num_columns, width, height))
    rows_data, columns_data = data.shape[0], data.shape[1]
    heights = [slc[0].shape[0] for slc in data]
    widths = [slc.shape[1] for slc in data[0]]
    fig_width = 12.0
    fig_height = fig_width * sum(heights) / sum(widths)
    f, axarr = plt.subplots(
        rows_data,
        columns_data,
        figsize=(fig_width, fig_height),
        gridspec_kw={"height_ratios": heights},
    )
    for i in range(rows_data):
        for j in range(columns_data):
            axarr[i, j].imshow(data[i][j], cmap="gray")
            axarr[i, j].axis("off")
    plt.subplots_adjust(wspace=0, hspace=0, left=0, right=1, bottom=0, top=1)
    plt.show()


# Visualize montage of slices.
# 4 rows and 10 columns for 100 slices of the CT scan.
plot_slices(4, 10, 128, 128, image[:, :, :40])

3D 畳込みニューラルネットワークを定義する

モデルを容易に理解できるように、それをブロックに構造化します。このサンプルで使用される 3D CNN のアーキテクチャはこの論文に基いています。

def get_model(width=128, height=128, depth=64):
    """Build a 3D convolutional neural network model."""

    inputs = keras.Input((width, height, depth, 1))

    x = layers.Conv3D(filters=64, kernel_size=3, activation="relu")(inputs)
    x = layers.MaxPool3D(pool_size=2)(x)
    x = layers.BatchNormalization()(x)

    x = layers.Conv3D(filters=64, kernel_size=3, activation="relu")(x)
    x = layers.MaxPool3D(pool_size=2)(x)
    x = layers.BatchNormalization()(x)

    x = layers.Conv3D(filters=128, kernel_size=3, activation="relu")(x)
    x = layers.MaxPool3D(pool_size=2)(x)
    x = layers.BatchNormalization()(x)

    x = layers.Conv3D(filters=256, kernel_size=3, activation="relu")(x)
    x = layers.MaxPool3D(pool_size=2)(x)
    x = layers.BatchNormalization()(x)

    x = layers.GlobalAveragePooling3D()(x)
    x = layers.Dense(units=512, activation="relu")(x)
    x = layers.Dropout(0.3)(x)

    outputs = layers.Dense(units=1, activation="sigmoid")(x)

    # Define the model.
    model = keras.Model(inputs, outputs, name="3dcnn")
    return model


# Build model.
model = get_model(width=128, height=128, depth=64)
model.summary()

Model: "3dcnn"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 128, 128, 64, 1)] 0         
_________________________________________________________________
conv3d (Conv3D)              (None, 126, 126, 62, 64)  1792      
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 63, 63, 31, 64)    0         
_________________________________________________________________
batch_normalization (BatchNo (None, 63, 63, 31, 64)    256       
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 61, 61, 29, 64)    110656    
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 30, 30, 14, 64)    0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 30, 30, 14, 64)    256       
_________________________________________________________________
conv3d_2 (Conv3D)            (None, 28, 28, 12, 128)   221312    
_________________________________________________________________
max_pooling3d_2 (MaxPooling3 (None, 14, 14, 6, 128)    0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 14, 14, 6, 128)    512       
_________________________________________________________________
conv3d_3 (Conv3D)            (None, 12, 12, 4, 256)    884992    
_________________________________________________________________
max_pooling3d_3 (MaxPooling3 (None, 6, 6, 2, 256)      0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 6, 6, 2, 256)      1024      
_________________________________________________________________
global_average_pooling3d (Gl (None, 256)               0         
_________________________________________________________________
dense (Dense)                (None, 512)               131584    
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 513       
=================================================================
Total params: 1,352,897
Trainable params: 1,351,873
Non-trainable params: 1,024
_________________________________________________________________

モデルの訓練

# Compile model.
initial_learning_rate = 0.0001
lr_schedule = keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate, decay_steps=100000, decay_rate=0.96, staircase=True
)
model.compile(
    loss="binary_crossentropy",
    optimizer=keras.optimizers.Adam(learning_rate=lr_schedule),
    metrics=["acc"],
)

# Define callbacks.
checkpoint_cb = keras.callbacks.ModelCheckpoint(
    "3d_image_classification.h5", save_best_only=True
)
early_stopping_cb = keras.callbacks.EarlyStopping(monitor="val_acc", patience=15)

# Train the model, doing validation at the end of each epoch
epochs = 100
model.fit(
    train_dataset,
    validation_data=validation_dataset,
    epochs=epochs,
    shuffle=True,
    verbose=2,
    callbacks=[checkpoint_cb, early_stopping_cb],
)

Epoch 1/100
70/70 - 12s - loss: 0.7031 - acc: 0.5286 - val_loss: 1.1421 - val_acc: 0.5000
Epoch 2/100
70/70 - 12s - loss: 0.6769 - acc: 0.5929 - val_loss: 1.3491 - val_acc: 0.5000
Epoch 3/100
70/70 - 12s - loss: 0.6543 - acc: 0.6286 - val_loss: 1.5108 - val_acc: 0.5000
Epoch 4/100
70/70 - 12s - loss: 0.6236 - acc: 0.6714 - val_loss: 2.5255 - val_acc: 0.5000
Epoch 5/100
70/70 - 12s - loss: 0.6628 - acc: 0.6000 - val_loss: 1.8446 - val_acc: 0.5000
Epoch 6/100
70/70 - 12s - loss: 0.6621 - acc: 0.6071 - val_loss: 1.9661 - val_acc: 0.5000
Epoch 7/100
70/70 - 12s - loss: 0.6346 - acc: 0.6571 - val_loss: 2.8997 - val_acc: 0.5000
Epoch 8/100
70/70 - 12s - loss: 0.6501 - acc: 0.6071 - val_loss: 1.6101 - val_acc: 0.5000
Epoch 9/100
70/70 - 12s - loss: 0.6065 - acc: 0.6571 - val_loss: 0.8688 - val_acc: 0.6167
Epoch 10/100
70/70 - 12s - loss: 0.5970 - acc: 0.6714 - val_loss: 0.8802 - val_acc: 0.5167
Epoch 11/100
70/70 - 12s - loss: 0.5910 - acc: 0.7143 - val_loss: 0.7282 - val_acc: 0.6333
Epoch 12/100
70/70 - 12s - loss: 0.6147 - acc: 0.6500 - val_loss: 0.5828 - val_acc: 0.7500
Epoch 13/100
70/70 - 12s - loss: 0.5641 - acc: 0.7214 - val_loss: 0.7080 - val_acc: 0.6667
Epoch 14/100
70/70 - 12s - loss: 0.5664 - acc: 0.6857 - val_loss: 0.5641 - val_acc: 0.7000
Epoch 15/100
70/70 - 12s - loss: 0.5924 - acc: 0.6929 - val_loss: 0.7595 - val_acc: 0.6000
Epoch 16/100
70/70 - 12s - loss: 0.5389 - acc: 0.7071 - val_loss: 0.5719 - val_acc: 0.7833
Epoch 17/100
70/70 - 12s - loss: 0.5493 - acc: 0.6714 - val_loss: 0.5234 - val_acc: 0.7500
Epoch 18/100
70/70 - 12s - loss: 0.5050 - acc: 0.7786 - val_loss: 0.7359 - val_acc: 0.6000
Epoch 19/100
70/70 - 12s - loss: 0.5152 - acc: 0.7286 - val_loss: 0.6469 - val_acc: 0.6500
Epoch 20/100
70/70 - 12s - loss: 0.5015 - acc: 0.7786 - val_loss: 0.5651 - val_acc: 0.7333
Epoch 21/100
70/70 - 12s - loss: 0.4975 - acc: 0.7786 - val_loss: 0.8707 - val_acc: 0.5500
Epoch 22/100
70/70 - 12s - loss: 0.4470 - acc: 0.7714 - val_loss: 0.5577 - val_acc: 0.7500
Epoch 23/100
70/70 - 12s - loss: 0.5489 - acc: 0.7071 - val_loss: 0.9929 - val_acc: 0.6500
Epoch 24/100
70/70 - 12s - loss: 0.5045 - acc: 0.7357 - val_loss: 0.5891 - val_acc: 0.7333
Epoch 25/100
70/70 - 12s - loss: 0.5598 - acc: 0.7500 - val_loss: 0.5703 - val_acc: 0.7667
Epoch 26/100
70/70 - 12s - loss: 0.4822 - acc: 0.7429 - val_loss: 0.5631 - val_acc: 0.7333
Epoch 27/100
70/70 - 12s - loss: 0.5572 - acc: 0.7000 - val_loss: 0.6255 - val_acc: 0.6500
Epoch 28/100
70/70 - 12s - loss: 0.4694 - acc: 0.7643 - val_loss: 0.7007 - val_acc: 0.6833
Epoch 29/100
70/70 - 12s - loss: 0.4870 - acc: 0.7571 - val_loss: 1.7148 - val_acc: 0.5667
Epoch 30/100
70/70 - 12s - loss: 0.4794 - acc: 0.7500 - val_loss: 0.5744 - val_acc: 0.7333
Epoch 31/100
70/70 - 12s - loss: 0.4632 - acc: 0.7857 - val_loss: 0.7787 - val_acc: 0.5833

<tensorflow.python.keras.callbacks.History at 0x7fea600ecef0>

サンプル数は非常に少なく (200 だけ) そしてランダムシードは指定していないことに注意することは重要です。そのため、結果のかなりの分散 (ばらつき) を期待できます。1000 以上の CT スキャンから成る完全なデータセットはここで見つかります。完全なデータセットの使用で、83% の精度が獲得されました。両者の場合で分類性能の 6-7 % のばらつきが観察されました。

モデル性能の可視化

ここでは訓練と検証セットのためのモデル精度と損失がプロットされます。検証セットはクラスのバランスが均衡しているので、精度はモデルの性能の公平な表現を提供します。

fig, ax = plt.subplots(1, 2, figsize=(20, 3))
ax = ax.ravel()

for i, metric in enumerate(["acc", "loss"]):
    ax[i].plot(model.history.history[metric])
    ax[i].plot(model.history.history["val_" + metric])
    ax[i].set_title("Model {}".format(metric))
    ax[i].set_xlabel("epochs")
    ax[i].set_ylabel(metric)
    ax[i].legend(["train", "val"])

シングル CT スキャン上で予測する

# Load best weights.
model.load_weights("3d_image_classification.h5")
prediction = model.predict(np.expand_dims(x_val[0], axis=0))[0]
scores = [1 - prediction[0], prediction[0]]

class_names = ["normal", "abnormal"]
for score, name in zip(scores, class_names):
    print(
        "This model is %.2f percent confident that CT scan is %s"
        % ((100 * score), name)
    )

This model is 26.60 percent confident that CT scan is normal
This model is 73.40 percent confident that CT scan is abnormal

以上

2021年10月
月	火	水	木	金	土	日
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31