TensorFlow : Eager Execution (翻訳/解説)

翻訳 : (株)クラスキャットセールスインフォメーション
作成日時 : 02/02/2018

* 本ページは、github の tensorflow/contrib/eager の guide.md – TensorFlow Eager Execution を翻訳した上で適宜、補足説明したものです：

TensorFlow Eager Execution

* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

概要

Eager execution は TensorFlow に演算を直ちに実行させる機能です : 後で実行される計算グラフの代わりに、具体的な値が返されます。

その結果、eager execution の有効化は以下を提供します :

GPU アクセラレーションと自動微分のためのサポートを持つ数値計算のための NumPy-ライクなライブラリ。
機械学習の研究と実験のための柔軟なプラットフォーム。

【注意】
Eager execution は活発な開発下にあります。このガイドは alpha/preview リリースをウォークスルーします。特に、総ての TensorFlow API が eager execution を有効にした場合に動作するわけではありません、そして eager execution を使用することなしに定義されたモデルと比較して、幾つかのモデルの実行は遅くなるかもしれません、

Getting Started

TensorFlow がインストールされていれば、eager execution は単一の呼び出しで有効になります :

import tensorflow as tf

import tensorflow.contrib.eager as tfe

tfe.enable_eager_execution()

eager execution の有効は TensorFlow 関数がどのように振る舞うかを変更します (特に、Tensor オブジェクトは計算グラフのノードへのシンボリック・ハンドルである代わりに具体的な値を参照します)。
結果的に、eager execution はプログラムの最初で有効にされるべきで後になって同じプログラムで無効にすることはできません。

このガイドの残りのコード・サンプルでは eager execution が有効にされていることを仮定します。

数値計算のためのライブラリ

TensorFlow API の重要な断片は数値演算から成ります : 算術演算、行列演算、線形代数演算, etc。

eager execution が有効であるとき、これらの演算は Numpy ndarray のように Tensor オブジェクトとして多次元配列を消費して返します。例えば :

# Multiply two 2x2 matrices
x = tf.matmul([[1, 2],
               [3, 4]],
              [[4, 5],
               [6, 7]])
# Add one to each element
# (tf.add supports broadcasting)
y = tf.add(x, 1)

# Create a random random 5x3 matrix
z = tf.random_uniform([5, 3])

print(x)
print(y)
print(z)

出力 :

tf.Tensor(
[[16 19]
 [36 43]], shape=(2, 2), dtype=int32)
tf.Tensor(
[[17 20]
 [37 44]], shape=(2, 2), dtype=int32)
tf.Tensor(
[[ 0.25058532  0.0929395   0.54113817]
 [ 0.3108716   0.93350542  0.84909797]
 [ 0.53081679  0.12788558  0.01767385]
 [ 0.29725885  0.33540785  0.83588314]
 [ 0.38877153  0.39720535  0.78914213]], shape=(5, 3), dtype=float32)

便利のために、これらの演算はまた Tensor オブジェクトのオーバーロードする演算子を通しても引き起こされます。例えば、+ 演算子は tf.add と同値で、- は tf.subtact、* は tf.multiply, 等 :

x = (tf.ones([1], dtype=tf.float32) + 1) * 2 - 1
print(x)

出力 :

tf.Tensor([ 3.], shape=(1,), dtype=float32)

Numpy への/からの変換

上述の演算子は (数字のリストのような) Python オブジェクトと NumPy 配列を自動的に Tensor オブジェクトに変換します。Tensor オブジェクトはまた numpy 演算により NumPy 配列としても使用されます。

import numpy as np

x = tf.add(1, 1)                     # tf.Tensor with a value of 2
y = tf.add(np.array(1), np.array(1)) # tf.Tensor with a value of 2
z = np.multiply(x, y)                # numpy.int64 with a value of 4

代わりに、次の例で示されるように、それらは tf.constant を使用して明示的に変換できます。

逆に、Tensor オブジェクトの numpy() メソッドをその NumPy ndarray 値を得るために呼び出すことができます。例えば :

import numpy as np

np_x = np.array(2., dtype=np.float32)
x = tf.constant(np_x)

py_y = 3.
y = tf.constant(py_y)

z = x + y + 1

print(z)
print(z.numpy())

出力 :

tf.Tensor(6.0, shape=(), dtype=float32)
6.0

GPU アクセラレーション

多くの TensorFlow 演算は GPU アクセラレーションをサポートします。eager execution が有効であるとき、計算は自動的には GPU にオフロードされません。代わりに、GPU が使用されるべきときには明示的に指定しなければなりません。

これを行なうための最も単純な方法は with tf.device(‘/gpu:0’) ブロックで貴方の計算を囲むことです。また興味深いのは tfe.num_gpus() でこれは利用可能な GPU の数を返します。

例えば、CPU 上で２つの 1000×1000 行列を乗算するための時間を計測するスニペットを考えます :

import time

def measure(x):
  # The very first time a GPU is used by TensorFlow, it is initialized.
  # So exclude the first run from timing.
  tf.matmul(x, x)

  start = time.time()
  for i in range(10):
    tf.matmul(x, x)
  end = time.time()

  return "Took %s seconds to multiply a %s matrix by itself 10 times" % (end - start, x.shape)

# Run on CPU:
with tf.device("/cpu:0"):
  print("CPU: %s" % measure(tf.random_normal([1000, 1000])))

# If a GPU is available, run on GPU:
if tfe.num_gpus() > 0:
  with tf.device("/gpu:0"):
    print("GPU: %s" % measure(tf.random_normal([1000, 1000])))

出力 (正確な数はハードウェアの特性に依存します) :

 
CPU: Took 0.145531892776 seconds to multiply a (1000, 1000) matrix by itself 10 times
GPU: Took 0.000458955764771 seconds to multiply a (1000, 1000) matrix by itself 10 times

あるいは、Tensor オブジェクト上のメソッドは Tensor を異なるデバイスに明示的にコピーするために使用できます。演算は典型的には入力が置かれたデバイス上で実行されます。例えば :

x = tf.random_normal([10, 10])

x_gpu0 = x.gpu()
x_cpu = x.cpu()

_ = tf.matmul(x_cpu, x_cpu)  # Runs on CPU
_ = tf.matmul(x_gpu0, x_gpu0)  # Runs on GPU:0

if tfe.num_gpus() > 1:
  x_gpu1 = x.gpu(1)
  _ = tf.matmul(x_gpu1, x_gpu1)  # Runs on GPU:1

自動微分

自動微分は多くの機械学習アルゴリズムを実装するときに非常に有用です (e.g. ニューラルネットワークを訓練するための backpropagation)。この目的のために、TensorFlow eager execution は自動微分のために autograd-スタイル API を提供します。特に、関数 :

tfe.gradients_function(f): Python 関数を返します、これは Python 関数 f のその引数に関して導関数を計算します。f はスカラー値を返さなければなりません。返された関数が起動されたとき、それは Tensor オブジェクトのリストを返します (f の各引数のために一つの要素)。
tfe.value_and_gradients_function(f): tfe.gradients_function と類似してます、返された関数が起動されたとき、それは f のその引数に関する導関数のリストに加えて f の値を返す点を除いてです。

これらの関数は自然に高次微分にもまた適用されます :

def f(x):
  return tf.multiply(x, x)  # Or x * x
assert 9 == f(3.).numpy()

df = tfe.gradients_function(f)
assert 6 == df(3.)[0].numpy()

# Second order deriviative.
d2f = tfe.gradients_function(lambda x: df(x)[0])
assert 2 == d2f(3.)[0].numpy()

# Third order derivative.
d3f = tfe.gradients_function(lambda x : d2f(x)[0])
assert 0 == d3f(3.)[0].numpy()

これらの関数はモデルを訓練するために使用できます。例えば、次の単純な線形回帰モデルを考えます :

def prediction(input, weight, bias):
  return input * weight + bias

# A toy dataset of points around 3 * x + 2
NUM_EXAMPLES = 1000
training_inputs = tf.random_normal([NUM_EXAMPLES])
noise = tf.random_normal([NUM_EXAMPLES])
training_outputs = training_inputs * 3 + 2 + noise

# A loss function: Mean-squared error
def loss(weight, bias):
  error = prediction(training_inputs, weight, bias) - training_outputs
  return tf.reduce_mean(tf.square(error))

# Function that returns the derivative of loss with respect to
# weight and bias
grad = tfe.gradients_function(loss)

# Train for 200 steps (starting from some random choice for W and B, on the same
# batch of data).
W = 5.
B = 10.
learning_rate = 0.01
print("Initial loss: %f" % loss(W, B).numpy())
for i in range(200):
  (dW, dB) = grad(W, B)
  W -= dW * learning_rate
  B -= dB * learning_rate
  if i % 20 == 0:
    print("Loss at step %d: %f" % (i, loss(W, B).numpy()))
print("Final loss: %f" % loss(W, B).numpy())
print("W, B = %f, %f" % (W.numpy(), B.numpy()))

出力: (正確な数値はノイズのランダムネスに依存して様々かもしれません。)

Initial loss: 66.730003
Loss at step 0: 64.200096
Loss at step 20: 29.872814
Loss at step 40: 14.233772
Loss at step 60: 7.090570
Loss at step 80: 3.819887
Loss at step 100: 2.318821
Loss at step 120: 1.628385
Loss at step 140: 1.310142
Loss at step 160: 1.163167
Loss at step 180: 1.095162
Final loss: 1.064711
W, B = 3.094944, 2.161383

GPU を利用するためには、with tf.device(“/gpu:0”): ブロック内に上のコードを置きます。

勾配をカスタマイズする

演算、または関数のためにカスタム勾配を定義することを望むかもしれません。これは複数の理由で有用かもしれません、演算のシークエンスのためにより効率的あるいはより数値的安定な勾配を提供することを含みます。

例えば、関数 log(1 + e^x) を考えます、これは交差エントロピーと log 尤度の計算で一般に発生します。

def log1pexp(x):
  return tf.log(1 + tf.exp(x))
grad_log1pexp = tfe.gradients_function(log1pexp)

# Works fine at x = 0.
assert 0.5 == float(grad_log1pexp(0.)[0])

# Returns a `nan` at x = 100 due to numerical instability.
import math
assert math.isnan(float(grad_log1pexp(100.)[0]))

上の関数のためにカスタム勾配を定義できます、これは勾配式を解析的に単純化します。

@tfe.custom_gradient
def log1pexp(x):
  e = tf.exp(x)
  def grad(dy):
    return dy * (1 - 1 / (1 + e))
  return tf.log(1 + e), grad
grad_log1pexp = tfe.gradients_function(log1pexp)

# Works as before at x = 0.
assert 0.5 == float(grad_log1pexp(0.)[0])

# But now works at x = 100 as well.
assert 1.0 == float(grad_log1pexp(100.)[0])

勾配関数の実装が forward パスの間に計算された式 (tf.exp(x)) どのように再利用するかについてもまた注目してください、このように冗長な計算を回避することにより勾配計算をより効率的にします。

モデルを構築して訓練する

実際に、貴方の計算は (導関数を計算することにより) 最適化されるべき多くのパラメータを持つかもしれません。それらを再利用可能なクラス/オブジェクトにカプセル化することは、多くの引数を持つ単一のトップレベル関数を書くことよりも、コードをフォローすることをより容易にします。

実際に、eager execution は tf.layers モジュールで Keras-スタイル “層” クラスの利用を奨励しています。

更に、tf.train.Optimizer 実装のそれらのように、パラメータ更新を計算するためにより洗練されたテクニックを適用することを望むかもしれません。

次のセクションでは、eager execution が有効化された環境で訓練可能な TensorFlow グラフを構築するために使用された同じ Optimizer と層 API を使用してウォークスルーします。

Variable と Optimizer

tfe.Variable オブジェクトは、訓練時にアクセス可能な可変な Tensor 値をストアし、自動微分をより容易にします。特に、モデルのパラメータは Python クラスで変数としてカプセル化されます。

先に導入された tfe.gradients_function(f) は f の導関数をその引数について計算します。けれども、それは f の引数となる興味対象である総てのパラメータを必要とします、これは f が巨大な数の訓練可能なパラメータに依存するとき扱いにくいものとなります。

tfe.implicit_gradients は幾つかの有用な特性を持つ代わりとなる関数です :

それは f により使用される tfe.Variables の総てに関して f の導関数を計算します。
返された関数が起動されるとき、(勾配値, Variable オブジェクトの) タプルのリストを返します。

tfe.implicit_gradients の使用と一緒にモデルパラメータを Variable オブジェクトとして表わすことは、典型的にはより良いカプセル化になります。例えば、上で記述された線形回帰モデルは次のクラスへと書くことができます :

class Model(object):
  def __init__(self):
    self.W = tfe.Variable(5., name='weight')
    self.B = tfe.Variable(10., name='bias')

  def predict(self, inputs):
    return inputs * self.W + self.B


# The loss function to be optimized
def loss(model, inputs, targets):
  error = model.predict(inputs) - targets
  return tf.reduce_mean(tf.square(error))

# A toy dataset of points around 3 * x + 2
NUM_EXAMPLES = 1000
training_inputs = tf.random_normal([NUM_EXAMPLES])
noise = tf.random_normal([NUM_EXAMPLES])
training_outputs = training_inputs * 3 + 2 + noise

# Define:
# 1. A model
# 2. Derivatives of a loss function with respect to model parameters
# 3. A strategy for updating the variables based on the derivatives
model = Model()
grad = tfe.implicit_gradients(loss)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)

# The training loop
print("Initial loss: %f" %
      loss(model, training_inputs, training_outputs).numpy())
for i in range(201):
  optimizer.apply_gradients(grad(model, training_inputs, training_outputs))
  if i % 20 == 0:
    print("Loss at step %d: %f" %
          (i, loss(model, training_inputs, training_outputs).numpy()))
print("Final loss: %f" % loss(model, training_inputs, training_outputs).numpy())
print("W, B = %s, %s" % (model.W.numpy(), model.B.numpy()))

出力:

Initial loss: 69.693184
Loss at step 0: 66.987854
Loss at step 20: 30.553387
Loss at step 40: 14.250237
Loss at step 60: 6.955020
Loss at step 80: 3.690550
Loss at step 100: 2.229739
Loss at step 120: 1.576032
Loss at step 140: 1.283496
Loss at step 160: 1.152584
Loss at step 180: 1.093999
Final loss: 1.067780
W, B = 3.0114281, 2.0865183

implicit_gradients の使用はモデルの訓練可能なパラメータ総てを損失関数への引数として提供する必要性を回避します。

Keras と層 API を使用する

Keras はモデル構造を定義するためのポピュラーな API です。tf.keras.layers モジュールはモデルのためのブロックを構築するセットを提供し tf.layers モジュールの tf.layers.Layer サブクラスを使用して実装されています。TensorFlow の eager execution 特徴を使用するときこれらの同じビルディング・ブロックの利用を推奨します。例えば、同じ線形回帰モデルは tf.layers.Dense を使用して構築されます :

class Model(object):
  def __init__(self):
    self.layer = tf.layers.Dense(1)

  def predict(self, inputs):
    return self.layer(inputs)

tf.layers API はより洗練されたモデルを定義することをより使いやすくします。例えば、次は MNIST モデルを訓練します :

class MNISTModel(object):
  def __init__(self, data_format):
    # 'channels_first' is typically faster on GPUs
    # while 'channels_last' is typically faster on CPUs.
    # See: https://www.tensorflow.org/performance/performance_guide#data_formats
    if data_format == 'channels_first':
      self._input_shape = [-1, 1, 28, 28]
    else:
      self._input_shape = [-1, 28, 28, 1]
    self.conv1 = tf.layers.Conv2D(32, 5,
                                  padding='same',
                                  activation=tf.nn.relu,
                                  data_format=data_format)
    self.max_pool2d = tf.layers.MaxPooling2D(
        (2, 2), (2, 2), padding='same', data_format=data_format)
    self.conv2 = tf.layers.Conv2D(64, 5,
                                  padding='same',
                                  activation=tf.nn.relu,
                                  data_format=data_format)
    self.dense1 = tf.layers.Dense(1024, activation=tf.nn.relu)
    self.dropout = tf.layers.Dropout(0.5)
    self.dense2 = tf.layers.Dense(10)

  def predict(self, inputs):
    x = tf.reshape(inputs, self._input_shape)
    x = self.max_pool2d(self.conv1(x))
    x = self.max_pool2d(self.conv2(x))
    x = tf.layers.flatten(x)
    x = self.dropout(self.dense1(x))
    return self.dense2(x)

def loss(model, inputs, targets):
  return tf.reduce_mean(
      tf.nn.softmax_cross_entropy_with_logits(
          logits=model.predict(inputs), labels=targets))


# Load the training and validation data
from tensorflow.examples.tutorials.mnist import input_data
data = input_data.read_data_sets("./mnist_data", one_hot=True)

# Train
device = "gpu:0" if tfe.num_gpus() else "cpu:0"
model = MNISTModel('channels_first' if tfe.num_gpus() else 'channels_last')
optimizer = tf.train.AdamOptimizer(learning_rate=1e-4)
grad = tfe.implicit_gradients(loss)
for i in range(20001):
  with tf.device(device):
    (inputs, targets) = data.train.next_batch(50)
    optimizer.apply_gradients(grad(model, inputs, targets))
    if i % 100 == 0:
      print("Step %d: Loss on training set : %f" %
            (i, loss(model, inputs, targets).numpy()))
print("Loss on test set: %f" % loss(model, data.test.images, data.test.labels).numpy())

より完全なサンプルについては、tensorflow/contrib/eager/python/examples/mnist.py を見てください。

訓練 variables をチェックポイントする

TensorFlow Variables (tfe.Variable) は貴方のモデルの共有される、永続的な状態を表現する方法を提供します。tfe.Saver クラス (これは tf.train.Saver クラスの thin ラッパーです) はチェックポイントへ/から variables をセーブしてリストアする方法を提供します。

例えば :

# Create variables.
x = tfe.Variable(10., name='x')
y = tfe.Variable(5., name='y')

# Create a Saver.
saver = tfe.Saver([x, y])

# Assign new values to the variables and save.
x.assign(2.)
saver.save('/tmp/ckpt')

# Change the variable after saving.
x.assign(11.)
assert 16. == (x + y).numpy()  # 11 + 5

# Restore the values in the checkpoint.
saver.restore('/tmp/ckpt')

assert 7. == (x + y).numpy()  # 2 + 5

tfe.Network

貴方は、上で定義された MNISTModel クラスのように、クラスを使用してモデルを体系化することをしばしば望むかもしれません。tfe.Network クラスから継承することを推奨します、何故ならばそれは総てのモデル variables を追跡するような便利さとチェックポイントをセーブしてリストアするメソッドを提供するからです。

tfe.Network のサブクラスは self.track_layer() への呼び出しを使用して ( tf.layers, または Keras 層内のクラスのように) 層を登録してそして call() の実装内で計算を定義するかもしれません。

tf.layers.Layer オブジェクトは、最初の入力に出会うとき variables を遅延して作成することに注意してください。

例えば、次の 2 層ニューラルネットワークを考えます :

class TwoLayerNet(tfe.Network):
  def __init__(self):
    super(TwoLayerNet, self).__init__()
    self.layer1 = self.track_layer(
      tf.layers.Dense(2, activation=tf.nn.relu, use_bias=False))
    self.layer2 = self.track_layer(tf.layers.Dense(3, use_bias=False))

  def call(self, x):
    return self.layer2(self.layer1(x))

net = TwoLayerNet()

# No variables created yet
assert 0 == len(net.variables)

# They are created on first input:
inp = tf.constant([[1.]])

# Since input is a 1x1 matrix, net.l1 has 2 units and net.l2 has 3 units,
# the output is the product of a 1x1 matrix with a 1x2 matrix with a 2x3
# matrix.
assert [1, 3] == net(inp).shape.as_list()  # Invoke net; get output shape.
assert 1 == len(net.layer1.variables)
assert 1 == len(net.layer2.variables)
assert 2 == len(net.variables)  # weights for each layer.
assert [1, 2] == net.variables[0].shape.as_list()  # weights of layer1.
assert [2, 3] == net.variables[1].shape.as_list()  # weights of layer2.

tfe.Network クラスはそれ自身が tf.layers.Layer のサブクラスです。これは tfe.Network のインスタンスが他のネットワークに埋め込まれることを可能にします。例えば :

class ThreeLayerNet(tfe.Network):
  def __init__(self):
    super(ThreeLayerNet, self).__init__()
    self.a = self.track_layer(TwoLayerNet())
    self.b = self.track_layer(tf.layers.Dense(4, use_bias=False))

  def call(self, x):
    return self.b(self.a(x))

net = ThreeLayerNet()

assert [1, 4] == net(inp).shape.as_list()
assert 3 == len(net.variables)
assert [1, 2] == net.variables[0].shape.as_list()
assert [2, 3] == net.variables[1].shape.as_list()
assert [3, 4] == net.variables[2].shape.as_list()

tensorflow/contrib/eager/python/examples の更なるサンプルを見てください。

tfe.restore_variables_on_create と組み合わせた tfe.Saver はひとたびチェックポイントが作成されればプログラムの変更なしにチェックポイントをセーブしてロードする便利な方法を提供します。例えば、ネットワークの出力のための目的を設定して、optimizer とチェックポイントのための場所を選択することができます :

objective = tf.constant([[2., 3., 4., 5.]])
optimizer = tf.train.AdamOptimizer(0.01)
checkpoint_directory = '/tmp/tfe_example'
checkpoint_prefix = os.path.join(checkpoint_directory, 'ckpt')
net = ThreeLayerNet()

variables はまだ作成されていないことに注意しましょう。もし存在すれば、チェックポイントからそれらがリストアされることを望みます、そしてそれらを tfe.restore_variables_on_create コンテキスト・マネージャ内に作成します。それから訓練は訓練を開始しても以前のチェックポイントから再開しても同じです :

with tfe.restore_variables_on_create(
    tf.train.latest_checkpoint(checkpoint_directory)):
  global_step = tf.train.get_or_create_global_step()
  for _ in range(100):
    loss_fn = lambda: tf.norm(net(inp) - objective)
    optimizer.minimize(loss_fn, global_step=global_step)
    if tf.equal(global_step % 20, 0):
      print("Step %d, output %s" % (global_step.numpy(),
                                    net(inp).numpy()))
      all_variables = (
          net.variables
          + optimizer.variables()
          + [global_step])
      # Save the checkpoint.
      tfe.Saver(all_variables).save(checkpoint_prefix, global_step=global_step)

最初にそれが実行されるときは、ネットワーク variables はランダムに初期化されます。それから出力は設定した目的に適合するように訓練されます :

Step 20, output [[ 0.03575622  0.29863232  0.03474367  0.24735749]]
Step 40, output [[ 0.40646029  0.9856872   0.46851286  0.95358551]]
Step 60, output [[ 1.74541104  2.800704    1.79055595  2.74783421]]
Step 80, output [[ 2.14977384  3.44340849  3.96120024  5.16242075]]
Step 100, output [[ 1.99943113  3.02364397  3.93500996  4.9610076 ]]

続く反復では、variables は最新のチェックポイントから読まれた値で初期化されます。同じコードを再度実行すれば、中止したところから続行されます :

Step 120, output [[ 1.99234128  3.0271616   3.98732996  4.96401167]]
Step 140, output [[ 2.00133467  3.01270437  4.00616646  5.00406504]]
Step 160, output [[ 1.99647415  2.9956708   3.99064088  4.99632359]]
Step 180, output [[ 2.00699997  3.00904822  4.00706148  5.01193142]]
Step 200, output [[ 1.98334622  2.98249531  3.97375059  4.97123432]]

Summaries, メトリクスと TensorBoard

TensorBoard はモデル訓練プロセスを理解し、デバッグし、そして最適化するためにポピュラーなツールです。TensorBoard から提供される可視化から恩恵を受けるためには、プログラムの実行の進行の間に summary events が書かれる必要があります。グラフ構築の間に tf.summary 演算子を含む多くの TensorFlow プログラムを見い出すでしょう。

tf.summary 演算は eager execution と互換ではありませんが、tf.contrib.summary に同値な代替があり、これは eager execution とグラフ構築の両者で互換です。

モデル構築の間に tf.contrib.summary.scalar のような summary 演算を単純に挿入します。これらの演算は、summary writer が有効で writing policy が設定されていない限りはデフォルトでは何もしません。

例えば、100 global ステップ毎に一度 summaries を記録するためには、以下を使用します :

tf.train.get_or_create_global_step()  # Ensuring the global step variable exists
writer = tf.contrib.summary.create_file_writer(logdir)

for _ in range(iterations):
  with writer.as_default():
    with tf.contrib.summary.record_summaries_every_n_global_steps(100):
      # your model code goes here
      tf.contrib.summary.scalar('loss', loss)
      # ...

tf.contrib.summary を使用する完全なモデルのためには tensorflow/contrib/eager/python/examples/mnist の完全な mnist サンプルを見てください。

summaries と同様に、tf.metrics のメトリクスは現在 eager execution と互換ではありません。代わりに tfe.metrics パッケージでオブジェクト指向メトリクスを提供します、これはグラフ構築ともまた互換です。

tfe.metrics.Mean と tfe.Metrics.Accuracy のような、tfe.metrics のメトリクス総ては直感的なオブジェクト指向インターフェイスを実装しています。tfe.metrics.Mean メトリックをどのように使用するかのサンプルがここにあります :

# Metrics are objects, which can be created and destroyed.
my_mean = tfe.metrics.Mean(name='my_mean')
# While a metric is active, you can call it as a function to accumulate into its
# internal state.
my_mean(0.0)
my_mean(10.0)
# Once you've finished updating the metric, you can get its result. In this case
# a simple average over all the calls to it. If a summary writer is active the
# metric will write the appropriate summaries using the metric name.
assert 5.0 == my_mean.result().numpy()

評価のためのメトリクスを使用するモデルの完全なサンプルのためには、tensorflow/contrib/eager/python/examples/mnist の mnist サンプルを見てください。

入力パイプライン

上の議論は貴方のモデルで実行される計算回りに中心が置かれていました。tf.data モジュールは単純で、再利用可能なピースから複雑な入力パイプラインを構築する API を提供します。

TensorFlow グラフを構築するとき tf.data.Dataset オブジェクトを構築することに精通していれば、eager execution が有効であるとき同じ API 呼び出しが使用されます。けれども、データセットの要素に渡る反復プロセスは eager execution とグラフ構築の間では異なります。eager execution が有効であるとき、Programmer’s Guiide の make_one_shot_iterator() と get_next() を使用する iterator 作成上の議論は当てはまりません。代わりに、より Pythonic な Iterator クラスが利用可能です。

例えば :

# Create a source Dataset from in-memory numpy arrays.
# For reading from files on disk, you may want to use other Dataset classes
# like the TextLineDataset or the TFRecordDataset.
dataset = tf.data.Dataset.from_tensor_slices([1, 2, 3, 4, 5, 6])

# Apply transformations, shuffling, batching etc.
dataset = dataset.map(tf.square).shuffle(2).batch(2)

# Use tfe.Iterator to iterate over the dataset.
for x in tfe.Iterator(dataset):
  print(x)

出力 :

tf.Tensor([4 9], shape=(2,), dtype=int32)
tf.Tensor([16 25], shape=(2,), dtype=int32)
tf.Tensor([36  1], shape=(2,), dtype=int32)

グラフとの相互運用

Eager execution は Python でのモデル開発プロセスを改善します ; けれども、それは最初期段階にありますので、モデルをプロダクションで配備するとき (TensorFlow グラフでは利用可能な) 望ましい幾つかの特徴をまだサポートしていません。特に、分散訓練、(他のプログラミング言語、TensorFlow serving そしてモバイル・アプリケーションへの) モデルのエクスポート、そして (TensorFlow dataflow グラフに適用される) 各種のメモリと計算最適化を eager execution はまだサポートしていません。

そうは言っても、モデルを構築するために使用される API は eagerly に実行されてもグラフを構築しても正確に同じです。これは貴方のモデルを eager execution を有効にして反復的に開発可能でそして後で必要であれば、モデルを計算グラフとして表現する恩恵を収穫するために同じコードを利用できることを意味しています。

例えば、mnist.py は eagerly に実行されるモデルを定義します。その同じコードが mnist_graph_test.py でグラフを構築して実行するために使用されます。

examples ディレクトリの他のモデルもまたこれを示します。

幾つかの違いは注目に値します :

eager execution が有効であるとき tf.placeholder や tf.Session の概念はありません。
tf.Tensor.name, tf.Tensor.op, tf.Tensor.inputs のような tf.Tensor オブジェクト上の多くのプロパティは eager execution が有効であるとき意味がなく、そしてそれらの使用は AttributeError を上げるべきです。
グラフ構築で tfe.implicit_gradients を使用するためには、variables は tf.get_variable() または tf.variable_scope() に提供される [use_resource=True] で作成されなければなりません。
(関数型スタイル tf.layers.dense, tf.layers.conv2d のような) 幾つかの API 呼び出しは eager execution と互換ではありません。そのようなメソッドの使用は代替 (e.g., tf.layers.Dense と tf.layers.Conv2D クラス) を示すエラーを上げるべきです。

以上

2018年2月
月	火	水	木	金	土	日
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28