TensorFlow Graphics : Beginner Tutorials : カメラ・キャリブレーション (翻訳/解説)

翻訳 : (株)クラスキャットセールスインフォメーション
作成日時 : 05/28/2019

* 本ページは、TensorFlow Graphics の github レポジトリの次のページを翻訳した上で適宜、補足説明したものです：

tensorflow/graphics/blob/master/tensorflow_graphics/notebooks/6dof_alignment.ipynb

* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

TensorFlow Graphics : Beginner Tutorials : カメラ・キャリブレーション

カメラは 3D オブジェクトの 2D 画像を捕捉できるハードウェアの複雑な部品群です。実際に、リアルなカメラを正確にモデル化するためには、レンズ・ディストーション (= distortion, 歪曲収差)、ISO、焦点距離そして露出時間を含む多くのパラメータを推定する必要があります。以下では、私達の興味を射影カメラモデルに制限します。

このモデルはしばしば内部パラメータとして参照される 2 つのパラメータから成ります :

主点、これは画像上の光学中心の投影です。理想的には、主点は画像の中心に近いです。
焦点距離、これは光学中心とイメージプレーンの間の距離です。このパラメータはズームのレベルを制御することを可能にします。

このノートブックは射影カメラの内部パラメータを推定するために Tensorflow Graphics をどのように利用するかを示します。これらのパラメータのリカバーは 3D 再構成を含む幾つかのタスクを遂行するために特に重要です。

この Colab では、目標は観察とその観察と現在のソリューションのレンダリングの間の対応が与えられたときにカメラの内部パラメータをリカバーすることです。3D シーンに矩形だけを挿入してそれを最適化の間の対応のソースとして使用することにより単純さが保持されます。最小化はレーベンバーグ・マーカート・アルゴリズムを使用して遂行されます。

セットアップ & Imports

このノートブックに含まれるデモを実行するために必要な総てを import しましょう。

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

#################################
# Imports the necessary modules #
#################################

from tensorflow_graphics.math.optimizer import levenberg_marquardt
from tensorflow_graphics.rendering.camera import perspective  

tf.enable_eager_execution()

射影カメラモデルを理解する

このモデルがどのように動作するかを示すため、カメラの前のシーンには矩形以外は何もないと仮定します。最初にこの矩形をカメラにより観測されたものとしてレンダーする関数を定義しましょう。

def render_rectangle(rectangle_vertices, focal, principal_point, image_dimensions):
  """Renders a rectangle on the image plane.

  Args:
    rectangle_vertices: the four 3D corners of a rectangle.
    focal: the focal lengths of a projective camera.
    principal_point: the position of the principal point on the image plane.
    image_dimensions: the dimensions (in pixels) of the image.

  Returns:
    A 2d image of the 3D rectangle.
  """
  image = np.zeros((int(image_dimensions[0]), int(image_dimensions[1]), 3))
  vertices_2d = perspective.project(rectangle_vertices, focal, principal_point)
  vertices_2d_np = vertices_2d.numpy()
  top_left_corner = np.maximum(vertices_2d_np[0, :], (0, 0)).astype(int)
  bottom_right_corner = np.minimum(
      vertices_2d_np[1, :],
      (image_dimensions[1] - 1, image_dimensions[0] - 1)).astype(int)
  for x in range(top_left_corner[0], bottom_right_corner[0] + 1):
    for y in range(top_left_corner[1], bottom_right_corner[1] + 1):
      c1 = float(bottom_right_corner[0] + 1 -
                 x) / float(bottom_right_corner[0] + 1 - top_left_corner[0])
      c2 = float(bottom_right_corner[1] + 1 -
                 y) / float(bottom_right_corner[1] + 1 - top_left_corner[1])
      image[y, x] = (c1, c2, 1)
  return image

次のセルはデフォルト内部パラメータを定義してこれらのパラメータを使用して矩形をレンダーします。


# Sets up the vertices of the rectangle.
rectangle_depth = 1000.0
rectangle_vertices = np.array(
    ((-150.0, -75.0, rectangle_depth), (150.0, 75.0, rectangle_depth)))

# Sets up the size of the image plane.
image_width = 400
image_height = 300
image_dimensions = np.array((image_height, image_width), dtype=np.float64)

# Sets the horizontal and vertical focal length to be the same. The focal length
# picked yields a field of view around 50degrees.
focal_lengths = np.array((image_height, image_height), dtype=np.float64)
# Sets the principal point at the image center.
ideal_principal_point = np.array(
    (image_width, image_height), dtype=np.float64) / 2.0

# Let's see what our scene looks like using the intrinsic paramters defined above.
render = render_rectangle(rectangle_vertices, focal_lengths, ideal_principal_point,
                          image_dimensions)
_ = plt.imshow(render)

焦点距離と光学中心の位置は最終的なイメージ上で非常に異なる効果を持ちます。これらのパラメータの各々の効果を納得するために下のスライダーの異なる configuration を使用してください。

###############
# UI controls #
###############
#@title model parameters { vertical-output: false, run: "auto" }
focal_length_x = 300  #@param { type: "slider", min: 100.0, max: 500.0, step: 1.0 }
focal_length_y = 300  #@param { type: "slider", min: 100.0, max: 500.0, step: 1.0 }
optical_center_x = 200  #@param { type: "slider", min: 0.0, max: 400.0, step: 1.0 }
optical_center_y = 150.0  #@param { type: "slider", min: 0.0, max: 300.0, step: 1.0 }

render = render_rectangle(
    rectangle_vertices,
    np.array((focal_length_x, focal_length_y), dtype=np.float64),
    np.array((optical_center_x, optical_center_y), dtype=np.float64),
    image_dimensions)
_ = plt.imshow(render)

(訳注: 以下はパラメータを変更して得られたイメージです : )

焦点距離 100	焦点距離 500

光学中心 x == 0	光学中心 x == 400

光学中心 y == 0	光学中心 y == 300

内部パラメータを最適化する

総てのカメラ (e.g. スマートフォン・カメラ) はそれ自身の内部パラメータのセットを持ちます。他のアプリケーションの中では、3D シーン再構成、ロボティクス、そしてナビゲーションシステムにおいて正確な内部パラメータが使用されます。

内部パラメータを推定するための一般的な方法は既知の 3D オブジェクトを使用することです。内部パラメータの現在の推定を使用して、既知の 3D オブジェクトがどのように「見える」べきかを予測して、そしてそれを実際の観測と比較す比較することができます。

結果をプロットする助けとなる幾つかのヘルパー関数を定義することから始めましょう。

def plot_optimization_step(observation, prediction):
  plt.figure(figsize=(20, 10))
  ax = plt.subplot("131")
  ax.set_title("Observation")
  _ = ax.imshow(observation)
  ax = plt.subplot("132")
  ax.set_title("Prediction using estimated intrinsics")
  _ = ax.imshow(prediction)
  ax = plt.subplot("133")
  ax.set_title("Difference image")
  _ = ax.imshow(np.abs(observation - prediction))
  plt.show()


def print_errors(focal_error, center_error):
  print("Error focal length %f" % (focal_error,))
  print("Err principal point %f" % (center_error,))

さて私達が求めている内部パラメータの値、そしてこれらのパラメータの値の初期推測を定義しましょう。

def build_parameters():
  # Constructs the intrinsic parameters we wish to recover.
  real_focal_lengths = focal_lengths * np.random.uniform(0.8, 1.2, size=(2,))
  real_principal_point = ideal_principal_point + (np.random.random(2) -
                                                  0.5) * image_width / 5.0

  # Initializes the first estimate of the intrinsic parameters.
  estimate_focal_lengths = tf.Variable(real_focal_lengths +
                                       (np.random.random(2) - 0.5) *
                                       image_width)
  estimate_principal_point = tf.Variable(real_principal_point +
                                         (np.random.random(2) - 0.5) *
                                         image_width / 4)
  return real_focal_lengths, real_principal_point, estimate_focal_lengths, estimate_principal_point

前述したように、内部パラメータの現在の推定を使用して 3D オブジェクトがどのように見えるか比較することができます、それを実際の観測と比較することができます。次の関数は私達が最小化することを求めるこれら 2 つのイメージ間の距離を捕捉します。

def residuals(estimate_focal_lengths, estimate_principal_point):
  vertices_2d_gt = perspective.project(rectangle_vertices, real_focal_lengths,
                                       real_principal_point)
  vertices_2d_observed = perspective.project(rectangle_vertices,
                                             estimate_focal_lengths,
                                             estimate_principal_point)
  return vertices_2d_gt - vertices_2d_observed

今では問題を解くための総てのピースが適所にあります; let’s give it a go!

Note: 残差は Levenberg-Marquard を使用して最小化されます、これは特にこの問題のために示されます。ファーストクラスの optimizers (e.g. Adam や勾配降下) もまた使用できます。

# Samples intrinsic parameters to recover and an initial solution.
real_focal_lengths, real_principal_point, estimate_focal_lengths, estimate_principal_point = build_parameters(
)

# Contructs the observed image.
observation = render_rectangle(rectangle_vertices, real_focal_lengths,
                               real_principal_point, image_dimensions)

# Displays the initial solution.
print("Initial configuration:")
print_errors(
    np.linalg.norm(estimate_focal_lengths - real_focal_lengths),
    np.linalg.norm(estimate_principal_point - real_principal_point))
image = render_rectangle(rectangle_vertices, estimate_focal_lengths,
                         estimate_principal_point, image_dimensions)
plot_optimization_step(observation, image)

# Optimization.
_, (estimate_focal_lengths,
    estimate_principal_point) = levenberg_marquardt.minimize(
        residuals, (estimate_focal_lengths, estimate_principal_point), 1)

print("Predicted configuration:")
print_errors(
    np.linalg.norm(estimate_focal_lengths - real_focal_lengths),
    np.linalg.norm(estimate_principal_point - real_principal_point))
image = render_rectangle(rectangle_vertices, estimate_focal_lengths,
                         estimate_principal_point, image_dimensions)
plot_optimization_step(observation, image)

以上

月	火	水	木	金	土	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31