Keras Stable Diffusion : 基本的な使い方 (テキスト-to-画像 / 画像-to-画像変換) (翻訳/解説)

翻訳 : (株)クラスキャットセールスインフォメーション
作成日時 : 12/27/2022

* 本ページは、github の divamgupta/stable-diffusion-tensorflow レポジトリの以下のドキュメント内の Colab ノートブックを翻訳した上でまとめ直したものです。一部は修正しています：

divamgupta/stable-diffusion-tensorflow/README.md

* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

クラスキャット人工知能研究開発支援サービス

◆ クラスキャットは人工知能・テレワークに関する各種サービスを提供しています。お気軽にご相談ください :

人工知能研究開発支援
1. 人工知能研修サービス(経営者層向けオンサイト研修)
2. テクニカルコンサルティングサービス
3. 実証実験(プロトタイプ構築)
4. アプリケーションへの実装
人工知能研修サービス
PoC(概念実証)を失敗させないための支援

◆ 人工知能とビジネスをテーマに WEB セミナーを定期的に開催しています。スケジュール。

お住まいの地域に関係なく Web ブラウザからご参加頂けます。事前登録 が必要ですのでご注意ください。

◆ お問合せ : 本件に関するお問い合わせ先は下記までお願いいたします。

株式会社クラスキャット セールス・マーケティング本部セールス・インフォメーション
sales-info@classcat.com ; Web: www.classcat.com ; ClassCatJP

Keras Stable Diffusion : GPU starter サンプル (テキスト-to-画像)

GPU 要件のインストール

!pip install git+https://github.com/fchollet/stable-diffusion-tensorflow --upgrade --quiet
!pip install tensorflow tensorflow_addons ftfy --upgrade --quiet
!apt install --allow-change-held-packages libcudnn8=8.1.0.77-1+cuda11.2

Text2Image generator をインスタンス化して最初の画像を作成しましょう

最初の実行は少しの追加のコンパイル・オーバーヘッドを持ちます。

# ClassCat
from stable_diffusion_tf.stable_diffusion import StableDiffusion
#from stable_diffusion_tf.stable_diffusion import Text2Image
from PIL import Image

generator = StableDiffusion(
#generator = Text2Image( 
    img_height=512,
    img_width=512,
    jit_compile=False,  # You can try True as well (different performance profile)
)
img = generator.generate(
    "DSLR photograph of an astronaut riding a horse",
    num_steps=50,
    unconditional_guidance_scale=7.5,
    temperature=1,
    batch_size=1,
)
pil_img = Image.fromarray(img[0])
display(pil_img)

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089
Downloading data from https://huggingface.co/fchollet/stable-diffusion/resolve/main/text_encoder.h5
492456896/492456896 [==============================] - 2s 0us/step
Downloading data from https://huggingface.co/fchollet/stable-diffusion/resolve/main/diffusion_model.h5
3439035312/3439035312 [==============================] - 34s 0us/step
Downloading data from https://huggingface.co/fchollet/stable-diffusion/resolve/main/decoder.h5
198152112/198152112 [==============================] - 12s 0us/step
Downloading data from https://huggingface.co/divamgupta/stable-diffusion-tensorflow/resolve/main/encoder_newW.h5
136801296/136801296 [==============================] - 8s 0us/step
  0   1: 100%|██████████| 50/50 [01:15<00:00,  1.52s/it]

同じ generator でより多くの画像を作成し続けることができます

コンパイルは一度通り抜ける必要があるだけです — 続くすべての実行は高速になります。

img = generator.generate(
    "An epic unicorn riding in the sunset, artstation concept art",
    num_steps=50,
    unconditional_guidance_scale=7.5,
    temperature=1,
)
pil_img = Image.fromarray(img[0])
display(pil_img)

バッチ化生成を試しましょう

img = generator.generate(
    "Ruins of a castle in Scotland",
    num_steps=50,
    unconditional_guidance_scale=7.5,
    temperature=1,
    batch_size=4,
)
pil_img = Image.fromarray(img[0])
display(pil_img)

pil_img = Image.fromarray(img[1])
display(pil_img)

pil_img = Image.fromarray(img[2])
display(pil_img)

pil_img = Image.fromarray(img[3])
display(pil_img)

Keras Stable Diffusion : 画像-to-画像変換

!pip install git+https://github.com/divamgupta/stable-diffusion-tensorflow --upgrade --quiet
!pip install tensorflow tensorflow_addons ftfy --upgrade --quiet
!apt install --allow-change-held-packages libcudnn8=8.1.0.77-1+cuda11.2

入力画像を取得して表示してみます :

! wget https://pyxis.nymag.com/v1/imgs/24c/d4a/6fdd64a7c835b8325065b72e6fbfe59fb9-09-family-drawing1.rsquare.w330.jpg -O inp.jpg

from PIL import Image

display(Image.open("inp.jpg"))

StableDiffusion をインスタンス化して generator を作成します :

from stable_diffusion_tf.stable_diffusion import StableDiffusion
from PIL import Image

generator = StableDiffusion(
    img_height=512,
    img_width=512,
    jit_compile=False,  # You can try True as well (different performance profile)
)

generator を実行して画像を生成し、入力画像と比較してみます :

img = generator.generate(
    "a high quality sketch of people standing with sun and grass , watercolor , pencil color",
    num_steps=50,
    unconditional_guidance_scale=7.5,
    temperature=1,
    batch_size=1,
    input_image="inp.jpg",
    input_image_strength=0.8
)
pil_img = Image.fromarray(img[0])

入力画像 :

出力画像例 :

以上

2022年12月
月	火	水	木	金	土	日
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31