🦜️🔗LangChain : モジュール : モデル I/O – 言語モデル : LLM : キャッシング / シリアル化 (翻訳/解説)

翻訳 : (株)クラスキャットセールスインフォメーション
作成日時 : 08/27/2023

* 本ページは、LangChain の以下のドキュメントを翻訳した上で適宜、補足説明したものです：

* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

クラスキャット人工知能研究開発支援サービス

◆ クラスキャットは人工知能・テレワークに関する各種サービスを提供しています。お気軽にご相談ください :

人工知能研究開発支援
1. 人工知能研修サービス(経営者層向けオンサイト研修)
2. テクニカルコンサルティングサービス
3. 実証実験(プロトタイプ構築)
4. アプリケーションへの実装
人工知能研修サービス
PoC(概念実証)を失敗させないための支援

◆ 人工知能とビジネスをテーマに WEB セミナーを定期的に開催しています。スケジュール。

お住まいの地域に関係なく Web ブラウザからご参加頂けます。事前登録 が必要ですのでご注意ください。

◆ お問合せ : 本件に関するお問い合わせ先は下記までお願いいたします。

株式会社クラスキャット セールス・マーケティング本部セールス・インフォメーション
sales-info@classcat.com ; Web: www.classcat.com ; ClassCatJP

🦜️🔗 LangChain : モジュール : モデル I/O – 言語モデル : LLM : キャッシング

LangChain は LLM 用のオプションのキャッシング層を提供しています。これは 2 つの理由で役立ちます :

それは、同じ補完を複数回リクエストすることが多い場合には LLM プロバイダーに行なう API 呼び出しの数を削減することでお金を節約できます。それは LLM プロバイダーに行なう API 呼び出しの数を削減することでスピードアップすることができます。

import langchain
from langchain.llms import OpenAI

# To make the caching really obvious, lets use a slower model.
llm = OpenAI(model_name="text-davinci-002", n=2, best_of=2)

メモリキャッシュ内

from langchain.cache import InMemoryCache
langchain.llm_cache = InMemoryCache()

# The first time, it is not yet in cache, so it should take longer
llm.predict("Tell me a joke")

    CPU times: user 35.9 ms, sys: 28.6 ms, total: 64.6 ms
    Wall time: 4.83 s
    

    "\n\nWhy couldn't the bicycle stand up by itself? It was...two tired!"

# The second time it is, so it goes faster
llm.predict("Tell me a joke")

    CPU times: user 238 µs, sys: 143 µs, total: 381 µs
    Wall time: 1.76 ms


    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

SQLite キャッシュ

rm .langchain.db

# We can do the same thing with a SQLite cache
from langchain.cache import SQLiteCache
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")

# The first time, it is not yet in cache, so it should take longer
llm.predict("Tell me a joke")

    CPU times: user 17 ms, sys: 9.76 ms, total: 26.7 ms
    Wall time: 825 ms


    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

# The second time it is, so it goes faster
llm.predict("Tell me a joke")

    CPU times: user 2.46 ms, sys: 1.23 ms, total: 3.7 ms
    Wall time: 2.67 ms


    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

チェイン内のオプションのキャッシング

チェインの特定のノードに対してキャッシングを無効にすることもできます。特定のインターフェイスゆえに、最初にチェインを構築してから、後で LLM を編集することがより簡単であることに注意してください。

例として、summarizer map-reduce チェインをロードします。map ステップ用に結果をキャッシュしますが、combine ステップのためにはそれを凍結しません。

llm = OpenAI(model_name="text-davinci-002")
no_cache_llm = OpenAI(model_name="text-davinci-002", cache=False)

from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.mapreduce import MapReduceChain

text_splitter = CharacterTextSplitter()

with open('../../../state_of_the_union.txt') as f:
    state_of_the_union = f.read()
texts = text_splitter.split_text(state_of_the_union)

from langchain.docstore.document import Document
docs = [Document(page_content=t) for t in texts[:3]]
from langchain.chains.summarize import load_summarize_chain

chain = load_summarize_chain(llm, chain_type="map_reduce", reduce_llm=no_cache_llm)

chain.run(docs)

    CPU times: user 452 ms, sys: 60.3 ms, total: 512 ms
    Wall time: 5.09 s


    '\n\nPresident Biden is discussing the American Rescue Plan and the Bipartisan Infrastructure Law, which will create jobs and help Americans. He also talks about his vision for America, which includes investing in education and infrastructure. In response to Russian aggression in Ukraine, the United States is joining with European allies to impose sanctions and isolate Russia. American forces are being mobilized to protect NATO countries in the event that Putin decides to keep moving west. The Ukrainians are bravely fighting back, but the next few weeks will be hard for them. Putin will pay a high price for his actions in the long run. Americans should not be alarmed, as the United States is taking action to protect its interests and allies.'

それを再度実行するとき、それは大幅に高速に実行されますが最終的な答えは異なることがわかります。これは map ステップのキャッシングにより reduce ステップではありません。

chain.run(docs)

    CPU times: user 11.5 ms, sys: 4.33 ms, total: 15.8 ms
    Wall time: 1.04 s


    '\n\nPresident Biden is discussing the American Rescue Plan and the Bipartisan Infrastructure Law, which will create jobs and help Americans. He also talks about his vision for America, which includes investing in education and infrastructure.'

rm .langchain.db sqlite.db

🦜️🔗 LangChain : モジュール : モデル I/O – 言語モデル : LLM : シリアル化

このノートブックは LLM configuration をディスクに書き込みそしてディスクから読む方法をウォークスルーします。これは、与えられた LLM 用の configuration (e.g. プロバイダー, temperature 等) をセーブしたい場合に役立ちます。

from langchain.llms import OpenAI
from langchain.llms.loading import load_llm

API リファレンス :

ロード

最初に、ディスクからの LLM のロードを調べます。LLM は 2 つの形式: json または yaml でディスクにセーブできます。拡張子がなんでも、それらは同じ方法っでロードされます。

cat llm.json

    {
        "model_name": "text-davinci-003",
        "temperature": 0.7,
        "max_tokens": 256,
        "top_p": 1.0,
        "frequency_penalty": 0.0,
        "presence_penalty": 0.0,
        "n": 1,
        "best_of": 1,
        "request_timeout": null,
        "_type": "openai"
    }

llm = load_llm("llm.json")

cat llm.yaml

    _type: openai
    best_of: 1
    frequency_penalty: 0.0
    max_tokens: 256
    model_name: text-davinci-003
    n: 1
    presence_penalty: 0.0
    request_timeout: null
    temperature: 0.7
    top_p: 1.0

llm = load_llm("llm.yaml")

セーブ

メモリ内の LLM からそのシリアル化されたバージョンに移行したい場合、.save メソッドを呼び出すことで簡単にそれを行なうことができます。再度、これは json と yaml の両方をサポートします。

llm.save("llm.json")

llm.save("llm.yaml")

以上

月	火	水	木	金	土	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31