LangChain-05 RAG Conversational 增强检索会话_chatgpt

安装依赖

pip install --upgrade --quiet  langchain-core langchain-community langchain-openai

编写代码

from langchain_core.messages import AIMessage, HumanMessage, get_buffer_string
from langchain_core.prompts import format_document
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai.chat_models import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain.prompts.prompt import PromptTemplate
from langchain.prompts.chat import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from operator import itemgetter
from langchain_community.vectorstores import DocArrayInMemorySearch


vectorstore = DocArrayInMemorySearch.from_texts(
        ["wuzikang worked at earth", "sam worked at home", "harrison worked at kensho"], embedding=OpenAIEmbeddings()
    )
retriever = vectorstore.as_retriever()


_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:"""
CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)


template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
ANSWER_PROMPT = ChatPromptTemplate.from_template(template)

DEFAULT_DOCUMENT_PROMPT = PromptTemplate.from_template(template="{page_content}")


def _combine_documents(
    docs, document_prompt=DEFAULT_DOCUMENT_PROMPT, document_separator="\n\n"
):
    doc_strings = [format_document(doc, document_prompt) for doc in docs]
    return document_separator.join(doc_strings)


_inputs = RunnableParallel(
    standalone_question=RunnablePassthrough.assign(
        chat_history=lambda x: get_buffer_string(x["chat_history"])
    )
    | CONDENSE_QUESTION_PROMPT
    | ChatOpenAI(temperature=0)
    | StrOutputParser(),
)
_context = {
    "context": itemgetter("standalone_question") | retriever | _combine_documents,
    "question": lambda x: x["standalone_question"],
}
conversational_qa_chain = _inputs | _context | ANSWER_PROMPT | ChatOpenAI()

message1 = conversational_qa_chain.invoke(
    {
        "question": "what is his name?",
        "chat_history": [],
    }
)
print(f"message1: {message1}")

message2 = conversational_qa_chain.invoke(
    {
        "question": "where did sam work?",
        "chat_history": [],
    }
)
print(f"message2: {message2}")

message3 = conversational_qa_chain.invoke(
    {
        "question": "where did he work?",
        "chat_history": [
            HumanMessage(content="Who wrote this notebook?"),
            AIMessage(content="Harrison"),
        ],
    }
)
print(f"message3: {message3}")

代码解释

我们可以看到,在初始化中,我们定义了一些文档内容:

"wuzikang worked at earth", "sam worked at home", "harrison worked at kensho"

在后续的代码中,我们做了一个模板,并做了三个问题的会话:

  • “question”: “what is his name?”
  • “question”: “where did sam work?”
  • “question”: “where did he work?”

此时我们并没有指明 his sam he,但是显然大模型通过我们定义的一些文档内容,和相对应的 chat history 来推测出了一些答案。

运行结果

➜ python3 test05.py
/Users/wuzikang/Desktop/py/langchain_test/own_learn/env/lib/python3.12/site-packages/pydantic/_migration.py:283: UserWarning: `pydantic.error_wrappers:ValidationError` has been moved to `pydantic:ValidationError`.
  warnings.warn(f'`{import_path}` has been moved to `{new_location}`.')
message1: content='The name of the person we were just talking about is Wuzikang.' response_metadata={'finish_reason': 'stop', 'logprobs': None}
message2: content='Sam worked at home.' response_metadata={'finish_reason': 'stop', 'logprobs': None}
message3: content='Harrison worked at Kensho.' response_metadata={'finish_reason': 'stop', 'logprobs': None}

LangChain-05 RAG Conversational 增强检索会话_langchain_02