從0開始學RAG之Query分解
原文地址:https://zhuanlan.zhihu.com/p/685746861
類型: 技術分享
本文為 @lucas大叔 投稿原創(chuàng)轉載!如有侵權,麻煩告知刪除!
Query分解是通過將問題分解為子問題來改善問答效果的策略,分兩條實現(xiàn)路徑:(1)序列求解,將上一個子問題的答案與當前問題一并扔給LLM生成答案,再把當前生成答案與下一個子問題一起給LLM生成答案,直到最后一個子問題生成最終答案;(2)并行的獨立回答問題,然后將多路答案合并為最終答案。
下面的示意圖很好地詮釋了兩種技術實現(xiàn)路徑的原理。


序列求解的思想來自least-to-most prompting和IRCoT這兩篇論文。
在least-to-most prompting中,作者認為CoT提示在自然語言推理任務上表現(xiàn)搶眼,但在解決比給的樣例更難的問題時就乏善可陳了。為了解決easy-to-hard泛化問題,大佬們提出了least-to-most prompting。分兩步解決問題:首先將復雜問題分解成一系列更容易的子問題,然后依次求解這些子問題,用先前解決的子問題答案助力當前子問題的解決。這兩步都是通過few-shot prompting來實現(xiàn),不需要訓練或者微調(diào)。
在IRCoT中作者提出了類似的想法,對多步問答在CoT中交錯檢索,使用檢索到的結果改善CoT。
更多技術細節(jié),大家可以參考上述的兩篇論文。下面我們轉入代碼實戰(zhàn)環(huán)節(jié)。
首先寫提示模板,將問題分解為若干個子問題。
from langchain.prompts import ChatPromptTemplate# Decompositiontemplate = """You are a helpful assistant that generates multiple sub-questions related to an input question. \nThe goal is to break down the input into a set of sub-problems / sub-questions that can be answers in isolation. \nGenerate multiple search queries related to: {question} \nOutput (3 queries):"""prompt_decomposition = ChatPromptTemplate.from_template(template)
構造分解問題的chain,并將問題分解為幾個子問題。
from langchain_openai import ChatOpenAIfrom langchain_core.output_parsers import StrOutputParser# LLMllm = ChatOpenAI(temperature=0)# Chaingenerate_queries_decomposition = ( prompt_decomposition | llm | StrOutputParser() | (lambda x: x.split("\n")))# Runquestion = "What are the main components of an LLM-powered autonomous agent system?"questions = generate_queries_decomposition.invoke({"question":question})
Answer recursively
如上面序列求解流程所示,構建序列回答的prompt模板,基于context和qa對回答問題。
# Prompttemplate = """Here is the question you need to answer:\n --- \n {question} \n --- \nHere is any available background question + answer pairs:\n --- \n {q_a_pairs} \n --- \nHere is additional context relevant to the question:\n --- \n {context} \n --- \nUse the above context and any background question + answer pairs to answer the question: \n {question}"""decomposition_prompt = ChatPromptTemplate.from_template(template)初始化q_a_pairs為空,查詢第一個問題時,沒有qa對信息。從第二個問題開
始,除了當前的問題,還有前面所有輪的qa對信息,再加上當前問題檢索回來的context一起給LLM得到答案。
from operator import itemgetterfrom langchain_core.output_parsers import StrOutputParserdef format_qa_pair(question, answer):"""Format Q and A pair"""formatted_string = ""formatted_string += f"Question: {question}\nAnswer: {answer}\n\n"return formatted_string.strip()# llmllm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)#q_a_pairs = ""for q in questions:rag_chain = ({"context": itemgetter("question") | retriever,"question": itemgetter("question"),"q_a_pairs": itemgetter("q_a_pairs")}| decomposition_prompt| llm| StrOutputParser())answer = rag_chain.invoke({"question":q,"q_a_pairs":q_a_pairs})q_a_pair = format_qa_pair(q,answer)q_a_pairs = q_a_pairs + "\n---\n"+ q_a_pair
Answer individually
相比于序列回答,并行獨立回答邏輯簡單的多。每個子問題各自調(diào)用LLM回答自己的提問,匯總得到的答案。
# Answer each sub-question individuallyfrom langchain import hubfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.runnables import RunnablePassthrough, RunnableLambdafrom langchain_core.output_parsers import StrOutputParserfrom langchain_openai import ChatOpenAI# RAG promptprompt_rag = hub.pull("rlm/rag-prompt")def retrieve_and_rag(question,prompt_rag,sub_question_generator_chain):"""RAG on each sub-question"""# Use our decomposition /sub_questions = sub_question_generator_chain.invoke({"question":question})# Initialize a list to hold RAG chain resultsrag_results = []for sub_question in sub_questions:# Retrieve documents for each sub-questionretrieved_docs = retriever.get_relevant_documents(sub_question)# Use retrieved documents and sub-question in RAG chainanswer = (prompt_rag | llm | StrOutputParser()).invoke({"context": retrieved_docs,"question": sub_question})rag_results.append(answer)return rag_results,sub_questions# Wrap the retrieval and RAG process in a RunnableLambda for integration into a chainanswers, questions = retrieve_and_rag(question, prompt_rag, generate_queries_decomposition)
提示模板的指令也簡單粗暴,告訴模型這是一組QA對,你用它們來合成原始問題的答案吧!
def format_qa_pairs(questions, answers):"""Format Q and A pairs"""formatted_string = ""for i, (question, answer) in enumerate(zip(questions, answers), start=1):formatted_string += f"Question {i}: {question}\nAnswer {i}: {answer}\n\n"return formatted_string.strip()context = format_qa_pairs(questions, answers)# Prompttemplate = """Here is a set of Q+A pairs:{context}Use these to synthesize an answer to the question: {question}"""prompt = ChatPromptTemplate.from_template(template)final_rag_chain = (prompt| llm| StrOutputParser())final_rag_chain.invoke({"context":context,"question":question})
