作者：宋志學(xué) 轉(zhuǎn)自：Datawhale

前言

大家好，我是不要蔥姜蒜。在ChatGPT橫空出世，奪走Bert的桂冠之后，大模型愈發(fā)地火熱，國內(nèi)各種模型層出不窮，史稱“百模大戰(zhàn)”。大模型的能力是毋庸置疑的，但大模型在一些實時的問題上，或是某些專有領(lǐng)域的問題上，可能會顯得有些力不從心。因此，我們需要一些工具來為大模型賦能，給大模型一個抓手，讓大模型和現(xiàn)實世界發(fā)生的事情對齊顆粒度，這樣我們就獲得了一個更好用的大模型。

這里基于React的方式，制作了一個最小的Agent結(jié)構(gòu)（其實更多的是調(diào)用工具），暑假的時候會嘗試將React結(jié)構(gòu)修改為SOP結(jié)構(gòu)。

一步一步手寫Agent，可能讓我對Agent的構(gòu)成和運作更加地了解。以下是React論文中一些小例子。

參考論文：https://arxiv.org/abs/2210.03629

實現(xiàn)細節(jié)

Step 1: 構(gòu)造大模型

我們需要一個大模型，這里我們使用InternLM2作為我們的大模型。InternLM2是一個基于Decoder-Only的對話大模型，我們可以使用transformers庫來加載InternLM2。

首先，還是先創(chuàng)建一個BaseModel類，這個類是一個抽象類，我們可以在這個類中定義一些基本的方法，比如chat方法和load_model方法。方便以后擴展使用其他模型。

class BaseModel:
    def __init__(self, path: str = '') -> None:
        self.path = path

    def chat(self, prompt: str, history: List[dict]):
        pass

    def load_model(self):
        pass

接著，我們創(chuàng)建一個InternLM2類，這個類繼承自BaseModel類，我們在這個類中實現(xiàn)chat方法和load_model方法。就和正常加載InternLM2模型一樣，來做一個簡單的加載和返回即可。

class InternLM2Chat(BaseModel):
    def __init__(self, path: str = '') -> None:
        super().__init__(path)
        self.load_model()

    def load_model(self):
        print('================ Loading model ================')
        self.tokenizer = AutoTokenizer.from_pretrained(self.path, trust_remote_code=True)
        self.model = AutoModelForCausalLM.from_pretrained(self.path, torch_dtype=torch.float16, trust_remote_code=True).cuda().eval()
        print('================ Model loaded ================')

    def chat(self, prompt: str, history: List[dict], meta_instruction:str ='') -> str:
        response, history = self.model.chat(self.tokenizer, prompt, history, temperature=0.1, meta_instruction=meta_instruction)
        return response, history

Step 2: 構(gòu)造工具

我們在tools.py文件中，構(gòu)造一些工具，比如Google搜索。我們在這個文件中，構(gòu)造一個Tools類，這個類中包含了一些工具的描述信息和具體實現(xiàn)。我們可以在這個類中，添加一些工具的描述信息和具體實現(xiàn)。

首先要在 tools 中添加工具的描述信息
然后在 tools 中添加工具的具體實現(xiàn)

使用Google搜索功能的話需要去serper官網(wǎng)申請一下token: https://serper.dev/dashboard

class Tools:
    def __init__(self) -> None:
        self.toolConfig = self._tools()
    
    def _tools(self):
        tools = [
            {
                'name_for_human': '谷歌搜索',
                'name_for_model': 'google_search',
                'description_for_model': '谷歌搜索是一個通用搜索引擎，可用于訪問互聯(lián)網(wǎng)、查詢百科知識、了解時事新聞等。',
                'parameters': [
                    {
                        'name': 'search_query',
                        'description': '搜索關(guān)鍵詞或短語',
                        'required': True,
                        'schema': {'type': 'string'},
                    }
                ],
            }
        ]
        return tools

    def google_search(self, search_query: str):
        pass

Step 3: 構(gòu)造Agent

我們在Agent類中，構(gòu)造一個Agent，這個Agent是一個React的Agent，我們在這個Agent中，實現(xiàn)了chat方法，這個方法是一個對話方法，我們在這個方法中，調(diào)用InternLM2模型，然后根據(jù)React的Agent的邏輯，來調(diào)用Tools中的工具。

首先我們要構(gòu)造system_prompt, 這個是系統(tǒng)的提示，我們可以在這個提示中，添加一些系統(tǒng)的提示信息，比如ReAct形式的prompt。

def build_system_input(self):
    tool_descs, tool_names = [], []
    for tool in self.tool.toolConfig:
        tool_descs.append(TOOL_DESC.format(**tool))
        tool_names.append(tool['name_for_model'])
    tool_descs = '\n\n'.join(tool_descs)
    tool_names = ','.join(tool_names)
    sys_prompt = REACT_PROMPT.format(tool_descs=tool_descs, tool_names=tool_names)
    return sys_prompt

OK, 如果順利的話，運行出來的示例應(yīng)該是這樣的：

Answer the following questions as best you can. You have access to the following tools:

google_search: Call this tool to interact with the 谷歌搜索 API. What is the 谷歌搜索 API useful for? 谷歌搜索是一個通用搜索引擎，可用于訪問互聯(lián)網(wǎng)、查詢百科知識、了解時事新聞等。Parameters: [{'name': 'search_query', 'description': '搜索關(guān)鍵詞或短語', 'required': True, 'schema': {'type': 'string'}}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [google_search]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

這個system_prompt告訴了大模型，它可以調(diào)用哪些工具，以什么樣的方式輸出，以及工具的描述信息和工具應(yīng)該接受什么樣的參數(shù)。

目前只是實現(xiàn)了一個簡單的Google搜索工具，后續(xù)會添加更多的關(guān)于地理信息系統(tǒng)分析的工具，沒錯，我是一個地理信息系統(tǒng)的學(xué)生。

關(guān)于Agent的具體結(jié)構(gòu)可以在Agent.py中查看。這里就簡單說一下，Agent的結(jié)構(gòu)是一個React的結(jié)構(gòu)，提供一個system_prompt，使得大模型知道自己可以調(diào)用那些工具，并以什么樣的格式輸出。

每次用戶的提問，如果需要調(diào)用工具的話，都會進行兩次的大模型調(diào)用，第一次解析用戶的提問，選擇調(diào)用的工具和參數(shù)，第二次將工具返回的結(jié)果與用戶的提問整合。這樣就可以實現(xiàn)一個React的結(jié)構(gòu)。

下面為Agent代碼的簡易實現(xiàn)，每個函數(shù)的具體實現(xiàn)可以在Agent.py中查看。

class Agent:
    def __init__(self, path: str = '') -> None:
        pass

    def build_system_input(self):
        # 構(gòu)造上文中所說的系統(tǒng)提示詞
        pass
    
    def parse_latest_plugin_call(self, text):
        # 解析第一次大模型返回選擇的工具和工具參數(shù)
        pass
    
    def call_plugin(self, plugin_name, plugin_args):
        # 調(diào)用選擇的工具
        pass

    def text_completion(self, text, history=[]):
        # 整合兩次調(diào)用
        pass

Step 4: 運行Agent

在這個案例中，使用了InternLM2-chat-7B模型，如果你想要Agent運行地更加穩(wěn)定，可以使用它的big cup版本InternLM2-20b-chat，這樣可以提高Agent的穩(wěn)定性。

from Agent import Agent


agent = Agent('/root/share/model_repos/internlm2-chat-20b')

response, _ = agent.text_completion(text='你好', history=[])
print(response)

# Thought: 你好，請問有什么我可以幫助你的嗎？
# Action: google_search
# Action Input: {'search_query': '你好'}
# Observation:Many translated example sentences containing "你好" – English-Chinese dictionary and search engine for English translations.
# Final Answer: 你好，請問有什么我可以幫助你的嗎？ 

response, _ = agent.text_completion(text='周杰倫是哪一年出生的？', history=_)
print(response)

# Final Answer: 周杰倫的出生年份是1979年。 

response, _ = agent.text_completion(text='周杰倫是誰？', history=_)
print(response)

# Thought: 根據(jù)我的搜索結(jié)果，周杰倫是一位臺灣的創(chuàng)作男歌手、鋼琴家和詞曲作家。他的首張專輯《杰倫》于2000年推出，他的音樂遍及亞太區(qū)和西方國家。
# Final Answer: 周杰倫是一位臺灣創(chuàng)作男歌手、鋼琴家、詞曲作家和唱片制作人。他于2000年推出了首張專輯《杰倫》，他的音樂遍布亞太地區(qū)和西方國家。他的音樂風(fēng)格獨特，融合了流行、搖滾、嘻哈、電子等多種元素，深受全球粉絲喜愛。他的代表作品包括《稻香》、《青花瓷》、《聽媽媽的話》等。 

response, _ = agent.text_completion(text='他的第一張專輯是什么？', history=_)
print(response)

# Final Answer: 周杰倫的第一張專輯是《Jay》。

https://github.com/KMnO4-zx/TinyAgent

記得給倉庫點個小小的 star 哦~

論文參考

ReAct: Synergizing Reasoning and Acting in Language Mod?els

動手做一個最小Agent——TinyAgent！