MetaGPT完全实践宝典——如何定义单一行为&多行为Agent

一、智能体

1-1、Agent概述

Agent（智能体）： 具有一定自主性和目标导向性，可以在没有持续人类干预的情况下执行任务和作出决策。以下为Agent的一些特性：

（1）自主性和目标导向性

自主性：Agent具备自主执行任务的能力，不需要外部指令即可根据设定的目标进行操作。
目标导向性：Agent设置并追求特定的目标或任务，这些目标指导其决策过程和行为模式。

（2）复杂的工作流程

任务规划与执行：Agent能够规划如何达到其目标，包括任务分解、优先级排序以及实际执行。
自我对话和内部决策：在处理问题时，Agent可以进行内部对话，以自我推理和修正其行动路径，而无需外部输入。

（3）学习和适应能力

反思和完善：Agent能从自身的经验中学习，评估过去的行为，从错误中吸取教训，并改进未来的策略。
环境适应性：在遇到变化的环境或不同的挑战时，Agent能够适应并调整其行为以最大化目标达成。

（4）记忆机制

短期记忆：使用上下文信息来做出即时决策。
长期记忆：保留关键信息，供未来决策使用，通常通过外部数据库或持久存储实现。（例如使用向量数据库）

（5）工具使用与集成

API调用和外部数据访问：Agent可以利用外部资源（如API、数据库）来获取信息，填补其知识空白，或执行无法直接通过模型内部处理的任务。
技术整合：Agent能整合多种技术和服务，如代码执行能力和专业数据库访问，以丰富其功能和提高效率。

LLM 驱动的自主Agents系统概述如下图所示：（包含工具调用、记忆、计划、执行模块）

1-2、Agent与ChatGPT的区别

Agent与ChatGPT的区别: Agent与ChatGPT在设计、功能和目标上有一些关键区别。虽然它们都是基于人工智能技术，但应用方式和交互性质大不相同。下面是这两者的主要区别：

（1）目标和自主性

ChatGPT：主要是一个响应型模型，专注于对用户的特定输入生成一次性、相关且连贯的回答。它的主要目的是解答问题、提供信息或进行对话模拟。
AI Agent：更强调在持续的任务中表现出自主性。它能够设定和追求长期目标，通过复杂的工作流程自主地完成任务，比如从错误中自我修正、连续地追踪任务进展等。

（2）交互方式

ChatGPT：用户与ChatGPT的交互通常是线性的和短暂的，即用户提问，ChatGPT回答。它不保留交互的历史记忆，每次交互都是独立的。
AI Agent：可以维持跨会话的状态和记忆，具有维持长期对话的能力，能够自动执行任务并处理一系列相关活动，例如调用API、追踪和更新状态等。

（3）任务执行和规划能力

ChatGPT：通常只处理单个请求或任务，依赖用户输入来驱动对话。它不具备自我规划或执行连续任务的能力。
AI Agent：具备规划能力，可以自行决定执行哪些步骤以完成复杂任务。它可以处理任务序列，自动化决策和执行过程。

（4）技术整合与应用

ChatGPT：主要是文本生成工具，虽然能够通过插件访问外部信息，但核心依然是文本处理和生成。
AI Agent：可能整合多种技术和工具，如API调用、数据库访问、代码执行等，这些都是为了实现其目标和改善任务执行的效率。

（5）学习和适应

ChatGPT：它的训练是在离线进行，通过分析大量数据来改进。
AI Agent：除了离线学习，更复杂的AI Agent可能具备实时学习能力，能够从新的经验中迅速适应和改进，这通常需要一定的记忆和自我反思机制。

二、多智能体框架MetaGPT

2-1、安装&配置

安装： 必须要python版本在3.9以上，这里使用conda，尝鲜安装。

代码语言：python代码运行次数：0复制

conda create -n metagpt python=3.9 && conda activate metagpt

开发模式下安装： 为开发人员推荐。实现新想法和定制化功能。

代码语言：python代码运行次数：0复制

git clone https://github.com/geekan/MetaGPT.git
cd ./MetaGPT
pip install -e .

模型配置： 在文件 ~/.metagpt/config2.yaml下，有关于各大厂商模型的配置详细列表参考：LLM API Configuration

代码语言：python代码运行次数：0复制

llm:
  api_type: "openai"  # or azure / ollama / groq etc. Check LLMType for more options
  model: "gpt-4-turbo"  # or gpt-3.5-turbo
  base_url: "https://api.openai.com/v1"  # or forward url / other llm url
  api_key: "YOUR_API_KEY"

2-2、使用已有的Agent（ProductManager）

概述： 调用ProductManager Agent，注意，会话上下文是需要独立创建的

代码语言：python代码运行次数：0复制

import asyncio

from metagpt.context import Context
from metagpt.roles.product_manager import ProductManager
from metagpt.logs import logger

async def main():
    msg = "Write a PRD for a snake game"
    context = Context()  # The session Context object is explicitly created, and the Role object implicitly shares it automatically with its own Action object
    role = ProductManager(context=context)
    while msg:
        msg = await role.run(msg)
        logger.info(str(msg))

if __name__ == '__main__':
    asyncio.run(main())

输出结果：

2-3、拥有单一行为的Agent（SimpleCoder）

Agent——SimpleCoder：拥有写代码能力，我们需要实现如下两步：

定义写代码行动
定义角色，并赋予写代码能力

2-3-1、定义写代码行为

继承自Action类
self.PROMPT_TEMPLATE.format：引用当前类的PROMPT_TEMPLATE属性，调用format方法来替换模板中的占位符，即instruction，并且使用run方法接受instruction参数，最终构建出完整的提示模板。
self._aask：调用大模型，使用提示词模板，进行提问。
最终结果需要经过解析函数parse_code，得到写好的代码。

代码语言：python代码运行次数：0复制

import re
from metagpt.actions import Action

class SimpleWriteCode(Action):
    PROMPT_TEMPLATE: str = """
    Write a python function that can {instruction} and provide two runnnable test cases.
    Return ```python your_code_here ```with NO other texts,
    your code:
    """

    name: str = "SimpleWriteCode"

    async def run(self, instruction: str):
        prompt = self.PROMPT_TEMPLATE.format(instruction=instruction)
        rsp = await self._aask(prompt)
        code_text = SimpleWriteCode.parse_code(rsp)
        return code_text

    @staticmethod
    def parse_code(rsp):
        pattern = r"```python(.*)```"
        match = re.search(pattern, rsp, re.DOTALL)
        code_text = match.group(1) if match else rsp
        return code_text

2-3-2、角色定义

继承自Role类，是Agent的逻辑抽象
一个角色可以拥有多个行为，即Action，也拥有记忆，可以以不同的策略来思考和行动。
初始化时，我们为他配备了行为SimpleWriteCode，即写代码这个行为
重写_act函数，在最近的消息中检索指令
运行相应操作使用，todo.run(msg.content)，todo这里代表的是相关行为，Action。

代码语言：python代码运行次数：0复制

from metagpt.roles import Role

class SimpleCoder(Role):
    name: str = "Alice"
    profile: str = "SimpleCoder"

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.set_actions([SimpleWriteCode])

    async def _act(self) -> Message:
        logger.info(f"{self._setting}: to do {self.rc.todo}({self.rc.todo.name})")
        todo = self.rc.todo  # todo will be SimpleWriteCode()

        msg = self.get_memories(k=1)[0]  # find the most recent messages
        code_text = await todo.run(msg.content)
        msg = Message(content=code_text, role=self.profile, cause_by=type(todo))

        return msg

2-3-3、初始化角色并运行

代码语言：python代码运行次数：0复制

import asyncio

from metagpt.context import Context

async def main():
    msg = "write a function that calculates the product of a list"
    context = Context()
    role = SimpleCoder(context=context)
    logger.info(msg)
    result = await role.run(msg)
    logger.info(result)

asyncio.run(main())

运行结果如下：

智能体的运行周期如下所示：

2-4、拥有多行为的Agent

RunnableCoder： 不仅拥有生成代码能力，还拥有执行代码能力

2-4-1、定义执行代码行为

概述： 执行代码主要是启动子进程获取执行结果，生成代码行为同上，不过正则表达式提取需要简单修改一下，根据个人生成代码差异可以进行调整，我这里为：pattern = r"pythonn([sS]*?)n"

代码语言：python代码运行次数：0复制

class SimpleRunCode(Action):
    name: str = "SimpleRunCode"

    async def run(self, code_text: str):
        result = subprocess.run(["python3", "-c", code_text], capture_output=True, text=True)
        code_result = result.stdout
        logger.info(f"{code_result=}")
        return code_result

备注： 执行代码部分因操作系统而异，我这里为：subprocess.run(sys.executable, "-c", code_text, capture_output=True, text=True, encoding='utf-8')

2-4-2、定义角色

概述：定义拥有多个行为的角色。

在set_actions中设定好所有行为。
_set_react_mode是用来设定角色每次如何选择行为，这里我们设定为by_order，即依次顺序执行。即先写代码，后执行代码
改写_act函数，角色从用户输入或者是上一轮行为输出的结果检索信息，当作当前行为（self.rc.todo）的入参，
最终返回当前行为输出的消息

代码语言：python代码运行次数：0复制

class RunnableCoder(Role):
    name: str = "Alice"
    profile: str = "RunnableCoder"

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.set_actions([SimpleWriteCode, SimpleRunCode])
        self._set_react_mode(react_mode="by_order")

    async def _act(self) -> Message:
        logger.info(f"{self._setting}: to do {self.rc.todo}({self.rc.todo.name})")
        # By choosing the Action by order under the hood
        # todo will be first SimpleWriteCode() then SimpleRunCode()
        todo = self.rc.todo

        msg = self.get_memories(k=1)[0]  # find the most k recent messages
        result = await todo.run(msg.content)

        msg = Message(content=result, role=self.profile, cause_by=type(todo))
        self.rc.memory.add(msg)
        return msg

2-4-3、启动角色

代码语言：python代码运行次数：0复制

import asyncio

from metagpt.context import Context

async def main():
    msg = "写一个傅里叶函数并且执行"
    context = Context()
    role = RunnableCoder(context=context)
    logger.info(msg)
    result = await role.run(msg)
    logger.info(result)

asyncio.run(main)

输出结果如下：

2-4-4、全部代码

上边的代码会缺少一些关键库，下边为代码的全部展示，可运行。

代码语言：python代码运行次数：0复制

import re
from metagpt.actions import Action
from metagpt.schema import Message
from metagpt.logs import logger
import subprocess
import sys

class SimpleWriteCode(Action):
    PROMPT_TEMPLATE: str = """
        编写一个python函数，有如下功能：{instruction}， 提供一个可以运行的测试案例。
        返回''' python your_code_here ''' 不加任何其他文本，代码显示如下：
        """

    name: str = "SimpleWriteCode"

    async def run(self, instruction: str):
        prompt = self.PROMPT_TEMPLATE.format(instruction=instruction)
        rsp = await self._aask(prompt)
        code_text = SimpleWriteCode.parse_code(rsp)
        return code_text

    @staticmethod
    def parse_code(rsp):
        # pattern = r"```python(.*)```"
        pattern = r"```pythonn([sS]*?)n```"
        match = re.search(pattern, rsp, re.DOTALL)
        code_text = match.group(1) if match else rsp
        return code_text

class SimpleRunCode(Action):
    name: str = "SimpleRunCode"

    async def run(self, code_text: str):
        result = subprocess.run([sys.executable, "-c", code_text], capture_output=True, text=True, encoding='utf-8')
        code_result = result.stdout
        logger.info(f"{code_result=}")
        return code_result



from metagpt.roles import Role

class RunnableCoder(Role):
    name: str = "Alice"
    profile: str = "RunnableCoder"

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.set_actions([SimpleWriteCode, SimpleRunCode])
        # self._set_react_mode(react_mode="react", max_react_loop=3)
        self._set_react_mode(react_mode="by_order")

    async def _act(self) -> Message:
        logger.info(f"{self._setting}: to do {self.rc.todo}({self.rc.todo.name})")
        # By choosing the Action by order under the hood
        # todo will be first SimpleWriteCode() then SimpleRunCode()
        todo = self.rc.todo

        msg = self.get_memories(k=1)[0]  # find the most k recent messages
        result = await todo.run(msg.content)

        msg = Message(content=result, role=self.profile, cause_by=type(todo))
        self.rc.memory.add(msg)
        return msg


import asyncio

from metagpt.context import Context

async def main():
    msg = "写一个傅里叶函数"
    context = Context()
    role = RunnableCoder(context=context)
    logger.info(msg)
    result = await role.run(msg)
    logger.info(result)

asyncio.run(main())

2-4-5、番外篇，如何获取模型决策？（决策下一步执行什么行为）

起因： 好兄弟对于ReAct 很费解，他想知道模型是如何决策下一个Action是怎么被调用的，于是乎有此番外篇。

主要思想时重写think方法
定义Role角色时新增一个参数，用于接收think方法中的参数
在act时，将think中对应的模型决策提示词输出就
腾讯技术创作特训营S9

0 人点赞