On March 6, Manus, the worlds first general AI Agent product released by Chinese startup Monica, was all over the domestic technology media and social networks. On the first day of its launch, invitation codes were hard to come by on the entire network. Even an invitation code on Xianyu cost 50,000 yuan. However, many industry KOLs still got the invitation codes in advance, and a flood of experience interpretation articles followed.
As a general AI agent product, Manus has the ability to complete tasks autonomously from planning to execution, such as writing reports and making tables. It not only generates ideas, but also thinks and takes action independently. With its strong ability to think independently, plan and execute complex tasks, it directly delivers complete results, demonstrating unprecedented versatility and execution capabilities.
The popularity of Manus has not only attracted attention within the industry, but also provided valuable product ideas and design inspiration for the development of various AI agents. With the rapid development of AI technology, AI Agent, as an important branch of artificial intelligence, is gradually moving from concept to reality, and has shown great application potential in various industries, including the Web3 industry.
Background
AI Agent, or Artificial Intelligence Agent, is a computer program that can autonomously make decisions and perform tasks based on the environment, inputs, and predefined goals. The core components of AI Agent include a large language model (LLM) as its brain, which enables it to process information, learn from interactions, make decisions, and perform actions; observation and perception mechanisms, which enable it to perceive the environment; reasoning thinking processes, which involve analyzing observations and memory content and considering possible actions; action execution, as an explicit response to thinking and observation; and memory and retrieval, which stores past experiences for learning.
The design pattern of AI Agent starts from ReAct and has two development paths: one is more focused on the planning ability of AI Agent, including REWOO, Plan Execute, and LLM Compiler. The other is more focused on reflection ability, including Basic Reflection, Reflexion, Self Discovery, and LATS.
Among them, the ReAct mode is the earliest AI Agent design mode and is currently the most widely used, so here we mainly introduce the concept of ReAct. ReAct refers to solving various language reasoning and decision-making tasks by combining reasoning (Reasoning) and action (Acting) in the language model. Its typical process is shown in the figure below, which can be described by an interesting cycle: Thought → Action → Observation (TAO cycle for short).
Thinking: When faced with a problem, we need to think deeply. This thinking process is about how to define the problem, determine the key information and reasoning steps required to solve the problem.
Action: After determining the direction of thinking, the next step is to take action. Based on our thinking, we take corresponding measures or perform specific tasks in the hope of moving the problem towards a solution.
Observation: After taking action, we must carefully observe the results. This step is to test whether our actions are effective and whether we are close to the answer to the problem.
Loop Iteration
AI Agent can be divided into Single Agent and Multi Agent according to the number of intelligent agents. The core of Single Agent lies in the cooperation between LLM and tools, and in the process of completing the task, the agent may have multiple rounds of interaction with the user. Multi Agent will assign different roles to different agents and complete complex tasks through the collaboration between agents. However, in the process of completing the task, compared with Single Agent, there will be less interaction with the user. At present, most frameworks focus on the Single Agent scenario.
Model Context Protocol (MCP) is an open source protocol launched by Anthropic on November 25, 2024, which aims to solve the connection and interaction problems between LLM and external data sources. LLM can be compared to an operating system, and MCP can be compared to a USB interface, which supports the flexible insertion of external data and tools, and then users can read and use these external data and tools.
MCP provides three capabilities to expand LLM: Resources (knowledge expansion), Tools (execute functions, call external systems), and Prompts (pre-written prompt templates). The MCP protocol adopts a Client-Server architecture, and the underlying transmission uses the JSON-RPC protocol. Anyone can develop and host the MCP Server, and can go offline and stop the service at any time.
The Current State of AI Agents in Web3
In the Web3 industry, the popularity of AI Agent has dropped significantly since it reached its peak in January this year, and the overall market value has shrunk by more than 90%. Currently, the most popular and market-capitalized Web3 explorations are still centered around the AI Agent framework, namely, the launch platform model represented by Virtuals Protocol, the DAO model represented by ElizaOS and the commercial company model represented by Swarms.
A launch platform is a platform that allows users to create, deploy, and monetize AI Agents, similar to pump.fun in Meme, but for AI Agents. Virtuals Protocol is currently the largest launch platform, with more than 100,000 Agents issued on it. The popular cryptocurrency KOL AIXBT was created based on Virtuals. Virtuals Protocol includes a modular Agent framework called GAME. The core positioning of GAME is to provide developers with an efficient and open framework, making the development and launch of AI Agents as simple as building a WordPress website.
DAO stands for Decentralized Autonomous Organization. ElizaOS (formerly ai16z) was founded by @shawmakesmagic on the daos.fun platform. The original idea was to use AI models to simulate the investment decisions of the well-known venture capital institution a16z and its co-founder Marc Andreessen, and to invest in combination with the advice of DAO members. Later, it developed into a DAO for AI Agent developers with the Eliza framework as the core. The Eliza framework is built with TypeScript and provides a flexible and scalable platform for developing AI Agents that can interact across multiple platforms while maintaining consistent personality and knowledge.
Swarms was launched by 20-year-old @KyeGomezB in 2022. It is an enterprise-level Multi Agent framework. Through intelligent orchestration and efficient collaboration, Swarms allows multiple AI Agents to work together like a team to solve complex business operation needs. At first, Swarms was just a Web2 AI Agent project. According to the founder, Swarms has more than 45 million agents running in production environments, providing services to the worlds largest financial, insurance and medical institutions. It will not officially switch from Web2 to Web3 until the token $SWARMS is issued in December 2024.
From the perspective of economic model, only launch platforms can achieve a self-sufficient economic closed loop. Take Virtuals as an example:
Agent creation: The creator launches a new AI Agent on the Virtuals platform;
Bonding curve setup: The creator pays 100 $VIRTUAL tokens, a bonding curve will be created for the newly brokered token and paired with $VIRTUAL.
Liquidity Pool Creation: Once the bonding curve limit is reached, the proxy “graduates” and creates a liquidity pool where the proxy token is paired with the $VIRTUAL token, adhering to the principle of a fair launch with no insiders: no pre-mine or internal allocation, fixed total supply, and liquidity lock-up for a longer period of time.
In addition to charging the launch fee for AI Agents, Virtuals will also charge transaction fees for each transaction of agent tokens, and will also charge inference fees for AI Agents accessing LLM through Virtuals API. Currently, ElizaOS and Swarms are planning to build their own launch platforms.
Of course, there are also problems with the launch platform. This kind of asset issuance requires the issued assets to be attractive in order to form a positive flywheel. Currently, most of the AI Agents launched are essentially memes, without intrinsic value support. Once they lose the markets attention, they will quickly return to zero. In the current cold market environment, the launch platform cannot even attract creators, so the economic model cannot actually work.
MCPs Web3 Exploration
The emergence of MCP has brought new exploration directions to the current Web3 AI Agent. The most intuitive directions are:
Deploy MCP Server to the blockchain network to solve the single point problem of MCP Server and provide censorship resistance;
MCP Server has the function of interacting with the blockchain, such as conducting DeFi transactions and management, lowering the technical threshold.
The first direction has extremely high requirements for the storage system, data management capabilities, and asynchronous computing capabilities of the underlying blockchain. You can choose a blockchain like 0G. 0G is a modular AI blockchain with a scalable and programmable DA layer suitable for AI dapps. Its modular technology will achieve frictionless interoperability between chains while ensuring security, eliminating fragmentation and maximizing connectivity to create a decentralized AI ecosystem.
The second direction is similar to a variant of DeFAI, but currently the backend of DeFAI is a series of tools in Function Call that are encapsulated by itself. UnifAI creates a unified DeFAI MCP Server to avoid reinventing the wheel. UnifAI is a platform that enables autonomous AI agents to perform on-chain and off-chain tasks in the Web3 ecosystem. It has UniQ for task automation, an agent service market, and infrastructure for tool discovery.
In addition to the above two directions, @brucexu_eth, the founder of LXDAO and ETHPanda, proposed a solution to build an OpenMCP.Network creator incentive network based on Ethereum. MCP Server needs to host and provide stable services. Users pay LLM providers. LLM providers distribute actual incentives to the called MCP Servers through the network to maintain the sustainability and stability of the entire network, and inspire MCP creators to continue to create and provide high-quality content. This network will need to use smart contracts to achieve automation, transparency, trustworthiness, and anti-censorship of incentives. Signatures, permission verification, and privacy protection during operation can all be achieved using technologies such as Ethereum wallets and ZK.
Although in theory, the combination of MCP and Web3 can inject a decentralized trust mechanism and economic incentive layer into AI Agent applications, the current zero-knowledge proof (ZKP) technology still makes it difficult to verify the authenticity of Agent behavior, and decentralized networks still have efficiency issues. This is not a short-term successful solution.
Summarize
The release of Manus marks an important milestone for general AI Agent products. The Web3 world also needs a milestone product to break the outside worlds doubts that Web3 has no practicality and is only hype.
The emergence of MCP has brought new exploration directions for Web3 AI Agent, including deploying MCP Server to the blockchain network, enabling MCP Server to interact with the blockchain, or building an MCP Server creator incentive network.
AI is the grandest narrative in history. For Web3, integration with AI is inevitable. We still need to maintain patience and confidence and continue exploring.
This article was written by pignard.eth (X account @pignard_web3 ) of ZAN Team (X account @zan_team ).
Note: This article is only for technical sharing and does not constitute any recommendation or suggestion.