
Following our previous discussion on the RoleEngine Core, it’s time to bridge the gap between a raw CLI “brain” and a real-world communication platform: WeChat.
In VisAgent, we don’t believe in heavy, bloated frameworks. Instead, we use a CLI-Native Bridge pattern. This post explores how we connected the “Clawbot” (our WeChat interface) to a hand-rolled Gemini CLI kernel.
The Request Flow: From Chat to CLI
The architecture is a chain of specialized tools, each doing one thing well. Here’s how a message travels from your phone to the AI:
sequenceDiagram
participant User as 📱 WeChat User
participant Daemon as ⚙️ WeClaw (Go)
participant Bridge as 🌉 visagent_invoke.py
participant Engine as 🧠 RoleEngine (Python)
participant CLI as 🐚 Gemini CLI (Binary)
User->>Daemon: Send Message
Daemon->>Bridge: Spawn Process (JSON)
Bridge->>Engine: Initialize Role
Engine->>CLI: subprocess.run(stdin=prompt)
CLI-->>Engine: Raw AI Output
Engine->>Engine: Filter Agentic Noise
Engine-->>Bridge: OK + Clean Output
Bridge-->>Daemon: JSON Event
Daemon-->>User: Reply to WeChat
1. The Bridge: visagent_invoke.py
Since our WeChat daemon (WeClaw) is written in Go and our Agent logic is in Python, we needed a thin, high-performance bridge. Instead of a complex RPC system, we used a CLI Bridge.
Whenever a message arrives, WeClaw spawns visagent_invoke.py. This script:
- Maps the WeChat
conversation-idto a persistent VisAgentuser_id. - Emits JSON events (matching the
claude stream-jsonstandard) that WeClaw can parse in real-time. - Fixes the
$HOMEenvironment to ensure the Gemini CLI finds its Auth tokens regardless of how the daemon was started.
2. Isolation via Symlinks
Running multiple WeChat accounts on a single server presents a challenge: credential leakage. We solved this using Symlink-based Sandboxing.
In RoleEngineBase, each user gets an isolated home directory. We dynamically symlink the global .gemini tokens into this isolated space:
def _link_auth_session(self):
global_gemini = os.path.expanduser("~/.gemini")
role_gemini_base = os.path.join(self.local_home, ".gemini")
# Link credentials to isolated role execution directory
for f in ["oauth_creds.json", "settings.json"]:
os.symlink(os.path.join(global_gemini, f), os.path.join(role_gemini_base, f))
This allows user_a and user_b to have independent histories and session files without interfering with each other’s tokens.
3. Metabolic Handoff: The “Clean Context” Strategy
Long WeChat conversations can quickly bloat the context, making the AI slower and more expensive. Our “Metabolic” approach automatically “metabolizes” the conversation:
- Handoff Threshold: Every 20 turns, the system triggers a Distillation.
- Substance Sync: The core facts are extracted and appended to a
SUBSTANCE.md(Level 2 Context). - History Reset: The CLI’s
resumehistory is truncated, and the Agent starts “fresh” but with the enrichedSUBSTANCE.mdas its long-term memory.
4. Cleaning the “Agentic Noise”
One of the most annoying parts of using raw LLMs in a messaging app is “AI filler” (“Sure, let me check that for you…”). In the tiny bubble of a WeChat UI, every character counts.
We implemented a Noise Filter that uses positive-lookahead regex and Chinese-language support to strip meta-talk:
patterns = [
r"^(I will|I am going to|Let me|First, I'll)\s+",
r"^(Searching for|Checking|Listing|Reading|Analyzing)\s+",
r"^(我将|我正在|让我来|首先)\s+",
r"^(正在搜索|正在检查|正在分析)\s+"
]
Conclusion
By treating the CLI as the “Atoms” of our architecture, we built a system that is transparent, portable, and remarkably robust. The Clawbot isn’t just a bot; it’s a mobile gateway to a fully governed, metabolic AI engine.
In part 4, we will look at how we use Sovereignty Manifests to restrict what these agents can do on the filesystem during autonomous tasks.