VisAgent WeChat Integration Cover

Following our previous discussion on the RoleEngine Core, it’s time to bridge the gap between a raw CLI “brain” and a real-world communication platform: WeChat.

In VisAgent, we don’t believe in heavy, bloated frameworks. Instead, we use a CLI-Native Bridge pattern. This post explores how we connected the “Clawbot” (our WeChat interface) to a hand-rolled Gemini CLI kernel.

The Request Flow: From Chat to CLI

The architecture is a chain of specialized tools, each doing one thing well. Here’s how a message travels from your phone to the AI:

sequenceDiagram
    participant User as 📱 WeChat User
    participant Daemon as ⚙️ WeClaw (Go)
    participant Bridge as 🌉 visagent_invoke.py
    participant Engine as 🧠 RoleEngine (Python)
    participant CLI as 🐚 Gemini CLI (Binary)

    User->>Daemon: Send Message
    Daemon->>Bridge: Spawn Process (JSON)
    Bridge->>Engine: Initialize Role
    Engine->>CLI: subprocess.run(stdin=prompt)
    CLI-->>Engine: Raw AI Output
    Engine->>Engine: Filter Agentic Noise
    Engine-->>Bridge: OK + Clean Output
    Bridge-->>Daemon: JSON Event
    Daemon-->>User: Reply to WeChat

1. The Bridge: visagent_invoke.py

Since our WeChat daemon (WeClaw) is written in Go and our Agent logic is in Python, we needed a thin, high-performance bridge. Instead of a complex RPC system, we used a CLI Bridge.

Whenever a message arrives, WeClaw spawns visagent_invoke.py. This script:

  • Maps the WeChat conversation-id to a persistent VisAgent user_id.
  • Emits JSON events (matching the claude stream-json standard) that WeClaw can parse in real-time.
  • Fixes the $HOME environment to ensure the Gemini CLI finds its Auth tokens regardless of how the daemon was started.

Running multiple WeChat accounts on a single server presents a challenge: credential leakage. We solved this using Symlink-based Sandboxing.

In RoleEngineBase, each user gets an isolated home directory. We dynamically symlink the global .gemini tokens into this isolated space:

def _link_auth_session(self):
    global_gemini = os.path.expanduser("~/.gemini")
    role_gemini_base = os.path.join(self.local_home, ".gemini")
    # Link credentials to isolated role execution directory
    for f in ["oauth_creds.json", "settings.json"]:
        os.symlink(os.path.join(global_gemini, f), os.path.join(role_gemini_base, f))

This allows user_a and user_b to have independent histories and session files without interfering with each other’s tokens.

3. Metabolic Handoff: The “Clean Context” Strategy

Long WeChat conversations can quickly bloat the context, making the AI slower and more expensive. Our “Metabolic” approach automatically “metabolizes” the conversation:

  • Handoff Threshold: Every 20 turns, the system triggers a Distillation.
  • Substance Sync: The core facts are extracted and appended to a SUBSTANCE.md (Level 2 Context).
  • History Reset: The CLI’s resume history is truncated, and the Agent starts “fresh” but with the enriched SUBSTANCE.md as its long-term memory.

4. Cleaning the “Agentic Noise”

One of the most annoying parts of using raw LLMs in a messaging app is “AI filler” (“Sure, let me check that for you…”). In the tiny bubble of a WeChat UI, every character counts.

We implemented a Noise Filter that uses positive-lookahead regex and Chinese-language support to strip meta-talk:

patterns = [
    r"^(I will|I am going to|Let me|First, I'll)\s+",
    r"^(Searching for|Checking|Listing|Reading|Analyzing)\s+",
    r"^(我将|我正在|让我来|首先)\s+",
    r"^(正在搜索|正在检查|正在分析)\s+"
]

Conclusion

By treating the CLI as the “Atoms” of our architecture, we built a system that is transparent, portable, and remarkably robust. The Clawbot isn’t just a bot; it’s a mobile gateway to a fully governed, metabolic AI engine.

In part 4, we will look at how we use Sovereignty Manifests to restrict what these agents can do on the filesystem during autonomous tasks.