[!TIP] Objective: Evolve from single-step prompts to a “Plan-Simulate-Execute” closed-loop, introducing YAML-based DAG asynchronous orchestration.
1. From “Chatting” to “Execution”: The AX (Architect-Executive) Paradigm
The watershed between a simple Chatbot and an Agent is the ability to decompose vague goals into actionable steps. In VISAGENT, we implemented the AX Planner logic:
- Architect: Receives requirements and outputs a
TODO.json. Execution is forbidden; only planning is allowed. - Simulation: Before execution, a “Security Expert” role performs a risk assessment on the plan.
- Executive: Executes each step via
execute_step.
This “think before you act” mechanism is easily implemented on any CLI using simple Prompt constraints:
# Core AX_PLANNING_MODE Instruction
plan_prompt = (
"You are a specialized Task Architect. Output ONLY valid JSON.\n"
"Schema: { 'goal': '...', 'steps': [ {'id': 1, 'desc': '...', 'type': 'EXECUTION'} ] }"
)
2. The Failure of Linearity: Why We Need a DAG
Early AX Planners were linear. However, real-world engineering tasks are often topological: some steps can be parallelized, while others must wait for preceding results. This led to the creation of the Flow Architect.
3. Orchestration Core: YAML-based DAG
We designed a minimalist YAML orchestration protocol. Instead of heavy frameworks like LangGraph, we drive it directly with Python:
id: "deploy_service_v1"
steps:
- id: "build_docker"
role: "builder"
desc: "Build Image"
- id: "health_check"
depends_on: ["build_docker"] # Declare dependency, constructing the DAG
skill: "web_fetcher"
desc: "Check Deployment Status"
Implementation Details:
- State Persistence: Each Flow has its corresponding
.state.json. If a process is interrupted, the system can resume from the last completed node. - L3 Context Propagation: Output from preceding nodes is automatically appended to
l3_context, serving as the reasoning basis for subsequent nodes.
4. Shadow Reflection
To enhance the stability of our “hand-rolled” system, we added a Shadow Reflection mechanism before each step:
reflection_prompt = f"Executing: '{desc}'. Assess risks. Reply 'SAFE' or 'WARNING'."
reflection_res = self.engine.invoke(reflection_prompt)
if "WARNING" in reflection_res:
# Auto-cutoff or human intervention
This “self-questioning” logic gives the Agent a much-needed sense of caution during write operations (like file edits or deployments).
Conclusion
The transition from linear to graph-based (DAG) orchestration is essential for an Agent to move from an experimental toy to an engineering tool. Through simple YAML definitions and state persistence, we achieved reliable long-cycle task scheduling without heavy external dependencies.
Next part: How to make your Agent “see” and understand multi-modal inputs, and the design of dynamic Skill Trees.