flowCreate.solutions

AI Engine Integration (Generic Standard)

This document defines generic engineering standards for implementing an AI engine that:

  • Runs structured LLM calls with typed outputs (Pydantic models).
  • Supports a multi-agent architecture.
  • Exposes an extensible tooling layer (function calls).
  • Provides optional streaming progress/status updates for a better UX.

This page intentionally avoids product/domain specifics. It focuses on architecture and contracts.

Goals

  • One engine is the single entry point for model calls (consistent retries, logging, and safety defaults).
  • Typed outputs everywhere: each agent returns a Pydantic model so downstream code remains deterministic.
  • Tool gating: only mount tools when needed to reduce latency and error surface.
  • Prompt modularity: prompts are files, composed from reusable sections + per-agent templates.
  • Testability: agents and tools can be tested independently (prompt mapping, tool behavior, orchestration wiring).

Standard module layout

ai_engine/
├── core.py                   # AIEngine + config (model selection, retries) + tool mounting
├── deps.py                   # dependency container passed into tool contexts / agent runs
├── progress.py               # optional progress tracker for streaming status events
├── utils.py                  # prompt loading + shared helpers (e.g., embeddings, template rendering)
├── prompts/                  # reusable prompt sections shared across agents
├── agents/                   # specialized agents (one directory per agent)
│   └── <agent_name>/
│       ├── prompts/          # sys/user prompt templates
│       ├── generate.py       # reads prompt files + injects placeholders
│       ├── schemas.py        # Pydantic response models
│       ├── response.py       # calls AIEngine.generate(...) with schema + flags
│       ├── process.py        # orchestration (prepare inputs, call response, post-process)
│       └── tools.py          # optional agent-local helpers (avoid duplicating shared tools)
└── tools/                    # shared tools grouped by category
    └── <tool_category>/
        ├── tools.py
        └── prompts/
            └── tool_prompt.txt

Core engine contract (core.py)

AIEngineConfig

Expose a config object that controls at least:

  • model identifier (string)
  • retry count (int)

Keep it small and stable: config should be easy to override at call sites.

AIEngine.generate(...)

Implement a single async method that:

  • Accepts system_prompt, user_prompt, and result_type (a Pydantic model class).
  • Accepts a typed dependency object (see deps.py) so tools can access shared context.
  • Allows feature flags for prompt sections and tool mounting (e.g., include knowledge tools, include pricing tools, include “important instructions”, etc.).
  • Returns either:
    • result_type instance (common case), or
    • a full run result object when callers need tool history/debug info (optional).

Prompt composition inside the engine

Standard pattern:

  • Compose a final system prompt by concatenating reusable prompt sections:
    • a “job role” or “agent role” template
    • optional tenant/org context
    • process/agent-specific system prompt
    • optional tool guidelines (one per tool category)
    • optional extra instructions

Prefer file-based templates for each section so changes are reviewable and testable.

Tool mounting

Tools should be attached to the agent conditionally:

  • Each tool category provides an attach_to_agent(agent) helper.
  • The engine attaches tool categories based on include_*_tools flags.
  • Tools should read the typed dependencies object from the run context rather than global state.

Dependency injection (deps.py)

Create a typed dependency container (dataclass) that can carry:

  • database session (optional, for tools that query storage)
  • provider client (optional, for embeddings or secondary model calls)
  • tenant/org id (or equivalent scope identifier)
  • conversation/transcript ids (optional, for auditing and retrieval)
  • process metadata (optional: versioning, intent/task ids, etc.)
  • progress tracker (optional, for streaming status)

Standards:

  • Keep the dependency surface minimal; add new fields only when a real tool/agent requires it.
  • Favor optional fields with safe behavior when missing (tools return empty/default results instead of raising).

Prompt and template standards

  • Store prompt files as plain text under:
    • ai_engine/prompts/ for reusable shared sections
    • ai_engine/agents/<agent>/prompts/ for agent-specific templates
    • ai_engine/tools/<category>/prompts/tool_prompt.txt for tool instructions
  • Load prompt files with async IO.
  • Inject variables using a deterministic placeholder strategy (e.g., [...] markers replaced in generate.py).
  • When prompts require runtime fields from persistent configuration, render templates via a dedicated template renderer utility.

Tool standards (tools/<category>/tools.py)

Tool interface

Each tool should:

  • Be a single-purpose function or small set of functions.
  • Accept a typed RunContext[Deps] and explicit parameters (type annotated).
  • Return a predictable JSON-serializable shape (lists/dicts of primitives).
  • Validate inputs early (empty query, missing deps) and return safe defaults.

Tool prompt (prompts/tool_prompt.txt)

Keep tool prompts short and structured:

  • Tool name
  • Purpose
  • When to use
  • How to call
  • Return shape
  • Rules / constraints

Optional: streaming progress (progress.py)

For interactive UX, expose progress updates during processing:

  • Maintain a per-request progress tracker (e.g., an asyncio.Queue).
  • Provide a helper like emit_status("...") that tools/agents can call.
  • Surface status updates via SSE (or equivalent) as:
    • status events (short messages)
    • complete event (final payload)
    • error event (structured error)

Guidelines:

  • Progress messages should be short and meaningful (milestones, not spam).
  • Progress tracking should be in-memory and best-effort (never block core processing).

Testing standards

  • Agent tests:
    • Verify prompt generation replaces placeholders correctly.
    • Verify response parsing returns the expected Pydantic model shape.
  • Tool tests:
    • Verify tools handle missing deps and invalid inputs safely.
    • Verify return shapes are stable.
  • Orchestration tests:
    • Verify tool gating (which tools mount under which flags).
    • Verify progress emission doesn’t leak tasks or state across requests.