Agently Docs

Agently documentation for building AI applications with stable outputs, observable actions, and durable workflows.

View the Project on GitHub AgentEra/Agently

Models Overview

Agently has three protocol-level request plugins, plus per-provider configuration recipes that select one of them.

Layered view

Application code
      │
      ▼
  ModelRequest  ──►  ModelResponse
      │
      ▼
ModelRequester plugin (the "protocol layer")
   ├── OpenAICompatible             ◄── most providers (Chat Completions)
   ├── OpenAIResponsesCompatible    ◄── Responses API variants
   └── AnthropicCompatible          ◄── Claude
      │
      ▼
HTTP to a model endpoint

The protocol plugin is what builds the HTTP request body and parses the wire response. Provider configuration is just a settings preset that targets one of these plugins.

Why three plugins, not one

Earlier versions of the docs implied “every provider goes through OpenAICompatible”. That is no longer accurate. OpenAICompatible, OpenAIResponsesCompatible, and AnthropicCompatible are separate requester plugins. Each one directly implements the ModelRequester protocol and owns its own protocol mapping. Anthropic in particular builds its own request bodies — anthropic_version, anthropic_beta, an explicit max_tokens requirement, and the messages/system field shape Claude expects. Those differences are real enough that lumping Claude under “OpenAICompatible” produces wrong configurations.

If you are pointing at https://api.anthropic.com (or a Claude-compatible proxy that speaks the same protocol), use AnthropicCompatible. For everything else (OpenAI, DeepSeek, Qwen, Ollama, Kimi, GLM, MiniMax, Doubao, SiliconFlow, Groq, ERNIE, Gemini’s OpenAI-compat endpoint, plus any private gateway speaking the OpenAI Chat Completions API), use OpenAICompatible.

Picking a plugin

You’re calling Use plugin
OpenAI, Azure OpenAI, Gemini-via-OpenAI OpenAICompatible
DeepSeek, Qwen, Kimi, GLM, MiniMax, Doubao, SiliconFlow, Groq, ERNIE OpenAICompatible
Ollama or any other OpenAI-compatible local server OpenAICompatible
Anthropic / Claude (native API) AnthropicCompatible
A private gateway speaking the OpenAI Chat Completions API OpenAICompatible
A private gateway speaking the OpenAI Responses API OpenAIResponsesCompatible
A private gateway speaking the Anthropic Messages API AnthropicCompatible

Minimal configuration

from agently import Agently

# OpenAI-compatible
Agently.set_settings("OpenAICompatible", {
    "base_url": "https://api.openai.com/v1",
    "api_key": "${ENV.OPENAI_API_KEY}",
    "model": "${ENV.OPENAI_MODEL}",
})

# Or Anthropic
Agently.set_settings("AnthropicCompatible", {
    "base_url": "https://api.anthropic.com",
    "api_key": "${ENV.ANTHROPIC_API_KEY}",
    "model": "${ENV.ANTHROPIC_MODEL}",
    "max_tokens": 4096,
})

Per-provider recipes (env vars, common model names, base URLs) live in Providers.

Switching Models With Model Pool

For applications that use more than one model, configure model aliases with model_pool, then switch the active Agent model with activate_model(...). The alias can be concrete and operational, such as ollama-qwen2.5 or deepseek-v4.

agent.set_settings("model_pool", {
    "ollama-qwen2.5": "qwen2.5:7b",
    "deepseek-v4": "deepseek-chat",
})
agent.set_settings("key_pool", {
    "local": "ollama",
    "deepseek-main": "${ENV.DEEPSEEK_API_KEY}",
    "deepseek-backup": "${ENV.DEEPSEEK_BACKUP_API_KEY}",
})
agent.set_settings("key_pool_strategy", {
    "qwen2.5:7b": {"mode": "fixed", "pool": ["local"]},
    "deepseek-chat": {"mode": "round_robin", "pool": ["deepseek-main", "deepseek-backup"]},
})

result = (
    agent
    .activate_model("ollama-qwen2.5")
    .input("Summarize this incident.")
    .output({"summary": (str, "incident summary", True)})
    .start()
)

activate_model(...) affects subsequent Agent-owned requests, including chain-style agent.input(...).start() and agent.create_execution(). For a one-off override, use agent.create_request(model_key="deepseek-v4").

API keys are selected at request time by key_pool_strategy: fixed, random, round_robin, or least_used. Agently 4.1.3 does not automatically retry a failed provider request with another key after auth, quota, or billing errors; those failures are surfaced so application code can decide whether switching credentials is safe for the business operation.

Where the plugin code lives

If a provider is missing or speaks an incompatible protocol, you can add a new requester plugin — but in practice almost every commercial endpoint either ships an OpenAI-compatible mode, a Responses-style mode, or matches Anthropic’s protocol, so these built-ins cover most cases.

See also