Agently documentation for building AI applications with stable outputs, observable actions, and durable workflows.
Languages: English · 中文
The validation pipeline runs the first time a structured response result is consumed, then caches the outcome on that response result. It has a fixed order, and each step contributes to the same retry budget.
For Agently 4.1.0.1+, the default authoring path is: mark fixed required leaves directly in .output(...) with the third-slot ensure flag, then let the runtime compile those flags into ensure_keys. Pass ensure_keys= manually only when the required path is runtime-dependent, conditional, or easier to express outside the static schema.
.output(...) defaults to format="auto". Auto chooses the simplest structured
format from the schema shape: flat string-only dicts become flat_markdown;
dicts that mix string fields with complex list/object fields become hybrid;
boolean/numeric control fields, all-complex, and non-dict schemas stay json.
Auto does not inspect business meaning in field names or descriptions. If
downstream code relies on a specific wire shape, set the format explicitly.
| Mode | Use When | Avoid When |
|---|---|---|
auto |
You want the framework default and the schema itself should choose the most model-friendly structured format. Good for application code that consumes parsed data through Agently rather than the raw model text. | A legacy consumer, test fixture, external API, or saved prompt expects raw JSON text. Use format="json" there. |
flat_markdown |
The result is a flat dict of named scalar fields, especially when one or more fields may contain large code, HTML, SVG, Markdown, SQL, templates, or multi-paragraph prose. Section headers avoid JSON escaping problems and keep big text blocks readable. | You need nested lists/objects, arrays of records, or a single unstructured document. |
hybrid |
The result mixes prose/code fields with structured lists or objects, for example summary plus citations, analysis plus components, or notes plus next_steps. Scalar fields stay as text sections; complex fields use JSON blocks. |
Every field is scalar, or every field is deeply structured and compact enough for JSON. |
json |
You need the strictest machine contract, nested data, arrays, interop with external systems, compatibility with old prompts/tests, or exact raw JSON behavior. | Large embedded documents or code blocks make escaping fragile or hard for the model to read. |
| Plain text | The request asks for one freeform artifact: an article, email, explanation, report, Markdown page, HTML page, or other single multi-paragraph document. Do not call output(); use start() / async_start() directly or read response.result.get_text(). |
You need separately addressable fields, path validation, ensure_keys, typed objects, or downstream branching. |
Use get_generator(type="instant") or get_async_generator(type="instant")
when the caller benefits from field-level updates before the full response is
finished: progress panels, live forms, long reports with independently
renderable sections, model-stage dashboards, or workflow UIs that can show a
field as soon as it is complete. For one freeform text artifact, use
type="delta" instead; plain text has no structured field paths for instant
events.
| Output Mode | Instant Support | Practical Guidance |
|---|---|---|
auto |
Yes, after auto resolves to json, flat_markdown, or hybrid. |
Good default for UI streaming when callers consume Agently StreamingData rather than raw model text. If auto later degrades to JSON during final parsing, treat instant events as provisional UI data and use the final parsed result for durable writes. |
flat_markdown |
Yes, field-level text deltas by ### field sections. |
Strong fit for long scalar fields such as code, HTML, Markdown, or report sections. First update for a field appears after the model emits that field header; pure-numeric scalar schemas have higher header-adherence risk. |
hybrid |
Yes, field-level text deltas by section. JSON block contents stream as text and are parsed into lists/objects at finalization. | Best for mixed prose plus structured records. Use instant for UI/progress, then use get_data() / async_get_data() for the finalized typed structure. |
json |
Yes, via incremental JSON parsing. | Best when arrays or nested objects need path-level updates. More sensitive to malformed or delayed JSON syntax while streaming; final repair still happens at completion. |
Plain text / text |
No structured instant paths. | Use type="delta" for raw token streaming, or get_text() after completion. |
The 2026-05-23 cross-model acceptance run covered 6 providers and 12 scenarios (72 total checks) using DeepSeek, Qwen, Qianfan ERNIE, MiniMax, GLM, and local Qwen2.5. A follow-up 2026-05-24 structured-output stability smoke run covered DeepSeek V4 Flash, Qwen3.6-35B-A3B, and GLM-4.5-Air on flat strings, scalar controls, nested EDA netlists, hybrid EDA netlists, and model-judge arrays. Treat these as observed compatibility data, not a mathematical guarantee:
| Mode / Scenario Set | Observed Result | Selection Implication |
|---|---|---|
auto overall |
The 2026-05-23 run passed 72/72 under the previous broader auto matrix. Current auto is structural: flat string-only dicts may resolve to flat_markdown; string-plus-complex dicts may resolve to hybrid; boolean/numeric control fields, all-complex, and non-dict schemas resolve to json. |
Good default when the application consumes final parsed data and can tolerate retry latency, but use explicit format when compatibility matters. |
flat_markdown native parsing |
Earlier flat-markdown scenarios showed header adherence risk, especially pure numeric fields and some ERNIE/GLM runs. Current auto avoids booleans/numbers. | Good for large text/code fields; avoid for all-number scalar schemas or models known to ignore section headers. |
json nested structures |
In the 2026-05-24 smoke run, nested EDA netlists and nested model-judge arrays passed on DeepSeek V4 Flash, Qwen3.6-35B-A3B, and GLM-4.5-Air except no JSON failure was observed. | Do not avoid complex nested structures categorically. Prefer JSON for dense nested records, judges, booleans, numbers, and machine contracts. |
hybrid nested structures |
A prompt-contract gap was found and fixed: complex hybrid sections now include their JSON sub-schema. After the fix, EDA hybrid passed first-attempt on DeepSeek V4 Flash and Qwen3.6-35B-A3B; GLM-4.5-Air hit a 360s request failure with no progress events. | Auto may select hybrid for string fields plus records. Use explicit json when a dense nested machine contract is preferred. For reasoning or large MoE models, use a 360s+ timeout and observe streaming/meta events before declaring failure. |
instant sampled scenarios |
Instant was included for flat scalar output (S8) and hybrid mixed output (S11) across the provider set. | Supported for UI/progress, but final business decisions should consume the completed parsed result because streaming events are provisional. |
Typical usage:
# Default: auto, chosen from schema shape.
agent.input("Create a self-contained page.").output({
"html": (str, "complete HTML document"),
"notes": (str, "short implementation notes"),
}).start()
# Force JSON when a downstream contract expects raw JSON-like structure.
agent.input("Extract invoice fields.").output({
"vendor": (str, "vendor name", True),
"line_items": [{"sku": (str,), "amount": (float,)}],
}, format="json").start()
# Explicit hybrid when prose plus records are both useful.
agent.input("Create an EDA netlist with design notes.").output({
"analysis": (str, "one paragraph design rationale", True),
"components": [{"refdes": (str, "reference designator", True), "value": (str, "part value", True)}],
"nets": [{"name": (str, "net name", True), "connections": [{"refdes": (str, "refdes", True), "pin": (str, "pin", True)}]}],
}, format="hybrid").start()
# Plain text: one artifact, no structured parser.
html = agent.input("Write a complete landing page as HTML.").start()
model returns text
│
▼
1. parse / repair ← extract structured object from text
│
▼
2. strict output ← match against .output(...) shape; ensure_all_keys checks if set
│
▼
3. ensure_keys ← per-leaf required-path checks (compiled from the ensure flag)
│
▼
4. custom validate ← .validate(handler) and validate_handler= business rules
│
▼
pass → return result | fail → retry (if budget remains) → top of pipeline
A failure at any step retries the request. Retries share one budget controlled by max_retries (default 3). When the budget is exhausted:
raise_ensure_failure=True (the default) — raises.raise_ensure_failure=False — returns the latest parsed result anyway..validate(handler) registers a custom check. It runs after strict output and ensure_keys have already passed, on a canonical dict snapshot of the result.
def must_be_short(result, ctx):
if len(result.get("answer", "")) > 280:
return {"ok": False, "reason": "answer too long", "validator_name": "length"}
return True
agent.input("Summarize.").output({
"answer": (str, "answer", True),
}).validate(must_be_short).start()
The handler runs only on structured-result getters: start(), async_start(), get_data(), async_get_data(), get_data_object(), async_get_data_object(). It does not run on get_text() / get_meta() (those don’t carry the parsed structure that validate would inspect).
Agently output schemas are ordered. When later fields depend on earlier judgment, put support fields first: evidence, assumptions, clarifications, source notes, calculation plans, concise rationale, rule checks, and intermediate facts. Put final booleans, verdicts, replies, summaries, and action decisions last. User-facing renderers can reorder sections for natural reading, but the model generation contract should keep support-before-conclusion order.
For model-owned grading, confidence, trust, relevance, usability, or quality
judgments, prefer conceptual levels with explicit definitions over precise
numeric scores. For example, ask for high_trust, moderate_trust, or
low_trust, and define each level in the prompt. If downstream code needs a
score for thresholds, weighting, statistics, or index calculations, map levels
to deterministic numbers in code after generation.
For complex arithmetic, long-number calculation, weighting, aggregation, or statistical transformations, ask the model for an executable calculation plan or code, run it with tools, and pass the original question, code, and observed result into the next model step. Do not make text generation be the calculator.
You can also pass handlers per-call:
agent.input("...").output({...}).start(validate_handler=must_be_short)
agent.input("...").output({...}).start(validate_handler=[check_a, check_b])
.validate(...) handlers run before validate_handler= handlers. Multiple .validate(...) calls preserve order.
| Return | Meaning |
|---|---|
True |
pass |
False |
fail — retry if budget remains |
dict |
structured result; see keys below |
Supported dict keys:
| Key | Effect |
|---|---|
ok |
True = pass, False = fail |
reason |
text shown in retry events / error messages |
payload |
structured details for downstream consumers |
validator_name |
tag the validator in events |
no_retry / stop |
fail but don’t retry |
error / exception / raise |
fail with the given exception |
Anything not in this list becomes a model.validation_error and consumes retry budget.
Both sync and async handlers are supported. An async handler signature:
async def check_remote(result, ctx):
ok = await some_external_check(result["answer"])
return ok
The second argument is an OutputValidateContext with at least:
value, input, agent_name, response_idattempt_index, retry_count, max_retriesprompt, settings, request_run_context, model_run_contextresponse_text, raw_text, parsed_result, result_object, typed, metaUse ctx.attempt_index if you want different behavior on later attempts (e.g., loosen the rule on retry).
Treat these fields as observational by default, but ctx.prompt and ctx.settings are live state objects for the current response-attempt chain. In advanced handlers, if you need to adjust the prompt / options / settings for a later retry, you can write back through them inside the validator.
For example, lower sampling parameters on the next retry:
def check(result, ctx):
if result.get("score", 0) < 0.8 and ctx.retry_count < ctx.max_retries:
ctx.prompt.set("options", {"temperature": 0.2, "top_p": 0.7})
return {"ok": False, "reason": "score too low"}
return True
Or change settings:
def check(result, ctx):
if should_switch_mode(result):
ctx.settings.set("my_plugin.some_flag", True)
return False
return True
Two caveats:
response is created from a new prompt/settings snapshot at the request/agent layer, so validator write-backs stay inside the current response’s retry chain.opts = ctx.prompt.get("options", {}) in place. get() returns a view/copy; use write APIs such as ctx.prompt.set(...), ctx.prompt.update(...), or ctx.settings.set(...) if you need the change to persist.Validation runs once per ModelResponseResult and the outcome is cached. Repeated calls — get_data() then get_data() again, or get_data() then get_data_object() — do not rerun validators. If you try to inject a different handler on the same response after validation has already finalized, the new handler is ignored with a warning.
This means: don’t expect to swap validators per consumer. If you need different validation for different consumers, run the request twice.
Validation contributes two new observation event types:
model.validation_failed — handler returned a failmodel.validation_error — handler raised, returned an unsupported value, etc.There is intentionally no model.validation_passed event in phase 1 — passing is the silent default.
The standard model.retrying event picks up validation-specific fields when the retry came from validate:
retry_reason, validator_name, validation_reason, validation_payloadAgently-DevTools consumes these defensively. New event keys are additive and should not break existing dashboards.
ensure_keys and .validate(...) are layered:
ensure_keys handles path presence (compiled from the ensure flag in .output(...))..validate(...) handles value rules that depend on the actual content.For fixed required leaves, prefer (TypeExpr, "description", True) in .output(...) rather than manually repeating the same paths in ensure_keys=. Use manual ensure_keys for conditional or runtime-only paths. Use .validate(...) for “this field must satisfy this business rule”.
Loosen on the last retry:
def check(result, ctx):
if ctx.attempt_index == ctx.max_retries:
return True # accept whatever came back
return strict_check(result)
Fail without retrying (e.g., validation reveals a permanent business issue):
def policy_check(result, ctx):
return {"ok": False, "reason": "policy violation", "no_retry": True}
Raise a custom exception:
def policy_check(result, ctx):
return {"ok": False, "raise": MyDomainError("rejected by policy")}
.output(...) authoring and ensure flag