After the control loop remediates a failing LLM call, you can retrieve a structured diff that shows exactly what changed between the original span and the remediated one. This guide explains how to fetch that diff, how to handle the lazy-computation pattern, and what each field means.Documentation Index
Fetch the complete documentation index at: https://docs.trulayer.ai/llms.txt
Use this file to discover all available pages before exploring further.
What is a remediation diff?
When a control action runs and produces a remediated span, TruLayer computes a structured before/after diff. The diff endpoint (GET /v1/control/actions/{id}/diff) compares the two span outputs and returns:
- Token length delta — how many tokens the remediated output added or removed
- Latency delta — how many milliseconds faster or slower the remediated call was
- Embedding similarity — cosine similarity between the original and remediated outputs, computed by Claude Haiku 4.5
- Score deltas — for each eval rule that scored both spans, the before and after scores and the delta
Which action types produce a diff?
All three action types that produce a remediated span support the diff endpoint:| Action type | Produces a diff? |
|---|---|
retry | Yes |
fallback_model | Yes |
prompt_modification | Yes |
422 from the diff endpoint. HTTP 422 is reserved for future action types that do not produce a new output span.
Fetching a diff
{id} with the UUID of the control action. This endpoint is dashboard-accessible for all Team+ plan roles.
The 202 → poll-until-200 pattern
Diff computation is lazy: the first GET triggers the computation, which runs asynchronously. If the underlying evaluation has not yet completed, the API returns202:
200. A simple polling loop:
200 immediately.
Response schema
A200 response returns a RemediationDiff object:
embedding_similarity
Embedding similarity is the cosine similarity between the original and remediated output embeddings, computed by Claude Haiku 4.5. The value ranges from 0.0 to 1.0.
A value of -1.0 is a sentinel meaning the embedding computation failed — treat it as unavailable and do not render it as a score. This can occur when the model is temporarily unavailable or when the output is too short to embed meaningfully.
score_deltas
Each entry in score_deltas corresponds to one eval rule that scored both the original and remediated spans. delta is remediated_score - original_score, so a positive delta means the retry improved the score on that rule.
If an eval rule did not score one or both spans — for example, because the rule was added after the original trace was ingested — it does not appear in the array.
Access and plan requirements
The diff endpoint is gated to the Team+ plan. Starter and Pro tenants receive402 with code: "plan_upgrade_required". The endpoint is dashboard-accessible — it cannot be reached via API key.
All three dashboard roles (owner, member, viewer) can read diffs. Only owners can execute control actions that produce them.
Error reference
| Status | Condition |
|---|---|
200 | Diff is available |
202 | Evaluation still running — poll again |
404 | Action not found or belongs to a different tenant |
422 | Reserved — action type produces no new output span (none currently) |
Auto-escalation and the retry depth cap
When a policy’s action type isretry, the control loop will automatically stop retrying and escalate to HITL if a single trace has been retried too many times. This prevents unbounded retry cascades.
max_retry_depth
Every retry-action policy has a max_retry_depth field (integer, 1–10, default 3). Set it when creating or updating a policy:
[1, 10] return 422 with a field-level validation error.
How escalation fires
When the control loop is about to fire a retry and the trace’sretry_count is already equal to or greater than max_retry_depth, the action is automatically converted to an escalate action:
require_approvalis set totrue.- The action’s
metadataincludesescalation_reason: "retry_threshold_exceeded"andretry_count: <N>. - The trace is routed to the HITL pending-approval queue — no further automatic retries occur for that policy on that trace.
Retry cap hit metric
Monitor how often auto-escalation fires across your project using theretry_cap_hit project metric:
escalation_reason: "retry_threshold_exceeded" — so you can inspect each case and decide whether to raise the cap, fix the policy trigger, or take no action.
control_loop_depth on traces
GET /v1/traces/{id} includes a control_loop_depth integer field counting the number of retry actions that executed on the trace. Escalation actions are not counted. Use this field to understand how many attempts the system made before succeeding or escalating.
See Control for policy configuration in the dashboard and Traces for how control_loop_depth appears in the trace detail view.
Next steps
- Control loop dashboard guide — view and manage control actions in the UI
- API reference — full
RemediationDiffandScoreDeltaschema definitions - Metrics —
retry_cap_hitproject metric reference - Changelog — recent additions and changes