Changelog - TruLayer

2026-05-11

Changed: Semantic search now available on Pro and above

GET /v1/search/spans and POST /v1/search/spans are now accessible on Pro, Team, Business, and Enterprise plans. Previously the endpoints were restricted to Team and above. Starter tenants still receive 402 with code: "plan_upgrade_required" when calling either search endpoint. See the Semantic search guide for setup and usage.

Changed: Eval rule mutations now available on all plans, including Starter (3-rule cap)

POST /v1/eval-rules, PATCH /v1/eval-rules/:id, and DELETE /v1/eval-rules/:id are now available on all plans, including Starter. Previously these endpoints required a Pro plan or above. Starter tenants are limited to 3 eval rules per tenant. Attempting to create a fourth rule returns HTTP 402 with code: "eval_rules_quota_exceeded". Upgrade to Pro to remove the cap. See the Evaluations reference and the API reference error codes for details.

Changed: Starter and Pro plan limits updated

Plan quotas and feature gating have been updated:

Starter (Free) plan — seat cap raised from 1 to 3. Monthly eval quota raised from 1,000 to 2,500. LLM evals are now available on the Starter plan using TruLayer’s platform judge model, consistent with Pro-tier behaviour.
Pro plan — monthly eval quota raised from 25,000 to 50,000.
Anomaly detection — previously gated to Pro and above; now available on all plans including Starter. The GET /v1/anomalies endpoint and the Anomalies toggle on the Metrics page are accessible to all tenants.
Semantic search — now available on Pro and above (was Team and above). See entry above.

See the usage reference for the full plan limits table and the evaluations reference for eval tier details.

2026-05-07

Added: Go SDK auto-instrumentation for OpenAI and Anthropic

InstrumentOpenAI and InstrumentAnthropic are now available as optional sub-modules of the Go SDK. Each wraps a provider client so every completion call is recorded as a TruLayer span automatically — no manual NewSpan / End calls required. Install only the sub-modules you need:

go get github.com/trulayer/client-go/instruments/openai
go get github.com/trulayer/client-go/instruments/anthropic

Usage:

import (
    "github.com/openai/openai-go"
    "github.com/trulayer/client-go/trulayer"
    instruments_openai "github.com/trulayer/client-go/instruments/openai"
)

tl := trulayer.NewClient(os.Getenv("TRULAYER_API_KEY"))
client := instruments_openai.InstrumentOpenAI(openai.NewClient(), tl)

trace, ctx := tl.NewTrace(ctx, "my-op")
defer trace.End(ctx)

resp, err := client.Chat.Completions.New(ctx, req) // span auto-recorded

Both instruments read the active trace from context.Context — pass the context returned by tl.NewTrace down to the provider call and the instrument finds it automatically. If no trace is in the context the call passes through unchanged. Also added to the core module: trulayer.TraceFromContext(ctx) — retrieves the active *Trace stored in a context by NewTrace. Useful for annotating a trace from middleware or helpers that only receive a context. See the Go SDK auto-instrumentation guide and the Go SDK reference for full documentation.

Changed: `Span.error` is now always present in API responses

The error field on span response objects is now always included in the JSON response, even when no error occurred. Previously the field was omitted when empty (omitempty); it now serializes as null when the span has no error. Before:

{ "id": "span-1", "name": "generate", "type": "llm" }

After:

{ "id": "span-1", "name": "generate", "type": "llm", "error": null }

The OpenAPI schema has always declared error as string | null — this change aligns the wire format to the documented schema. Action required: If your code checks for the presence of the error key (e.g. "error" in span), it will now always evaluate to true. Use a truthiness or null check instead:

# Python — correct
if span.get("error"):
    handle_error(span["error"])

// TypeScript — correct
if (span.error !== null) {
  handleError(span.error)
}

Code that already reads span.error with a null/falsy check is unaffected. Both SDKs expose span.error as string | null and are unaffected by this change.

2026-05-05

Added: full evaluator catalog — 25 built-in evaluators

The Evaluations page now lists all 25 shipped evaluators, up from 15. The 10 previously undocumented evaluators are:

Bias Detection (bias) — detects demographic, political, or cultural bias in outputs
Context Utilization (context_utilization) — output meaningfully uses the relevant parts of provided context
Cost Threshold (cost_threshold) — flags traces where token cost exceeds a configured limit
Injection Resistance (prompt_injection_resistance) — model successfully resisted embedded prompt injection attempts
Language Match (language_conformance) — output is written in the language requested by the input
Latency Threshold (latency_threshold) — flags traces where wall-clock latency exceeds a configured limit
Multi-Turn Consistency (multi_turn_consistency) — conversation stays internally consistent across turns
Format Compliance (output_format_compliance) — output conforms to the specified schema or structure
Sentiment Match (sentiment_match) — output sentiment matches the expected sentiment label
Tool Choice (tool_choice_correctness) — model selected the right tool for the task

The catalog table also corrects the format tag to tool-use, which matches the tag returned by the API.

Added: `min_score_drop` on eval rules

Eval rules: per-rule regression threshold — min_score_drop field added to eval rules. Set a per-rule minimum score drop to suppress false-positive RemediationRegression alerts caused by LLM-eval temperature variance. Defaults to 0.0 (existing behaviour preserved). The field is accepted on POST /v1/eval-rules and PATCH /v1/eval-rules/:id, and is returned in all eval-rule responses. See Evaluations — eval rules for the full field reference and guidance on when to use it.

Added: cursor-based pagination on `GET /v1/anomalies`

GET /v1/anomalies now supports cursor-based pagination in addition to its existing limit parameter. The offset parameter has been replaced by cursor.

Parameter	Type	Description
`cursor`	string	Opaque cursor from a previous response’s `next_cursor`. Omit for the first page.

The response now includes two new fields:

Field	Type	Description
`next_cursor`	`string \| null`	Opaque cursor to pass as `cursor` on the next request. `null` when there are no further pages.
`has_more`	boolean	`true` when additional pages exist.

How to paginate:

Fetch the first page with no cursor parameter.
If has_more is true, pass the returned next_cursor value as cursor on your next request.
Continue until has_more is false.

# First page
curl "https://api.trulayer.ai/v1/anomalies?limit=50" \
  -H "Authorization: Bearer $TRULAYER_API_KEY"

# Subsequent pages
curl "https://api.trulayer.ai/v1/anomalies?limit=50&cursor=<next_cursor>" \
  -H "Authorization: Bearer $TRULAYER_API_KEY"

Requires viewer+ role or an API key with query scope. Available on all plans.

Added: `from` and `to` time-range params on `GET /v1/control/actions`

GET /v1/control/actions now accepts two new optional date-range parameters to filter by when actions were created.

Parameter	Type	Description
`from`	ISO-8601 date-time	Lower bound on `created_at` (inclusive). Optional.
`to`	ISO-8601 date-time	Upper bound on `created_at` (inclusive). Optional. Must be `>= from` when both are set.

These parameters follow the same pattern as other time-filtered endpoints. Both are optional and additive — existing calls without them continue to work and return actions across the full history subject to the existing cursor and limit parameters.

curl "https://api.trulayer.ai/v1/control/actions?from=2026-05-01T00:00:00Z&to=2026-05-05T00:00:00Z" \
  -H "Authorization: Bearer $TRULAYER_API_KEY"

Requires Team+ plan and viewer+ role (dashboard-only; not reachable via API key). See Control and the API reference for full details.

2026-05-04

Added: `GET /v1/feedback` — list feedback with cursor-based pagination

GET /v1/feedback is now fully documented in the API reference. The endpoint was previously undocumented; it accepts the following optional query parameters:

Parameter	Type	Description
`project_id`	uuid	Filter by project
`trace_id`	uuid	Filter by trace
`label`	`good` \| `bad` \| `neutral`	Filter by label
`from`	ISO-8601 date-time	Lower bound on `created_at` (inclusive)
`to`	ISO-8601 date-time	Upper bound on `created_at` (inclusive). Must be `>= from` when both are set.
`cursor`	string	Opaque cursor for the next page

Returns a FeedbackListResponse with a feedback array of Feedback objects and an optional next_cursor for pagination. Requires viewer+ role or an API key with query scope. See the API reference for full schema details.

Added: `from` and `to` query params on `GET /v1/anomalies`

GET /v1/anomalies now accepts two new optional date-range parameters:

Parameter	Type	Description
`from`	ISO-8601 date-time	Lower bound on `created_at` (inclusive)
`to`	ISO-8601 date-time	Upper bound on `created_at` (inclusive). Must be `>= from` when both are set.

These parameters are additive and non-breaking. Existing calls without them continue to work and return the full anomaly list subject to the existing limit/offset pagination. Requires viewer+ role or API key with query scope. Available on all plans. See the API reference for full details.

2026-05-02

Added: Control Loop v0.1 — prompt deployments, remediation context on evals, and regression detection

Control Loop v0.1 ships three related surfaces: Prompt deployments (GET/POST /v1/prompts/deployments/*) — The platform now clusters production failures, asks an LLM to synthesise a candidate prompt diff, validates it in a sandboxed A/B replay, and proposes the result for review. Five new endpoints manage the deployment lifecycle:

Method	Path	What it does
`GET`	`/v1/prompts/deployments`	List deployments, filterable by `status` and `project_id`
`GET`	`/v1/prompts/deployments/{id}`	Get a single deployment
`POST`	`/v1/prompts/deployments/{id}/approve`	Approve an `ab_passed` deployment for ship (owner-only)
`POST`	`/v1/prompts/deployments/{id}/reject`	Reject any non-terminal deployment (owner-only)
`POST`	`/v1/prompts/deployments/{id}/rollback`	Roll back a shipped or regressed deployment (owner-only)

All endpoints require Team+ plan and Clerk session authentication (dashboard-only). The PromptDeployment schema is now in the API reference. See the control loop quickstart and the prompt improvements dashboard guide. prompt_autoship_enabled on projects (default false) — Controls whether ab_passed deployments ship automatically or wait for owner approval. Configure via PATCH /v1/projects/{id} or in project settings. See the control loop quickstart. Remediation context on evaluations — Three new optional fields on Evaluation responses:

context ("fresh" | "remediation") — distinguishes a fresh eval from a re-score triggered by a control action.
source_action_id (uuid, nullable) — the control action that drove the re-score.
original_evaluation_id (uuid, nullable) — the baseline eval the regression detector compares against.

These fields are additive. Existing evaluations have context: "fresh" and both FK fields as null. No changes to ingestion or existing integrations are required. RemediationRegression failure detection — When a remediated trace’s eval score drops below the original baseline, TruLayer emits a RemediationRegression event into the failure pipeline. These appear in GET /v1/failures/clusters alongside existing failure types. No configuration is required to receive them. max_cascade_depth on policies (default 5) — A new per-policy field that caps the total number of remediation actions — retry, fallback_model, and prompt_modification combined — across all policies on a single trace. When the count reaches the cap, the next remediation is auto-converted to escalate and parked in the HITL queue with escalation_reason: "cascade_depth_exhausted". This is distinct from max_retry_depth (TRU-362), which counts only retry actions for a single policy. Both gates are enforced on every control-loop execution; the cascade gate runs first. Valid range: 1–20. See the control loop quickstart — cascade depth gate and the API reference for the full Policy schema. See the migration guide for defaults that apply to existing policies and projects.

2026-04-30

Added: `max_retry_depth` on policies, `control_loop_depth` on trace detail, and `GET /v1/projects/{id}/metrics`

Three new API surfaces shipped in TRU-362:

max_retry_depth (integer, 1–10, default 3) on POST /v1/policies and PATCH /v1/policies/:id. When a retry-action policy has executed this many retries on a single trace, the next retry is automatically converted to an escalate action and routed to the HITL pending-approval queue. Protects against unbounded retry loops. See Control and the API reference.
control_loop_depth (integer) on GET /v1/traces/:id. Counts how many retry control actions have executed on the trace across all policy executions. Escalation actions are excluded. Only present on the detail response, not the list. See Traces.
GET /v1/projects/{id}/metrics — new endpoint to query project-scoped scalar metrics. Currently supports metric=retry_cap_hit with a required window parameter (7d or 30d). Returns { "metric": "retry_cap_hit", "value": 42, "window": "30d" }. Requires viewer+ (Clerk session) or API key with query scope. See Metrics.

Added: `GET /v1/control/actions/{id}/diff` — remediation diff for all action types

GET /v1/control/actions/{id}/diff returns a RemediationDiff comparing the original and remediated span outputs after a control-loop action. All three action types (retry, fallback_model, prompt_modification) produce a remediated span and support this endpoint. The diff includes token length delta, latency delta, embedding similarity (cosine via Claude Haiku 4.5), and per-eval-rule score deltas. Returns 202 while evaluation is still running — poll until 200. Requires Team+ plan. See the remediation diffs guide and the API reference for the full RemediationDiff schema.

Added: auto-escalation when `max_retry_depth` is exceeded

When a retry-action policy has retried a single trace max_retry_depth times, the control loop automatically converts the next retry into an escalate action (require_approval: true) and routes the trace to the HITL queue. No further automatic retries occur on that trace. The action’s metadata includes escalation_reason: "retry_threshold_exceeded" and retry_count: <N>. The Retry cap hit metric on the project overview counts distinct traces that triggered auto-escalation over the last 7 or 30 days. Clicking the metric opens a filtered trace list. See the remediation diffs guide — auto-escalation section and the Metrics reference.

Updated: all three action types now publish remediated spans and produce diffs

fallback_model and prompt_modification actions now re-publish their output as a remediated span on the originating trace, and eval rules re-fire on those spans. This means GET /v1/control/actions/{id}/diff now returns diffs for all three action types. Earlier documentation that said only retry produces a diff, or that fallback_model/prompt_modification return 422, is superseded by this change. See the updated remediation diffs guide.

Fixed: TypeScript tutorial — `scrubFn` renamed to `redact` in PII section

Section 12 of the TypeScript SDK tutorial used scrubFn as the config key, which does not exist on TruLayerConfig. The correct key is redact. Updated the code example and section heading, and corrected the same incorrect name in the best-practices guide prose. Python uses scrub_fn; TypeScript uses redact. No SDK behaviour changed.

Fixed: Python SDK reference — `TraceData` missing fields

The TraceData model listing in the Python SDK reference omitted five fields that exist in the actual Pydantic model: external_id, tag_map, model, latency_ms, and cost. Added all five with correct types. No wire format change.

2026-04-29 (upcoming)

Added: `set_cost` / `setCost` on `SpanContext` in both SDKs

Each span can now record its own cost in USD directly on the SpanContext, rather than relying on trace-level cost attribution. This is useful for multi-span traces where each LLM call has a distinct cost. Python:

with trace.span("generate", span_type="llm") as span:
    span.set_tokens(prompt_tokens, completion_tokens)
    span.set_cost(0.0024)  # cost in USD

TypeScript:

await trace.span("generate", "llm", async (span) => {
  span.setTokens({ promptTokens, completionTokens });
  span.setCost(0.0024);  // cost in USD, chainable
});

See Python SDK reference — SpanContext and TypeScript SDK reference — SpanContext.

Added: Archiving the last active project is now blocked (HTTP 409)

POST /v1/projects/{id}/archive returns 409 with code: "error.project.cannot_archive_last" if the project is the tenant’s only active project. Every organization must have at least one active project at all times. Create or unarchive another project before archiving the one you want to retire. See Project lifecycle and the API reference error codes.

Added: Ingest returns HTTP 403 when the API key’s project is archived

POST /v1/ingest, POST /v1/ingest/batch, and POST /v1/otlp/traces now return 403 with code: "error.project.archived" if the API key’s associated project has been archived. The Python and TypeScript SDKs both handle this response by permanently disabling the exporter and logging an ERROR-level message — your application continues running normally. To resume: unarchive the project from Projects settings, then restart the process or create a new client instance (no key rotation needed). See Project lifecycle, TypeScript SDK — archived project, Python SDK — archived project.

2026-04-29

Fixed: API reference synced — `Span` response schema updated to match backend wire format

The Span response schema in the API reference has been updated to match the backend’s actual wire format. Changes:

required array corrected to [id, created_at, tenant_id, trace_id, name, type] (removed phantom span_type, error, started_at)
Deprecated shim fields removed: span_type, error_message, started_at, ended_at
error field type corrected from boolean to string | null
New fields added: created_at, tenant_id, parent_span_id, model, latency_ms, cost, otel_trace_id, otel_span_id
Timestamp fields are now start_time / end_time (matching wire keys)

These are documentation corrections — no wire format change. Existing integrations are unaffected.

Fixed: `SpanType` docs corrected — `"chain"` and `"default"` removed, `"other"` is the correct fallback

Several code examples and the SpanType reference table in the TypeScript SDK docs incorrectly listed "chain" and "default" as valid span type values. The valid enum is "llm" | "tool" | "retrieval" | "other" — this has always been the case on the wire and in both SDK implementations. The following files have been corrected:

/sdks/typescript/reference — example code and the TruLayerCallbackHandler span-type table
/sdks/typescript/tutorial — the span-type reference table and all example code blocks

No action required for existing integrations. If your code passed "chain" or "default" as a span type, the server accepts unknown string values and stores them as-is; switch to "other" for correct dashboard grouping.

Fixed: `error` field on `Trace` and `Span` clarified as `string | null`

Prose in the TypeScript SDK reference and tutorial incorrectly described the error field as a boolean (error: true). The field has always been string | null — it carries the error message string when an error occurred, or null on success. This matches the Python SDK documentation and both SDK implementations. No wire format change.

2026-04-27

Added: `POST /v1/webhooks/:id/test` — send a test ping to your webhook endpoint

A new endpoint lets you fire a synthetic ping event to a registered webhook and inspect the destination’s HTTP response, without waiting for a real event to fire. The delivery is signed with HMAC-SHA256 (same as live deliveries) and is ephemeral — it does not appear in the delivery log. Requires Pro plan or above and Member or Owner role. See Webhooks guide and the API reference for full details.

Added: URL validation on `POST /v1/webhooks` (Create)

Webhook URLs are now validated at creation time. The endpoint returns 422 with a machine-readable error key in the error field if any check fails:

Error key	Meaning
`webhook.url.not_https`	URL scheme must be `https`
`webhook.url.private_ip`	URL resolves to a private, loopback, or link-local IP (SSRF protection)
`webhook.url.unresolvable`	URL hostname could not be resolved via DNS

Existing webhooks are unaffected. See the Webhooks guide for details.

SDK constructor parameter renamed: `project` → `projectName` (TypeScript) / `project_name` (Python)

The project constructor parameter was never a valid field on TruLayerConfig (TypeScript) or TruLayerClient (Python). The correct parameter names are:

TypeScript: projectName: string — pass to new TruLayer({ projectName: "..." }) or init({ projectName: "..." })
Python: project_name: str — pass to trulayer.init(project_name="...") or TruLayerClient(project_name="...")

The old projectId (TypeScript) / project_id (Python) aliases remain accepted but are deprecated and will be removed in SDK 0.3.x. Documentation examples in the quickstart and SDK references have been corrected.

2026-04-26

@trulayer/mcp — MCP server for AI-native tooling

The TruLayer MCP (Model Context Protocol) server is now available as @trulayer/mcp. Connect Claude Desktop, Cursor, VS Code Copilot, or any MCP-compatible host directly to your TruLayer workspace to query traces, evals, metrics, and anomalies using natural language. All 8 tools available: list_traces, get_trace, list_evals, get_eval, get_eval_trends, list_eval_rules, get_metrics, list_anomalies. Install guide: MCP & Skills

2026-04-25

SpanRequest: new optional fields `prompt_tokens` and `completion_tokens`

The POST /v1/ingest and POST /v1/ingest/batch span objects now accept two new optional integer fields:

prompt_tokens — number of tokens in the LLM prompt
completion_tokens — number of tokens in the LLM completion

These are additive (non-breaking). Existing ingestion calls are unaffected.

`Span.error` field: clarified as nullable string

The error field on span response objects has always carried a nullable string error message on the wire, despite earlier SDK documentation describing it as a boolean. The field declaration is now correctly documented as string | null. No wire format change — this is a documentation correction only.

`SpanType` default corrected to `"other"`

The default span type in SDK examples was incorrectly shown as "custom". The correct default is "other". No API behaviour change.

Getting started

Core concepts

Python SDK

TypeScript SDK

Go SDK

SDK features

Dashboard

Integrations

Control loop

Guides

Best practices

Reference

Contributing

Documentation Index

​2026-05-11

​Changed: Semantic search now available on Pro and above

​Changed: Eval rule mutations now available on all plans, including Starter (3-rule cap)

​Changed: Starter and Pro plan limits updated

​2026-05-07

​Added: Go SDK auto-instrumentation for OpenAI and Anthropic

​Changed: Span.error is now always present in API responses

​2026-05-05

​Added: full evaluator catalog — 25 built-in evaluators

​Added: min_score_drop on eval rules

​Added: cursor-based pagination on GET /v1/anomalies

​Added: from and to time-range params on GET /v1/control/actions

​2026-05-04

​Added: GET /v1/feedback — list feedback with cursor-based pagination

​Added: from and to query params on GET /v1/anomalies

​2026-05-02

​Added: Control Loop v0.1 — prompt deployments, remediation context on evals, and regression detection

​2026-04-30

​Added: max_retry_depth on policies, control_loop_depth on trace detail, and GET /v1/projects/{id}/metrics

​Added: GET /v1/control/actions/{id}/diff — remediation diff for all action types

​Added: auto-escalation when max_retry_depth is exceeded

​Updated: all three action types now publish remediated spans and produce diffs

​Fixed: TypeScript tutorial — scrubFn renamed to redact in PII section

​Fixed: Python SDK reference — TraceData missing fields

​2026-04-29 (upcoming)

​Added: set_cost / setCost on SpanContext in both SDKs

​Added: Archiving the last active project is now blocked (HTTP 409)

​Added: Ingest returns HTTP 403 when the API key’s project is archived

​2026-04-29

​Fixed: API reference synced — Span response schema updated to match backend wire format

​Fixed: SpanType docs corrected — "chain" and "default" removed, "other" is the correct fallback

​Fixed: error field on Trace and Span clarified as string | null

​2026-04-27

​Added: POST /v1/webhooks/:id/test — send a test ping to your webhook endpoint

​Added: URL validation on POST /v1/webhooks (Create)

​SDK constructor parameter renamed: project → projectName (TypeScript) / project_name (Python)

​2026-04-26

​@trulayer/mcp — MCP server for AI-native tooling

​2026-04-25

​SpanRequest: new optional fields prompt_tokens and completion_tokens

​Span.error field: clarified as nullable string

​SpanType default corrected to "other"