Skip to main content

5th Grade Summary

Kam needs one map that shows how everything connects.

The map starts with a workload ID.

It connects the answer trace, the label, the judge result, the fixture, the release gate, and the work item.

That map is the Kam Intelligence Registry.

The most important product object in Kam AI is not the model response.

It is the relationship between evidence objects.

A failed answer is only useful if Kam can connect it to the workload that produced it, the label that explains it, the fixture that protects against it, and the release gate that proves the fix stayed fixed.

That is why Kam needs an Intelligence Registry.

Why a registry exists

Without a registry, AI quality work becomes scattered.

One trace is in logs. One label is in a review queue. One fixture is in a test directory. One judge result is in an eval report. One release gate is in CI. One work item is in an ops board.

Each object may be valid, but the system cannot answer basic questions:

  • Which workload is failing most often?
  • Which labels are approved but not protected by fixtures?
  • Which fixtures are failing after the latest release?
  • Which judge failures need human review?
  • Which agentic runs created work that is still pending?
  • Which release gate blocked a bad change?

The registry turns those questions into normal product queries.

The object graph

Visual artifact

Registry evidence flow

The registry connects evidence by workload, then lets KamOps and KamSRE render the right view for each human workflow.

  1. 01evidence

    Trace arrives

    A production answer records route, hot reads, source context, tool plan, and answer metadata.

  2. 02scope

    Label attaches

    A reviewer states what should have happened: intent, entities, route, reads, source rules, and expected answer shape.

  3. 03answer

    Fixture promotes

    Approved evidence becomes regression coverage with deterministic graders and expected contracts.

  4. 04answer

    Gate decides

    Release gates evaluate workload health before risky changes move forward.

The registry is not a dashboard first. It is a relationship model first.

What goes in the registry

The registry should be compact. It should point to full artifacts instead of copying everything.

Registry object types

Object
Trace
Compact fields
id, workloadId, route, status, timestamp, artifactUri
Full artifact
Full prompt, response, spans, tool events
Object
Label
Compact fields
id, workloadId, labelState, failureType, reviewer
Full artifact
Expected route, entities, reads, rubric
Object
Finding
Compact fields
id, severity, failureCode, owner, status
Full artifact
Diagnosis, examples, screenshots
Object
Fixture
Compact fields
id, workloadId, graderSet, lastResult
Full artifact
Input, expected output, replay data
Object
Judge evidence
Compact fields
id, judgeType, score, decision
Full artifact
Full rationale and rubric result
Object
Agentic run
Compact fields
id, workloadId, state, approvalRequired
Full artifact
Run transcript and generated artifacts
Object
Work item
Compact fields
id, owner, priority, state
Full artifact
Review packet or engineering task
Object
Release gate
Compact fields
id, version, passRate, blockedReason
Full artifact
Full CI or eval report

Takeaway: Keep indexes small and artifacts durable. The registry should help humans navigate evidence, not become a blob store.

Source of truth split

The registry should not become a backend data-ops control plane.

For Kam, the cleaner split is:

  • DynamoDB stores compact indexes, states, and edges.
  • S3 stores full traces, labels, reports, and packets.
  • Athena or offline query jobs analyze drift and trends.
  • KamOps renders review flows, receipts, scorecards, and work queues.
  • Backend APIs own mutation and storage contracts.

That keeps the Next.js dashboard focused on viewing and approving bounded objects through API routes. It does not run cleanup, archive, scheduler, or import orchestration from the UI.

Why workload ID is the primary key

Workload ID is the stable unit of quality.

Routes change. Prompts change. Model providers change. UI screens change. But the product obligation remains.

chat.team_trends.v1 still means the user expects a team trend answer with the right sport, team, sample, dates, denominator, source context, and caveats.

agentic.trace_label_prep.v1 still means the internal workflow should read a failed trace, draft label expectations, run deterministic checks, and hand off a review item to a human.

What workload IDs unlock

Slicing

Compare quality, latency, cost, and failures by product workload instead of raw endpoint.

Dedupe

Group repeated failures into one review packet instead of making humans inspect every duplicate.

Dataset creation

Build eval sets from approved labels tied to the same product obligation.

Release gates

Block changes that damage a specific workload even when global tests still pass.

Scorecards

Show whether one lane is getting healthier or drifting.

Agentic work

Create bounded internal tasks that know what workload they are improving.

Takeaway: A workload ID is the handle that turns AI behavior into product operations.

A useful registry receipt

Trust receipt

What Kam should prove before confidence

A useful answer should leave a small receipt: route, scope, freshness, evidence, missing data, and confidence state.

Route

chat.team_trends.v1

Scope

A team trend answer that must expose the games, dates, spreads, scores, and ATS outcomes behind the aggregate.

Freshness

Trace and label evidence are current; release-gate enforcement is not active yet.

Evidence loaded

  • failed deterministic denominator check
  • approved label: missing_historical_denominator
  • fixture candidate created

Missing or caveated

  • release gate not enforced
  • scorecard threshold not attached
Status: Candidate pending fixture promotion

The lesson

The registry is the backbone of the better Kam framework.

It makes trace evidence searchable, label work repeatable, fixture promotion visible, and release quality measurable. It also keeps open-source integrations in their right place. Langfuse, Phoenix, Label Studio, Argilla, promptfoo, or Evidently can help with slices of the workflow, but the registry remains Kam-native because the product taxonomy is Kam-native.

The next action is clear:

Build the registry v1 around workload ID, keep it as a derived projection, and make every evidence object link back to the product obligation it supports.

Read next

Related field notes

View all posts