System Design

The Kam Intelligence Registry

May 24, 20268 min read

Kam AI

Product and research

The Kam Intelligence Registry hero image

5th Grade Summary

Kam needs one map that shows how everything connects.

The map starts with a workload ID.

It connects the answer trace, the label, the judge result, the fixture, the release gate, and the work item.

That map is the Kam Intelligence Registry.

The most important product object in Kam AI is not the model response.

It is the relationship between evidence objects.

A failed answer is only useful if Kam can connect it to the workload that produced it, the label that explains it, the fixture that protects against it, and the release gate that proves the fix stayed fixed.

That is why Kam needs an Intelligence Registry.

Why a registry exists

Without a registry, AI quality work becomes scattered.

One trace is in logs. One label is in a review queue. One fixture is in a test directory. One judge result is in an eval report. One release gate is in CI. One work item is in an ops board.

Each object may be valid, but the system cannot answer basic questions:

Which workload is failing most often?
Which labels are approved but not protected by fixtures?
Which fixtures are failing after the latest release?
Which judge failures need human review?
Which agentic runs created work that is still pending?
Which release gate blocked a bad change?

The registry turns those questions into normal product queries.

The object graph

Visual artifact

Registry evidence flow

The registry connects evidence by workload, then lets KamOps and KamSRE render the right view for each human workflow.

01evidence
Trace arrives
A production answer records route, hot reads, source context, tool plan, and answer metadata.
02scope
Label attaches
A reviewer states what should have happened: intent, entities, route, reads, source rules, and expected answer shape.
03answer
Fixture promotes
Approved evidence becomes regression coverage with deterministic graders and expected contracts.
04answer
Gate decides
Release gates evaluate workload health before risky changes move forward.

The registry is not a dashboard first. It is a relationship model first.

What goes in the registry

The registry should be compact. It should point to full artifacts instead of copying everything.

Registry object types

Object: Trace
Compact fields: id, workloadId, route, status, timestamp, artifactUri
Full artifact: Full prompt, response, spans, tool events

Object: Label
Compact fields: id, workloadId, labelState, failureType, reviewer
Full artifact: Expected route, entities, reads, rubric

Object: Finding
Compact fields: id, severity, failureCode, owner, status
Full artifact: Diagnosis, examples, screenshots

Object: Fixture
Compact fields: id, workloadId, graderSet, lastResult
Full artifact: Input, expected output, replay data

Object: Judge evidence
Compact fields: id, judgeType, score, decision
Full artifact: Full rationale and rubric result

Object: Agentic run
Compact fields: id, workloadId, state, approvalRequired
Full artifact: Run transcript and generated artifacts

Object: Work item
Compact fields: id, owner, priority, state
Full artifact: Review packet or engineering task

Object: Release gate
Compact fields: id, version, passRate, blockedReason
Full artifact: Full CI or eval report

Object	Compact fields	Full artifact
Trace	id, workloadId, route, status, timestamp, artifactUri	Full prompt, response, spans, tool events
Label	id, workloadId, labelState, failureType, reviewer	Expected route, entities, reads, rubric
Finding	id, severity, failureCode, owner, status	Diagnosis, examples, screenshots
Fixture	id, workloadId, graderSet, lastResult	Input, expected output, replay data
Judge evidence	id, judgeType, score, decision	Full rationale and rubric result
Agentic run	id, workloadId, state, approvalRequired	Run transcript and generated artifacts
Work item	id, owner, priority, state	Review packet or engineering task
Release gate	id, version, passRate, blockedReason	Full CI or eval report

Takeaway: Keep indexes small and artifacts durable. The registry should help humans navigate evidence, not become a blob store.

Source of truth split

The registry should not become a backend data-ops control plane.

For Kam, the cleaner split is:

DynamoDB stores compact indexes, states, and edges.
S3 stores full traces, labels, reports, and packets.
Athena or offline query jobs analyze drift and trends.
KamOps renders review flows, receipts, scorecards, and work queues.
Backend APIs own mutation and storage contracts.

That keeps the Next.js dashboard focused on viewing and approving bounded objects through API routes. It does not run cleanup, archive, scheduler, or import orchestration from the UI.

Why workload ID is the primary key

Workload ID is the stable unit of quality.

Routes change. Prompts change. Model providers change. UI screens change. But the product obligation remains.

chat.team_trends.v1 still means the user expects a team trend answer with the right sport, team, sample, dates, denominator, source context, and caveats.

agentic.trace_label_prep.v1 still means the internal workflow should read a failed trace, draft label expectations, run deterministic checks, and hand off a review item to a human.

What workload IDs unlock

Slicing

Compare quality, latency, cost, and failures by product workload instead of raw endpoint.

Dedupe

Group repeated failures into one review packet instead of making humans inspect every duplicate.

Dataset creation

Build eval sets from approved labels tied to the same product obligation.

Release gates

Block changes that damage a specific workload even when global tests still pass.

Scorecards

Show whether one lane is getting healthier or drifting.

Agentic work

Create bounded internal tasks that know what workload they are improving.

Takeaway: A workload ID is the handle that turns AI behavior into product operations.

A useful registry receipt

Trust receipt

What Kam should prove before confidence

A useful answer should leave a small receipt: route, scope, freshness, evidence, missing data, and confidence state.

Route

chat.team_trends.v1

Scope

A team trend answer that must expose the games, dates, spreads, scores, and ATS outcomes behind the aggregate.

Freshness

Trace and label evidence are current; release-gate enforcement is not active yet.

Evidence loaded

failed deterministic denominator check
approved label: missing_historical_denominator
fixture candidate created

Missing or caveated

release gate not enforced
scorecard threshold not attached

Status: Candidate pending fixture promotion

The lesson

The registry is the backbone of the better Kam framework.

It makes trace evidence searchable, label work repeatable, fixture promotion visible, and release quality measurable. It also keeps open-source integrations in their right place. Langfuse, Phoenix, Label Studio, Argilla, promptfoo, or Evidently can help with slices of the workflow, but the registry remains Kam-native because the product taxonomy is Kam-native.

The next action is clear:

Build the registry v1 around workload ID, keep it as a derived projection, and make every evidence object link back to the product obligation it supports.

Related field notes

View all posts

kam-frameworkai-operating-system

Kam AI Is Becoming an Operating System

Why Kam is moving from AI chat into a production loop of traces, labels, graders, fixtures, release gates, and agentic work.

9 min read

tracesworkload-id

Production Traces Need Workload IDs

Why Kam treats workload IDs as the unit for trace slicing, deduplication, dataset creation, and release scorecards.

7 min read

open-sourceobservability

Use Open Source for Plumbing, Keep Kam Intelligence Custom

How Langfuse, Phoenix, OpenTelemetry, Label Studio, Argilla, promptfoo, DeepEval, Evidently, Temporal, Dagster, and LiteLLM fit around Kam.