What is a source-type taxonomy for marketing claims?

A scheme that records how each commercial claim was obtained — human-verified, customer-provided, scraped from the public web, or inferred by an AI — and caps the confidence each origin can carry. It ranks claims by provenance rather than by how certain the last system to touch them sounded.

What are the source types and their confidence ceilings?

Assay's canonical set is six: first-party verified (ceiling 1.00), first-party unverified (0.85), second-party (0.75), third-party (0.60), AI-extracted (0.65, capped), and AI-imported (0.50). Final confidence is the minimum of the source ceiling and the computed score.

Why does an AI extraction need a confidence ceiling?

Without one, 'the model was confident' silently overrides 'a human verified it.' A ceiling caps an AI-derived claim below a human-verified one regardless of the model's certainty, so unverified inferences cannot outrank checked facts.

How is this different from an AI claim-checker?

Tools that check copy against an approved-claims document validate text after the fact. A source-type taxonomy types the claim itself as a first-class object with a provenance and a ceiling, so every downstream surface — decks, agents, proposals — inherits the same governed confidence.

A source-type taxonomy for marketing claims

Every claim your company makes about itself arrived somehow — a price approved by the deal desk, a competitor’s gap scraped off a review site, a proof point an AI agent inferred from three old decks. A source-type taxonomy records how each claim arrived and caps how much confidence it can carry based on that origin. It is the difference between “the model sounded sure” and “a human verified it.”

The Commercial Truth manifesto argues that marketing has never had infrastructure — no equivalent of the source documents that underpin an audited ledger, or the rules of evidence that govern what a court will weight. Other functions inherited those disciplines decades ago. Commercial claims never did, which is why an unverified guess and a board-approved fact sit side by side in the same deck, indistinguishable.

For the person who owns positioning, the cost is specific. An AI tool emits a confident variation of a claim nobody approved, and when Counsel asks which of these did we actually verify, the honest answer is that the document doesn’t say. Source typing is how you make the document say it.

Why does this matter now?

The question used to be academic. When humans wrote every claim by hand, provenance lived in their heads — the marketer knew which battlecard line was solid and which was a guess. Agents now generate claims at machine speed and have no such instinct unless it is encoded in the data they read.

Without an explicit record of origin, the loudest input wins. “The AI was confident” silently overrides “a human verified it” — which is exactly the failure Assay’s own audit found when an extraction overwrote a higher-authority source because the continuity rule did not fire on the AI-derived type. The fix is not a better model. It is a taxonomy.

The primitive: six source types and a ceiling

In a typed knowledge graph, every claim is a node, and every node carries a source type — a category that records how the information arrived. Each source type carries an explicit confidence ceiling: the maximum score a claim from that origin can ever reach. The canonical version Assay runs on:

Source type	What it means	Confidence ceiling
First-party verified	A human entered it; a known reviewer checked it	1.00
First-party unverified	A human entered it; not yet checked	0.85
Second-party	A customer or partner provided it	0.75
Third-party	Public web, news, external reference	0.60
AI-extracted	An AI pulled it from a known source (inherits that source’s ceiling)	0.65, capped
AI-imported	An AI inferred or reconciled it, with no direct source	0.50

Source: Assay’s canonical source-type vocabulary (d09).

The rule that makes the taxonomy load-bearing is one line: final_confidence = min(source_ceiling, computed_score). An AI extraction from a board-approved source inherits the 1.00 ceiling. The same extraction from a third-party blog caps at 0.60 — no matter how certain the model reports itself to be.

This is the rules-of-evidence move. A court does not weight hearsay the way it weights a signed contract, however sincerely the witness believes it. A source-type ceiling does the same for commercial claims: it ranks them by how they were obtained, not by how confident the last system to touch them sounded.

A worked example

Take one claim: we integrate with Slack. An AI agent notices Slack mentioned three times in the docs and writes the claim with high confidence — under a flat model, it ships. Under source typing it is tagged AI-imported (an inference with no direct source) and capped at 0.50, held below the bar for customer-facing copy until a human verifies it.

Now the deal desk approves a new enterprise price. A human enters it; a second human verifies it; it is tagged first-party verified and carries up to 1.00. When an AI SDR later quotes that price, it is quoting the highest-authority node in the graph — and the audit trail shows the price was verified, by whom, and when.

A third case shows the ceiling working in the other direction. A market-size figure pulled from an analyst report is tagged third-party and capped at 0.60 — usable in a blog, flagged as external the moment a rep tries to present it as a company fact. The taxonomy does not forbid the claim; it records what kind of claim it is, so the surface that consumes it can decide.

The taxonomy is enforced at the schema, not by convention. If any stored claim ever exceeds its source ceiling, that is a bug — Assay runs a standing check that the count of such claims must be zero.

What this is not

This is not an AI claim-checker. Tools that scan finished copy against an approved-claims document validate text after it is written, one asset at a time. Source typing works a layer earlier — it types the claim itself as a first-class object, so every surface that quotes it inherits the same provenance and the same ceiling, without re-checking the copy each time.

It is also not a confidence score bolted onto a model’s output. A model’s self-reported certainty is precisely the thing the ceiling is built to bound. The number that matters is not how sure the system sounds; it is how the claim was obtained.

Where this lives

Source typing is the provenance layer of the Grounded pillar — the part of the substrate that lets a buyer inspect where any claim came from before trusting it. It is also why a credible interval beats a point estimate: the ceiling is an honest statement that some claims are simply better-sourced than others.

The methodology Assay is developing for the Commercial Truth Index scores vendors partly on this — the source-type distribution of everything they publish, and whether their confidence scores respect source-type ceilings at all. A confidence number that ignores its own provenance is decoration. A confidence number capped by how the claim was obtained is evidence.

Closes / opens

Closes the source-type parallel cluster (LSO §F.6): the primitive that ranks claims by origin, not by tone. Opens the operator question — what share of your current claims could you assign a source type to today, and what share are confident guesses you have never checked?

This essay is grounded in Assay’s source-type vocabulary (d09) and the Truth Graph product canon, with buyer context from the brand canon. Methodology for the Commercial Truth Index is in development.