Prompt Decision Records

Back to Home

"If the developer who built this left tomorrow, could someone else maintain it?"

Teams building with LLMs are creating a new category of institutional knowledge: prompt strategies, model configurations, and hard-won lessons about what works and what doesn't. Today, most of that knowledge lives in one person's head.

Architecture Decision Records (ADRs) solved this problem for system design decades ago. Prompt Decision Records (PDRs) apply the same pattern to LLM interactions — capturing not just the prompt, but the reasoning, the failures, and the runtime configuration that makes it work.

The Gap

Prompt / Skill alone

What to tell the LLM
The executable instruction
Works if you wrote it
Silent on past failures
No model or config context

Prompt + PDR

Why it's framed that way
What failed before this
Works for the next developer
Known gotchas & edge cases
Model, temperature & config pinned

Anatomy of a PDR

PDR-001-document-comparison.md

Configuration Model version, temperature, output format, API method — the runtime contract

Context What problem triggered this prompt strategy

Prompt Strategy How the prompt is structured and why

What We Tried Failed approaches with reasons — highest value section

Gotchas Known failure modes and signs output is going sideways

Evidence Test results per iteration — quantitative proof it works

Worked Example Real input → output → commentary on why it worked

Why This Matters

These are the kinds of discoveries teams make through iteration — and lose when someone leaves. PDRs capture them.

Configuration Is Not Portable Across Models Gotcha

Same prompt. Same test set. Only change: temperature 0.2 → 0.0.

Model A at temp=0 Accuracy regressed. Lower temperature produced more errors, not fewer. The model became less flexible in handling ambiguous inputs.

Model B at temp=0 Near-perfect accuracy on the same test set. The newer model handled deterministic output well; the older one didn't.

Without the PDR, the next developer assumes "lower temp = better" and breaks production.

General Rules Don't Prevent Hallucination Gotcha

General instruction "Do NOT invent or infer values." Result: fabricated citations, invented data points, and bogus references — despite the explicit prohibition.

Enumerated rules Specific prohibitions for each failure type: fabricated values, false citations, invented page numbers, blank field handling, ambiguous input defaults.

Explicit rules eliminated the entire category of error. General instructions did not.

Absence ≠ Missing Gotcha

Naive rule "If a field is absent, treat as missing." The LLM flagged nearly every output — because some fields are legitimately absent in certain document types.

Context-aware rule A field availability mapping told the LLM which fields to expect in which document types. Absence was only flagged when the field should have been present.

False positive rate dropped to near zero. The LLM needed domain context, not just instructions.

How It Fits

Source Code

Implementation

→

Prompt / Skill

What to do

→

PDR

Why it works

PDRs live alongside prompts and skills in the repo. A prompt references its PDR, a PDR references its prompt. Developers starting from either direction find the full picture — including the configuration the prompt depends on to function correctly.

Getting Started

Pick your most complex prompt. Write one PDR for the LLM interaction that would be hardest for someone else to maintain. One real example is worth more than a process document.

Automate the scaffolding. An LLM can generate 80% of a PDR draft from an existing prompt and its surrounding code. The developer fills in the tribal knowledge — what failed, what to watch for.

Make it part of code review. If a PR changes a prompt strategy, it should include a PDR update. No new tooling required — just a convention that reviewers enforce.

Standardize across teams. As LLM usage scales, PDRs become the shared documentation language for prompt strategies. Every team building with agents uses the same format.