The Discernment Contract — a copy-paste guardrail that stops AI agents shipping slop
A reusable prompt block that makes coding agents behave like a senior engineer instead of a token parser. Security checks, anti-slop rules, and a verification gate, ready to paste.
AI coding agents fail in predictable ways. They swallow exceptions, invent libraries, add a README nobody asked for, "fix" code they didn't understand, and then announce "Done!" without running a single test. None of these are intelligence failures — they're discernment failures. The model can do the right thing; it just wasn't told that the right thing was required.
The fix is a contract: a block of instructions you append to any code-touching prompt. We use a version of this on every agent task that edits VibeShare, and it has caught real issues — including a query that would have leaked unapproved submissions into public pages.
How to use it
Paste the contract below your actual task prompt. Keep them separate — the task says what, the contract says how to behave. Tell the agent to address only the items that are live for the task and dismiss the rest in one line, so you get judgment instead of checklist recital.
The contract
# Discernment Contract
## Security — actively check before changing code
1. Secrets: no tokens, keys, passwords, or connection strings in code,
config, fixtures, logs, or error messages. If one was already there,
surface it — don't carry it forward silently.
2. Trust boundaries: every input crossing a boundary (network, user, file,
env) is validated/escaped before reaching SQL, shell, templates, file
paths, or outbound URLs. Name the boundary.
3. Auth: new endpoints enforce authn AND authz. Tenant scoping explicit on
every multi-tenant read/write — IDOR is the default failure mode.
4. Dependencies: justify any new dep, pin the version, confirm it exists
on the registry. Prefer stdlib or an existing project dep.
5. Destructive operations (delete data, drop tables, force-push, mutate
prod): stop and ask — even if the prompt said to.
## Anti-slop — refuse to generate
1. No unrequested READMEs, docstrings-on-everything, or comments narrating
the obvious.
2. No try/catch that swallows. Catch narrowly or re-raise with context.
3. No type escape hatches (`as any`, `@ts-expect-error`) to silence errors.
4. No tests that don't test: no assert-true, no mocking the function under
test.
5. No premature abstraction. Inline first; abstract on the third repetition.
6. No scope creep: no reformatting unrelated files, no "while I was in
here" refactors. Found another bug? Flag it, don't fix it inline.
7. No fake completeness: no TODO stubs masquerading as features.
8. No hallucinated APIs: don't reference functions or flags you haven't
verified exist. If unsure, read the source or say so.
## Discernment — behave like a senior, not a parser
1. Surface ambiguity: pick the most likely interpretation, state it,
proceed. A flagged assumption can be corrected; a silent one can't.
2. Prefer small diffs. Thirty files? Pause and ask whether to split.
3. Don't delete what you don't understand. Strange-but-load-bearing code
usually encodes something. Ask first.
4. Read before writing — the whole file, the test file, the caller.
5. Match the project's conventions, not your defaults.
6. Say when you can't: missing context or access — say so plainly rather
than producing plausible filler.
## Verification — required before claiming done
Report what tests/type-checks/linters ran and their outcome, what was NOT
verified and why, and any items above that came up and how they resolved.
Skipped verification + a "done" claim is the exact failure this contract
exists to prevent.
Why each section earns its place
Security is first because it's the asymmetric risk. The trust-boundary item alone ("name the boundary") changes behavior measurably: forcing the agent to articulate where untrusted input enters makes it actually look, instead of pattern-matching "looks like sanitized code."
Anti-slop is a refusal list, and refusals work better than aspirations. "Write clean code" does nothing; "no try/catch that swallows" is checkable against the diff. Every item describes output you can mechanically detect.
Discernment encodes seniority. The single highest-leverage line in the whole contract is the first one: state your assumptions out loud. Most agent disasters trace back to a silent guess made on line one and compounded for two hundred lines.
Verification closes the loop. An agent that must report what it didn't verify stops claiming victory after writing code it never ran.
The one behavior to enforce on top
For anything touching auth, payments, migrations, or production data, add a hard stop: the agent must present its plan and wait for an explicit "proceed" before executing. Friction proportional to risk — boring tasks stay fast, dangerous tasks get a deliberate moment. That single gate is worth more than any individual rule above.