The anatomy of a production prompt

“A prompt isn't a wish you whisper to a genie. It's a spec you hand to a contractor.”

Treat prompts like code: clear instructions, examples, and a fixed output shape.

A prompt is a structured document

A production prompt is not a sentence. It's a layered document: a system message that sets the role and rules, clear instructions for the task, two or three worked examples, the user's actual input, and a specification of the output format. Each layer does a different job, and skipping any of them is where quality leaks out.

The reason structure beats cleverness: models are far better at imitation than at obeying abstract instructions. Show the model exactly what good output looks like and it will match the pattern. Describe it in adjectives and it will improvise.

The load-bearing techniques

Most prompt advice is a local optimum that breaks next model release. A few principles generalise because they're about information, not tricks:

Constraints: state the format, length, and what to exclude. A tight box gets filled well; an open prompt gets improvised.
Examples (few-shot): show input/output pairs. The single highest-leverage thing you can add to a prompt.
Decomposition: split a multi-step task into multiple prompts, or ask the model to work through steps before answering.
Role and audience: "you are a careful technical editor writing for non-experts" shapes register more reliably than any adjective in the instruction.
Output schema: tell it the exact shape (JSON keys, sections) you will parse back.

Let the model think before it answers

For anything involving reasoning, asking for the final answer directly leaves quality on the table. Letting the model produce intermediate reasoning first ("chain of thought") measurably improves results on multi-step problems, because each generated token can condition on the previous ones. You're giving it room to work.

The practical form: ask for the reasoning, then the answer, in a structured output, and parse out the answer. Or, with newer "reasoning" models, the thinking happens internally and you simply pay for it in latency and tokens. Either way, hard problems want deliberation, not a snap answer.

Version your prompts like code

A prompt is the most load-bearing string in your application, and teams routinely edit it in a text box in production with no history. That's insane the moment you say it out loud. Prompts belong in version control, with a changelog, tied to the eval run that justified the change.

The discipline: a prompt change is a code change. It gets a diff, a review, an eval run, and a rollback path. "I tweaked the prompt and it feels better" is how teams ship silent regressions they discover from angry users a week later.

In one line each

A prompt is a structured document: system, instructions, examples, input, output schema.
Examples beat instructions; constraints beat hope; let the model reason before hard answers.
Don't collect tricks. Internalise the principles, which survive model changes.
Prompts are code: version them, review them, and gate changes on an eval run.

Where to go next

Chapter 3: Retrieval done right