Data leakage & privacy

“Everything you put in the prompt, you've said out loud in a room you don't control.”

Data in prompts or logs can leak; minimize, redact, and isolate it.

Context leakage: the obvious one

The most common leak is the simplest: whatever you put in the prompt can come out of the model. System prompts get extracted. One user's data, left in a shared context, surfaces in another user's answer. Confidential documents fed for summarisation get quoted back to someone who shouldn't see them. The model has no concept of "this part is secret"; context is context.

The defences are about hygiene, not cleverness: never put data in a context that the requesting user isn't authorised to see; isolate sessions so one user's data can't bleed into another's; and assume your system prompt is public, because a determined user will extract it. Design as if the prompt is readable by whoever you're serving.

Training-data extraction

Models can memorise fragments of their training data and, under the right prompt, reproduce them: verbatim secrets, personal data, copyrighted text. This matters in two directions. If you use a public model, it may emit memorised content from its training set. If you fine-tune a model on your own data, that model can leak your data to anyone who queries it.

The third-party problem

When you call a hosted model API, your prompt leaves your infrastructure. That's a data-processing event with legal weight: where does the data go, who can read it, is it used for training, how long is it retained, and does that satisfy the contracts and regulations you operate under? "We send customer records to a model API" is a sentence your privacy and compliance people need to have signed off on.

Practical positions, from most to least control: self-host an open model so data never leaves (the strongest privacy story, more operational cost); use an enterprise API tier with a contractual no-training, data-residency guarantee; or redact and minimise: never send more than the task needs, and strip identifiers before the call. SDEN's bias toward self-hosted infrastructure exists partly for this reason.

Compliance is a design input, not an afterthought

If you handle personal data of people in North America, AI doesn't get a regulatory exemption. CCPA/CPRA (California), PIPEDA (Canada), and sector rules (HIPAA for health, GLBA for finance) all still apply to data you route through a model. The questions are the familiar ones: what's the lawful basis, can you honour a deletion request, where is the data processed, and can you produce an audit trail.

Data residency: can you guarantee where the data is processed and stored?
Right to deletion: if data is in a fine-tuned model, can you actually remove it? (Usually not without retraining.)
Purpose limitation: is the model provider allowed to train on your data? Get it in writing.
Auditability: can you show what data went where, for a regulator or a customer?
Minimisation: are you sending only what the task requires, or the whole record because it was easy?

The cheapest privacy control is the oldest one: don't collect or send data you don't need. A model that never receives a social security number can never leak it. Minimisation at the boundary beats every downstream control.

In one line each

Context leakage is the common case: anything in the prompt can come out. Isolate sessions, authorise context, assume the system prompt is public.
Models memorise training data; fine-tuning on sensitive data bakes it into the weights and is not a privacy boundary.
Hosted APIs send your data off-site; control it with self-hosting, enterprise no-training contracts, or redaction.
CCPA/PIPEDA/HIPAA still apply; minimisation at the boundary is the cheapest and most expected control.

Where to go next

Chapter 4: Jailbreaks & misuse