№ 00dLearn With Darin · The Primer

AI Safety.

The practical, day-to-day version. Not Skynet, not AGI, not a manifesto. Just the everyday risks any thoughtful user should understand: what gets logged, what gets wrong, what to keep out of the chat box, and how to calibrate how much you trust the answer.

Updated May 2026 ~16 min read Privacy, accuracy, risk
Part 01

The honest framing

There are two conversations about AI safety happening at the same time, and they get mixed up constantly. One is about the long-term, speculative, "what happens if these systems become much more capable" question. That conversation is real and serious; it is also not what most people need on a Tuesday afternoon when they paste a tax form into ChatGPT. This guide is about the other one.

The everyday version is mundane and practical. It looks like: who sees this conversation, is the answer actually correct, should I really be using this for medical advice, what happens if my kid asks it to do their homework, can I paste a client's contract into it. These are not exciting questions. They are the ones that decide whether AI is a useful tool in your life or an expensive mistake.

My read of this is straightforward. AI tools are genuinely useful. They are also fluent enough to feel more reliable than they are, connected enough to leak data you didn't mean to share, and convincing enough to crowd out your own judgment if you let them. None of that is a reason to avoid them. It is a reason to use them with your eyes open.

This guide is not about existential risk, AGI, or AI takeover scenarios. Plenty of other writers cover that ground. This is about the boring, useful safety practice: the kind that protects your data, your decisions, and the people who depend on you.
Part 02

Privacy: what's actually logged, trained on, shared

The single most misunderstood thing about consumer AI tools is what happens to the words you type into them. The honest answer is "it depends on the tier you're using, the provider's current policy, and which toggles you've flipped." That sounds evasive. It isn't. Vendor policies are genuinely different across tiers, and they change.

Here is the rough shape of it, as of mid-2026. Treat it as a starting point, not a guarantee. Always check the current policy of the specific tool you use.

The three tiers, generalized

  • Free consumer tier. On most providers, conversations on the free tier are eligible to be reviewed by humans and used to train future models, unless you turn that off. The toggle exists. It is usually not on by default.
  • Paid consumer tier (Plus, Pro, Team). The defaults vary. Some providers do not train on paid-tier conversations. Others do, with an opt-out. Read the data-controls page on whichever tool you pay for.
  • Enterprise and API. Different contract entirely. Conversations are typically not used for training, are subject to stricter retention, and are governed by a data-processing agreement. This is the tier built for companies with confidentiality obligations.

Generally trained on

  • Free-tier consumer chats with default settings
  • "Improve the model" toggles you left on
  • Public ChatGPT shared-link content (varies)
  • Voice transcripts on free tiers (often)
  • Conversations you explicitly thumbs-upped or thumbs-downed

Generally NOT trained on

  • Enterprise and Team plans with data controls
  • API traffic with default provider settings
  • Paid-tier chats where you turned training off
  • Temporary, incognito, or "off-the-record" chats
  • Workspace and admin-managed deployments

Memory features

Several tools now offer "memory": an opt-in feature where the assistant carries facts about you across conversations. Memory is not the same as training. Facts in memory are stored against your account, used to personalize replies, and can be viewed and deleted in settings. Memory is mostly low-risk for low-stakes personalization. It is not the place to put anything you wouldn't want surfaced in a future, unrelated chat.

What stays in the chat after you close the tab

Closing the browser tab does not delete the conversation. Most tools store every chat you've ever had under your account, viewable in a sidebar. Deleting from the sidebar usually removes it from your view; full deletion from the provider's systems takes longer and varies by provider. If you want a conversation to leave no trace, use the provider's temporary or incognito mode before you start, not after.

Tip Once a quarter, open the data-controls page on every AI tool you use. Confirm training is off where you want it off. Confirm memory contains only what you expect. Purge old conversations you no longer need. Five minutes; high leverage.
Part 03

Accuracy: hallucinations and the confident-and-wrong problem

The canonical AI failure mode is not lying. It is plausibility. The model produces an answer that sounds right, reads cleanly, and is structurally correct, but is factually wrong. This is what people mean when they say "hallucination," and it is the single most expensive habit a new user has to break.

The reason it happens is mechanical. These systems are optimized to produce text that looks like the right answer, given everything they have seen. When the right answer is well-represented in their training data, they tend to produce it. When it isn't, they produce something that looks like the right answer would look. Same fluency, same confidence, no warning bell.

Where hallucinations show up most

  • Citations and references. Made-up paper titles, plausible-looking author lists, fabricated DOIs, invented page numbers. The format is right; the source does not exist.
  • Case law and statutes. A real area where lawyers have been sanctioned for filing AI-generated briefs with cases that were never decided.
  • Specific factual details about people. Wrong middle names, wrong birth years, mixed-up biographies, attributions to the wrong person.
  • Numbers in tables. Plausible-looking statistics that don't match the source. The model interpolates when it can't read precisely.
  • Recent events. Anything past the model's training cutoff is guesswork unless the tool is doing live web search and citing what it finds.

The two structural defenses

You will not catch hallucinations by being more careful. They are designed, accidentally, to evade careful reading. You catch them with structure.

  1. Cross-check between models. Paste the same question into a second AI. Different labs train differently and have different blind spots. When two models agree on a specific factual claim, your confidence rises. When they disagree, you have learned something useful. (See the Cross-check between models section in Best Practices for the longer version.)
  2. Verify against a non-AI source for anything load-bearing. Open the citation. Read the actual document. Call the actual person. The cost of being wrong publicly is usually much higher than the cost of one minute of verification.
Warn Fluent and confident is not the same as correct. The more authoritative an AI answer sounds, the less it tells you about whether it is right. Treat smooth, certain prose as a reason to verify, not a reason to relax.
Part 04

Sensitive content: medical, legal, financial advice

This is the area where the gap between "useful" and "dangerous" is widest, and the framing matters most. AI tools are genuinely helpful for medical, legal, and financial topics. They are also a bad place to make medical, legal, or financial decisions. Both can be true.

What AI is genuinely useful for

  • Decoding terminology. What does "ejection fraction" mean. What is a 1031 exchange. What is the difference between a chapter 7 and a chapter 13 bankruptcy. The model is excellent at translating jargon into plain language.
  • Preparing questions. Before a doctor's appointment, a lawyer's intake call, a meeting with a financial advisor. Walking in with five sharp questions is dramatically more useful than walking in cold.
  • Understanding general concepts. How does a Roth IRA work. What does a typical knee replacement recovery look like. What is "consideration" in contract law. General-shape education is exactly what these models are built for.
  • Decoding documents in front of you. Lab results, an MRI report, a lease, a prospectus. The model is good at saying "here is what the document is telling you, in plain English."

What it should not be used for

  • Diagnosis. A symptom list and an AI is not a substitute for a clinician with eyes, ears, and tools.
  • Prescribing or dosage. Drug interactions, pediatric dosing, anything where the wrong number causes harm. This is not a place for plausibility.
  • Legal counsel for your specific situation. "Should I sign this" is not the same as "what does this clause typically mean." The first one needs a lawyer; the second one is fair game.
  • Specific investment decisions. "Should I sell" or "is this stock a buy" is not a question with a correct answer in the model's training data, and the answer it generates is not advice.

The framing I've found useful is "more informed patient" or "more informed client." You are not replacing the professional. You are showing up better prepared, with sharper questions, less intimidated by the vocabulary. That is genuinely valuable. The model gets you to the appointment ready. It does not replace the appointment.

Warn Anything irreversible (a procedure, a signed contract, a large transfer of money, a custody decision) needs a human professional. The cost of being wrong is too high, and the savings from skipping the professional are too small, for AI to be the final word.
Part 05

Children and AI

The right posture here is engaged, not panicked. AI tools are part of the world your kids will grow up in. The question is not whether they will encounter them, but whether they will encounter them with some context. The goal is not to keep AI away. The goal is to teach them how to use it.

What to teach, by roughly what age

  • Younger (under 10). Active supervision. Use AI together, the way you'd watch a movie together. Treat the assistant as a guest in the house, not a babysitter. Most providers' terms of service set the minimum age at 13 anyway.
  • Middle (10–13). Co-piloted use, with conversations about what the tool is. Three rules I'd land on early: it can be wrong, don't share personal information with it, don't believe everything it says.
  • Older (14+). More independence, with periodic check-ins. By this age the more interesting conversation is "how to use it well," not "whether to use it."

The homework question

This is the conversation parents have most often, and the answer is more nuanced than "always" or "never." AI helps learning when it explains a concept the kid is genuinely stuck on, walks them through a problem they then solve themselves, or quizzes them on material they need to internalize. It shortcuts learning when it produces a finished essay the kid copies, or solves a math problem without the kid working through it.

The test I'd suggest. After the AI session, can the kid explain the answer, in their own words, to a sibling? If yes, the tool helped. If not, the tool replaced the work the learning was supposed to do.

Personal information

The most important rule, repeated until it sticks: do not share full name, address, school, phone number, or photos with an AI assistant unless an adult has reviewed the tool and approved that. Same hygiene as any other internet service. Different surface, same rule.

Tip Have the conversation once, then revisit it casually a few times a year. "Hey, what did you ask the chatbot about this week?" gets you more honest answers than a one-time lecture about safety. Curiosity beats rules.
Part 06

Workplace data: what NOT to paste

This is the single most common way professionals get themselves and their employers into trouble. Someone has a confidential document, an AI tool open in another tab, and a deadline. The tool is helpful. The document is sensitive. They paste it in. The convenience is immediate; the problem is invisible until later.

The default rule worth internalizing. Anything you paste into a free or non-enterprise AI tool could end up in training data, in a security review, or in someone else's hands. Treat the chat box like email to an outside vendor: fine for what you'd send anyway, not the right surface for what you wouldn't.

The not-without-thinking list

  • Confidential client information. Names, financial figures, deal details, anything covered by a confidentiality clause.
  • Source code under NDA or proprietary terms. Especially the parts that aren't on a public repo.
  • Customer PII. Email lists, account numbers, anything regulated under privacy law.
  • Salary and performance data. Yours, your team's, anyone's. Always sensitive.
  • Draft contracts and term sheets. Especially before signature, especially with named counterparties.
  • Intellectual property in formation. Patent-eligible inventions, unfiled trade secrets, internal R&D.
  • Health information. Yours, your colleagues', a customer's. HIPAA-adjacent and beyond.

What changes on enterprise tiers

Enterprise and API tiers exist precisely so companies can use AI on sensitive material under contract. The data-processing agreement says, in writing, what the provider will and won't do with your data. Training is off. Retention is bounded. Access is logged. If your work involves the categories above, the right tool is an enterprise deployment your company has reviewed, not the consumer tool you signed up for personally.

Shadow IT

The most common version of this problem is innocent. Someone signs up for a tool with their personal email because the procurement queue is slow. Then they use it for work. The company has no contract, no audit trail, no idea where the data went. It is rarely malicious; it is almost always avoidable. Ask first; it is faster than you'd think, and the alternative is a security incident no one wants to write up.

Warn If you are about to paste something you'd hesitate to forward to an external vendor, that hesitation is the signal. Find a different surface. Either the enterprise version of the tool, an internal one, or a non-AI workflow.
Part 07

Account security

Your AI account is a higher-value target than it used to be. A few years ago, breaking into someone's chatbot account got you very little. Today, that account holds months of conversations: the questions someone asks privately, the documents they upload, the projects they save, sometimes their memory store. That is a richer target than most email inboxes.

The basics, applied here

  • Strong, unique password. A password manager is the only reasonable way to do this; I'm not going to talk you out of using one.
  • Two-factor authentication on the AI account itself. This is the one people miss. They have 2FA on email, on banking, on social. They sign into ChatGPT or Claude with just a password. Turn on 2FA. It takes ninety seconds.
  • Don't share your login. Not with a colleague, not with a family member who wants to "try it." The conversation history is yours, the memory is yours, the saved files are yours. Sharing the login mixes them with someone else's, in both directions.
  • Review active sessions periodically. Most providers have a "devices" or "sessions" page. If something is signed in that you don't recognize, sign it out and rotate the password.

If you suspect a session is compromised

Sign out of all sessions immediately. Rotate the password. Re-enable 2FA if it wasn't already on. Review the chat history for anything that wasn't you. Check connected apps and integrations; revoke anything you don't recognize. If you store sensitive material in memory or projects, audit those next. Then, separately, think about what was in those conversations and whether anyone needs a heads-up.

Part 08

Bias and representation

The discussion about AI bias gets stuck in two unhelpful places. One side says it's a fatal problem that makes the tools unusable. The other says it's overblown and these systems are basically neutral. Neither is right. The realistic version is more useful.

These models are trained on the internet, mostly in English, weighted toward US and Western European sources. The defaults reflect that. Names in examples skew Anglo. Cultural references skew American. Measurement units default to one system. The "average user" the model imagines is a particular kind of person, and if you are not that person, the defaults will sometimes feel a beat off.

Where it actually shows up

  • Resume and candidate screening. Models trained on past hiring data inherit past hiring patterns. This is a known and studied issue. AI-assisted hiring needs human review, not delegation.
  • Names in generated examples. Default to Anglo names unless asked otherwise. Worth specifying when context matters.
  • Cultural and culinary defaults. "A typical breakfast" leans toward a Western breakfast. "A traditional wedding" varies in unhelpful ways unless you specify the tradition.
  • Medical and clinical guidance. Reference ranges, symptom presentations, and risk factors are often built on study populations that under-represent women, people of color, and older adults. Worth flagging when relevant.
  • Translation and idiom. Translations into and out of less-resourced languages are weaker than English-to-Spanish or English-to-French. The fluency hides the weakness.

What you can actually do about it

  • Specify context explicitly. "I'm writing this for a clinical audience in India," "Use names that reflect a Latinx workplace," "Assume the reader is a non-native English speaker." Specificity overrides defaults.
  • Ask for the question reframed. "How would this answer change if the user were 70 years old," or "if the company were based in Lagos rather than San Francisco." The model is often capable of better answers; the default is just one of many.
  • Cross-check on anything load-bearing. A second model trained slightly differently sometimes catches what the first one missed.
  • Don't outsource consequential judgment. Hiring, lending, sentencing, medical triage. The presence of an AI in the loop does not make the bias question go away; it usually makes it harder to see.
Part 09

The agreeable problem (sycophancy)

Modern AI assistants are tuned to be helpful and pleasant. The side effect, well documented now, is that they tend to agree with you. If you propose a plan, the model usually finds reasons to support it. If you suggest an interpretation, the model often endorses it. If you ask whether your work is good, the answer is almost always yes, with light suggestions for improvement.

This feels nice. It is also dangerous, because you cannot distinguish "the model agrees because I'm right" from "the model agrees because that is what it does." The flattery is invisible exactly when you most want it to be.

How to notice it

  • Watch for unbroken affirmation. If three turns in a row endorse your idea, the model is probably not adding signal.
  • Notice when the praise precedes the substance. "That's a great question" or "What a thoughtful approach" before any analysis is filler. Skim past it.
  • Be suspicious of "you're absolutely right." Especially when you've just contradicted what the model said two messages ago. Either the first answer was wrong, or this one is. Pick.

Counter-prompts that actually work

  • "Play devil's advocate. Argue the opposite position with the strongest case you can make."
  • "Give me three reasons this is wrong, assuming the strongest version of the critique."
  • "What would a skeptical expert in this field say about this? Don't soften it."
  • "Rate this 1 to 10, and tell me what would have to change to get it to a 9. Be honest about the gap."
  • "What am I missing? Not what's good. What's the thing I'd regret in three months."

When agreement is suspicious vs when it's correct

Agreement on a well-documented factual question is usually fine. Two plus two equals four; the capital of France is Paris; the model agreeing with you on these is not flattery. Agreement on a judgment call, a strategy, a creative direction, or a self-assessment of your own work is the one to question. The more the answer depends on taste, context, or your own situation, the more the model's default helpfulness can produce false comfort.

Part 10

Calibrating trust

The right relationship with an AI tool is not "trust it" or "don't trust it." It is calibrated trust: high in some contexts, low in others, and you should be able to say which is which. Here is the rubric I actually use.

When to believe

  • Well-documented general explanations. How does photosynthesis work, what is compound interest, what does a typical software architecture look like. Topics with deep, consistent training-data support.
  • Drafts you'll edit. The model produces a starting point; you do the work of making it correct. Errors get caught in editing.
  • Brainstorming and ideation. You're not looking for the right answer; you're looking for a list of possibilities to choose from. Fluency is the feature.
  • Decoding documents in front of you. The source material is on screen; the model is helping you read it.
  • Format conversion and structural work. Reformatting, summarizing, restructuring. Lower stakes for hallucination because the source is right there.

When to verify

  • Anything you'll cite. Names, dates, numbers, quotes, statistics. Open the source.
  • Specific factual details about real people. Middle names, biographies, attributions, credentials. Cross-check.
  • Technical claims you can't evaluate yourself. If you can't tell whether the answer is right, you need the structural defense of a second source.
  • Anything from the tool's recent-events boundary. Live web search helps but does not eliminate the problem.
  • Anything that feels too clean. Suspiciously neat answers often paper over a hard part. The neatness is the tell.

When to escalate to a human professional

  • Medical decisions. Diagnosis, treatment, dosing, anything that touches a procedure.
  • Legal action. Filing, signing, suing, defending. The model is for preparation; the lawyer is for the action.
  • Large or irreversible financial moves. Selling a house, major portfolio rebalancing, taxable events with consequences.
  • Anything safety-critical. Electrical, structural, anything that could hurt someone.
  • Anything involving your kids' futures. Schooling, custody, healthcare. AI can prepare you; humans should decide.

The single rule that holds all of this together: if the cost of being wrong is greater than the cost of verifying, verify. That is the entire algorithm. Most of the time, verification is cheap and the wrongness would be expensive. Use the model anyway. Just don't use it as the final word on something that matters.

Part 11

Closing thought

Safety, in this everyday sense, is not paranoia. It is calibration. The people who use AI most heavily are not the ones who have stopped checking; they are the ones who have built habits for when to check. They flip training off. They cross-check load-bearing claims. They keep the sensitive stuff out of the chat box. They escalate to a professional when the stakes go up. Those habits do not slow them down. They are what makes the speed safe.

Calibrated trust beats blind trust, and it beats reflexive distrust too. The work is figuring out which is which. — TWD

If you take one thing from this guide, take this. The risks here are not exotic. They are the everyday ones, dressed up in new clothes: privacy, accuracy, judgment, professionalism. The same instincts that keep you safe in the rest of your digital life work here. They just need to be pointed at a tool that talks back fluently.

From here, the natural next reads are Best Practices for the cross-cutting habits that turn careful use into productive use, A gentle start if you're newer to all of this, and Everyday AI for the practical patterns of weaving these tools into a normal week. None of them are about safety in particular. All of them are about using AI in ways that make safety easier.