You’ve probably asked ChatGPT something personal by now—maybe a question about taxes, a family issue, or work you weren’t ready to share with a colleague.
And now that OpenAI revamped its memory feature, you might be wondering: what happens to all of that data?
It’s a fair question. But here’s the problem—most people are still confusing three very different concepts: inference, memory, and training. That confusion can lead to misplaced fears, clumsy policies, or worse, risky decisions.
So let’s clear it up.
Late last week, Kevin Weil, OpenAI’s Chief Product Officer, posted:
“Starting today, Memory in ChatGPT can reference all of your past chats to provide more personal responses.”
If you use ChatGPT regularly, you’ll notice the change quickly—responses feel more tailored, more aware of your past conversations, and maybe a little more natural, less like a one-shot search experience and more like a true assistant. It might remember that you asked about how to do a backdoor Roth IRA in Robinhood, what to do when your freezer door is frozen shut, or that you were asking for ideas for your father’s 70th birthday gift.
But for many legal, compliance, and security teams, this update might feel like confirmation of a long-standing fear: that everything you say to an AI model is being trained into it. That your prompts are becoming part of a collective intelligence, accessible to anyone and everyone.
Let’s be clear: that’s not how it works. And with memory now a visible part of the ChatGPT experience, it’s more important than ever to clarify three separate concepts:
💡 Inference ≠ Training
When you type a prompt into ChatGPT, Claude, or Gemini, you’re performing inference. That means you’re querying a pre-trained model—asking it to generate a response based on its existing knowledge.
It’s like asking a very smart intern (or PhD if using o1) a question. They give you the best answer they can based on what they’ve already learned, with memory this means from past experiences with you specifically. But they’re not gossiping to others about your questions. Unless, of course, you’ve explicitly opted into model training.
All the major paid versions—ChatGPT Plus, Claude Pro, Gemini Advanced—let you toggle training on or off in settings. Some popular tools, like Granola, do train their models on anonymized versions of user interactions by default, unless you pay up for enterprise and opt out.
Inference is stateless—meaning the model doesn’t retain or remember what you just said—unless memory or another mechanism is specifically turned on. The model isn’t learning from your input unless the system is configured to retain and use it. Always read the T&Cs (or have an LLM do it for you).
And I get it—you might worry there’s a principal-agent problem. Maybe you don’t fully trust the big AI labs to resist the temptation to train on your very rich chat history.
My take? The winners in AI products will get this right. They’ll look more like Apple in their posture toward privacy and security—and less like the data brokers of ad tech.
🧠 Memory ≠ Training
OpenAI’s revamped memory feature is also not training.
Memory allows the model to persist useful context across sessions—things like your name, preferences, or ongoing projects—tied to your account. The goal is to make interactions more helpful and efficient.
But memory is personalized recall, not model evolution. The base model isn’t improving because of your input. Your memory isn’t contributing to a global dataset. It’s not being shared with other users. Think of it as a notebook (e.g., Apple Notes, Notion, etc) and less like updating model weights.
And you’re in control: OpenAI allows you to view, manage, and delete your memory at any time.
While we’re here—take a moment to think about the sensitive data you already have sitting in tools like Airtable, Superhuman, or the long tail of SaaS you’ve connected via OAuth and checked off access to email, calendar, etc. Who do you trust more with your data?
🧪 Training Is Deliberate and Controlled
Training a large language model is a completely separate, intensive process. It involves ingesting massive curated datasets and optimizing the model’s internal weights over thousands of compute hours. Incorporating user prompts into that process is intentional, not automatic nor accidental.
And in practice? If you’re a paying user of ChatGPT Plus, Claude Pro, or Gemini Advanced—and you’ve configured your settings appropriately—your data is not used to train the model. That’s been confirmed publicly by OpenAI, Anthropic, and Google.
This isn’t a grey area. It’s a product design and policy decision with clear legal and reputational implications.
Of course, let’s be honest—all of these companies would love to train on your data. More data means better models, faster iteration, and potentially defensible improvements. But that’s not where the market is heading. Enterprise buyers, developers, and even individual users are demanding control, transparency, and the ability to opt out. And that’s exactly why OpenAI, Anthropic, and Google have made non-training an option (or at least for paying users). Control is quickly becoming table stakes.
🤔 So What Should You Actually Worry About?
If you’re in legal, compliance, or security, it’s right to be cautious about where your data lives and who can access it. But the fear shouldn’t be “Is this AI training on my data?”—the better questions are:
- Who has access to this data?
- What is their retention policy?
- What agreements are in place?
- Can I see what’s stored and delete it?
And let’s be honest: if you’re fine giving Superhuman access to your inbox, or letting Asana store your company’s roadmap, or uploading sensitive docs into Airtable—but you panic at the idea of asking ChatGPT a question with sensitive context—you’re probably reacting to fear, not risk.
Companies like OpenAI and Anthropic are huge, well-capitalized, and security-obsessed. Their reputations—and entire business models—depend on keeping your data safe. In many cases, they’re more secure than the average productivity tool.
One real risk? ChatGPT Wrappers. There are tons of third-party tools built on top of ChatGPT that offer convenience or verticalized features—but often with weak terms of service and murky data practices. Prompt and chat history data is incredibly valuable, especially to SEO optimizers, marketing toolkits, and AI content generators. If you’re using one of these wrappers, you may be handing over your queries to someone who’s reselling or repurposing them. That’s the kind of leakage that should keep legal teams up at night—not whether OpenAI is training on your deck.
✅ The Bottom Line
- Inference is not training.
- Memory is not training.
- Training is a separate, deliberate process that can be turned off.
ChatGPT’s new memory feature makes the product more personal—but it doesn’t make the model smarter in a global sense.
Shopify CEO Tobi Lütke recently sent a company-wide memo declaring that “reflexive AI usage is now a baseline expectation.” He made it clear: every team, including the executive team, is expected to use AI not just as a tool, but as a default mode of working. His point isn’t that AI is trendy—it’s that it’s now table stakes. If you’re not using it, you’re already falling behind. I tend to agree. This isn’t about hype. It’s about not being the last one clinging to a calculator while everyone else is using Excel.
If we want to use AI responsibly, we have to start by understanding what it’s actually doing—and what it’s not.