Please delete! What the right to be forgotten means for AI models


The most important facts at a glance
- ChatGPT & GPAI: ChatGPT is a general-purpose AI system (GPAI) that processes user input, which often contains personal data.
- GDPR requirements: GDPR-compliant use requires comprehensive transparency, documentation of all data processing, and the deliberate exclusion of sensitive data from the input.
- New obligations from August 2, 2025: The EU AI Act introduces additional requirements, especially for companies that integrate LLMs such as ChatGPT into internal processes.
- Challenges: Lack of transparency, complex data flows, and unclear model behavior make it difficult to implement data subject rights (e.g., the right to be forgotten) and increase regulatory risks.
- Governance & compliance: Companies should establish governance processes, employee training, and technical safeguards at an early stage to minimize legal and operational risks.
- Support from heyData: Practical risk assessments, AI literacy programs, and templates help to efficiently implement data protection and AI compliance.
Introduction: Between data protection and technical feasibility
AI models such as ChatGPT, image recognition systems, and applicant scoring algorithms have long been an integral part of many business processes in Europe. However, as their capabilities grow, so does the pressure to consistently implement data protection regulations such as the right to be forgotten. It gets especially tricky when personal data isn't just in databases, but also in the “learned knowledge” of the models themselves. This guide shows why deletion is so challenging in the AI world, what legal and technical pitfalls companies need to watch out for, and how practical, GDPR-compliant strategies can be implemented with a view to the EU AI Act coming into effect in August 2025.
Table of Contents:
What does the right to be forgotten mean?
The “right to be forgotten” under Article 17 of the GDPR obliges companies to permanently delete personal data upon request – e.g., in the event of withdrawal of consent, fulfillment of purpose, or unlawful processing.
Relevance for AI systems:
Many companies use personal data to improve algorithms, for example:
- for behavioral analysis in e-commerce
- for chatbots with personalized responses
- in applicant scoring systems
But if this data has to be deleted later, what happens to the AI model that was trained on it?
Problem: The GDPR does not distinguish between a classic database and a trained model. This leads to uncertainty – because even “learned knowledge” can be personal if it is traceable.
Why AI models are problematic
An AI model “learns” by processing training data. This results in model weights or vector representations that are not structured like classic data sets and cannot be deleted.
Example 1 – Chatbot trained with CRM data:
A chatbot is trained with real customer inquiries from the CRM, including names, problems, and order numbers. Even if the original data is deleted, the patterns remain in the model.
Example 2 – Applicant management with AI:
HR software uses applicant profiles to improve scoring. If an applicant is deleted, the model can continue to use their language style or criteria patterns – indirectly personal data.
Risk: If this data can be reconstructed later or is used as a significant factor in decisions, this constitutes a data protection violation.
What to do when a deletion request is made?
When a request for deletion is made, the question arises: Does the AI model need to be retrained or adapted?
The GDPR does not provide a clear answer, but according to supervisory authorities, the following applies:
If the trained model stores personal information or makes it reproducible, the model is also subject to the obligation to delete.
Real-world example:
Meta has already been criticized by data protection authorities because training data with personal content could no longer be completely deleted in LLMs (Large Language Models). There is a growing expectation that AI systems should be able to respond to data deletion requests in a transparent and reversible manner.
Typical problems in companies:
Model was trained with live customer data, but no log exists
No versioning → it is not known which data is contained in which model
Deletion request also affects data that was relevant for training purposes
Technical approaches
This is where AI engineering and data protection come together – there are technical options for enabling the right to be forgotten in practice:
1. Machine unlearning
Targeted removal of individual training data from a model without complete retraining.
- Works well with simple models (e.g., decision trees)
- More difficult with deep neural networks
2. Differential privacy
The model is trained in such a way that no conclusions can be drawn about individual data.
- Used by OpenAI, Google, Apple
- Well suited for aggregated data sets, but less so for individual contexts
3. Retraining with deletion list
The model is retrained regularly – without deleted data.
- Very reliable
- But: complex, expensive, only practical for large companies
4. Federated learning
Data remains decentralized – models learn locally and are merged centrally.
- Advantage: deletions can be implemented locally
- Complex to integrate
Tip: For companies with a lot of data deletions, a “model-centric” deletion process with automated audit trails and versioning is worthwhile.
What companies should do now
The obligation to delete data does not end with CRM or the cloud—it increasingly affects algorithmic systems as well. That is why companies need:
1. Privacy by design for AI systems
Clarify early on: Which data may be used?
Pseudonymize or aggregate training data as much as possible
2. Model documentation
Who trained what with which data?
Which models are in use? Which version contains which data?
3. Process for deletion requests
Interdisciplinary between data protection, IT, and data science
Automated checks to see if models are affected
Model retraining or replacement if necessary
4. DPIA (Data Protection Impact Assessment)
Especially for sensitive models (HR, health, behavior, risk scoring)
Must assess whether data subject rights are technically enforceable
Conclusion
The right to be forgotten poses real challenges for AI developers and companies - but it can be solved. With documented data flows, technical “unlearning,” and clear responsibilities, the AI world can also become GDPR-compliant.
Companies building AI models today must consider reversibility - otherwise they risk expensive retraining, fines, and damage to their image.
Important: The content of this article is for informational purposes only and does not constitute legal advice. The information provided here is no substitute for personalized legal advice from a data protection officer or an attorney. We do not guarantee that the information provided is up to date, complete, or accurate. Any actions taken on the basis of the information contained in this article are at your own risk. We recommend that you always consult a data protection officer or an attorney with any legal questions or problems.