Skip to content
LinkedInX

Generative AI and Personal Information: Legal Obligations and Practical Compliance

About 10 minutes

Target audience: Legal, compliance, and information security teams at companies using generative AI; AI product developers working to comply with personal data protection laws
Prerequisites: AI Governance Overview

As generative AI tools become embedded in business workflows, organizations increasingly face the question of what happens to the personal information — customer data, employee records, contract details — that flows into these systems.

This article identifies three key scenarios where generative AI and personal data intersect, and outlines what organizations need to do under Japan’s Act on the Protection of Personal Information (APPI) and the EU’s General Data Protection Regulation (GDPR).


Three Scenarios Where Generative AI Meets Personal Data

Section titled “Three Scenarios Where Generative AI Meets Personal Data”

When an organization trains or fine-tunes an AI model using data that includes personal information, data protection law applies. Using customer data collected for service delivery as training data for a new AI product is a change of purpose that typically requires disclosure or consent.

When employees use SaaS AI tools (ChatGPT, Claude, etc.) in their daily work, they may paste customer names, contact details, contract terms, or health information into a prompt. That input is transmitted to a third-party server — a transfer that triggers legal obligations in most jurisdictions.

AI models trained on data that includes personal information may reproduce that information in their outputs — a phenomenon known as memorization. A specific individual’s details could appear in a response to an unrelated query, constituting an unintended disclosure.


Japan: The Act on the Protection of Personal Information (APPI)

Section titled “Japan: The Act on the Protection of Personal Information (APPI)”

Under the APPI, “personal information” means information relating to a living individual that can identify a specific person, including information that can be cross-referenced with other data to enable identification.[1]

In the AI context, the following types of data are likely to qualify:

  • Names, addresses, phone numbers, email addresses
  • Facial images, voice data (individual identification codes)
  • Purchase history, browsing history (may identify an individual when combined with other data)
  • Medical and health information (subject to stricter rules as “special care-required personal information”)

Prohibition on Use Beyond Specified Purpose

Section titled “Prohibition on Use Beyond Specified Purpose”

Personal information handling businesses must not process personal information beyond the scope necessary to achieve the specified purpose of use without the individual’s consent (Article 18).[1]

Using personal data collected for customer service purposes to train an AI model constitutes a new purpose. The organization must either obtain consent for this new purpose or have disclosed it as an intended use at the time of collection.

Providing personal information to a third party generally requires the individual’s consent (Article 27).[1]

Sending personal data to an external AI API may constitute third-party provision. Whether exceptions for outsourcing or joint use apply depends on the contractual structure and actual arrangements — legal review is advisable.

Stricter Rules for Special Care-Required Personal Information

Section titled “Stricter Rules for Special Care-Required Personal Information”

Medical and health information, race, creed, and criminal history are classified as “special care-required personal information.” Obtaining consent at the point of collection is required (Article 20, Paragraph 2).[1]

When AI systems in healthcare, HR, or similar domains process this category of data, requirements are materially stricter than for ordinary personal information.


EU: General Data Protection Regulation (GDPR)

Section titled “EU: General Data Protection Regulation (GDPR)”

The GDPR may apply to Japanese organizations that process personal data of EU residents or offer products and services to individuals in the EU.[2]

Processing personal data requires one of the following lawful bases under the GDPR:

BasisDescription
ConsentThe individual has given explicit consent
Performance of a contractProcessing is necessary for a contract with the individual
Legal obligationProcessing is required by law
Legitimate interestsNecessary for the legitimate interests of the controller or a third party, unless overridden by the individual’s rights

When relying on legitimate interests for AI data processing, organizations must conduct and document a balancing test weighing those interests against the rights and interests of the individuals concerned.

Article 22 of the GDPR gives individuals the right not to be subject to decisions based solely on automated processing that produce significant effects on them.[2]

AI-driven hiring screenings, loan decisions, and insurance assessments fall into this category. Building a human review process (Human-in-the-Loop) into the decision workflow is a governance requirement in these contexts.

Before undertaking high-risk processing — large-scale personal data processing, processing of sensitive categories of data — a Data Protection Impact Assessment is mandatory (Article 35).[2]

Organizations should assess whether a DPIA is required before deploying a new AI system.


Before using an external AI service for work that involves personal data, check:

  • Whether input data is used to train or improve the model
  • Data retention period and storage location (country or region)
  • Availability of a Data Processing Agreement (DPA)
  • Whether an enterprise plan with data processing restrictions is available

Many AI service providers offer enterprise contracts that prohibit using customer input for model training. Verifying these terms before deploying AI tools in personal data workflows is essential.[3]

Define what employees may and may not include in prompts when using AI tools.

  • Which categories of personal information are prohibited or restricted in prompts
  • Rules for anonymization or pseudonymization before AI input (replacing identifying details)
  • Specific handling rules for special care-required personal information (medical data, etc.)
  • Situations where customer disclosure or consent is required before using AI tools

For organizations developing or fine-tuning their own models:

  • Confirm purpose specification and consent where required
  • Apply privacy-preserving techniques (differential privacy, anonymization)
  • Minimize data: use only what is necessary for the intended purpose
  • Implement access controls and audit logging for training data

Both the APPI and GDPR recognize rights that individuals can exercise over their personal data:

RightDescriptionAI Implementation Consideration
DisclosureRight to know what personal data is heldIdentify and provide the individual’s data used in or by AI systems
Correction/DeletionRight to correct or delete inaccurate dataDeletion from trained models is technically difficult; preventing inclusion is the more practical control
Suspension of useRight to stop processing for a specific purposeBuild a procedure for stopping processing when consent is withdrawn

Deletion of personal information from a trained model is technically challenging. Research into “machine unlearning” is ongoing, but preventing personal data from entering training sets in the first place remains the most practical approach.

Prepare a response procedure for personal data breaches.

  • Under the APPI, certain types of data breaches must be reported to the Personal Information Protection Commission and disclosed to affected individuals (Article 26)[1]
  • Under the GDPR, data breaches must generally be reported to the supervisory authority within 72 hours of discovery[2]

The intersection of generative AI and personal information creates obligations in three scenarios: training data, prompt input, and AI output. Under the APPI, the key issues are use beyond the specified purpose and third-party disclosure restrictions. Under the GDPR, the key issues are establishing a lawful basis for processing and complying with restrictions on automated decision-making.

Practical priorities include reviewing AI service terms, establishing prompt input guidelines for employees, managing personal data in training sets, and building procedures for responding to data subject rights requests. Addressing these incrementally while staying current with regulatory developments is a realistic path forward.

For specific legal judgment, consulting a lawyer or privacy specialist with expertise in data protection is recommended.

Related topics: Generative AI and Privacy | AI and Copyright


  1. Personal Information Protection Commission, Act on the Protection of Personal Information, Articles 2, 18, 20, 26, 27
  2. European Union, Regulation (EU) 2016/679 (General Data Protection Regulation), Articles 6, 22, 35
  3. Personal Information Protection Commission, Notice regarding generative AI services, June 2, 2023
Quiz