Research Report

Home

Publications

Research Report

Share SNS

Legal Frameworks for Enhancing the Lawfulness of Personal Data Processing in AI Development

Issue Date 2025-12-31
Page 241
Price 9,000

Preview Download

Abstract
Content

Ⅰ. Background and Purpose of Research

▶ Background of the Discussion

○ As artificial intelligence technologies have recently developed rapidly, the management of the risks arising therefrom, the lawful processing of data, and the securing of transparency have emerged as major tasks.

- In particular, there is an urgent need to establish a balanced legal and institutional framework that can reasonably protect personal data without hindering the innovation brought about by artificial intelligence technologies.

○ Artificial intelligence has emerged as a core infrastructure for national competitiveness and industrial-economic value creation, but the personal data legal framework governing it has shown limitations in that it has not escaped the existing framework.

- New forms of data processing that emerged after the introduction of artificial intelligence include areas not anticipated by existing laws, thereby expanding legal gray areas.

○ Developers or development companies of artificial intelligence point out, as the greatest difficulty, that “the lawful basis other than consent is unclear.”

- In order to develop high-performance artificial intelligence, large-scale and high-quality data are essential, but under the current system there is a major problem in that data acquisition is greatly restricted due to the limitations of de-identification and pseudonymization methods and the burden of consent-based processing.

○ There is a demand for establishing clear legal grounds and standards for personal data that may be used together with publicly available data in the development of artificial intelligence.

▶Objectives of This Study

○This study was conducted with the following two objectives.

○ First, it examined the legal issues arising in the process of artificial intelligence technologies processing personal data, and by reviewing the current regulatory status for the lawful processing of personal data, it presented a foundation for discussing institutional improvement measures.

- Beginning with a review of the legal issues of the Framework Act on Artificial Intelligence and its Enforcement Decree, which are scheduled to take effect on January 22, 2026, it evaluates whether the Personal Information Protection Act properly reflects the characteristics of artificial intelligence technologies and what problems exist.

- It reviews institutional improvement discussions to secure consistency between the artificial intelligence regulatory system and the Personal Information Protection Act by analyzing the demands for transparency and accountability under the current Personal Information Protection Act, the guarantee of the rights of data subjects, and the principle-based regulatory system.

○ Next, it reviewed and presented directions for institutional improvement as a means of enhancing the legality of personal data processing for the development of artificial intelligence.

- It sought improvement measures for Korea’s personal data regulatory legal system through comparative legal discussions centered on legislative examples of major foreign jurisdictions (the EU, the United States, Japan, Singapore, etc.).

- It closely analyzed the legislative background, purposes, and detailed provisions of recently submitted bills on special provisions for personal data regulation for the development of artificial intelligence, and reviewed the expected legal and institutional issues and institutional preconditions upon the introduction of each bill.

Ⅱ. Contents

▶ Issue 1: Legal Problems of Personal Data Processing in the Development of Artificial Intelligence

○ A review was conducted of the current domestic guidelines related to the development of artificial intelligence technologies.

- The Personal Information Protection Commission’s Guide on the Processing of Publicly Available Personal Information (July 2024) presented that where publicly available personal information is used for AI training, Article 15(1)6 of the Personal Information Protection Act (legitimate interests) may be used as the legal basis, and emphasized that the legitimate interests of the personal information controller (AI developer, service provider) must clearly outweigh the rights of the data subject, and that measures to ensure safety and ways to guarantee rights must be established.

- The Personal Information Protection Commission’s Guide on Personal Information Processing for the Development and Use of Generative AI (August 2025), in response to personal information protection issues arising from the spread of generative AI, recommends safety measures at the data level, model level, and system level in the stages of development and training; at the data level, it recommends verification of training data sources, deletion and pseudonymization/anonymization of unique identifying information and sensitive information, and the introduction of privacy-enhancing technologies (PETs); and at the model/system level, it recommends supplementation of risks through model fine-tuning, minimization of memorization and disclosure of personal data, access control to the operating environment, input-output filtering, and the introduction of functions for the automatic detection and blocking of personal data in prompts/results.

- The National Human Rights Commission’s Human Rights Guidelines (April 2022), in order to prevent human rights violations and discrimination caused by the development and use of AI, are composed of six chapters: ① respect for human dignity, ② transparency and the duty to explain, ③ guarantee of self-determination, ④ prohibition of discrimination, ⑤ implementation of human rights impact assessments, and ⑥ establishment of risk grades and laws/systems; with respect to transparency and the duty to explain, they provide that the judgment process and results through artificial intelligence technology should be explained reasonably, that artificial intelligence used by public institutions should be explainable, that fully automated decision-making having a significant effect on an individual’s fundamental rights should be restricted, and that the parties should be guaranteed a right to refuse or a right to request human intervention; with respect to self-determination, they emphasize that data subjects have the right to identify and participate in the processing of their own information, that data should be processed accurately and completely within the minimum scope necessary for the purpose, and with respect to the prohibition of discrimination, that data inspection, adjustment, and periodic monitoring are needed throughout the AI lifecycle in order to exclude biased outcomes.

○ Regarding the relationship between artificial intelligence law and data law, because AI regulation is based on the collection, processing, and use of data, it has an inseparable relationship with data law, and therefore the European Union (EU) regulates it in an integrated manner through the AI Act, GDPR, Data Act, etc.

- Because AI regulation is based on the collection, processing, and use of data, it has an inseparable relationship with data law, and therefore the European Union (EU) regulates it in an integrated manner through the AI Act, GDPR, Data Act, etc.

- (Double regulation and conflict) Where personal data are included in generative AI training, there may be a conflict between the lawful processing basis under the Personal Information Protection Act and the transparency requirements under AI law.

- (Gap in scope of application) The Personal Information Protection Act is limited to “personal data,” whereas the Framework Act on Artificial Intelligence encompasses non-personal data and industrial data as well, thereby creating a difference in regulatory scope.

- (Conflict of authority among supervisory agencies) There is a possibility of competition over jurisdiction and supervisory authority between AI regulatory agencies and personal information protection regulatory agencies.

- (Principle of prohibition of use beyond purpose) It is pointed out that where use as training data for artificial intelligence deviates from the original purpose of the data subject’s consent, separate consent or a legal basis is required.

- In order to resolve these problems, there is a need to establish integrated guidelines, secure consistency in the legal system, and strengthen the cooperative system among law enforcement agencies (joint regulation/supervisory model).

- In order to prevent confusion over the order of application between AI law and the Personal Information Protection Act, it is desirable to clearly provide that, in matters involving personal data, the Personal Information Protection Act shall apply preferentially.

○ Regarding the risks that the development of artificial intelligence poses to personal information protection, in particular, generative artificial intelligence based on large language models (LLMs) has the possibility of infringing personal information.

- (Personal identification and inference) In the course of training on publicly available unstructured text, there is a risk of reinforcing discrimination and prejudice because personal characteristics of the author (gender, race, etc.) may be accurately inferred.

- (Opacity and information exposure) Opaque practices continue as companies do not disclose algorithms on the grounds of “trade secrets,” and there is a risk that artificial intelligence may remember and expose personal data over the long term.

- (Hallucination) Artificial intelligence may provide false or manipulated personal information as if it were true, and where errors occur in human-rights-sensitive fields, controversies may arise concerning vulnerabilities in information security and the guarantee of the data subject’s right to request correction/deletion (right to be forgotten).

- For trustworthy artificial intelligence, personal information protection is a core element, and there is a need to establish and continuously improve legal and institutional devices in order to strike a reasonable balance between innovation in artificial intelligence technology and the guarantee of the data subject’s right to informational self-determination.

▶ Issue 2: Comparative Legal Review

○ Analysis of the EU legal framework

- If two important normative systems related to artificial intelligence in the EU are to be mentioned, they are the EU AI Act and the GDPR.

- The EU AI Act and the GDPR are normative systems that coexist, and do not take the form of mutually excluding each other in regulating artificial intelligence.

- Rather, Article 2(7) of the EU AI Act expressly states that the regulation concerning personal data and privacy, etc. prescribed at the European Union level is not excluded by the EU AI Act, so even if the EU AI Act applies with respect to artificial intelligence, where issues related to personal data or privacy exist in relation to artificial intelligence, the GDPR may apply.

- There is an ambiguous aspect as to whether the legal regulation contained in the GDPR provides clear normative guidance with respect to the use of artificial intelligence and the development of artificial intelligence technology taking place in the process leading to its realization.

- Discussions exist concerning an approach to securing the legality of personal data processing carried out in the course of the development and use of artificial intelligence through one of the lawful bases for personal data processing prescribed in the items of Article 6(1) of the GDPR, namely the legitimate interests of the personal data controller or of a third party.

- However, securing the legality of personal data processing in AI development through the use of Article 6(1)(f) of the GDPR may be regarded as an alternative that does not resolve the ambiguity of the GDPR itself and the resulting ambiguity as to whether the GDPR has been violated, because it does not present a concrete and clear judgment on illegality or legality.

- Meanwhile, the EU AI Act provides a sandbox system as a special mechanism related to the development of artificial intelligence technology involving the processing of personal data.

- In the case of Article 59(1) of the EU AI Act, it is provided that personal data lawfully collected for another purpose within an AI regulatory sandbox may be processed within the sandbox only for the purpose of the development, training, and testing of a specific AI system.

- It is pointed out that the exception for personal data processing through the AI sandbox is set very narrowly, and because it is necessary to continuously confirm whether the original personal data sought to be used beyond the purpose were lawfully collected, it is practically impossible to develop a general-purpose AI system in an AI sandbox.

- Meanwhile, under Article 60(1) and (2) of the EU AI Act, with regard to the testing of high-risk AI in real-world conditions outside the AI regulatory sandbox, the provider of a high-risk AI system, or a person seeking to provide it, may conduct testing pursuant to a testing plan before placing it on the market or putting it into service, on its own or together with a deployer or a person seeking deployment.

- In the case of France, even before the EU AI Act entered into force, it had already been operating AI-related sandboxes to some extent, aiming both to support new legal and technical issues and to provide professional advice, and in particular, such professional advice was mainly focused on legal advice concerning compliance with the GDPR.

- In the case of Germany, efforts are being made to establish sandbox-related legal systems in relation to the EU AI Act.

○ Analysis of the U.S. legal framework

- In the United States, regulation through laws at the federal level is very restrained with respect to both artificial intelligence and personal information protection.

- With respect to personal information protection, laws are increasingly being enacted at the state level, but these too are being made from the perspective of protecting consumer rights.

- However, in the United States, it has been pointed out that big tech companies have continuously collected training data indiscriminately for the development of AI technologies and services, and together with the problem of infringing others’ copyrights, the problem of infringing personal information has also become an issue.

- Even though there is no general law on personal information protection at the federal level, there are laws such as COPPA for protecting children’s personal information.

- In individual cases, the FTC has issued orders to businesses to stop using such personal information for the purpose of product improvement and to delete it on the grounds of COPPA violations.

- Nevertheless, in the case of the United States, the scope of personal data processing for AI development is still not clearly defined legislatively, and since there is not a strong legislative movement from the perspective of pursuing a balance between the two values of technological development and industrial growth on the one hand and the protection of data subject rights on the other, the implications for Korea in terms of legislative improvement do not appear to be very high.

○ Analysis of the Japanese legal framework

- Japan had been operating a regulatory system based on guidelines, but there existed a social atmosphere that it was difficult to ensure the safe use of artificial intelligence with only guidelines and grounds provisions under individual laws, and accordingly the Act on the Promotion of Research, Development and Utilization of Artificial Intelligence-Related Technologies was enacted.

- However, the Act on the Promotion of Research, Development and Utilization of Artificial Intelligence-Related Technologies does not contain any special provisions on personal information in relation to AI technology development.

- Looking at the Alert Regarding the Use of Generative Artificial Intelligence, the Japanese Personal Information Protection Commission basically takes the stance of recommending compliance with the basic obligations related to personal information protection prescribed in the Japanese Personal Information Protection Act, and of urging caution in use because, for users of generative AI who process personal information, the process of using generative AI may create the possibility of violations of various provisions prescribed in the Japanese Personal Information Protection Act.

- Meanwhile, the Japanese Personal Information Protection Commission’s document entitled On the Approach to Institutional Issues under the Personal Information Protection Act (Draft) contains the need to ease the level of regulation contained in the existing Japanese Personal Information Protection Act, while at the same time also reviewing matters for which regulation needs to be strengthened in relation to personal information protection.

○ Analysis of the Singapore legal framework

- In Singapore, there is no general law on artificial intelligence, and the policy direction is mainly to secure a balance among the development of technology and industry and the safety, reliability, and ethics of artificial intelligence through soft law.

- However, although it does not directly regulate artificial intelligence, legislation in individual sectors may operate as regulation on products or services to which AI systems are applied, and at times may also exercise great influence over the overall development and use of artificial intelligence technology.

- Singapore’s PDPA, enacted in 2012 and amended in 2020, in principle requires organizations corresponding to personal data controllers to obtain the consent of data subjects where they intend to collect or use personal data.

- Attention should be paid to the fact that, through Article 17 of the PDPA and the Schedule provisions, the collection and use of personal data may be possible without obtaining the consent of the data subject at all.

- Personal data may be collected and used without the consent of the data subject in situations such as a person’s vital interests, matters affecting the public, legitimate interests, business asset transactions, etc., and in particular, because special treatment is recognized for business improvement and research purposes, this offers great implications for Korea’s legal system at the point where companies developing AI systems may become beneficiaries.

- Singapore’s PDPC and IMDA have also established recommendation guidelines so that AI development businesses, etc. can easily interpret and apply the relevant provisions.

- The exception for business improvement may become a provision that can be applied very broadly in that it may apply where a company develops an AI system for improving its existing products or services.

- The exception for research may be limited in scope because it may apply when developing AI systems for the public interest, but because such AI systems for public purposes do not exclude cases having a commercial nature, and research likewise includes commercial research, it may be applied broadly.

- However, even where such exception provisions exist, they are not permitted where the objective of AI system development can be achieved through anonymous information or pseudonymized information, or where there is a possibility of infringing the rights of data subjects.

- In addition, organizations that become beneficiaries of the exception need appropriate control measures for the safe protection of the relevant personal information, and at that time, various factors such as the type of risk of exposure, sensitivity, volume, and personnel with access to the personal information must be taken into account.

▶ Issue 3: Analysis of Bills on Special Provisions for Personal Information in AI Development

○ In order not to fall behind amid the great transformation of artificial intelligence, efforts are being made to invigorate the data ecosystem, and representative examples include proposals for amendments to the Personal Information Protection Act to introduce special provisions for personal data processing in AI development (bills proposed by Representative Min Byung-deok, bills proposed by Representative Koh Dong-jin, etc.).

○ The common point of the amendment bills is that, for AI technology development and performance improvement, where personal data already lawfully collected are subjected to technical, managerial, and physical measures under the management and supervision of the Personal Information Protection Commission, they seek to establish a new legal basis enabling original personal data to be used beyond the original purpose without the separate consent of the data subject.

- This is, as a kind of legal basis, to guarantee use substituting for consent through the deliberation and resolution of the Personal Information Protection Commission.

- This special provision is an extraordinary exception that would allow the lawful use of original-form personal data, even without going through the alternative means of the data subject’s consent or pseudonymization, with respect to use beyond the original collection purpose of personal data, which is strictly restricted under the current Personal Information Protection Act.

- The legislative background of this special provision lies, above all, in the urgency of not falling behind in the hegemonic competition surrounding artificial intelligence, and at the same time it can be said to be the establishment of a new legal structure for seeking a harmonious coexistence of the protection and safe use of personal data even under the changed technological environment.

- The Koh Dong-jin bill added a basis for simplifying the deliberation procedure by providing a provision that “where there has been a deliberation/resolution by the Personal Information Protection Commission that is identical or similar, the deliberation/resolution procedure may be simplified.”

○ In addition, as a separate regulation on training data in AI development, a partial amendment bill to the Framework Act on the Promotion of Data Industry and Data Use was also proposed (sponsored by Representative Kim Tae-nyeon).

- It was proposed to promote transactions of AI training data and support the growth of new data-based industries by supplementing the data valuation system so that it is specialized for AI training data, preparing separate standard transaction contracts reflecting the characteristics of training data premised on repeated use and large-scale processing, and institutionally improving anonymization/de-identification standards and quality certification standards for the protection of personal data.

▶ Issue 4: Considerations in Introducing Bills on Special Provisions for Personal Data Processing in AI Development

○ In order for the special provisions to apply, substantive requirements and procedural requirements must be satisfied.

- As substantive requirements, it is necessary to satisfy ① difficulty of developing AI technology through anonymization or pseudonymization of personal data, ② the establishment of strengthened safety devices such as a space with technical, managerial, and physical measures, and ③ inclusion of the purpose of promoting public interest or social interest + a significantly low concern of unjust infringement of the interests of data subjects/third parties.

- As procedural requirements, exceptions are recognized through prior deliberation/resolution by the Commission as to whether the requirements are satisfied (and conditions may be attached where necessary).

○ Various issues regarding the special provisions should be discussed, such as controversy over excessive restrictions on the data subject’s right to informational self-determination, problems of consistency within the overall structure of the Personal Information Protection Act, and the ambiguity of public interest or social interest.

- Since the possibility of using AI training data has already been opened through pseudonymized information, questions arise as to whether it is necessary to establish a dual system that permits the use of original personal data without pseudonymization, and whether prescribing a broad exception for the purpose of AI development may end up rendering the entire regulatory system ineffective.

- In addition, although the requirements of “promotion of public interest” and “promotion of social interest” are purpose-limiting devices established to guard against indiscriminate expansion of the special provisions, doubts are raised as to whether judgments on publicness can really be made concretely.

- On the other hand, a strong counterargument is also raised that such publicness requirements may greatly reduce the usability of the special provisions in AI development; because the inclusion of “promotion of social interest” in the requirements is intended to exclude cases confined only to the pursuit of a company’s simple private interest, even where there is a profit-making purpose, it should be judged whether social interests may be brought about by comprehensively considering the social ripple effects that AI technology development may generate.

- The requirement that “it is difficult to develop artificial intelligence technology if processed anonymously or pseudonymously” is, in reality, very difficult for most personal data controllers to satisfy.

○ There is an urgent need to strengthen infrastructure, such as reinforcing the organization in accordance with the increased role of the Personal Information Protection Commission.

- Deliberation and resolution on the special provisions will be decided through the plenary session of the Personal Information Protection Commission, but prior analysis thereof and evaluation of safety measures are essentially required.

- If the usability of the special provisions increases, the number of deliberation cases will also expand, and above all, it is also absolutely necessary to strengthen review personnel and organizational capacity for professional review reflecting the characteristics of artificial intelligence in each field.

- Even after recognition of the special provisions, an organization for ex post management will also be necessary, and it is necessary to strengthen a management system and organization specialized for AI training data.

○ Since a large amount of original data is used, the parallel introduction of privacy-protecting technologies is required.

-It is important to reduce the exposure of personal data by integrating new technologies such as privacy-preserving machine learning (PPBML), differential privacy, and federated learning.

○ The clause requiring disclosure of the contents of deliberation was introduced for transparency, but from the perspective of companies, concerns may be raised that sensitive AI development strategies or details of data use may be exposed.

▶ Issue 5: Considerations in Introducing a Basis under the Data Industry Act for Establishing a Specialized Framework for AI Training Data

○ It is inevitable that personal data are included in AI training data, and it is necessary to properly link anonymization standards under the Data Industry Act with the pseudonymized information system under the Personal Information Protection Act.

- The various guidelines issued by the Personal Information Protection Commission require an essential collaborative system that integrally reflects guidelines by relevant fields.

○ It is necessary to clearly define the relationship among the Framework Act on Artificial Intelligence, the Data Industry Act, and the Personal Information Protection Act, so that the concept and scope of AI training data are prescribed in a unified manner.

- In particular, it is necessary either to cite the definitional provisions of the Framework Act on Artificial Intelligence or to newly establish a definition of “AI training data” within the Data Industry Act.

- The current Data Industry Act requires consultation with the Fair Trade Commission and the hearing of opinions from interested parties and experts when preparing standard data contracts, and such matters also need to be reflected in the case of standard contracts for transactions of AI training data.

Ⅲ. Expected Effects

○ This study emphasizes the need for a shift in perception that, although the Personal Information Protection Act, at the time of its enactment in 2011, started from the principal value of information protection to protect the rights of data subjects, in recent years, with deepened digitalization across all sectors of the state and society and the rapid development of artificial intelligence technologies, personal data should also be appropriately utilized, and that merely blocking the use of personal data in advance is not always the best course.

- As artificial intelligence technologies are recognized as strategic core assets of the state, there is also a growing consensus that the value of building a smooth data-utilization ecosystem, focusing on the point that data are the most important foundation for the development of artificial intelligence, is by no means lighter than the value that data subjects should individually possess disposition authority over personal data.

○ In addition, this study confirms that it is now time to move the discussion forward beyond simply finding a balance point between the protection and appropriate use of personal data, toward the creation and maximization of public value in our society through the use of data, including personal data.

- It is necessary not to view personal data only as if they were exclusive ownership, but also to pay attention to their character as community assets in modern society entering an era of deep digitalization.

- It is a task of the times to appropriately design the system so that, while respecting the data subject’s management and control rights, it does not become, at the very least, an inert obstacle to contributing to the increase of the public interest or social interest of our society as a whole through the use of personalized data.

- Considering that, for the competitiveness of artificial intelligence, the use of high-value data such as personal data and copyrighted works is the most crucial foundation, there is an urgent need to seek an alternative institutional system to the rigid legality framework centered on prior consent, and herein lies the significance of special provisions on personal data processing in AI development.

○ Meanwhile, this study emphasizes that, even if the introduction of innovative exception procedures for AI training data is allowed, detailed safety devices must be established so that the rights of data subjects are not hollowed out through this.

- This is because the trust in artificial intelligence and the acceptability of our community will ultimately be measured by the extent to which the rights of data subjects are not marginalized but rather substantively realized.

- Beyond the traditional dichotomy of protection and use, the perspective of the public use of data and effective guarantee of rights, as major assets of our society in the era of deep digitalization, should be projected across our legal system as a whole. This study may find significance as a study that can serve as a reference in composing the contents of subordinate legislation and guidelines, and in subsequent follow-up legislation and the progressive regulation of the Framework Act on Artificial Intelligence.

Next The 2025 Consciousness Survey on Laws

List

KLRI : KOREA LEGISLATION RESEARCH INSTITUTE

Publications

Research Report