Analysis of AIGC R&D and Data Privacy Compliance Obligations

1. ChatGPT Data Privacy Compliance Issues Spark Global Regulatory Attention

Since the “sudden emergence” of ChatGPT, AIGC (AI-Generated Content, generative artificial intelligence, hereinafter referred to as AIGC) has attracted significant attention from regulatory agencies worldwide. The chairman of the U.S. Federal Trade Commission (FTC) stated that generative AI will be “highly disruptive,” and the FTC will enforce strict regulations in this area.[1] In the realm of data privacy compliance, the Italian Data Protection Authority (GARANTE PER LA PROTEZIONE DEI DATI PERSONALI, hereinafter GPDP) fired the first shot of national regulation—on March 31, 2023, GPDP announced a complete ban on ChatGPT and prohibited OpenAI from processing data of Italian users. Subsequently, data protection authorities from multiple countries announced investigations into ChatGPT. On April 3, Germany also indicated it was considering banning ChatGPT. France and Ireland took measures, including discussions with Italy on enforcement matters, while Spain requested the European Data Protection Board (EDPB) to evaluate privacy issues related to ChatGPT. On April 4, the chairman of the Korean Personal Information Protection Commission stated that it was investigating data breaches involving ChatGPT users in Korea. On the same day, Canada announced an investigation into OpenAI regarding data security issues. On April 13, the EDPB decided to establish a special working group for ChatGPT. Although ChatGPT does not provide services in mainland China and Hong Kong, the National Internet Information Office of China also released a draft for public consultation on the “Management Measures for Generative Artificial Intelligence Services” on April 11 regarding the R&D and application of AIGC domestically.

2. The History of Italy’s Ban on ChatGPT to the Resumption of Services

GPDP’s first national regulatory shot, through a series of measures such as temporary restrictions, complete bans, violation investigations, deadlines for rectification, and lifting bans, urged OpenAI to meet compliance requirements in data privacy, which is worthy of reference. The following is the timeline from Italy’s ban on ChatGPT to the resumption of services:

On March 30, GPDP issued a decisive document stating that ChatGPT violated personal data protection regulations and imposed temporary restrictions on ChatGPT’s data processing activities.[2]

On March 31, GPDP announced a complete ban on ChatGPT, restricting OpenAI from processing Italian user information and investigating OpenAI for the illegal collection of user data. GPDP also pointed out that on March 20, a data breach incident involving ChatGPT resulted in the loss of user conversation data and payment service information, and there were issues of collecting and processing user data without fulfilling the obligation to inform users. OpenAI does not have a business location within the EU, but has designated a representative in the European Economic Area, which must report within 20 days on the measures taken to comply with GPDP’s requirements, or face fines of up to €20 million or up to 4% of global annual revenue.[3]

On April 6, GPDP stated that OpenAI expressed a willingness to cooperate in order to reach a positive solution. On the evening of April 5, OpenAI held a meeting with GPDP attended by OpenAI’s CEO, board members, two deputy general counsels, and the head of public policy. OpenAI committed to providing GPDP with documents by April 6 outlining the measures to be taken, including but not limited to enhancing the mechanisms for exercising the rights of data subjects, increasing transparency in the use of personal data, and implementing protective measures for minors, while requesting the lifting of the temporary processing restriction. GPDP emphasized that it does not intend to hinder the development of artificial intelligence and technological innovation but aims to protect the personal data of Italian and European citizens and reiterated the importance of respecting data protection rules.[4]

On April 12, GPDP made a series of requests to OpenAI regarding data privacy security, including but not limited to publicly disclosing the data processing logic of ChatGPT, screening user ages, and clearly informing data subjects of their rights. GPDP stated that if OpenAI implements the above measures by April 30, the temporary restrictions it faces will be lifted, and ChatGPT will have the opportunity to relaunch in Italy.[5]

On April 28, OpenAI sent a letter to GPDP outlining a series of measures implemented to comply with GPDP’s orders, including but not limited to expanding privacy notifications to European users and non-users, modifying and clarifying several mechanisms, and providing user-friendly, accessible solutions for users and non-users to exercise their rights. Based on these improvements, ChatGPT resumed services.[6]

GPDP acknowledged OpenAI’s measures to coordinate technological advancement while respecting personal rights and hopes that the company will continue to comply with European data protection laws, fulfill other outstanding requirements, and adhere to further future requirements, especially the implementation of age verification measures, planning, and conducting outreach activities to inform the Italian public about what has occurred, and that they have the right to refuse the processing of their personal data for algorithm training. GPDP will continue to investigate OpenAI with the support of the special working group established by the EDPB.

3. Analysis of ChatGPT’s Violations, Corresponding Compliance Rectification Orders, and Implementation Status

Here is a summary of the series of decisions, violations, enforcement bases, rectification orders issued by GPDP during the investigation process, and the rectification actions taken by ChatGPT:

ChatGPT’s phased rectification has been recognized by GPDP, but some long-term rectification requirements remain to be fulfilled. Regarding the March 20 data breach involving ChatGPT, GPDP has not issued any decisive orders and does not rule out the possibility that the investigation may determine that this data breach does not involve Italian users or is still ongoing.

4. Identification of Privacy Data Compliance Obligations in AIGC R&D and Application and Their Implementation Forms

With the rapid emergence and iteration of new technologies, the formulation and implementation of laws are inherently lagging. How AIGC R&D and application can quickly and accurately identify privacy compliance obligations amidst high uncertainty in relevant legislation and law enforcement, frequent and vague rule changes, is well illustrated by the regulatory requirements of Italy and the series of compliance efforts made by OpenAI, which can help identify certain and compliant obligations. In summary, when enterprises develop and utilize AIGC to provide services to the public, they should fulfill the following privacy compliance obligations:

(1) Core Business of AIGC: Legitimacy Guarantee for Processing Personal Data for Algorithm Training

After being ordered to change the legal basis for utilizing personal data for algorithm training from fulfilling contracts to user consent, OpenAI chose to change it to legitimate interests, which has now been recognized by GPDP, providing a reference for the legitimacy guarantee of AIGC’s core business. However, in countries and regions where there is no legal basis of legitimate interests, obtaining user consent remains a more stable legal basis.

Specifically, to ensure the legitimacy of the training data source, AIGC R&D and utilization should provide a separate announcement for processing personal data for algorithm training, publicly disclose the data processed by AIGC and the logical methods of processing, and this announcement should be independent of the privacy policy, terms of service/user agreement, with the aim of ensuring that users are fully informed. There are three suggestions regarding the applicability of the legitimacy basis:

If the training data contains personal information, consent from the data subject should be obtained. If consent is used as the legal basis, it needs to be set so that users can fully read the aforementioned separate announcement before clicking to authorize consent, and ensure that users can withdraw their consent at any time, similar to the separate consent requirements in China’s Personal Information Protection Law;
In countries/regions where the legal basis for data processing includes legitimate interests, if choosing to use legitimate interests, a balancing test must be implemented, and users must be guaranteed the right to refuse such processing. Specific methods can refer to the online application forms provided by OpenAI to filter out conversations related to the applicant from the data used for algorithm training;
According to GPDP’s rectification order, it is not recommended to use fulfilling contracts as the legal basis, as the solidity of this legal basis seems not to be recognized by regulatory agencies.

(2) Fulfillment of Transparency Obligations

Privacy policies, terms of service/user agreements, and related announcements must be easily accessible and clearly inform users in plain language and concise manner about what personal data will be collected, how the collected personal data will be used, and should be readable by users/non-users (such as website visitors) before registration.

Substantial changes to the privacy policy content should be prompted, and users should be ensured to re-read it. For registered users, a pop-up reading prompt should be set before they reuse the service.

For AIGC, a technology that has a significant public impact but has many unknown risks, in addition to informing on the service interface, mainstream media can also be used to publicly disclose data processing situations to let the public know that their personal data will be used for algorithm training.

(3) Guaranteeing the Principle of Accuracy

R&D and utilization of generative artificial intelligence should take measures to prevent the generation of false information, ensuring that the content generated is true and accurate. Users should have the right and be able to easily correct inaccurate personal data or delete it in cases where it is technically impossible to correct erroneous data (and the fact that it is technically impossible to correct should also be truthfully communicated to users), such as through online applications or by providing users with options to directly delete data.

(4) Protection of Data Subject Rights and Their Forms

Users should be prominently informed of their data rights, channels for receiving and processing user complaints, and response mechanisms at a prominent location on the product interface. At the same time, enterprises should ensure timely handling of personal requests regarding correction, deletion, blocking, and other rights.

(5) Protection of Minors

Measures to protect minors should be taken, such as establishing age verification systems, filtering out minor users, or ensuring that minor users can only use the service with parental consent (the definition of minors varies by country and should be implemented according to local legal special provisions).[7]

In terms of specific implementation forms, the following can be referenced:

Privacy policies, terms of service/user agreements, and related announcements should clearly state that they only serve adults, do not collect children’s data, and insert a request for birth date confirmation in the service registration mask, allowing users to confirm that they are over 18 years old or that they are over 13 years old and have obtained parental or guardian consent before registering. As previously mentioned, OpenAI’s current measures for protecting minors have only been implemented to this extent, and it still needs to implement age determination and verification mechanisms by September 30, 2023. In the implementation of age determination and verification mechanisms, OpenAI can adopt the following product interaction designs during the user account registration process:

Verification through confirmation of the registered user’s phone number, such as through SMS verification codes;
Verification through third-party payment channels on the client side, such as credit cards;
Verification through client-side real-name authentication, facial recognition, voice recognition, etc.;
Verification through client-side intelligence tests, such as setting question banks that require certain intellectual or knowledge requirements for users to answer, commonly using more complex math calculations to preliminarily judge the user’s intellect and age, and implementing restrictions on accounts that do not pass verification.

When adopting the above age determination mechanisms, enterprises should assess the necessity and effectiveness of various verification methods, especially those that collect sensitive personal data like real-name authentication, facial recognition, and voice recognition.

Additionally, there are obligations that must be followed, such as not profiling based on user input information and usage, not generating discriminatory content based on user race, nationality, gender, etc., not providing user input information to third parties, and not illegally retaining input information that can infer user identity. Some countries and regions may require user real-name authentication and other privacy compliance-related obligations to be followed. Moreover, this list of compliance obligations may continue to expand as AIGC evolves and legal requirements are updated. The specific implementation methods for various obligations need to be explained by data privacy legal and technical experts and also await practical interpretations by ChatGPT or other AIGC pioneers and regulators.

Furthermore, for multinational enterprises’ compliance with data privacy regulations locally, OpenAI has also provided its response strategy in this regulatory requirement response: a general compliance strategy plus localized adjustments. Specifically, it is a combination of general legal obligation identification and compliance plus localized evaluation of special legal requirements and regulatory requirements and localized applicability. For instance, TikTok does not implement a global data localization storage strategy, but based on strict data privacy regulations in Europe and the U.S., it has implemented the Texas plan (localization of user data storage in the U.S.) and the Clover plan (localization of user data storage in Europe). Just as ChatGPT responded when asked, “What differences do you currently have regarding data privacy for Italian users and users from other countries?”:

Footnotes

[1] The New York Times. Lina Khan: We Must Regulate A.I. Here’s How〔J／OL〕. https://www.nytimes.com/2023/05/03/opinion/ai-lina-khan-ftc-technology.html?searchResultPosition=3

[2] GPDP Provision of March 30, 2023〔EB/OL〕. https://www.garanteprivacy.it/web/guest/home/docweb/-/docweb-display/docweb/9870832

[3] Artificial intelligence: stop to ChatGPT by the Italian SA Personal data is collected unlawfully, no age verification system is in place for children〔EB/OL〕. https://www.garanteprivacy.it/web/guest/home/docweb/-/docweb-display/docweb/9870847

[4] ChatGPT: OpenAI collaborates with the Privacy Guarantor with commitments to protect Italian users〔EB/OL〕. https://www.garanteprivacy.it/web/guest/home/docweb/-/docweb-display/docweb/9872832

[5] ChatGPT: Italian SA to lift temporary limitation if OpenAI implements measures 30 April set as deadline for compliance〔EB/OL〕. https://www.garanteprivacy.it/web/guest/home/docweb/-/docweb-display/docweb/9874751

[6] ChatGPT: OpenAI reinstates service in Italy with enhanced transparency and rights for European users and non-users〔EB/OL〕. https://www.garanteprivacy.it/web/guest/home/docweb/-/docweb-display/docweb/9881490

[7] In fact, the definition of “children” varies significantly by country based on cultural traditions and legislative needs. The main protection target of COPPA in the U.S. is children under 13 years old; the EU’s General Data Protection Regulation allows member states to retain the authority to set their own age of protection for children, which can range from 13 to 16 years old. Additionally, Korea has defined it as 14 years old, Japan as 18 years old, and Australia as 13 years old. Therefore, from the perspective of legislative provisions in various countries, the protection target is generally limited to between 13 and 16 years old, with exceptions specifically noted.

In addition, the information collected for this article is sourced from legally public channels, and we cannot provide any form of guarantee regarding the authenticity, completeness, and accuracy of the information; this article is for sharing, communication, and learning purposes only, and no one should use the entire or part of this article as a basis for decision-making. Consequently, any consequences arising therefrom will be borne by the individual responsible for the actions taken.

Disclaimer

Leave a Comment Cancel reply