AIGC Platform Infringement Case: Boundaries of Directed Learning

AIGC Platform Infringement Case: Boundaries of Directed Learning

Human literary and artistic creations cannot emerge out of thin air; they are either imitations of nature or inspired by existing works. As we enter the AIGC era, users utilizing AI tools for creation cannot avoid issuing commands based on predecessors’ works or styles. This necessitates that AI must learn to a certain extent from existing works, especially those with high recognition. Consequently, this leads to a very serious intellectual property issue: should machines learn specifically from certain IPs have certain boundaries? Particularly, should there be a heightened duty of care regarding well-known IP content that is still under legal protection? It is essential to deeply consider whether certain measures should be taken to technically ‘avoid’ these IP contents.

AIGC Platform Infringement Case: Boundaries of Directed Learning

1. Technical Principles of AIGC Directed Learning

Taking the example of generating a new image similar to a well-known IP character using Artificial Intelligence Generated Content (AIGC) technology, we can explain this process in detail:

Selecting the Target IP: Assume the target IP is “Donald Duck and Mickey Mouse.” This character has a unique appearance, clothing style, language habits, and story background.

Step 1: Data Collection and Processing

Content Scraping: The system collects images, video clips, comic books, character voiceovers, etc., of this character from various available resources.

Data Cleaning and Annotation: Remove duplicate or low-quality data, and annotate images, such as different expressions, actions, scenes, etc., of the character.

Step 2: Model Training

Selecting an Appropriate Model: For image generation, deep learning models like GAN (Generative Adversarial Network) or Variational Autoencoders may be chosen.

Directed Learning: Input a large amount of data about the character into the model, especially their visual style and expressions. During the training process, the model will attempt to learn and imitate these characteristics.

Step 3: Content Generation

Generation Trials: Input specific prompts or conditions, such as “generate a new clothing design for the character” or “character image in different backgrounds,” etc.

Iterative Optimization: The generated images may need to be selected and fine-tuned through manual review to ensure the new character maintains its original characteristics while introducing novel creativity.

Step 4: Optimization and Feedback

Performance Evaluation: Assess the similarity and innovation of the newly generated character through intuitive comparisons with the original character and user feedback.

Feedback Loop: Adjust model parameters or training methods based on feedback to improve future generation results.

Ultimately, the AIGC system may create a character that is similar in shape, color, and temperament to the original animated character but with new elements or details. For example, retaining the original character’s classic hairstyle and clothing elements but innovating and varying the color scheme, clothing details, or background story.

Through this process, AIGC can not only learn and imitate existing IP content but also innovate and generate entirely new works based on this foundation. However, such practices also involve considerations and challenges regarding the sources of training data, copyright, and unfair competition.

2. Should AI Technology Neutrality Have Boundaries?

In fact, from the perspective of AI technology providers, this topic inevitably relates to the applicable scope of the “technology neutrality” principle. The “neutrality” of AI as a technology in the field of literary and artistic creation seems unquestionable. At first glance, users are the ultimate users of the technology, and the law should allow AI to have the maximum “implementation capacity,” including the ability to learn and regenerate from all IPs (regardless of whether they are still under legal protection). Only in this way can users access a super tool, and theoretically, the responsibility for how this tool is used should lie with the users.

This viewpoint can find its most compelling justification in the real world: humans can also engage in directed learning. The training of an artist can even be said to be achieved through continuous learning from the works of masters. Once they have learned, if they independently or at the behest of others reproduce a well-known work, they only need to bear responsibility for the act of reproduction; the prior learning is not considered infringement.

Using existing works based on learning is generally considered fair use under copyright law. However, in the context of AIGC, the machine learning based on large models presents three situations distinct from human learning:

1. The learning efficiency of large models is extremely high; as long as there is a certain amount of corpus, they can learn in a very short time. A human artist may require over a decade to reach a professional level, while a large model can achieve this in a day. This overwhelming efficiency leads to a significant impact on human creators in a very short time (a decade of hard work is not as good as a single day of learning);

2. In terms of effectiveness, large models can often achieve results that approach or even surpass those of the original authors;

3. Once a large model has completed its learning, it becomes a general skill and can be provided almost without limitation (the computational power constraint is virtually limitless), meaning users can invoke this capability to generate large quantities of similar or derivative works, leading to a rapid dilution of original human-created works.

The above new situations were not encountered in the past when individuals learned from each other, so the analogy of individual learning to machine learning to advocate for technical neutrality in the machine learning process is likely to be untenable.

Of course, this does not mean that the author believes all learning from protected works should be prohibited; rather, directed learning of works and authors with a certain level of recognition should be restricted. The reason is that these works have undergone market validation, gained public recognition, and are a core component of human intellectual wealth. In using these results, technology providers should have a prior duty of care during training, which is the “active avoidance” proposed in the title of this article.

AIGC Platform Infringement Case: Boundaries of Directed Learning

(It can be seen that ChatGPT has already begun to apply the “avoidance principle” to well-known IPs, restricting user generation instructions)

For works created by general authors, it is not that machines can “arbitrarily” learn under the guise of “fair use,” but rather that the duty of care for AIGC technology providers can be relaxed. They can train without prior knowledge to maximize technological advancement, and once a rights holder discovers that the machine has directed this mimicking ability to the point of diluting or confusing their work, they can file a complaint, and the technology provider must take necessary measures.

3. Economic Analysis of the “Avoidance Principle”

AIGC’s targeted learning training on well-known IP images, followed by generating similar or modified images based on user instructions, can cause damage to the IP parties. From an economic perspective, failing to regulate such behavior could lead to multiple adverse consequences:

1. Impact on IP Rights Holders

  • Brand Dilution: AIGC generating content similar to well-known IPs may lead to a market flooded with similar products or content, diluting the uniqueness and value of the original brand.

  • Reduced Income: If AIGC-generated content replaces the consumption of the original IP, this could directly affect the sales and profits of the original IP rights holders.

  • Lower Return on Investment: The creation and maintenance of well-known IPs require significant investment. Directed learning and generation by AIGC may reduce the uniqueness of original content, thereby affecting investors’ willingness to invest in original content and their expectations of returns.

  • Harm to Innovation Incentives: If AIGC-generated content easily replaces original content, this could reduce investment and development in original IPs, thereby stifling innovation and diversity.

2. Impact on Users

Market Failure: In the absence of appropriate regulation, AIGC could lead to market saturation, making it difficult for consumers to distinguish between original and AI-generated content, resulting in decreased market efficiency and consumer welfare loss.

  • Value Transfer: AIGC technology may shift value from original IP rights holders to technology providers, especially when technology providers profit from these generated contents without reasonable sharing. This not only affects the economic interests of IP rights holders but may also lead to a restructuring of the entire creative ecosystem’s value chain.

  • Quality Decline and Homogenization: In an unregulated environment, to pursue cost-effectiveness, the market may be flooded with large quantities of low-quality and highly homogenized content, damaging consumer experience and lowering overall cultural quality.

If certain degrees of “avoidance” are not applied to existing IPs, the enthusiasm of creators and users to participate in the cultural and artistic market will gradually decrease. AI could become a predatory “IP gold-consuming beast,” gradually driving the most creative humans out of the market. While this may benefit technology providers in the short term (by avoiding learning costs), in the long run, AI could also fall into a “corpus desert” due to the lack of human participation. Unless one day AI can completely learn in the literary and artistic fields by itself, freeing itself from human involvement, technology providers should still maintain a long-term vision and design a win-win mechanism for all parties involved.

4. What Can Be Done at the Regulatory Level

Besides the spontaneous mutual constraints of the market, the law cannot be absent in constructing a human-machine collaborative content generation ecosystem. It should formulate scientific and reasonable rules to reduce transaction costs, promote the release of technological power, and allow society as a whole to benefit.

To address the potential impact of AIGC directed learning on IP rights, a series of adjustments can be made at the regulatory level to balance innovation promotion with intellectual property protection. Here are some proposed solutions:

1. Legislation to Impose “Active Avoidance” Obligations on AIGC Technology Providers.

For works with high recognition, “avoidance” should be implemented during the training or generation phase, similar to the “red flag principle.” However, there is a distinction: the “red flag principle” applies to scenarios where users upload information to a platform, while “active avoidance” pertains to scenarios where AI technology providers directly conduct data training and content provision. As the direct implementers of the behavior, their duty of care should be higher.

2. Clarifying the Rights Ownership and Infringement Composition of AIGC Generated Content.

The legal protection of AIGC-generated content remains a highly controversial issue. In previous articles, I have proposed solutions, suggesting that certain AIGC works should be granted copyright protection as soon as possible. Only in this way can rights holders assert their rights when infringement occurs.

3. Introducing a Collective Management Mechanism for AIGC Training Data Sets

Similar to the practices of the Music Copyright Association and Film Copyright Association, a collective management organization for training data sets should be established to coordinate interest distribution and ensure fair compensation for IP rights holders when their works are used for AI training. Additionally, the power of associations can be leveraged to develop unified authorization and tracking systems: advanced technological solutions should be developed to track and manage the usage of AIGC content, ensuring a transparent and traceable authorization process.

4. Establishing Standards and Guidelines

With the rapid development of AI, legislation will inevitably follow, and the era of standards is approaching. Non-mandatory standards regarding copyright protection standards for AIGC, data training standards, and AIGC platform regulation standards should be formulated as soon as possible to provide guidelines for the industry, steering practitioners toward healthy and positive development. During this process, new and effective practices should be continuously referenced, and those practices unsuitable for standardization can be led by enterprises, associations, and professional institutions to formulate guidelines.

Through the implementation of the above measures, we can not only protect the legitimate interests of IP rights holders but also maintain the normal operation of the market and promote innovation and prosperity in society as a whole. This is a process that requires the joint participation of legislative bodies, technology providers, IP rights holders, and consumers. All parties should work together to create a fair, reasonable, and transparent future for the healthy development of AIGC technology and effective protection of intellectual property.

— END —

Article Submission Contact

AIGC Platform Infringement Case: Boundaries of Directed Learning

Author|Zhang Yanlai

AIGC Platform Infringement Case: Boundaries of Directed Learning

Zhejiang Kending Law Firm Founder, Chief Lawyer, Patent Agent

Founder of “Internet Law Society” and “Internet Law Practice Circle”

Practical Mentor at China University of Political Science and Law

Practical Mentor at Southwest University of Political Science and Law

Member of the Expert Guidance Committee on Anti-Monopoly in Zhejiang Province

Director of Hangzhou Lawyers Association

Arbitrator at Hangzhou Arbitration Commission

Since practicing, has been fully focused on internet legal practice, serving as a long-term legal advisor for dozens of top internet companies, and representing landmark internet litigation cases such as the first case of NFT infringement regarding “Fat Tiger,” the first case of group control, the first case of WeChat Mini Programs, the first case of smartphone flashing, the first case of 5G cloud gaming, and the first case of facial recognition, with the cases being selected for the “Top Ten Typical Intellectual Property Cases of the Supreme Court,” “Top Fifty Typical Intellectual Property Cases of the Supreme Court,” “Most Research-Value Intellectual Property Cases in China,” and “Top Ten Constitutional Cases in China.” Deeply involved in the legislative work of China’s “E-commerce Law,” the State Administration for Industry and Commerce’s “Regulations on Online Transactions,” and Hangzhou’s “Regulations on Online Transactions.” Personal works include “Legal Eye on E-commerce,” “Notes from the Battlefield of Internet Law,” and “No Technology, No Law,” published by Law Press and Legal Publishing House, respectively.

Leave a Comment