top of page
Search

ANI v. OpenAI: The Intersection of Copyright and Artificial Intelligence in India

  • Writer: Centre for Advanced Studies in Cyber Law and AI CASCA
    Centre for Advanced Studies in Cyber Law and AI CASCA
  • Feb 5
  • 7 min read

Updated: Mar 14

By- Raima Singh & Vishwaroop Chatterjee (2nd Year Members, CASCA)

Introduction

 

The legal dilemma of Artificial Intelligence (AI) and its impact on Copyright laws embarked into India’s jurisdiction as the news media organisation ANI filed a petition before Delhi High Court against OpenAI alleging several copyright violations. This is not unprecedented for the Microsoft owned company, OpenAI, which has several lawsuits against itself in countries such as the USA, Canada, Germany, etc. The ANI case reflects the larger issues that concern the training of these large language models (LLM) that infringe the intellectual property rights (IPR) of news publications to train their GenerativeAI (GenAI) . The challenges are thus poised with dilemmas whether to create a legal habitat that safeguards IPRs of the original creators or to pave the way for innovation and technological advancements.


This blog aims to analyse the approaches available under the existing copyright laws in India and also a comparative analysis of other jurisdictions to understand the global scenario as Indian laws are still in the early phases of becoming accustomed with the growth of GenAI. A comprehensive analysis shall provide suggestions to mitigate the issues that give rise to copyright infringements by LLMs.

 

Context, Conflict, and Analysis

 

The conundrum of the ANI case does not only lie in the mere training of the ChatGPT product of OpenAI, but also in the contention that data that was not publicly available was used by the company was used for training, which enhances the complexity of the issue. Furthermore, ANI has claimed that ChatGPT incorrectly attributed it as a source of information that does not even originate with ANI. In contrast, there have been instances where ANI has been incorrectly quoted as a source.


The ANI case also shares the constant quandaries with its global counterparts, cases which include the New York Times lawsuit or GEMA lawsuit raise pertinent questions such as who owns the copyright of the content produced by GenAI? Does storing datasets purely for training purposes constitute copyright infringement; and do AI models share these protected datasets to consumers in a manner that breaches copyright?


To establish copyright under the Indian Copyright Act, 1957 (hereinafter, “the Act”) one must be a natural person as per Section 17 of the Act. This diminishes OpenAI’s ownership scope as generative AI content is not produced by a natural person as it is merely a programme. But if not the programme, does the user have the copyright claim? The answer lies in the global landmark case of Zarya of the Dawn which dealt with ownership of a movie created by AI which was then disqualified for copyright status on the grounds that it lacks human originality. The  US judiciary emphasised the doctrine sweat of the brow  which refers to the doctrine that copyright and ownership shall be provided to generative AI content based human input and involvement. In India, the case of Eastern Book Company v. D.B Modak laid down the Canadian Test which refers to evaluating an author's skill and judgement to reflect human creativity as the primary reasoning to establish copyright ownership.


On the other hand, Open AI has over and over again raised the exception of Fair Use and Opt out in its defence. The Section 52 of the Act outlines what constitutes fair dealing which refers to allowing the use of copyrighted material without the need for permission from the copyright holder in certain instances. This concept strikes a balance between the rights of copyright holders and the public's interest in accessing and using creative works. Additionally, OpenAI has an opt out mechanism which mandates data driven companies to register out of their training models and refuse to participate in or exempt oneself from particular processes, agreements, or data uses, which are often related to data collecting, intellectual property, or technology driven services. Opt out method as provided in OpenAI’s terms and conditions shifts the responsibility to the data fiduciaries but the question arises, whether it is even effective if OpenAI has already trained its LLM even before they provided the opt out mechanism. Does this not just provide them with a head start and competitive advantage in comparison to AI start-ups?


This deliberate decision substantially widens the learning gap barrier for future AI engineers. OpenAI's commitment to allow opt out from its training data, while maintaining its non-infringement stance, creates an untenable barrier. New entrants must now create equivalent skills despite limited content access, while competing against systems that already have optimal learning models. This normalizes opt out as a viable balancing mechanism for OpenAI under the pretense of ethical grounds, notwithstanding the continuous claim that training purpose content use is not infringement under copyright law.

 

Inter-Jurisdictional Analysis

 

EU’s Directive on Copyright and Related Rights in the Digital Single Market (hereinafter, “the Directive”)


The Directive aims at avoiding distortion of internal competition and ensuring harmonization of laws. It addresses issues relating to multiple facets including text and data mining and out-of-commerce works. The laws primarily focus on promoting development of AI along with ensuring protection of rights holders. Article 3 of the Directive protects text and data mining conducted for scientific research, while Article 4 permits such actions provided the rightsholders have not explicitly reserved their rights. This balanced approach streamlines content usage while clearly defining its objectives. Moreover, Article 5 permits the use of copyrighted material strictly for educational purposes by institutions, offering the rightsholders the ability to override such exceptions through explicit reservations. Article 17 lays emphasis on platform responsibility for both authorized and unauthorized use of copyrighted material. The Directive provides an ideal framework for fostering AI-driven innovation along with protecting the rights of content creators. While both EU and Indian framework of laws inculcate the necessary exceptions to copyright laws for educational purposes, the Directive undertakes a comprehensive approach by not only providing concrete guidelines on text and data mining, but also addressing the rapidly evolving nature of the digital age.

 

United States’ Digital Millennium Copyright Act (hereinafter, “DMCA”)


DCMA integrates two treaties designed by the World Intellectual Property Organization to address the copyright challenges arising owing to digital technologies. The Technological Protection Measures (hereinafter, “TPM”) serve as a safeguarding mechanism through two key components: anti-circumvention provisions and penalties from violations. Section 1201 outlines multiple exceptions, including those for educational and scientific works, aligning with global legal frameworks.

However, a standout feature of the provision remains the Rule Making process, which adapts the copyright laws to the evolving business and technical landscape. This involves triennial proceedings, which conclude with proposals for refurbishing the exceptions to copyright infringement. The recently introduced recommendations involve addressing the increased integration of AI in the society through exceptions in favor of GenAI and AI model training. This expands the boundaries of copyright to encourage innovation, accessibility, and research. Moreover, the criteria of ‘data minimization’ ensures minimal access to copyrighted content for compliance with legal standards.

 

Suggestions

 

The Indian copyright regime is governed by Act, which provides a comprehensive legal framework. As arguments arise for classifying transformative uses of AI under fair use, it is crucial to recognize that the existing fair use doctrine under Indian laws remains narrow and restrictive. It has been depicted in cases like ANI v. OpenAI  and the recent case of LAION v. Kneschke, it is essential for copyright laws to evolve and address the challenges of growing AI usage, regulations surrounding Internet Service Providers, and other emerging players overlooked in traditional legal frameworks. The following recommendations are proposed to complement the existing framework:

 

1.     Adopting the Rule Making Process 


The Indian copyright regime is governed by Act, which provides a comprehensive legal framework. As arguments arise for classifying transformative uses of AI under fair use, it is crucial to recognize that the existing fair use doctrine under Indian laws remains narrow and restrictive. It has been depicted in cases like ANI v. OpenAI  and the recent case of LAION v. Kneschke, it is essential for copyright laws to evolve and address the challenges of growing AI usage, regulations surrounding Internet Service Providers, and other emerging players overlooked in traditional legal frameworks. The following recommendations are proposed to complement the existing framework:

 

2.     Streamlining the Opt-Out Mechanism


Although opt-out is as an essential safeguard, the absence of a clear or a legally enforceable procedure for the same hinders its effectiveness for the rightsholders. For instance, the Act remains ambiguous regarding whether consent is required for training GenAI, leaving room for interpretation. Incorporating provisions relating to opt out, as given in the EU Directive, can efficiently address the ‘consent gaps’ and provide for enhanced control over copyrighted content. Such refinement also makes Indian copyright provisions conducive to cross border partnerships in a globalized world. Additionally, a clear opt out mechanism may also decrease any further disadvantages or barriers for sprouting AI players, by offering a well-defined legal landscape for further growth and innovation.

 

3.     Regulatory Approach


A statute that mandates AI companies to provide a detailed report on the data that was used to train the AI and any copyrighted works contained in the database used to alter their training data must be established. Similar approach has been proposed by the USA’s AI Foundational Model Transparency Act. Though not enforceable as of now, India can certainly draw inspiration from its regulatory policy. The proposed Act seeks to regulate AI data training by authorising the Federal Trade Commission (FTC) to create regulations requiring models to publish training data and take into account information necessary to assist copyright owners in enforcing their copyright.

 

Moreover, since in the ANI case, it is contended that information which is not accessible to the public was used to train ChatGpt could have been resolved if a detailed report was submitted to maintain transparency. Thus, a regulated approach will prevent such malpractices and facilitate safeguard mechanisms by regulated approach.

 

Conclusion


The ANI case serves as a pivotal moment for Indian copyright laws, highlighting the urgent need for harmonising regulation of AI and innovation. Global jurisdictions offer suitable legal frameworks and can serve as benchmarks for enhancing the protection of rightsholders under the Act. Expanding the Act to include transparency reports, streamlined opt out systems can provide legal certainty in disputes of similar nature. As global players increasingly engage and navigate the intellectual property landscape in India, it becomes imperative to prioritise an equitable and novel legal environment.

 
 
 

Comments


bottom of page