There are too few nonprofit organizations like CC fighting for the digital commons – support our vital leadership with an end of year contribution. Donate today!
Today, the European Parliament (EP) adopted its position in plenary on the Artificial Intelligence (AI) Act. This is the culmination of a months-long process whereby thousands of pages of amended text have been pored over by policymakers, civil society and industry alike. The strong, cross-party endorsement (499 votes in favor, 28 against and 93 abstentions) paves the way for tough negotiations with the European Council, which concluded its position at the end of last year. Since then, the bulk of the EP’s political focus has been on so-called “foundation models,” which are trained on vast ranges of data for a wide set of downstream tasks. In particular, they have focused on “generative AI,” with Members of the European Parliament (MEPs) seeking to provide a legal framework for recent innovations such as ChatGPT or Bard.
In a rare move, the EP, European Commission, and European Council agreed to start three-way negotiations — the so-called trilogues — immediately after the vote at 9pm CET today. This need for speed underscores the political imperative of reaching a deal before next year’s EP elections on the high-stakes, hot-button draft AI Act, which is the regulation that will set the rules around AI in the EU space. In fact, the Act is bound to shape how policymakers approach regulating AI in many other jurisdictions and at the international level.
Creative Commons (CC) has actively engaged in the AI Act process (see here and here) and welcomes the EU’s leadership on defining a regulatory framework around this impactful technology. In this blog post, we highlight the issues most likely to impact the topics we focus on: growing the commons and better sharing of knowledge and culture.
Background and key issues for CC
The AI Act process started in 2021 with a proposal by the Commission and has since been debated in Parliament and Council, whose approaches will need to converge in order for an agreement to be reached and the Act adopted.
In its initial conception, the Act focused on regulating certain uses of AI. In particular, it seeks to ban certain uses of AI, such as broad-based real-time biometric identification for law enforcement in public places. It also seeks to ensure that certain precautions are taken before deployment of uses deemed “high-risk,” such as the use of AI for access to education, employment, financial credit, or other essential services.
However, in the last year, the focus expanded. The Council incorporated provisions with respect to “general purpose AI” (GPAI), and the Parliament subsequently created requirements specifically for “foundation models.” Rather than addressing specific high-risk uses, these provisions impact technologies that have a wide range of uses, both potentially beneficial and harmful, and of varying degrees of risk. Moreover, the Parliament added specific requirements for generative AI, including requirements related to transparency of copyright works used to train these models.
At CC, we support the overall aims of the draft legislation, but we want to ensure that these new points of focus are handled in a careful, narrowly-tailored way. Specifically, here is what we will be focusing on as we engage policymakers going forward.
CC comments on specific issues
For many years, CC has focused on the interplay between copyright and AI, because of the way this technology can foster better sharing, helping people build on and contribute to the commons, spurring on new creativity and knowledge sharing. The Act poses several challenges to these aims, and we address them in turn below.
Free and open source software (FOSS)
FOSS provides important benefits, including by improving transparency and auditability of AI systems and by making it easier for a wide variety of players, including nonprofits, start-ups, researchers/academics and SMEs, to innovate, test and compete in the market. As such, CC’s views are that collaborative development of FOSS and merely making FOSS available in public repositories should not subject developers to the AI Act’s requirements.
GPAI, foundation models, and FOSS
We appreciate the concerns policymakers have about how general purpose tools can be used “downstream” by other actors in harmful ways. It is particularly important that downstream users have sufficient information about the underlying model in order to address possible risks.
At the same time, it is important to treat general purpose tools distinctly from tools aimed for a particular, high-risk use. For multi-purpose tools, it can be impractical for developers to implement risk management in ways suited for narrowly defined, “high risk” AI uses. In turn, imposing the same rules on GPAI creators may create significant barriers to innovation and drive market concentration, leading the development of AI to only occur within a small number of large, well-resourced commercial operators. With that in mind, we also want to ensure there are proportional requirements for FOSS “foundation models” that are “put on the market” or “put into service,” tailored to different services and providers.
In particular, we have concerns when it comes to FOSS developers. As above, merely developing and making available in a repository a FOSS “foundation model” or other general purpose tool should not subject developers to the Act’s requirements.
Transparency of training data and copyright
At CC, we are convinced that greater openness and transparency in the development of AI models can serve the public interest and facilitate better sharing by building trust among creators and users. As such, we generally support more transparency around the training data for regulated AI systems.
The Parliament version of the text includes specific provisions with respect to generative AI models, requiring providers to “document and make publicly available a sufficiently detailed summary of the use of training data protected under copyright law.” On the one hand, this can be a sensible way to ensure transparency, particularly for rightsholders who wish to exercise their right to “opt-out” of exceptions to copyright pertaining to AI training pursuant to Article 4 of the EU Copyright Directive in the Digital Single Market. On the other hand, it is important that this requirement is applied proportionately; developers should not be expected to literally list out every item in the training data, but rather provide useful summaries, such as use of a particular dataset like Common Crawl or LAOIN-5B, which rightsholders can then use to determine whether their works were used or not.
Generative AI and copyright generally
The Parliament’s text also requires that generative AI model providers take “adequate safeguards against the generation of content in breach of Union law, in line with the generally acknowledged state of the art, and without prejudice to fundamental rights, including the freedom of expression.” While perhaps to reaffirm that developers should comply with copyright law, this is likely to create much more uncertainty; at worst, it could be read to suggest a more sweeping requirement for developers to implement copyright filtering tools that could address perfectly lawful uses. We encourage policymakers to take steps to ensure this does not become a backdoor expansion of copyright law; to the extent policymakers want to consider this broader topic, they should do it separately, rather than tacked on to the AI Act at the eleventh hour.
Regarding the start of the trilogues, Creative Commons CEO Catherine Stihler said: “Creative Commons remains committed to finding fair and lasting solutions to ensure AI can support creators and grow the commons, in line with our strategy of better sharing and our values of openness, transparency, fairness, and creativity. At CC, we will continue to proactively engage with EU institutions as the trilogues commence, in order to achieve our mission to empower individuals and communities around the world by equipping them with technical, legal and policy solutions to enable sharing of knowledge and culture in the public interest.”
If you are in Brussels on 28 June, 2023, do not miss Creative Commons’ Brigitte Vézina speaking at the European Internet Forum’s Generative AI, art & copyright: from creative machines to human-powered tools event, an in-person panel organized by EIF with opening remarks by MEP Dragos Tudorache, Parliament co-rapporteur on the AI Act — you can find more information on the program and register on the EIF’s website.
Like the rest of the world, CC has been watching generative AI and trying to understand the many complex issues raised by these amazing new tools. We are especially focused on the intersection of copyright law and generative AI. How can CC’s strategy for better sharing support the development of this technology while also respecting the work of human creators? How can we ensure AI operates in a better internet for everyone? We are exploring these issues in a series of blog posts by the CC team and invited guests that look at concerns related to AI inputs (training data), AI outputs (works created by AI tools), and the ways that people use AI. Read our overview on generative AI or see all our posts on AI.
Note: We use “artificial intelligence” and “AI” as shorthand terms for what we know is a complex field of technologies and practices, currently involving machine learning and large language models (LLMs). Using the abbreviation “AI” is handy, but not ideal, because we recognize that AI is not really “artificial” (in that AI is created and used by humans), nor “intelligent” (at least in the way we think of human intelligence).