Skip to content

Supporting Open Source and Open Science in the EU AI Act

Better Internet
An abstract European Union flag of diffused gold stars linked by golden neural pathways on a deep blue mottled background.

“EU Flag Neural Network” by Creative Commons was cropped from an image generated by the DALL-E 2 AI platform with the text prompt “European Union flag neural network.” OpenAI asserts ownership of DALL-E generated images; Creative Commons dedicates any rights it holds to the image to the public domain via CC0.

As the EU seeks to finalize its landmark AI Act, CC has joined with Eleuther AI, GitHub, Hugging Face, LAION, and Open Future in offering suggestions for how the Act can better support open source and open science.

As we’ve said before, we welcome the EU’s leadership on defining a regulatory framework around AI and the Act’s overall approach. At the same time, it’s critical that well-intentioned proposals do not have unintended, harmful consequences for the extensive ecosystem of open scientific research and open source development.

As we note in our full paper, open and accessible sharing of the software, datasets, and models that make up AI systems allows for more widespread scrutiny and understanding of both their capabilities and shortcomings. Open source development can enable competition and innovation by new entrants and smaller players, including in the EU. Projects like EleutherAI and BigScience have brought together researchers and a range of institutions, including ones in the EU, to develop and share resources and skills to train high quality models.

Unfortunately, as things stand in the current negotiations, we believe the proposals threaten to create impractical barriers to and disadvantages for contributors to this open ecosystem. For instance, the text could impede simply making open source components available in public repositories and collaborating on them, thereby threatening the very process on which open source depends to develop.

To be clear, we don’t think open approaches to AI development should make their use fully exempt from the Act’s requirements, and we recognize how open source AI can also make harmful uses of AI more accessible to more people. Instead, our recommendations underscore the need for a tailored, proportionate approach to open source and open science, which supports collaborative models of development of AI by a wide range of players.

Like the rest of the world, CC has been watching generative AI and trying to understand the many complex issues raised by these amazing new tools. We are especially focused on the intersection of copyright law and generative AI. How can CC’s strategy for better sharing support the development of this technology while also respecting the work of human creators? How can we ensure AI operates in a better internet for everyone? We are exploring these issues in a series of blog posts by the CC team and invited guests that look at concerns related to AI inputs (training data), AI outputs (works created by AI tools), and the ways that people use AI. Read our overview on generative AI or see all our posts on AI.

Note: We use “artificial intelligence” and “AI” as shorthand terms for what we know is a complex field of technologies and practices, currently involving machine learning and large language models (LLMs). Using the abbreviation “AI” is handy, but not ideal, because we recognize that AI is not really “artificial” (in that AI is created and used by humans), nor “intelligent” (at least in the way we think of human intelligence).

Posted 26 July 2023