Many wonder what role CC licenses, and CC as an organization, can and should play in the future of generative AI. The legal and ethical uncertainty over using copyrighted inputs for training, the uncertainty over the legal status and best practices around works produced by generative AI, and the implications for this technology on the growth and sustainability of the open commons have led CC to examine these issues more closely. We want to address some common questions, while acknowledging that the answers may be complex or still unknown.
We use “artificial intelligence” and “AI” as shorthand terms for what we know is a complex field of technologies and practices, currently involving machine learning and large language models (LLMs). Using the abbreviation “AI” is handy, but not ideal, because we recognize that AI is not really “artificial” (in that AI is created and used by humans), nor “intelligent” (at least in the way we think of human intelligence).
CC licensing and training AI on copyrighted works
Can you use CC licenses to restrict how people use copyrighted works in AI training?
This is among the most common questions that we receive. While the answer depends on the exact circumstances, we want to clear up some misconceptions about how CC licenses function and what they do and do not cover.
You can use CC licenses to grant permission for reuse in any situation that requires permission under copyright. However, the licenses do not supersede existing limitations and exceptions; in other words, as a licensor, you cannot use the licenses to prohibit a use if it is otherwise permitted by limitations and exceptions to copyright.
This is directly relevant to AI, given that the use of copyrighted works to train AI may be protected under existing exceptions and limitations to copyright. For instance, we believe there are strong arguments that, in most cases, using copyrighted works to train generative AI models would be fair use in the United States, and such training can be protected by the text and data mining exception in the EU. However, whether these limitations apply may depend on the particular use case.
It’s also useful to look at this from the perspective of the licensee — the person who wants to use a given work. If a work is CC licensed, does that person need to follow the license in order to use the work in AI training? Not necessarily — it depends on the specific use.
- To the extent your AI training is covered by an exception or limitation to copyright, you need not rely on CC licenses for the use.
- To the extent you are relying on CC licenses to train AI, you will need to follow the relevant requirements under the licenses.
Another common question we hear is “Does complying with CC license conditions mean you’re always legally permitted to train AI on that CC-licensed work?”
Not necessarily — it is important to note here that CC licenses only give permission for rights granted by copyright. They do not address where other laws may restrict training AI, such as privacy laws, which are always a consideration where material contains personal data and are not addressed by copyright licensing. (Many kinds of personal data are not covered by copyright at all, but may still be covered by privacy-related regulations.)
For more explanation, see our flowchart regarding the CC licenses in this context, and read more in our FAQ on AI and CC licenses.
CC Licenses and outputs of generative AI
In the current context of rapidly developing AI technologies and practices, governments scrambling to regulate AI, and courts hearing cases regarding the application of existing law, our intent is to give our community the best guidance available right now. If you create works using generative AI, you can still apply CC licenses to the work you create with the use of those tools and share your work in the ways that you wish. The CC license you choose will apply to the creative work that you contribute to the final product, even if the portion produced by the generative AI system itself may be uncopyrightable. We encourage the use of CC0 for those works that do not involve a significant degree of human creativity, to clarify the intellectual property status of the work and to ensure the public domain grows and thrives.
Though using CC licenses and legal tools for training data and works produced by generative AI may address some legal uncertainty, it does not solve all the ethical concerns raised, which go far beyond copyright — involving issues of privacy, consent, bias, economic impacts, and access to and control over technology, among other things. Neither copyright nor CC licenses can or should address all of the ways that AI might impact people. There are no easy solutions, but it is clear we need to step outside of copyright to work together on governance, regulatory frameworks, societal norms, and many other mechanisms to enable us to harness AI technologies and practices for good.
We must empower and engage creators
Generative AI presents an amazing opportunity to be a transformative tool that supports creators — both individuals and organizations — provides new avenues for creation, facilitates better sharing, enables more people to become creators, and benefits the commons of knowledge, information, and creativity for all.
But there are serious concerns, such as issues around author recognition and fair compensation for creators (and the labor market for artistic work in general), the potential flood of AI-generated works on the commons making it difficult to find relevant and trustworthy information, and the disempowering effect of the privatization and enclosure of AI services and outputs, to name a few.
For many creators, these and other issues may be a reason not to share their works at all under any terms, not just via CC licensing. CC wants AI to augment and support commons, not detract from it, and we want to see solutions to these concerns to avoid AI turning creators away from contributing to the commons altogether.
We believe that trustworthy, ethical generative AI should not be feared, but instead can be beneficial to artists, creators, publishers, and to the public more broadly. Our focuses going forward will be:
- To develop and share principles, best practices, guidance, and training for using generative AI to support the commons. We don’t have all the answers — or necessarily all the questions — and we will work collaboratively with our community to establish shared principles.
- To continue to engage our community and broaden it to lift up diverse, global voices and find ways to support different types of sharing and creativity.
- Additionally, it is imperative that we engage more with AI developers and services to increase their support for transparency and ethical, public-interest tools and practices. CC will be seeking to collaborate with partners who share our values and want to create solutions that support a thriving commons.
For over two decades we have stewarded the legal infrastructure that enables open sharing on the web. We now have an opportunity to reimagine sharing and creativity in this new age. It is time to build new infrastructure that supports better sharing with generative AI.
We invite you to join us in this work, as we continue to openly discuss, deliberate, and take action in this space. Follow along with our blog series on AI, subscribe to our newsletter, support our work, or join us at one of our upcoming events. We’re particularly excited to welcome our community back in-person to Mexico City in October for the CC Global Summit, where the theme is focused squarely on AI & the commons.
Like the rest of the world, CC has been watching generative AI and trying to understand the many complex issues raised by these amazing new tools. We are especially focused on the intersection of copyright law and generative AI. How can CC’s strategy for better sharing support the development of this technology while also respecting the work of human creators? How can we ensure AI operates in a better internet for everyone? We are exploring these issues in a series of blog posts by the CC team and invited guests that look at concerns related to AI inputs (training data), AI outputs (works created by AI tools), and the ways that people use AI. Read our overview on generative AI or see all our posts on AI.
Note: We use “artificial intelligence” and “AI” as shorthand terms for what we know is a complex field of technologies and practices, currently involving machine learning and large language models (LLMs). Using the abbreviation “AI” is handy, but not ideal, because we recognize that AI is not really “artificial” (in that AI is created and used by humans), nor “intelligent” (at least in the way we think of human intelligence).