Better Sharing for Generative AI

by Catherine Stihler Better Internet, Licenses & Tools, Open Creativity, Technology

Like the rest of the world, CC has been watching generative AI and trying to understand the many complex issues raised by these amazing new tools. We are especially focused on the intersection of copyright law and generative AI. How can CC’s strategy for better sharing support the development of this technology while also respecting the work of human creators? How can we ensure AI operates in a better internet for everyone? We are exploring these issues in a series of blog posts by the CC team and invited guests that look at concerns related to AI inputs (training data), AI outputs (works created by AI tools), and the ways that people use AI. Read our overview on generative AI or see all our posts on AI.

Note: We use “artificial intelligence” and “AI” as shorthand terms for what we know is a complex field of technologies and practices, currently involving machine learning and large language models (LLMs). Using the abbreviation “AI” is handy, but not ideal, because we recognize that AI is not really “artificial” (in that AI is created and used by humans), nor “intelligent” (at least in the way we think of human intelligence).

A bluish surrealist painting generated by the DALL-E 2 AI platform showing a small grayish human figure holding a gift out to a larger robot that has its arms extended and a head like a cello.

“Better Sharing With AI” by Creative Commons was generated by the DALL-E 2 AI platform with the text prompt “A surrealist painting in the style of Salvador Dali of a robot giving a gift to a person playing a cello.” CC dedicates any rights it holds to the image to the public domain via CC0.

Over the last year, innovation and use of generative artificial intelligence (AI) has exploded, providing new ways for people to create content of all sorts. For example, it’s been used to help create award winning art, develop educational materials, expedite software development, and craft business materials. Recently, three artists filed a class action lawsuit in the USA against StabilityAI and Midjourney, two companies that use the Stable Diffusion tool to enable people to generate images using simple text prompts. It follows on the heels of litigation brought by the same attorneys and other plaintiffs against GitHub and OpenAI for their Copilot and Codex tools for generating software code.

AI is an area that Creative Commons has long focused on, including most recently in a webinar series we held last fall. We are going to expand on our views in future posts, including exploring why we think the legal arguments in the US court case against StabilityAI, Midjourney, and DeviantArt are ill-founded. (Getty Images also subsequently filed a similar suit against StabilityAI in the US, as well as apparently commencing litigation in the UK, but we have yet to see that complaint.)

But before digging into all of the legal issues, we wanted to take a step back and restate our general approach to generative AI.

CC on Generative AI

Creative Commons has always sought out ways to harness new technology to serve the public interest and to support better sharing of creative content — sharing that is inclusive, just, equitable, reciprocal and sustainable. We support creators to share their works as broadly and openly as they want, so that people can enjoy them globally without unnecessary barriers. We also advocate for policies that ensure new and existing creators are able to build on a shared commons, while respecting creators’ legitimate interests in control and compensation for their creative expressions.

A founding insight of Creative Commons is that all creativity builds on the past. When people learn to play the cello or paint a picture, for instance, they necessarily learn from and train their own skills by engaging pre-existing works and artists — for instance, noticing the style in which cellists like Yo-Yo Ma arrange notes, or building on surrealist styles initiated by artists like Dali. Similarly, while Star Wars invented the character of Luke Skywalker, it built on the idea of the hero’s journey, among many other elements from past works. People observe the ideas, styles, genres, and other tropes of past creativity, and use what they learn to create anew. No creativity happens in a vacuum, purely original and separate from what’s come before.

Generative AI can function in a similar way. Just as people learn from past works, generative AI is trained on previous works, analyzing past materials in order to extract underlying ideas and other information in order to build new works. Image generation tools like Stable Diffusion develop representations of what images are supposed to look like by examining pre-existing works, associating terms like “dog” or “table” with shapes and colors such that a text prompt of those terms can then output images.

Given how digital technologies function, training AI in this way necessarily involves making an initial copy of images in order to analyze them. As we’ve explored in the past and will discuss in future posts about these recent lawsuits, we think this sort of copying can and should be permissible under copyright law. There are certainly nuances when it comes to copyright’s interaction with these tools — for instance, what if the tools are later used by someone to generate an output that does copy from a specific creative expression? But treating copying to train AI as per se infringing copyright would in effect shrink the commons and impede others’ creativity in an over-broad way. It would expand copyright to give certain creators a monopoly over ideas, genres, and other concepts not limited to a specific creative expression, as well as over new tools for creativity.

Copyright, and intellectual property law in general, are only one lens to think about AI: It’s still important to grapple with legitimate concerns about this technology and consider what responsible development and use should be. For instance, what impact will these tools have on artists and creators’ jobs and compensation? How can we ensure that AI that is trained on the commons contributes back to the commons as well, supporting all types of creators? What about the use of these tools to develop harmful misinformation, to exploit people’s privacy (eg, their biometric data), or in ways that perpetuate biases? More generally, how can we ensure human oversight and responsibilities to ensure that these tools work well for society?

These are just some of the tricky issues that will need to be worked out to ensure people can harness AI tools in ways that support creativity and the public interest. Along with other policy and legal approaches to governing AI, it’s important to look to community-driven solutions that support responsible development and use. Already, StabilityAI will let artists opt-out of its training data set, as well as opt in to provide greater information about their works. While this precise approach raises a variety of views, indexing of the web has functioned well using a similar sort of opt-out approach — set through global technical standards and norms, rather than law. Creators of some generative AI tools are using licenses that constrain how they are deployed, which also carries various trade-offs.

What’s Next? Community Input

Supporting community-driven solutions has also always been at the heart of Creative Commons’ approach to creativity. If you’re interested in this subject, we are going to be holding meetings with the Creative Commons community, and we also plan to continue meeting with diverse stakeholders to explore what sorts of solutions may be helpful in this area. As we go along we’ll continue to report on what we’ve learned and seek out more community feedback.

Join the CC team at a community discussion about generative AI: How can we make it work better for everyone and support better sharing in the commons?

To enable participation around the world, we’ve scheduled three times for this conversation. Come to the one that works best for your schedule, or join as many as you like. We’ll be focused on the same questions and issues at each meeting, but different participants will bring different perspectives, reshaping each conversation. To enable participants to speak freely, these meetings will not be recorded, but the CC team will be taking notes to share outcomes from the conversations.

Community Meetings: Wednesday 22 February 2023

Register for 2:00–3:00 UTC

(check the schedule in your local timezone)

Register for 14:00–15:00 UTC

(check the schedule in your local timezone)

Register for 18:00–19:00 UTC

(check the schedule in your local timezone)

Stay in touch with CC: subscribe to our mailing list, follow us on social media (Facebook, Instagram, LinkedIn & Twitter), or join CC on Slack.

Posted 06 February 2023

CC on Generative AI

What’s Next? Community Input

Community Meetings: Wednesday 22 February 2023

Tags

Related posts