“Generative AI” has been the subject of much online conversation over the past few months. “Generative AI” refers to artificial intelligence (AI) models that can create different kinds of content by following user input and instructions. These models are trained on massive datasets of content — images, audio, text — that is harvested from the internet, and they use this content to produce new material. They can create all kinds of things, including images, music, speech, computer programs, and text, and can either work as stand-alone tools or can be incorporated into other creative tools.
The rapid development of this technology has caught the attention of many, offering the promise of revolutionizing how we create art, conduct work, and even live our daily lives. At the same time, these impressive new tools have also raised questions about the nature of art and creativity and what role law and policy should play in both fostering the development of AI and protecting individuals from possible harms that can come from AI.
At Creative Commons, we have been paying attention to generative AI for several years. We recently hosted a pair of panel discussions on AI inputs and outputs, and we have been working with lawmakers in the EU and elsewhere as jurisdictions debate the best approach to legal regulation of artificial intelligence. With our focus on better sharing, we have been particularly interested in how intellectual property policy intersects with AI. In this post, we will explore some of the challenges with applying copyright laws to the works created by generative AI.
Recently, text-to-image models like DALL-E 2, Midjourney, and Stable Diffusion have received significant attention because of their ability to create complex pieces of visual art just by following simple user text prompts. These systems essentially work by connecting user keywords to elements from images in the datasets that were used to train them to create entirely new images. The length and complexity of the prompts that users input can vary dramatically and the models are able to quickly create works across seemingly endless styles and genres.
To better understand the process of creating a piece of visual art using a text-to-image model and how it raises complex intellectual property issues, let’s look at an example. If I ask DALL-E to create an image of a “bicycle”, the AI takes that prompt, compares it to images and text descriptions in its training data, and creates a few examples of what it thinks I mean. I am mostly a spectator in this process, and not meaningfully in control over the end product. Without further instructions, the model produces what it thinks a “bicycle” should look like, and not necessarily what I had in mind when I started the process.
This image from my single prompt may or may not be what I was looking for, and in some circumstances, it may serve my needs. But if I have a particular vision for what I want my bicycle image to look like, I need to work more with DALL-E to bring that to life. I can add more prompts into the system, and the more specific I am, the closer DALL-E may get to what I want. That is, the more description about what I want my image to look like, the more thought I give into what I want my end product to be, the more material the AI has to work with to find elements from its training data to bring out my artistic vision. I’m never entirely in control in this process, since the model does the physical creation of the work. And this doesn’t always work as planned — more specifics can lead to unexpected (and sometimes very strange) new elements added to the image. But ideally the more I work with the tool, the closer I may get to my vision.
The process of creating content using something like DALL-E 2 takes a considerable amount of trial and error, and over time you can develop skills in prompting the AI to generate what you want it to produce. Indeed, there are entire “prompt books” available to give people shortcuts to get the most out of DALL-E 2 without having to get over the learning curve. Yet even without these books, you can learn to use the system to create content that fits your artistic vision, given enough time and experience.
This simple example begins to illustrate how generative AI can blur the line between what is the work of a human artist and what is the work of a machine, and when it involves both, it reveals the difficulty with applying classic copyright laws to AI-generated content. Creative Commons has argued for several years that, absent significant and direct human creative input, AI outputs should not qualify for copyright protection. In part, this is because we believe that copyright law’s fundamental purpose is to foster human creativity. Where human creativity is not involved, the default should be no copyright protection. Freedom from copyright protection offers important benefits, mostly notably, it enables downstream users to build on, share, and create new works without worrying about infringing on anyone’s rights. Simply stated, autonomously created content produced by AI doesn’t involve human creative expression and simply isn’t within the subject matter of copyright. It should be free for all to use.
But what happens when human creativity is more deeply involved in the generative AI process? What then? It would be difficult to argue that adding minimal inputs into DALL-E 2, like my “bicycle” example, is “creative” in any substantial way. However, as I manipulate the tool more by adding in more substantive and creative prompts to get it to produce the work that I have in mind, the more creative input I have in the process and the less that is left to the AI alone. In this way, DALL-E begins to look more an artist’s tool and less like an autonomous or semi-autonomous content generator.
In fact, these generative AI models can be powerful tools to encourage and enhance human creativity. Over the summer, an artist named Jason Allen won the Colorado State Fair digital arts competition with an image generated by Midjourney. Allen spent nearly 80 hours creating his work, adjusting text prompts to create hundreds of images, from which he selected three and manipulated them with other digital tools, until finally printing the works on canvas — certainly this goes beyond simply entering a few keywords into the tool. What is more, AI can give non-artists the ability to create new works. I, for example, do not have much talent or training in the visual arts. I have never been able to draw what I have in my head. But with these generative AI tools, I can make my artistic vision a reality in a way that I have not been able to do in the past. Imagine what someone with deep artistic vision but physical or visual challenges may be able to do with these tools — the possibilities are amazing!
If this is true — if generative AI can be an engine for human creativity instead of a substitute for it — then perhaps we need to consider if, when, and how copyright protection should attach to parts of some AI outputs, separating the unprotectable elements from what may be, on a case-by-case basis, protectable. On the other hand, rights restrictions come with potential downsides. While copyright can help incentivize creation in some circumstances, legal protection should not be granted where it disproportionately harms the public’s right to access information, culture, and knowledge, as well as freedom of expression. The questions for us, then, are when does the creativity that a user puts into a work based on a generative model rise to a level where rights protections should attach and when do the benefits of protection outweigh the costs.
In a blogpost from 2020, P. Bernt Hugenholz, Joao Pedro Quintais, and Daniel Gervais offered an interesting way to look at the generative AI creation process. They divided the process into three parts: Conception (designing and specifying the final output), execution (producing draft versions of the output), and redaction (refining the output). Humans are primarily involved at the conception and redaction phases — coming up with the idea for the output, entering prompts into the system, iterating and refining the final product. AI, on the other hand, is essential to the execution phase — assembling the output following human inputs. The authors wrote that whether copyright should protect a piece of AI-generated content should be a case-by-case determination. By breaking down the process into these parts, we can evaluate what kind and how much human creative input goes into the production of AI outputs. Where human creative choices are expressed in a final output, that output should qualify for copyright protection; where an AI creates without creative choices of the human author in the final product, that should not qualify for copyright.
There are no easy answers here. While we believe that AI-generated content should not be protected by copyright by default, the line between what are the works of human artists and what are the productions of AI algorithms will only become more complex as AI technologies continue to develop and are incorporated into other creative tools.
For over twenty years, Creative Commons has argued for a copyright system that encourages more sharing and a freer use of creative works, because we believe that an open approach to intellectual property rights benefits us all. The law should support and foster human creativity, and right now it is at best unclear how AI-generated content fits into this system. However copyright law applies to this new technology, it is essential for the law to strike a balance between the rights of people to use, share, and express themselves using creative works and incentivizing creativity through exclusive rights.