Help us protect the commons. Make a tax deductible gift to fund our work in 2025. Donate today!
Six Insights on Preference Signals for AI Training
Events, Licenses & Tools, Policy, TechnologyIn these uncertain times, one thing is clear: there is an urgent need to develop new, nuanced approaches to digital sharing. This is Creative Commons’ speciality and we’re ready to take on this challenge by exploring a possible intervention in the AI space: preference signals.
At the intersection of rapid advancements in generative AI and our ongoing strategy refresh, we’ve been deeply engaged in researching, analyzing, and fostering conversations about AI and value alignment. Our goal is to ensure that our legal and technical infrastructure remains robust and suitable in this rapidly evolving landscape.
In these uncertain times, one thing is clear: there is an urgent need to develop new, nuanced approaches to digital sharing. This is Creative Commons’ speciality and we’re ready to take on this challenge by exploring a possible intervention in the AI space: preference signals.
Understanding Preference Signals
We’ve previously discussed preference signals, but let’s revisit this concept. Preference signals would empower creators to indicate the terms by which their work can or cannot be used for AI training. Preference signals would represent a range of creator preferences, all rooted in the shared values that inspired the Creative Commons (CC) licenses. At the moment, preference signals are not meant to be legally enforceable. Instead, they aim to define a new vocabulary and establish new norms for sharing and reuse in the world of generative AI.
For instance, a preference signal might be “Don’t train,” “Train, but disclose that you trained on my content,” or even “Train, only if using renewable energy sources.”
Why Do We Need New Tools for Expressing Creator Preferences?
Empowering creators to be able to signal how they wish their content to be used to train generative AI models is crucial for several reasons:
- The use of openly available content within generative AI models may not necessarily be consistent with creators’ intention in openly sharing, especially when that sharing took place before the public launch and proliferation of generative AI.
- With generative AI, unanticipated uses of creator content are happening at scale, by a handful of powerful commercial players concentrated in a very small part of the world.
- Copyright is likely not the right framework for defining the rules of this newly formed ecosystem. As the CC licenses exist within the framework of copyright, they are also not the correct tools to prevent or limit uses of content to train generative AI. We also believe that a binary opt-in or opt-out system of contributing content to AI models is not nuanced enough to represent the spectrum of choice a creator may wish to exercise.
We’re in the research phase of exploring what a system of preference signals could look like and over the next several months, we’ll be hosting more roundtables and workshops to discuss and get feedback from a range of stakeholders. In June, we took a big step forward by organizing our most focused and dedicated conversation about preference signals in New York City, hosted by the Engelberg Center at NYU.
Six Highlights from Our NYC Workshop on Preference Signals
- Creative Commons as a Movement
Creative Commons is a global movement, making us uniquely positioned to tackle what sharing means in the context of generative AI. We understand the importance of stewarding the commons and the balance between human creation and public sharing.
- Defining a New Social Contract
Designing tools for sharing in an AI-driven era involves collectively defining a new social contract for the digital commons. This process is essential for maintaining a healthy and collaborative community. Just as the CC licenses gave options for creators beyond no rights reserved and all rights reserved, preference signals have the potential to define a spectrum of sharing preferences in the context of AI that goes beyond the binary options of opt-in or opt-out.
- Communicating Values and Consent
Should preference signals communicate individual values and principles such as equity and fairness? Adding content to the commons with a CC license is an act of communicating values; should preference signals do the same? Workshop participants emphasized the need for mechanisms that support informed consent by both the creator and user.
- Supporting Creators and Strengthening the Commons
The most obvious and prevalent use case for preference signals is to limit use of content within generative AI models to protect artists and creators. There is also the paradox that users may want to benefit from more relaxed creator preferences than they are willing to grant to other users when it comes to their content. We believe that preference signals that meet the sector-specific needs of creators and users, as well as social and community-driven norms that continue to strengthen the commons, are not mutually exclusive.
- Tagging AI-Generated vs. Human-Created Content
While tags for AI-generated content are becoming common, what about tags for human-created content? The general goal of preference signals should be to foster the commons and encourage more human creativity and sharing. For many, discussions about AI are inherently discussions about labor issues and a risk of exploitation. At this time, the law has no concept of “lovingly human”, since humanness has been taken for granted until now. Is “lovingly human” the new “non-commercial”? Generative AI models also force us to consider what it means to be a creator, especially as most digital creative tools will soon be driven by AI. Is there a specific set of activities that need to be protected in the process of creating and sharing? How do we address human and generative AI collaboration inputs and outputs?
- Prioritizing AI for the Public Good
We must ensure that AI benefits everyone. Increased public investment and participatory governance of AI are vital. Large commercial entities should provide a public benefit in exchange for using creator content for training purposes. We cannot rely on commercial players to set forth industry norms that influence the future of the open commons.
Next Steps
Moving forward, our success will depend on expanded and representative community consultations. Over the coming months, we will:
- Continue to convene our community members globally to gather input in this rapidly developing area;
- Continue to consult with legal and technical experts to consider feasible approaches;
- Actively engage with the interconnected initiatives of other civil society organizations whose priorities are aligned with ours;
- Define the use cases for which a preference signals framework would be most effective;
- Prototype openly and transparently, seeking feedback and input along the way to shape what the framework could look like;
- Build and strengthen the partnerships best suited to help us carry this work forward.
These high-level steps are just the beginning. Our hope is to be piloting a framework within the next year. Watch this space as we explore and share more details and plans. We’re grateful to Morrison Foerster for providing support for the workshop in New York.
Join us by supporting this ongoing work
You have the power to make a difference in a way that suits you best. By donating to CC, you are not only helping us continue our vital work, but you also benefit from tax-deductible contributions. Making your gift is simple – just click here. Thank you for your support.
Posted 23 August 2024