Two weeks ago we wrote about the U.S. Executive Order and announcement of Project Open Data, an open source project (managed on Github) that lays out the implementation details behind behind the President’s Executive Order and memo. The project offers more information on open licenses, and gives examples of acceptable licenses for U.S. federal data. Some of this information is clear, while other pieces require more clarification. Below we’ve provided some commentary and notes on the licensing parts of Project Open Data.
The Open Licenses page on Project Open Data says that a license will be considered “open” if the following conditions are met:
Reuse. The license must allow for reproductions, modifications and derivative works and permit their distribution under the terms of the original work.
Users can copy and make adaptations of the data. The government may use a copyleft license, thus requiring that adapted works be shared under the same license as the original. In our view, the reference to the government using a license is confusing. Works created by federal government employees in the in the public domain, and a license is not appropriate–at least as a matter of U.S. copyright law. More on this below.
The rights attached to the work must not depend on the work being part of a particular package. If the work is extracted from that package and used or distributed within the terms of the work’s license, all parties to whom the work is redistributed should have the same rights as those that are granted in conjunction with the original package.
Everyone is offered the work under the same public license.
Redistribution. The license shall not restrict any party from selling or giving away the work either on its own or as part of a package made from works from many different sources.
Third parties can sell the data verbatim or produce adaptations of the data and sell those.
The license shall not require a royalty or other fee for such sale or distribution.
Users don’t have to pay to use the licensed data.
The license may require as a condition for the work being distributed in modified form that the resulting work carry a different name or version number from the original work.
When the data gets remixed the licensor can require that the remixer note that their remixed version is different from the original.
The rights attached to the work must apply to all to whom it is redistributed without the need for execution of an additional license by those parties.
Public licenses must be used, which means that everyone gets offered the data under the same terms, without the need to negotiation individual licenses.
The license must not place restrictions on other works that are distributed along with the licensed work. For example, the license must not insist that all other works distributed on the same medium are open.
The license doesn’t infect other data or content that is distributed alongside the openly licensed data. It’s important that the open data is marked as such; the same goes for marking of the the non-open data.
If adaptations of the work are made publicly available, these must be under the same license terms as the original work.
This is a confusing statement, because it seems to require that all data be licensed under a copyleft license. This does not align with the licensing options listed in the Open License Examples page.
No Discrimination against Persons, Groups, or Fields of Endeavor. The license must not discriminate against any person or group of persons. The license must not restrict anyone from making use of the work in a specific field of endeavor. For example, it may not restrict the work from being used in a business, or from being used for research.
Anyone may use the licensed data for any reason.
Open License Examples
The Open License Examples page offers a helpful guide as to which open licenses will be accepted for government data released by federal agencies. As we noted in our earlier post, there is some confusion in that the Open Data Policy Memo says, “open data are made available under an open license that places no restrictions on their use.” Saying that data should be placed under a license with no restrictions doesn’t make sense, since even the most “open” license (such as CC BY) makes attribution to the author a condition on using the license. If the United States truly wishes to make federal government data available without restriction, it could consider mandating only those tools that accomplish this, for example the CC0 Public Domain Dedication or the Open Data Commons Public Domain Dedication and License.
Data and content created by government employees within the scope of their employment are not subject to domestic copyright protection under 17 U.S.C. § 105.
The fact that data and content created by federal government employees is not subject to copyright protection in the United States is a longstanding positive feature of the US code. But as noted here, this copyright-free zone only applies when talking about domestic protection, e.g. inside the United States. Outside its borders, the United States government could assert that, for example, one of its works is protected under French copyright law, and then enforce its copyright in France. It’s unclear how much this legal nuance is leveraged outside of the United States. But it does seem to create a challenge for the U.S. federal agencies in utilizing public domain dedication tools like CC0. This is because CC0 puts content into the worldwide public domain, whereas under Section 105 works created by federal government employees are only in the public domain in the United States. So, while it’s useful that works created by U.S. federal government employees is in the public domain in the United States, it’s a shame that this seems to preclude federal agencies from utilizing public domain tools like CC0, which would help communicate broad reuse rights easily and in machine-readable form. This begs the larger question, if information created by federal government employees is in the public domain in the United States, then is it inappropriate to license this data and content under one of the licenses noted below? And, if that is true, then what content will be licensed under the conformant licenses? Third party content?
When purchasing data or content from third-party vendors, however care must be taken to ensure the information is not hindered by a restrictive, non-open license. In general, such licenses should comply with the open knowledge definition of an open license. Several examples of common open licenses are listed below:
- Creative Commons BY, BY-SA, or CC0
- GNU Free Documentation License
- Open Data Commons Public Domain Dedication and Licence (PDDL)
- Open Data Commons Attribution License
- Open Data Commons Open Database License (ODbL)
- Creative Commons CC0
Notwithstanding the questions above about licensing options for the work produced by federal government employees, the Administration is taking a great step in recommending that licenses should align with the Open Definition. In addition, the Administration might include information about appropriate software licenses, should those come into play when they release data.1 Comment »
Seal Of The Executive Office Of The President / Public Domain
Yesterday President Barack Obama issued an Executive Order requiring federal government information to be open and machine-readable by default. This Order is the latest in a series of actions going back to 2009 in support of increasing access to and transparency of government information.
In addition to the Executive Order, the White House released a Memorandum (PDF) explaining how federal government agencies will comply with the new open data policy.
This Memorandum requires agencies to collect or create information in a way that supports downstream information processing and dissemination activities. This includes using machine readable and open formats, data standards, and common core and extensible metadata for all new information creation and collection efforts. It also includes agencies ensuring information stewardship through the use of open licenses and review of information for privacy, confidentiality, security, or other restrictions to release.
It provides a forward-thinking set of guidelines for open data to be released by U.S. federal agencies:
Open data: For the purposes of this Memorandum, the term “open data” refers to publicly available data structured in a way that enables the data to be fully discoverable and usable by end users. In general, open data will be consistent with the following principles:
- Public. Consistent with OMB’s Open Government Directive, agencies must adopt a presumption in favor of openness to the extent permitted by law and subject to privacy, confidentiality, security, or other valid restrictions.
- Accessible. Open data are made available in convenient, modifiable, and open formats that can be retrieved, downloaded, indexed, and searched. Formats should be machine-readable (i.e., data are reasonably structured to allow automated processing). Open data structures do not discriminate against any person or group of persons and should be made available to the widest range of users for the widest range of purposes, often by providing the data in multiple formats for consumption. To the extent permitted by law, these formats should be non-proprietary, publicly available, and no restrictions should be placed upon their use.
- Described. Open data are described fully so that consumers of the data have sufficient information to understand their strengths, weaknesses, analytical limitations, security requirements, as well as how to process them. This involves the use of robust, granular metadata (i.e., fields or elements that describe data), thorough documentation of data elements, data dictionaries, and, if applicable, additional descriptions of the purpose of the collection, the population of interest, the characteristics of the sample, and the method of data collection.
- Reusable. Open data are made available under an open license that places no restrictions on their use.
- Complete. Open data are published in primary forms (i.e., as collected at the source), with the finest possible level of granularity that is practicable and permitted by law and other requirements. Derived or aggregate open data should also be published but must reference the primary data.
- Timely. Open data are made available as quickly as necessary to preserve the value of the data. Frequency of release should account for key audiences and downstream needs.
- Managed Post-Release. A point of contact must be designated to assist with data use and to respond to complaints about adherence to these open data requirements.
The Memorandum provides some more information about how U.S. government information will be made reusable:
Ensure information stewardship through the use of open licenses – Agencies must apply open licenses, in consultation with the best practices found in Project Open Data, to information as it is collected or created so that if data are made public there are no restrictions on copying, publishing, distributing, transmitting, adapting, or otherwise using the information for non-commercial or for commercial purposes.
Depending on the exact implementation details, this could be a fantastic move that would remove any legal confusion about using federal government data. By leveraging open licenses, the U.S. federal government would be doing a great service to reusers by communicating those rights available in advance. And, if the U.S. truly wishes to make federal government information available without restriction, it could consider using a tool such as the CC0 Public Domain Dedication. CC0 is used by many data providers to place open data directly in the public domain. We’ve already suggested this (PDF) as an option for sharing federally funded research data.
The White House should be commended for taking another positive step forward to ensure that U.S. government data is made legally and technically accessible and useable.2 Comments »
As research communities worldwide look for new ways to make the scientific process and its data and results more open and participatory, New Zealand is showing us how it is done.
In July 2010, The New Zealand Government Open Access and Licensing framework (NZGOAL) approved by the Cabinet provided guidance for agencies to follow when releasing copyright works and non-copyright material for re-use by others. NZGOAL seeks to standardise the licensing of government copyright works for re-use via Creative Commons New Zealand law licences and recommends the use of ‘no-known rights’ statements for non-copyrighted material.
Then in August 2011, the Declaration on Open and Transparent Government was also approved by the Cabinet whereby the government committed to actively release high value public data “to enable the private and community sectors to use it to grow the economy, strengthen the social and cultural fabric, and sustain the environment… to encourage business and community involvement in government decision-making.”
And earlier this month in December 2012, a report of the Education and Science Committee presented to the House of Representatives of the 50th Parliament an Inquiry into 21st century learning environments and digital literacy. Among its recommendations were that the Government:
- review the intellectual property framework for (NZ) education system to resolve copyright issues that have been raised, including considering Creative Commons policy.
- consider the advantages and disadvantages of whether all documentation produced by the Ministry of Education for teaching and learning purposes should be released under a Creative Commons licence.
In keeping with this spirit, a group of researchers committed to bringing an Open Research conference to Australia and New Zealand are organizing a three day event February 6-8, 2013 in Auckland.
The purpose of this conference is to explore new, open models of research that speed up the effective transfer of research results and improve economic, environmental and social impacts. A growing community of researchers around the world are investigating new commercial and academic models to enhance the reach of their research. These new ways of doing research openly are akin to changes happening in the IT and business world, where open innovation has enabled people to achieve more together than they ever could alone.
Creative Commons plays a key role in promoting openness in science. Events such as this one in Auckland demonstrate the concern about open science that the community shares with Creative Commons. In the end, only good things can come out of openness, sharing and broad participation. Creative Commons is very pleased to see this event take place, and wishes it utmost success.No Comments »
In the past few weeks, the Foundation Center and the philanthropic world have taken two big steps forward in transparency. First, 15 of the nation’s largest foundations joined the “Reporting Commitment,” agreeing to release grant information regularly through Foundation Center’s Glasspockets repository. Then last week, the Foundation Center relaunched IssueLab, an extensive repository of third-sector research. IssueLab’s mission is to “gather, index, and share the collective intelligence of the social sector” more effectively.
All of the IssueLab metadata is licensed under CC BY-NC-SA and all of the content is accessible (for reading, if not necessarily for other uses) for free. Everything released to Glasspockets under the Reporting Commitment is licensed under BY NC.
Taken together, these initiatives present some interesting possibilities for the future of open data in the foundation space. Foundation Center president Bradford K. Smith discussed the implications of both initiatives in a blog post:
If you think foundations are only ATM machines and nonprofits just service providers, think again. With the launch of IssueLab, there is one place you can go to find more than eleven thousand knowledge products published, funded, produced, and/or generated by foundations and nonprofits in the U.S. and around the globe.
Last month, the Foundation Center announced the Reporting Commitment, an effort by fifteen of America’s largest philanthropic foundations to make their grants data — who they give money to, how much, where, and for what purpose — available in an open, machine-readable format. Starting today, through IssueLab, the social sector can also access what it knows as a result of that funding. A service of the Foundation Center, IssueLab gathers, indexes, and shares the sector’s collective intelligence on a free, open, and searchable platform, and encourages users to share, copy, distribute, and even adapt the work. It’s a big step for philanthropy and “open knowledge.”
Smith went on to explain why it’s important that these resources aren’t just freely available; they’re openly licensed too:
Free is good, but IssueLab promotes openness in a number of other ways. First, the metadata — the abstracts and “tags” developed for all reports in the collection — is available under a Creative Commons license and can be grabbed and/or remixed by anyone as long as they use it for non-commercial purposes. Second, only work that is available for free is included in the IssueLab collection. These are public “assets,” in that the organizations which produced them already have tax-exempt status and/or have received government funding, and they should be easy for the public to find. Sorry but Kardashian Konfidential will not be found on IssueLab. Third, IssueLab itself is an open-source platform whose underlying codebase/framework is continually being improved by a community of developers. And fourth, our own developers embrace the Open Archives Initiative (OAI), which develops and promotes interoperability standards to facilitate the efficient dissemination of online content.
Here at Creative Commons, we’re big proponents of foundations and other institutions sharing their data — and the works they produce or fund — under an open license. It makes sense for foundations to reciprocate the public’s trust by showing how philanthropic dollars have been spent, and the foundations that join in the Reporting Commitment make that information available much sooner and much more easily than it is under the federally-required information returns. By use of Glasspockets, the public can see and compare the activities of the participating foundations. Private foundations are tax-exempt because they are dedicated to the public benefit; those that share their data and research in ways that invite the reuse and contributions of others add a valuable new dimension to their public service.4 Comments »
Some of these developments may be dated by a month or more, but we want to make sure they are on your radar by pointing them out here.
Several open data portals have launched, including a Brazilian Open Data portal powered by the open-source data cataloguing software CKAN (run by the Open Knowledge Foundation – OKFN). The Ministry of Planning in Brazil worked with the OKFN to develop the portal, cultivating citizen participation through an open and transparent development process. Furthermore, the portal itself carries a default license of CC BY-SA. Since its May 4 launch, the portal has grown and now hosts 79 data sets and 893 resources. As noted on the OKFN blog, “the portal is part of a larger project called the National Infrastructure Open Data, or INDA. The general idea of INDA is to establish technical standards for open data, promote training and support public bodies in the task of publishing open data. This entire process is done through intra-government cooperation and cooperation between government and citizens, always aiming to achieve a real platform for open government.”
You should also take note of the Open GLAM data portal. This portal also runs on CKAN and is a hub for open data sets from GLAM institutions, aka Galleries, Libraries, Archives, and Museums. The datasets are licensed under various open licenses, and some with no rights attached thanks to the use of the CC0 public domain waiver.
In addition to open data portals, open data initiatives like the School of Data and the Open Data Institute are taking off. The School of Data is a collaboration between the OKFN and the Peer 2 Peer University (P2PU) to “create a set of courses for people to learn how to do interesting things with data, from beginners to experts.” In late May, the School of Data held a week-long kick-off sprint in Berlin with a virtual component, which I participated in by helping to start an open data challenge with virtual colleagues. The challenge is still in development, and once completed it will be a part of the School of Open as well as the School of Data. You can help to build it at the P2PU platform.
The kick-off yielded a great foundation for many other data tracks as part of the School of Data, which you can read about here.
The Open Data Institute is an initiative by the UK government to “incubate, nurture and mentor new businesses exploiting Open Data for economic growth” and to “promote innovation driven by the UK Government Open Data policy.” £10m will be invested over five years by the Technology Strategy Board, a non-departmental public body. The UK government has published its implementation plan as a pdf online. You can learn more at The Guardian article from last May.
The data-driven economy is also a hot topic within the EU, with the emergence of a data session at the European Commission’s 2nd Digital Agenda Assembly taking place today and tomorrow. The workshop will “explore the potential of data, some of the most promising economic and business aspects involved, and discuss how policy for data and our investment in R&D can better address the challenges of businesses and the public sector and further support innovative business development.”
Lastly, to put all the current activity around data into perspective, is a thoughtful article by the OKFN’s Jonathan Gray on “What data can and cannot do.” The Guardian article reinforces the point that data, while valuable, when divorced from context and without interpretation, is not very effective. He encourages us to “cultivate a more critical literacy” towards data:
“Data can be an immensely powerful asset, if used in the right way. But as users and advocates of this potent and intoxicating stuff we should strive to keep our expectations of it proportional to the opportunity it represents.”
Essentially, opening up data is just the first step — and arguably, a necessary step to ensuring that data can be reused, contextualized, and interpreted in meaningful ways.
To learn more about how CC tools may be applied to data, see our landing page and FAQ on data.1 Comment »
This Saturday’s International Journalism Festival in Perugia, Italy will unveil a months-long collaborative effort — the Data Journalism Handbook, a free, CC BY-SA licensed book to help journalists find and use data for better news reporting.
A joint initiative of the European Journalism Centre and the Open Knowledge Foundation, the collaborative book effort was kicked off at the 2011 Mozilla Festival: Media, Freedom and the Web — which gathered reporters, data journalism practitioners, advocates, and journalism and related organizations from around the globe. Over three days, participants researched, wrote, and edited chapters of the handbook. Contributors include the Australian Broadcasting Corporation, the BBC, the Chicago Tribune, Deutsche Welle, the Guardian, the Financial Times, La Nacion, The New York Times, ProPublica, The Washington Post, and many others — including Creative Commons. Creative Commons contributed to various pieces of the “Getting Data” section, including “Using and Sharing Data: the Black Letter, Fine Print, and Reality.” You can preview the outline here.
From the announcement,
Now more than ever, journalists need to know how to work with data. From covering public spending to elections, the Wikileaks cables to the financial crisis – journalists need to know where to find and request key datasets, how to make sense of them, and how to present them to the public.
Jonathan Gray, lead editor for the handbook, says: “The book gives us an unprecedented, behind-the-scenes look at how data is used by journalists around the world – from big news organisations to citizen reporters. We hope it will serve to inform and inspire a new generation of data journalists to use the information around us to communicate complex and important issues to the public.
You can sign up to get the handbook when it goes live at http://www.datajournalismhandbook.org. The entire handbook will be available for free under CC BY-SA, with an alternative printed version and e-book to be published by O’Reilly Media.2 Comments »
Creative Commons is seeking a Project Coordinator for Science and Data! The Project Coordinator will organize, coordinate and manage projects related to data policy and governance and perform research and analysis on data governance topics across relevant sectors — particularly for science — and communicate results and recommendations from the project via writing and related outreach.
We are looking for someone who is experienced in policy analysis, development and processes, in addition to Open Source Software, Open Access/Open Data and other Open content projects. A science and/or legal background with international experience is highly desirable — especially as the position will be representing Creative Commons at global events in the Open Data and Open Science communities! See the job posting and apply at our opportunities page.
We will stop accepting applications after 11:59 p.m. PDT, May 25, 2012.No Comments »
The last few months has seen a growth in open data, particularly from governments and libraries. Among the more recent open data adopters are the Austrian government, Italian Ministry of Education, University and Research, Italian Chamber of Deputies, and Harvard Library.
The Italian Ministry of Education, University and Research launched its Open Data Portal under CC BY, publishing the data of Italian schools (such as address, phone number, web site, administrative code), students (number, gender, performance), and teachers (number, gender, retirement, etc.). The Ministry aims to make all of its data eventually available and open for reuse, in order to improve transparency, aid in the understanding of the Italian scholastic system, and promote the creation of new tools and services for students, teachers and families.
Lastly, Harvard Library in the U.S. has released 12 million catalog records into the public domain using the CC0 public domain dedication tool. The move is in accordance with Harvard Library’s Open Metadata Policy. The policy’s FAQ states,
“With the CC0 public domain designation, Harvard waives any copyright and related rights it holds in the metadata. We believe that this will help foster wide use and yield developments that will benefit the library community and the public.”
Harvard’s press release cites additional motivations for opening its data,
John Palfrey, Chair of the DPLA, said, “With this major contribution, developers will be able to start experimenting with building innovative applications that put to use the vital national resource that consists of our local public and research libraries, museums, archives and cultural collections.” He added that he hoped that this would encourage other institutions to make their own collection metadata publicly available.
We are excited that CC tools are being used for open data. For questions related to CC and data, see our FAQ about data, which also links to many more governments, libraries, and organizations that have opened their data.2 Comments »
Today we’re pleased to announce that Athabasca University, BCcampus, and the Samuelson-Glushko Canadian Internet Policy and Public Interest Clinic have joined together to re-establish a CC affiliate team in Canada. All three organizations will take part in the official relaunch at the Creative Commons Salon Ottawa: Open Data on Friday, March 30.
This is not a new affiliate so much as a re-ignition of our existing Canadian community. Since 2004, a number of volunteers, interns and affiliate leads have supported and promoted CC and the use of open licenses generally in a Canadian context. This new team, representing three organizations spread across the geographic and cultural expanse of Canada, will be a key asset to support and lead the CC activities of this community.
Through public outreach, community building, tools, research, and resources this team will work with a network of open supporters to maximize digital creativity, sharing and innovation across Canada. The work of CC Canada is aligned with the overarching vision of Creative Commons — to help provide universal access to research and education, and full participation in culture to drive a new era of development, growth and productivity.
Whether you’re an artist, teacher, scientist, librarian, policymaker or just a regular citizen, Creative Commons provides you with a free, public, and standardized set of tools and licenses that create a balance between the reality of the Internet and the reality of copyright laws. CC Canada joins over four hundred other affiliates working in seventy-two jurisdictions around the world in supporting the use of Creative Commons infrastructure. Collectively this global network is creating a vast and growing digital commons of content that can be copied, distributed, edited, remixed, and built upon, all within the boundaries of copyright law.
Be sure to check out the CC Canada roadmap on the wiki. Congratulations to the CC Canada affiliate team!3 Comments »
In November we wrote that the White House Office of Science and Technology Policy (OSTP) was soliciting comments on two related Requests for Information (RFI). One asked for feedback on how the federal government should manage public access to scholarly publications resulting from federal investments, and the other wanted input on public access to the digital data funded by federal tax dollars.
Creative Commons submitted a response to both RFIs. Below is a brief summary of the main points. Several other groups and individuals have submitted responses to OSTP, and all the comments will eventually be made available on the OSTP website.
- The public funds tens of billions of dollars in research each year. The federal government can support scientific innovation, productivity, and economic efficiency of the taxpayer dollars they expend by instituting an open licensing policy.
- Scholarly articles created as a result of federally funded research should be released under full open access. Full open access policies will provide to the public immediate, free-of-cost online availability to federally funded research without restriction except that attribution be given to the source.
- The standard means for granting permission to the public aligned with full open access is through a Creative Commons Attribution (CC BY) license.
- If the federal government wants to maximize the impact of digital data resulting from federally funded scientific research, it should provide explicit, easy-to-understand information about the rights available to the public.
- The federal government should establish policies that insure the public has cost-free, unimpeded access to the digital data resulting from federally funded scientific research. Access to this data should be made available as soon as possible, with due consideration to confidentiality and privacy issues, as well as the researchers’ need to receive credit and benefit from the work.
- The federal government can grant these permissions to the public by supporting policies whereby 1) data is made available by dedicating it to the public domain or 2) data is made available through a liberal license where at most downstream data users must give credit to the source of the data. CC offers tools such as the CC0 waiver and CC BY license in support of these goals.