Seal Of The Executive Office Of The President / Public Domain
Yesterday President Barack Obama issued an Executive Order requiring federal government information to be open and machine-readable by default. This Order is the latest in a series of actions going back to 2009 in support of increasing access to and transparency of government information.
In addition to the Executive Order, the White House released a Memorandum (PDF) explaining how federal government agencies will comply with the new open data policy.
This Memorandum requires agencies to collect or create information in a way that supports downstream information processing and dissemination activities. This includes using machine readable and open formats, data standards, and common core and extensible metadata for all new information creation and collection efforts. It also includes agencies ensuring information stewardship through the use of open licenses and review of information for privacy, confidentiality, security, or other restrictions to release.
It provides a forward-thinking set of guidelines for open data to be released by U.S. federal agencies:
Open data: For the purposes of this Memorandum, the term “open data” refers to publicly available data structured in a way that enables the data to be fully discoverable and usable by end users. In general, open data will be consistent with the following principles:
- Public. Consistent with OMB’s Open Government Directive, agencies must adopt a presumption in favor of openness to the extent permitted by law and subject to privacy, confidentiality, security, or other valid restrictions.
- Accessible. Open data are made available in convenient, modifiable, and open formats that can be retrieved, downloaded, indexed, and searched. Formats should be machine-readable (i.e., data are reasonably structured to allow automated processing). Open data structures do not discriminate against any person or group of persons and should be made available to the widest range of users for the widest range of purposes, often by providing the data in multiple formats for consumption. To the extent permitted by law, these formats should be non-proprietary, publicly available, and no restrictions should be placed upon their use.
- Described. Open data are described fully so that consumers of the data have sufficient information to understand their strengths, weaknesses, analytical limitations, security requirements, as well as how to process them. This involves the use of robust, granular metadata (i.e., fields or elements that describe data), thorough documentation of data elements, data dictionaries, and, if applicable, additional descriptions of the purpose of the collection, the population of interest, the characteristics of the sample, and the method of data collection.
- Reusable. Open data are made available under an open license that places no restrictions on their use.
- Complete. Open data are published in primary forms (i.e., as collected at the source), with the finest possible level of granularity that is practicable and permitted by law and other requirements. Derived or aggregate open data should also be published but must reference the primary data.
- Timely. Open data are made available as quickly as necessary to preserve the value of the data. Frequency of release should account for key audiences and downstream needs.
- Managed Post-Release. A point of contact must be designated to assist with data use and to respond to complaints about adherence to these open data requirements.
The Memorandum provides some more information about how U.S. government information will be made reusable:
Ensure information stewardship through the use of open licenses – Agencies must apply open licenses, in consultation with the best practices found in Project Open Data, to information as it is collected or created so that if data are made public there are no restrictions on copying, publishing, distributing, transmitting, adapting, or otherwise using the information for non-commercial or for commercial purposes.
Depending on the exact implementation details, this could be a fantastic move that would remove any legal confusion about using federal government data. By leveraging open licenses, the U.S. federal government would be doing a great service to reusers by communicating those rights available in advance. And, if the U.S. truly wishes to make federal government information available without restriction, it could consider using a tool such as the CC0 Public Domain Dedication. CC0 is used by many data providers to place open data directly in the public domain. We’ve already suggested this (PDF) as an option for sharing federally funded research data.
The White House should be commended for taking another positive step forward to ensure that U.S. government data is made legally and technically accessible and useable.2 Comments »
Celebrating Open Data
Open Data Day 2013 can be described as a success. Why? Because hundreds of people participated in more than 100 events distributed across six continents all over the world, celebrating open data and all that we can do with it. Here at CC, we planned and executed a community-supported event to build open learning resources around the topic of Open Science, done in a hackathon-style sprint event that gathered people with diverse backgrounds and experience levels. An undergraduate student and a post-doc researcher, both from Stanford. An instructional designer from Los Angeles and an associate professor from Auburn University, plus a handful more of very talented people. Oh, and a mother and high school-aged daughter duo that simply wanted to see what “open” is about. We all connected to help build an open course to teach others about Open Science. Here’s how we did it.
Open Content for Learning
It’s worth mentioning that the course materials that were produced during the sprint will be openly licensed CC BY and shared so that their benefit to Open Education and Open Science are not restricted by legal boundaries. The material is being curated and will undergo a review process over the next couple weeks before being ported to the School of Open, a collaborative project by Creative Commons, P2PU, and a strong volunteer community of “open” experts and organizations. Though fitting the content to P2PU’s online course platform was in the back of our minds, time and consideration were largely placed on identifying important ideas that explain what Open Access, Open Research, and Open Data mean for Open Science, and how we can engage more “young scientists” (this is an ever-broadening term) in the ways of Open.
The Net Works Effect*
Adding a layer on top of open content itself, which is elastic in nature, our approach to this hackathon-style event focused on being very lean, the type of event that can be run by anyone, anywhere, and requiring very few resources. We created a Google Drive folder and a set of publicly-editable documents to collect openly-licensed resources, map out a tentative module/lesson plan, coordinate communications between participants, and generally provide a single place to collaborate on Open Science learning materials. Connecting with other event organizers at the OKFN and PLOS, mailing lists, Twitter hashtags, and other forms of communication were established so that there was a support network for those who were organizing events and those who were interested in participating in Open Data Day events on some level. David Eaves, Rufus Pollock, Ross Mounce, and many others were loud and clear on the Open Data Day mailing list, making sure news about each event was passed around.
— creativecommons (@creativecommons) February 22, 2013
Before the event, a registration page was created for the course sprint. We offered a handful of in-person tickets for folks to come down to our office in Mountain View, as well as a number of remote participant tickets for those who were in different geographical locations. Google Hangout “rooms” were set up on laptop computers placed in physical conference rooms at the CC HQ, allowing remote participants to work in real-time with persons on the ground. To see a more detailed description of the day’s event, see the schedule document here.
So what did we make? The sprinters involved in the project collected and organized resources that explain common aspects of Open Science. The main sections (access, methods, data) were helpful in searching for content, but there was a great deal of overlap between sections, which highlighted the relationhips between them. Beyond the collection of resources, sets of tasks were built that are meant to guide learners out beyond the course and into the communities of Open Science, interacting with the ideas, technical systems, and people who are opening up science. The Introduction to Open Science course on P2PU is still in a lightly-framed state, but the plan is to include the course in the launch of the School of Open during Open Education Week, March 11-15. If you’re interested in helping make this transition or to help build or review other courses that we call “open,” come introduce yourself in the School of Open Google Group. Or check out what else is happening on P2PU.
Beyond the course itself, we’re going to take a look at the sprint process we used, and work out some of the kinks. This rapid open-content creation technique is manageable, low-cost, and builds the Commons. There’s enough openly-licensed content existing on the web to produce a range of learning experiences, so now it seems that it’s a matter of developing open technology tools to the point where we can build education on the web together, easily. For more information about this and other Open Education projects being worked on by Creative Commons, see this page.
We Got Together for Open
Thanks to those who were able to participate in the Open Science course, as well as those who contributed the planning documents leading up to the event. We’ve done well.
PLOS Sci-Ed Blog, Guest Post: Open Data Day, Course Sprints, and Hackathons!
David Eaves’ Blog, International #OpenDataDay: Now at 90 Cities (and… the White House)
Debbie Morrison’s Blog, A Course Design ‘Sprint’: My Experience in an Education Hackathon
Also: The Flickr album from the event can be found here.
*This phrase coined by P. Kishor here, describing the interconnectedness of Open Data Day events.2 Comments »
On this 10th anniversary of CC, there’s much to celebrate: Creative Commons licenses and tools have been embraced by millions of photographers, musicians, videographers, bloggers, and others sharing countless numbers of creative works freely online. One area of growth in use of CC licenses and public domain tools is for government works. Government adoption of Creative Commons may prove to be one of the most significant movements looking into the future. Said well by David Bollier, “Governments are coming to realize that they are one of the primary stewards of intellectual property, and that the wide dissemination of their work—statistics, research, reports, legislation, judicial decisions—can stimulate economic innovation, scientiﬁc progress, education, and cultural development.” If governments around the world are going to unleash the power of hundreds of billions of dollars of publicly funded education, research and scientific resources, we need broad adoption of open policies aligned with the belief that the public should have access to the resources they paid for. At a fundamental level, “all publicly funded resources [should be] openly licensed resources.”
CC licenses and tools have been implemented by government entities and public sector bodies around the world. And over the last few years, there’s been an increasing focus in governments aligning to the principle that the public should have access to the materials that it pays for. These funding mandates, which require that grantees release content produced with grant funds under an open license, has been a increasingly commons way for governments to support openness. Legislation involving the open licensing of publicly funded educational materials has been passed in Brazil, Poland, the United States, and Canada. The UK has championed an open access policy for publicly funded research under the Creative Commons Attribution (CC BY) license. Governments in Australia and New Zealand have opted for comprehensive open licensing policies for all government-produced works, by default releasing public information and data under CC BY. The Dutch government has taken this one step further, opting to release government information directly into the public domain using the CC0 Public Domain Dedication.
In addition to governments, other publicly-minded institutions like philanthropic foundations and intergovermental organizations are supporting open licensing. Several foundations have already implemented or are considering requiring open licensing on the outputs of their grant funds, including the William and Flora Hewlett Foundation , the Open Society Foundations, and the Bill & Melinda Gates Foundation already require their grantees to release content they build with grant money under open licenses. And CC continues to explore how to evaluate current copyright policies within the foundation world and suggest how foundations (and their grantees) can benefit from open licensing for their grant funded materials. Intergovernmental organizations like the Commonwealth of Learning and the World Bank have adopted open licensing policies to share their publications too.
Open advocates – whether it be in support of open sharing of publicly funded educational materials, open access to scientific research articles, access to a huge trove of cultural heritage resources from libraries and museums, or open licensing for public sector information and government datasets – have been increasingly active over the last few years, particularly in working to educate policymakers about the importance and benefits of open licensing. These efforts include the development of declarations such as the Budapest Open Access Initiative, Cape Town and Paris Declarations on Open Educational Resources, the Washington Declaration on Intellectual Property and the Public Interest, the Panton Principles, and many others. Advocates have been key in communicating the need for governments to consider open licensing, whether it be for federal agencies, governing bodies like the European Commission, or through multilateral negotiations such as WIPO. And the grassroots open community has been extremely active in raising awareness of open licensing, whether it be through the tireless work of CC Affiliates, the broad network of open data activists from the Open Knowledge Foundation, legal experts championing Open Government Data Principles, and persons participating in events from Open Access Week to Open Education Week to Public Domain Day. All of these actions have rallied around the common theme that governments and public bodies should release content they create or fund under open licenses, for the benefit of all.
Since the beginning of Creative Commons, governments and public sector bodies have leveraged CC licenses and public domain tools to share their data, publicly funded research, educational and cultural content, and other digital materials. Governments are increasingly leveraging CC licenses as part of their strategy to proactively share resources, promote effective spending, and champion innovation. A massive amount of work is ahead, and with a committed community of advocates, interested governmental departments, and open minded policymakers, we can together work toward a close integration of open licensing inside the public sector. If we do so, governments can better support their populations with the information they need, increase the effectiveness of the public’s investment, and contribute to a true global commons.No Comments »
In the past few weeks, the Foundation Center and the philanthropic world have taken two big steps forward in transparency. First, 15 of the nation’s largest foundations joined the “Reporting Commitment,” agreeing to release grant information regularly through Foundation Center’s Glasspockets repository. Then last week, the Foundation Center relaunched IssueLab, an extensive repository of third-sector research. IssueLab’s mission is to “gather, index, and share the collective intelligence of the social sector” more effectively.
All of the IssueLab metadata is licensed under CC BY-NC-SA and all of the content is accessible (for reading, if not necessarily for other uses) for free. Everything released to Glasspockets under the Reporting Commitment is licensed under BY NC.
Taken together, these initiatives present some interesting possibilities for the future of open data in the foundation space. Foundation Center president Bradford K. Smith discussed the implications of both initiatives in a blog post:
If you think foundations are only ATM machines and nonprofits just service providers, think again. With the launch of IssueLab, there is one place you can go to find more than eleven thousand knowledge products published, funded, produced, and/or generated by foundations and nonprofits in the U.S. and around the globe.
Last month, the Foundation Center announced the Reporting Commitment, an effort by fifteen of America’s largest philanthropic foundations to make their grants data — who they give money to, how much, where, and for what purpose — available in an open, machine-readable format. Starting today, through IssueLab, the social sector can also access what it knows as a result of that funding. A service of the Foundation Center, IssueLab gathers, indexes, and shares the sector’s collective intelligence on a free, open, and searchable platform, and encourages users to share, copy, distribute, and even adapt the work. It’s a big step for philanthropy and “open knowledge.”
Smith went on to explain why it’s important that these resources aren’t just freely available; they’re openly licensed too:
Free is good, but IssueLab promotes openness in a number of other ways. First, the metadata — the abstracts and “tags” developed for all reports in the collection — is available under a Creative Commons license and can be grabbed and/or remixed by anyone as long as they use it for non-commercial purposes. Second, only work that is available for free is included in the IssueLab collection. These are public “assets,” in that the organizations which produced them already have tax-exempt status and/or have received government funding, and they should be easy for the public to find. Sorry but Kardashian Konfidential will not be found on IssueLab. Third, IssueLab itself is an open-source platform whose underlying codebase/framework is continually being improved by a community of developers. And fourth, our own developers embrace the Open Archives Initiative (OAI), which develops and promotes interoperability standards to facilitate the efficient dissemination of online content.
Here at Creative Commons, we’re big proponents of foundations and other institutions sharing their data — and the works they produce or fund — under an open license. It makes sense for foundations to reciprocate the public’s trust by showing how philanthropic dollars have been spent, and the foundations that join in the Reporting Commitment make that information available much sooner and much more easily than it is under the federally-required information returns. By use of Glasspockets, the public can see and compare the activities of the participating foundations. Private foundations are tax-exempt because they are dedicated to the public benefit; those that share their data and research in ways that invite the reuse and contributions of others add a valuable new dimension to their public service.4 Comments »
We’re psyched to be a part of OKFestival: Open Knowledge in Action. The OKFestival takes place September 17-22, 2012 in Helsinki, Finland, and features “a series of hands-on workshops, talks, hackathons, meetings and sprints” exploring a variety of areas including open development, open cultural heritage, and gender and diversity in openness. You can buy tickets to the festival for any number of days until September 16 at http://okfestival.org/early-bird-okfest-tickets/. The OKFestival website has all the details, including the preliminary schedule.
We are particularly interested in and helped to shape the Open Research and Education topic stream, where we are leading an “Open Peer Learning” workshop on Wednesday (Sept 19) from 11:30am to 3:30pm. For the workshop the School of Open (co-led by Creative Commons and P2PU) is combining forces with the OKFN’s School of Data to explore, test and develop learning challenges around open tools and practices in data, research, and education. Participation in the workshop is free (you don’t even have to buy a festival ticket), but space is limited, so RSVP at: http://peerlearningworkshop.eventbrite.com/
The workshop will be held in this awesome space, reserved for four HACK workshops:
For those of you able to come to Helsinki, look out for our CC staff reps, Jessica Coates and Timothy Vollmer, along with many of our European affiliates who will be holding a regional meeting on Day four of the fest.
For the rest of you, you can still participate in helping to build initiatives like the School of Open from wherever you are by visiting http://schoolofopen.org/ and signing up for the mailing lists there.No Comments »
Some of these developments may be dated by a month or more, but we want to make sure they are on your radar by pointing them out here.
Several open data portals have launched, including a Brazilian Open Data portal powered by the open-source data cataloguing software CKAN (run by the Open Knowledge Foundation – OKFN). The Ministry of Planning in Brazil worked with the OKFN to develop the portal, cultivating citizen participation through an open and transparent development process. Furthermore, the portal itself carries a default license of CC BY-SA. Since its May 4 launch, the portal has grown and now hosts 79 data sets and 893 resources. As noted on the OKFN blog, “the portal is part of a larger project called the National Infrastructure Open Data, or INDA. The general idea of INDA is to establish technical standards for open data, promote training and support public bodies in the task of publishing open data. This entire process is done through intra-government cooperation and cooperation between government and citizens, always aiming to achieve a real platform for open government.”
You should also take note of the Open GLAM data portal. This portal also runs on CKAN and is a hub for open data sets from GLAM institutions, aka Galleries, Libraries, Archives, and Museums. The datasets are licensed under various open licenses, and some with no rights attached thanks to the use of the CC0 public domain waiver.
In addition to open data portals, open data initiatives like the School of Data and the Open Data Institute are taking off. The School of Data is a collaboration between the OKFN and the Peer 2 Peer University (P2PU) to “create a set of courses for people to learn how to do interesting things with data, from beginners to experts.” In late May, the School of Data held a week-long kick-off sprint in Berlin with a virtual component, which I participated in by helping to start an open data challenge with virtual colleagues. The challenge is still in development, and once completed it will be a part of the School of Open as well as the School of Data. You can help to build it at the P2PU platform.
The kick-off yielded a great foundation for many other data tracks as part of the School of Data, which you can read about here.
The Open Data Institute is an initiative by the UK government to “incubate, nurture and mentor new businesses exploiting Open Data for economic growth” and to “promote innovation driven by the UK Government Open Data policy.” £10m will be invested over five years by the Technology Strategy Board, a non-departmental public body. The UK government has published its implementation plan as a pdf online. You can learn more at The Guardian article from last May.
The data-driven economy is also a hot topic within the EU, with the emergence of a data session at the European Commission’s 2nd Digital Agenda Assembly taking place today and tomorrow. The workshop will “explore the potential of data, some of the most promising economic and business aspects involved, and discuss how policy for data and our investment in R&D can better address the challenges of businesses and the public sector and further support innovative business development.”
Lastly, to put all the current activity around data into perspective, is a thoughtful article by the OKFN’s Jonathan Gray on “What data can and cannot do.” The Guardian article reinforces the point that data, while valuable, when divorced from context and without interpretation, is not very effective. He encourages us to “cultivate a more critical literacy” towards data:
“Data can be an immensely powerful asset, if used in the right way. But as users and advocates of this potent and intoxicating stuff we should strive to keep our expectations of it proportional to the opportunity it represents.”
Essentially, opening up data is just the first step — and arguably, a necessary step to ensuring that data can be reused, contextualized, and interpreted in meaningful ways.
To learn more about how CC tools may be applied to data, see our landing page and FAQ on data.1 Comment »
This Saturday’s International Journalism Festival in Perugia, Italy will unveil a months-long collaborative effort — the Data Journalism Handbook, a free, CC BY-SA licensed book to help journalists find and use data for better news reporting.
A joint initiative of the European Journalism Centre and the Open Knowledge Foundation, the collaborative book effort was kicked off at the 2011 Mozilla Festival: Media, Freedom and the Web — which gathered reporters, data journalism practitioners, advocates, and journalism and related organizations from around the globe. Over three days, participants researched, wrote, and edited chapters of the handbook. Contributors include the Australian Broadcasting Corporation, the BBC, the Chicago Tribune, Deutsche Welle, the Guardian, the Financial Times, La Nacion, The New York Times, ProPublica, The Washington Post, and many others — including Creative Commons. Creative Commons contributed to various pieces of the “Getting Data” section, including “Using and Sharing Data: the Black Letter, Fine Print, and Reality.” You can preview the outline here.
From the announcement,
Now more than ever, journalists need to know how to work with data. From covering public spending to elections, the Wikileaks cables to the financial crisis – journalists need to know where to find and request key datasets, how to make sense of them, and how to present them to the public.
Jonathan Gray, lead editor for the handbook, says: “The book gives us an unprecedented, behind-the-scenes look at how data is used by journalists around the world – from big news organisations to citizen reporters. We hope it will serve to inform and inspire a new generation of data journalists to use the information around us to communicate complex and important issues to the public.
You can sign up to get the handbook when it goes live at http://www.datajournalismhandbook.org. The entire handbook will be available for free under CC BY-SA, with an alternative printed version and e-book to be published by O’Reilly Media.2 Comments »
Creative Commons is seeking a Project Coordinator for Science and Data! The Project Coordinator will organize, coordinate and manage projects related to data policy and governance and perform research and analysis on data governance topics across relevant sectors — particularly for science — and communicate results and recommendations from the project via writing and related outreach.
We are looking for someone who is experienced in policy analysis, development and processes, in addition to Open Source Software, Open Access/Open Data and other Open content projects. A science and/or legal background with international experience is highly desirable — especially as the position will be representing Creative Commons at global events in the Open Data and Open Science communities! See the job posting and apply at our opportunities page.
We will stop accepting applications after 11:59 p.m. PDT, May 25, 2012.No Comments »
The last few months has seen a growth in open data, particularly from governments and libraries. Among the more recent open data adopters are the Austrian government, Italian Ministry of Education, University and Research, Italian Chamber of Deputies, and Harvard Library.
The Italian Ministry of Education, University and Research launched its Open Data Portal under CC BY, publishing the data of Italian schools (such as address, phone number, web site, administrative code), students (number, gender, performance), and teachers (number, gender, retirement, etc.). The Ministry aims to make all of its data eventually available and open for reuse, in order to improve transparency, aid in the understanding of the Italian scholastic system, and promote the creation of new tools and services for students, teachers and families.
Lastly, Harvard Library in the U.S. has released 12 million catalog records into the public domain using the CC0 public domain dedication tool. The move is in accordance with Harvard Library’s Open Metadata Policy. The policy’s FAQ states,
“With the CC0 public domain designation, Harvard waives any copyright and related rights it holds in the metadata. We believe that this will help foster wide use and yield developments that will benefit the library community and the public.”
Harvard’s press release cites additional motivations for opening its data,
John Palfrey, Chair of the DPLA, said, “With this major contribution, developers will be able to start experimenting with building innovative applications that put to use the vital national resource that consists of our local public and research libraries, museums, archives and cultural collections.” He added that he hoped that this would encourage other institutions to make their own collection metadata publicly available.
We are excited that CC tools are being used for open data. For questions related to CC and data, see our FAQ about data, which also links to many more governments, libraries, and organizations that have opened their data.2 Comments »
Yesterday, Nature Publishing Group announced the launch of a new linked data platform, providing access to “20 million Resource Description Framework (RDF) statements, including primary metadata for more than 450,000 articles published by NPG since 1869. The datasets include basic citation information (title, author, publication date, etc) as well as NPG specific ontologies.” All datasets are published using the CC0 public domain dedication, which is not a license, but a legal tool that may be used by anyone wishing to permanently surrender the copyright and database rights (where they exist) they may have in a work, thereby placing it as nearly as possible into the public domain.
This is an excellent move by NPG, especially following an opinion piece they published in 2009 explicitly recommending open sharing and the use of CC0 to put data in the public domain, entitled, “Post-publication sharing of data and tools”:
“Although it is usual practice for major public databases to make data freely available to access and use, any restrictions on use should be strongly resisted and we endorse explicit encouragement of open sharing, for example under the newly available CC0 public domain waiver of Creative Commons.”
Many more organizations and institutions are using CC0 to release their data, which you can peruse at our wiki page for CC0 uses with data and databases. CC licenses are also used for data; read more about this and other issues plus an FAQ on CC and data at http://wiki.creativecommons.org/Data.2 Comments »