Today Open Knowledge and the Open Definition Advisory Council announced the release of version 2.0 of the Open Definition. The Definition “sets out principles that define openness in relation to data and content,” and is the baseline from which various public licenses are measured. Any content released under an Open Definition-conformant license means that anyone can “freely access, use, modify, and share that content, for any purpose, subject, at most, to requirements that preserve provenance and openness.” The CC BY and CC BY-SA 4.0 licenses are conformant with the Open Definition, as are all previous versions of these licenses (1.0 – 3.0, including jurisdiction ports). The CC0 Public Domain Dedication is also aligned with the Open Definition.
The Open Definition is an important standard that communicates the fundamental legal conditions that make content and data open. One of the most notable updates to version 2.0 is that it separates and clarifies the requirements under which an individual work will be considered open from the conditions under which a license will be considered conformant with the Definition.
Public sector bodies, GLAM institutions, and open data initiatives around the world are looking for recommendation and advice on the best licenses for their policies and projects. It’s helpful to be able to point policymakers and data publishers to a neutral, community-supported definition with a list of approved licenses for sharing content and data (and of course, we think that CC BY, CC BY-SA, and CC0 are some of the best, especially for publicly funded materials). And while we still see that some governments and other institutions are attempting to create their own custom licenses, hopefully the Open Definition 2.0 will help guide these groups into understanding of the benefits to using an existing OD-compliant license. The more that content and data providers use one of these licenses, the more they’ll add to a huge pool of legally reusable and interoperable content for anyone to use and repurpose.
To the extent that new licenses continue to be developed, the Open Definition Advisory Council has been honing a process to assist in evaluating whether licenses meet the Open Definition. Version 2.0 continues to urge potential license stewards to think carefully before attempting to develop their own license, and requires that they understand the common conditions and restrictions that should (or should not) be contained in a new license in order to promote interoperability with existing licenses.
Open Definition version 2.0 was collaboratively and transparently developed with input from experts involved in open access, open culture, open data, open education, open government, open source and wiki communities. Congratulations to Open Knowledge and the Open Definition Advisory Council on this important improvement.1 Comment »
Mangrove forests have been described by the World Wildlife Fund as one of the world’s most threatened tropical ecosystems. In an effort to protect and raise awareness around this problem, MapWorks Learning launched the first of what they plan to make an annual Mapathon for ecological preservation and learning. The inaugural event engaged schools, universities, and environmental groups around the world to document the health and well being of mangrove populations using the Mapping the Mangroves tool.
The Mapping the Mangroves (MTM) toolkit is a project originally funded by Qatar Foundation International, and is now a keystone project of MapWorks Learning. MTM uses a mapping application built on the open source Ushahidi software platform, relying on crowdsourcing to collect geographic and descriptive data about mangrove forests. The project’s reporting system allows anyone to submit a report about mangrove forests, describing the area’s biodiversity and pairing it with geographic coordinates and other sensor data. The data are then displayed on an interactive map on the project’s homepage, with all reports searchable and explorable by geographic region and other habitat or report traits. The data are freely available for download and licensed under a CC0 Public Domain Dedication, too.
The MTM project is supporting the development of OER curriculum introducing learners to mangrove forest ecosystems, basic species identification, and explaining how they can take part in the monitoring and protection of forests around the world. The toolkit’s learning material is available under a CC BY-NC-ND license on OER Commons.1 Comment »
CC is supporting the Bouchout Declration for Open Biodiversity Knowledge Management by becoming a signatory. The Declaration’s objective is to help make biodiversity data openly available to everyone around the world. It offers the biodiversity community a way to demonstrate their commitment to open science, one of the fundamental components of CC’s vision for an open and participatory internet.
In April 2013 CC participated in a workshop on Names attribution, rights, and licensing convened by the Global Names Project which led to a report titled Scientific names of organisms: attribution, rights, and licensing that concluded:
“There are no copyright impediments to the sharing of names and related data. The system must reward those who make the contributions upon which we rely. Building an attribution system remains one of the more urgent challenges that we need to address together.”
Many of the attendees of the workshop and of the report cited above are among those who met in June in Meise, Belgium and released the Bouchout Declaration.
The declaration calls for free and open use of digital resources about biodiversity and associated access services and exhorts the use of licenses or waivers that grant or allow all users a free, irrevocable, world-wide, right to copy, use, distribute, transmit and display the work publicly as well as to build on the work and to make derivative works, subject to proper attribution consistent with community practices, while recognizing that providers may develop commercial products with more restrictive licensing. This is not only aligned with the vision of CC itself, CC is also the creator and steward of the legal and technical infrastructure that allows open licensing of content.
The declaration also promotes Tracking the use of identifiers in links and citations to ensure that sources and suppliers of data are assigned credit for their contributions and Persistent identifiers for data objects and physical objects such as specimens, images and taxonomic treatments with standard mechanisms to take users directly to content and data. CC has participated from the beginning in the activities that led to the Joint Declaration of the Data Citation Principles and that promotes the use of persistent identifiers to allow discovery and attribution of resources.
Finally, the declaration calls for Policy developments that will foster free and open access to biodiversity data. CC works assiduously on creating, fostering, nurturing and assisting in the promulgation of open policies and practices that advance the public good by supporting open policy advocates, organizations and policy makers.
We have a few concerns: most copyright laws around the world treat data as not protected by copyright, thus would not require licensing. We are also aware that some cultures wish to preserve and protect traditional knowledge, so we want to make sure information is released by only those who have the right to do so without impinging on the rights of such segments that might otherwise be negatively affected by its release. However, overall we believe that open biodiversity information is crucial for science and society. Be it heralding the Seeds of Change, participating in the Group on Earth Observations (GEO), or assisting the Paleobiology Database to move to CC BY license, CC is playing a vital role in the progress of open science in the areas of biodiversity and natural resources. CC has committed to assisting organizations joining Google in the White House Climate Data Initiative. On a personal front I have released the entire codebase of Earth-Base under the CC0 Public Domain Dedication making possible applications such as Mancos on the iOS App Store.
Most of the world’s biodiversity is in developing countries, and ironically, most of biodiversity information and collections are in developed countries. Agosti calls this, “Biopiracy: taking biodiversity material from the developing world for profit, without sharing benefit or providing the people who live there with access to this crucial information.” (Agosti, D. 2006. Biodiversity data are out of local taxonomists’ reach. Nature 439, 392) Opening up the data will benefit the developing counties by giving them free and easy access to information about their own biological riches. Friction-free access to and reuse of data, software and APIs is essential to answering pressing questions about biodiversity and furthering the move to better understanding and stewarding our planet and its resources. Signing the Bouchout Declaration strengthens this movement.Comments Off
Today the White House released the U.S. Open Data Action Plan, reaffirming their belief that “freely available data from the U.S. Government is an important national resource… [and] making information about government operations more readily available and useful is also core to the promise of a more efficient and transparent government.” The report (PDF) outlines the commitments to making government data more accessible and useful, and documents how U.S. federal agencies are sharing government information. From a legal standpoint, some agencies have decided to place their datasets into the worldwide public domain using the CC0 Public Domain Dedication. This means that all copyright and related rights to the data are waived, so it may be used by anyone–for any purpose–anywhere in the world–without having to ask permission in advance–and even without needing to give attribution to the author of the data.
Use of CC0 for U.S. government works has always been a challenging topic for federal agencies. This is due to the hybrid nature of copyright for government works under Section 105 of U.S. copyright law. That statute guarantees that U.S. government works do not receive copyright protection–they are in the public domain. However, while these works are not granted copyright protection inside the U.S., the legislative history of the law notes that the works may receive copyright protection outside of U.S. borders:
The prohibition on copyright protection for United States Government works is not intended to have any effect on protection of these works abroad. Works of the governments of most other countries are copyrighted. There are no valid policy reasons for denying such protection to United States Government works in foreign countries, or for precluding the Government from making licenses for the use of its works abroad.
Historically, the U.S. government has been apprehensive to apply CC0 to federal government works, because the CC0 Public Domain Dedication is a tool to waive copyright and neighboring rights globally. At the same time, it’s clear that many high-value U.S. government datasets, such as the weather data produced by the National Oceanographic and Atmospheric Administration (NOAA), are being widely (and freely) used by meteorological and research organizations around the world. It seems that in the vast majority of cases, the U.S. federal government doesn’t wish to leverage its copyrights abroad. So perhaps it makes sense to simply clarify that these works will be made available in the worldwide public domain using a standard tool such as CC0. While we had some initial questions about acceptable licenses for federal government information, it seems that agencies are moving in the right direction in utilizing the public domain dedication, as opposed to the other copyright licensing tools that were laid out in Project Open Data.
In addition to showcasing federal agencies that are using CC0 on some of the datasets it’s releasing, the U.S. Open Data Action Plan document itself is also published under CC0.
As a work of the United States Government, this document is in the public domain within the United States. Additionally, the United States Government waives copyright and related rights in this work worldwide through the CC0 1.0 Universal Public Domain Dedication.
Over the last several years, many have called upon the federal government to adopt CC0 for U.S. government works. Most recently, a group of advocates drafted recommendations urging federal agencies to release federal government works, contractor-produced works, and primary legal materials into the into the worldwide public domain under CC0. Today’s announcement is a move in the right direction for data re-users in the United States and beyond.1 Comment »
This is part three of a five week series on the Affiliate Team project grants. So far, you’ve heard from our affiliates in Africa and the Arab World. Today, we’re showcasing projects in our Asia-Pacific region, including open data workshops from Japan, a media studies textbook from New Zealand, and software tools and guidelines for public domain materials from Taiwan.
Japan: Workshops and Symposium for Open Data in Japan
by Puneet Kishor (project lead: Tomoaki Watanabe)
Last year in June, the CommonSphere, won a grant to hold three workshops and a public symposium on the use of CC tools (licenses and the CC0 Public Domain Dedication) in the context of open data. The aim of the workshops was to respond to informal inputs from government and other stakeholders on their implementation of CC tools in the context of open data, a new frontier of openness in the last few years in Japan. The team was planning to invite involvement from Japanese national and municipal government agencies and Open Knowledge Foundation Japan.
The first event was a workshop at Information Processing Agency, IPA, an independent administrative agency discussing open data licensing. The panel involved a member of Open Knowledge Foundation Japan as well. The whole session was video-recorded by the IPA staff, and it is now available online, along with presentation materials. The attendance was mostly government officials and the agency staff, around 50 people, and an attendant survey indicated a reasonable success.
The second meeting was held among key figures related to open data and other relevant initiatives, as invitation-only discussions on licensing and other legal issues. CCJP provided logistics support and expertise. It was decided by the attendants that the discussion will remain informal and unpublished.
The third was a symposium to discuss implementation issues of open data, including licensing issues organized by the third party, Innovation Nippon, a joint project between Google Japan and GLOCOM. Both CCJP and OKF Japan helped with pre-event publicity and provided expertise. It featured and was attended by local government officials and municipal law makers, along with business people and academics. The event was videocast and the archive is available already, along with the slides.
- Political will, however, key politicians are not necessarily expected to support liberal licensing allowing use that goes against public order.
- Evidence, anecdotal or scientific, showing that more liberal licensing results in better outcomes. However, such evidence is not abundant, and some government agencies have very specific uses in mind that may make them hesitate.
- Evidence showing other governments of developed countries are doing things differently from what Japan is doing or planning to do. UK, FR, US, AU, NZ all are CC-BY compatible or use a CC-BY license. Their licensing all seem to be open in the Open Definition sense. Japan may result a bit differently.
- Prospective users actively asking for a change.
The challenges faced by the team so far have been 1) the above-mentioned development away from CC tools and 2) the lack of availability of licensing and editing talent on a more stable basis.
The team is in talks with a local government to hold at least one more workshop to discuss licensing issues as they relate to local governments. The symposium was originally planned to be at the end, but given the emerging development above, it may be timed differently.
New Zealand: Media Text Hack
by project lead Matt McGregor
In the middle of 2013, a few New Zealand academics and librarians began to toss around an exciting-but-preposterous-sounding idea: what if they could hack a media studies textbook in a weekend, and then release the results to the world under an open Creative Commons license?
The social benefit – the why – was clear. With textbook prices continuing to rise (and rise) well above inflation, and student debt levels ballooning, the Pacific region desperately needs a new model for producing and distributing educational resources. As Dr Erika Pearson, who led the Media Text Hack project, put it, “Textbooks currently available for New Zealand first year students are often produced overseas, usually the US, and can have a cripplingly high price tag.”
The how was a bit more difficult. Academics and librarians are already rather busy people, and the process of building and managing a team of contributors is labor intensive, with plenty of emailing, documenting, cat-herding, and problem-solving. Thankfully, with the help of a $4000 affiliate grant from Creative Commons, the team could hire a project manager — Bernard Madill — to help build the network of contributors, document progress, and make sure the hack weekend progressed smoothly.
Cut to 16-17 November, 2013: the team, largely made up of early career researchers from across New Zealand and Australia, got together and successfully produced the ‘beta’ version of the textbook. For the last few months, they have been progressively editing and re-editing content, to ensure that the textbook is classroom ready in time for the first down-under semester, which starts in late February.
As the book is shared, edited, and reused by students and teachers across the world, the team will incorporate new ideas, explanations, and examples, producing a text that can be hacked and re-hacked over the years ahead.
This is new territory: while there have been a few textbooks hacks in other disciplines – including this inspirational group of Finnish mathematicians – this is of the first (to our knowledge) of this kind of text-hack in the humanities.
For this reason, the team is putting together a parallel ‘cookbook’, to enable other projects to understand what worked – as well as what did not work – about the project. This will be released in the first half of 2014, and will hopefully inspire other projects around the world to attempt open textbook projects of their own.
The team is hopeful that open textbooks will become more prevalent in public higher education. As University of Otago Copyright Officer Richard White, a core member of the text-hack team, puts it, the open textbook marks a return to the “core principles of academia: sharing knowledge, learning from, and building on the work of others.”
Taiwan: Practices and Depositories for The Public Domain
by project lead Tyng-Ruey Chuang
The project “Practices and Depositories for The Public Domain” (PD4PD) aims to develop software tools and practical guidelines to put public domain materials online more easily. This is a joint uptake of the GNU MediaGoblin project , NETivism Ltd. , and Creative Commons Taiwan , with the latter coordinating the team effort. The overall project goal is to firm up access to and reuse of the many digital manifestations of public domain cultural works by means of replicable tools, practices, and communities.
Tools: The plan is to extend the functionality of the GNU MediaGoblin software package so as to make it more suitable for hosting large collections of public domain materials. For this purpose, new features have been suggested to add to GNU MediaGoblin to help users self-hosting their media archives. These features include batch upload of media (with proper metadata annotations), customizable themes and pages, and an “easy install” script (to install GNU Media Goblin itself).
Practices: The plan is to develop guidelines and how-to on self-hosting public domain materials. Two versions are planned: One in English and the other one in the Chinese language used in Taiwan. An educational website on the public domain, and self-hosting, is also planned.
Community: The plan is to outreach to content holders in Taiwan, and to work with them in releasing some of their holdings to the public domain. It will be demonstrated by a website using the tools mentioned above.
This six-month project started in December 2013 and plans to finish in June 2014. The GNU MediaGoblin project has been focusing on tool development while NETivism Ltd. is concentrating on community outreach. Creative Commons Taiwan is working on practical guidelines. Several interns have been recruited to help with this project.Comments Off
Last month, Creative Commons and several other groups responded to the European Commission’s consultation on licensing, datasets and charging for the re-use of public sector information (PSI). See our response here. There were 355 submissions to the questionnaire (spreadsheet download), apparently from all EU Member States except Cyprus. The Commission hosted a hearing (PDF of meeting minutes) on the issue on 25 November.
This week the Commission released a final summary report (PDF) to the consultation. There were several interesting data points from the report concerning licensing. First, the questionnaire respondents preferred a “light-weight approach, limited to a mere disclaimer or consisting of allowing the reuse of data without any particular restrictions…” (pg5). In our submission, we said that there should be no conditions attached to the re-use of public sector information, with the best case scenario being for public sector information to be in the public domain, exempt from copyright protection altogether by amending national copyright laws.
Second, when asked about licensing conditions that would comply with the PSI Directive’s requirement of ‘not unnecessarily restricting possibilities for re-use’, the most respondents indicated support for the requirement to acknowledge the source of data. In our submission we said we believed every condition would be deemed restrictive, since ideally PSI would be removed from the purview of copyright protection through law. At the same time, we realize that if the Commission were to permit public sector bodies to incorporate a limited set of conditions through licensing, then they should be expected to use standard public licenses aligned with the Open Definition. The preference should be for “attribution only” licenses, like CC BY.
The report noted that a majority (62%) of respondents believed that greater interoperability would be best achieved through the use of standard licences. And 71% of respondents said that the adoption of Creative Commons licenses would be the best option to promote interoperability. The report states, “this may be interpreted as both a high awareness of the availability of standard licences and a genuine understanding of their role in ensuring licencing interoperability across jurisdictions” (pg7).
The report also mentions the fact that several respondents chose to provide feedback on which Creative Commons licenses would be deemed suitable for PSI re-use. It noted that the most prevalent licenses mentioned were CC0 and CC BY, while a few respondents suggested BY-SA. Others provided a more general answer, such as “the most open CC license could be used…But [the] BEST OPTION is no use of any of license: public domain” (pg9).
The report concludes (pg16):
There is also a widespread acceptance of the need to offer interoperable solutions, both on the technical and licencing levels. And even if opinions differ as to the exact shape of re-use conditions, the answers show that a general trend towards a more open and interoperable licencing system in Europe, largely based on available standard licences is gaining ground.
now available under
Paleontology, the description and biological classification of fossils, has spawned countless field expeditions, museum trips, and hundreds of thousands of publications. The construction of databases that aggregate these descriptive data on fossils in a way that allows large-scale, synthetic questions to be addressed, such as the long-term history of biodiversity and rates of biological extinction and origination during global environment change, has greatly expanded the intellectual reach of paleontology and has led to many important new insights into macroevolutionary and macroecological processes.
One of the largest compendia of fossil data assembled to date is the Paleobiology Database (PBDB), founded in 1998 by John Alroy and Charles Marshall. These two pioneers assembled a small team of scientists who were motivated to generate the first geographically-explicit, sampling standardized global biodiversity curve. The PBDB has since grown to include an international group of more than 150 contributing scientists with diverse research agendas. Collectively, this body of volunteer and grant-supported investigators have spent more than 9 continuous person years entering more than 280,000 taxonomic names, nearly 500,000 published opinions on the status and classification of those names, and over 1.1 million taxonomic occurrences. Some PBDB data derive from the original fieldwork and specimen-based studies of the contributors, but the majority of the data were extracted from the text, figures, and tables of over 48,000 published papers, books, and monographs that span the range of topics covered by paleontology. Their efforts have been well rewarded by enabling new science. As of December 2013, the PBDB had produced almost two hundred official peer reviewed publications, all of which address scientific questions that cannot be adequately answered without such a database.
|Ptyagnostus atavus or Leiopyge calva Zone (Cambrian of the United States)|
|PaleoDB collection 262: authorized by Jack Sepkoski, entered by Mike Sommers on 20.11.1998|
Shift to CC BY
From its inception, the paleontologists who have invested the most effort in entering data have made decisions about data management and access policies, which ultimately brings up the important questions of proper licensing and citation. In the first application of the PBDB licensing policy, the individual contributors chose their own CC license for each fossil collection record. As a result there were three kinds of contributors: those who didn’t know what to do, didn’t care, or didn’t know about the new policy that required them to specify how existing collections should be licensed (55% of the data), those who selected the most restricted option available to them (34% of the data), and those who selected the most unrestricted option available to them (10% of the data).
This received mostly negative response via social media and other outlets, partly because of the increased attention the database was receiving during a leadership and governance transition. Naturally, the governance group responded to the community feedback. The first actual action was by individual contributors. Many of the contributors who either didn’t know about CC licenses or who didn’t think fully about their meaning and implications changed their own individual licenses. This always went from a more restrictive license to the least restrictive option available to them: CC BY. That wave of individual choices towards the least restrictive license immediately shifted the balance for records in the database. At that point, only one contributor had a restrictive license, and the governance group quickly moved to adopt one single unifying license for the database: CC BY. Now, all new records are explicitly CC BY as part of database policy, although individual contributors still have the option of placing a moratorium on the public release of their own new data so as to protect their individual scientific interests.
Future of PBDB
In addition to being a scientific asset to the field of paleontology, the PBDB and other databases like it provide an addition means by which to participate in rapidly emerging initiatives and developments in cyberinfrastructure. To increase its reach in this area, the PBDB now has an Application Programming Interface (API), which makes data more easily and transparently accessible, both to individual researchers and to applications, such as the open source web application PBDB Navigator and the Mancos iOS mobile application. Both of these applications are built on the public API and are designed to allow the history of life and environment documented by the PBDB to be more discoverable. These new modes of interactivity and visualization highlight unintended, but potentially useful, aspects of the PBDB. The PBDB API has facilitated a loosely coupled integration with other related but independently managed biological and paleontological database initiatives and online resources, such as the Neotoma Paleoecology Database, Morphobank, and the Encyclopedia of Life. The PBDB API can also be harnessed by geoscientists outside of paleontology, thereby facilitating the integration of paleontological data with diverse types of data and model output, such as paleogeographic plate rotation and geophysical models in GPlates. The liberal CC BY license ensures interoperability and data access necessary to facilitate fundamentally new science and because it expands the reach of paleontology to a broader community of researchers and educators than is possible via any single website or application.1 Comment »
A few weeks ago, CC co-hosted an open education meetup in London with P2PU, the Open Knowledge Foundation (OKFN), and FLOSS Manuals Foundation. We also led or participated in sessions and tracks on open science, makes for cultural archives, collaborations across the open space, and open education data at the Mozilla Festival immediately following the meetup. Several interesting projects have arisen from both the meetup and sessions, so we thought it worthwhile to mention here in case others would like to get involved.
Hit the Road Map: A Human Timeline of the Open Education Space
In addition to networking and sharing our common open education interests, participants of the Open Ed Meetup at the William Goodenough house collectively built a timeline of events that they felt marked important (and personal) milestones in the open education space, from the beginning of the Open University in 1969 to Lessig’s countersuit against Liberation Music this year. The timeline was a great collaborative exercise for the group, and one that we hope is only beginning. As Marieke from the OKFN writes in her post,
“…the plan is to digitise what we have by moving all the ideas in to Google Docs and then create a TimeMapper of them. This may form part of the Open Education handbook. At that point we will be able to share the document with you so you can add more information, correct the date and add in your own ideas. We may even try to run more open education timeline events.”
In fact, CC affiliates in Europe will be co-hosting the second Open Education Handbook booksprint with the OKFN and Wikimedia in Berlin as a result!
Getting hands-on with tools on the web for Open Science
by Billy Meinke
In another team-up with the Open Knowledge Foundation (OKFN), we ran a session investigating tools on the web that help make science more open. Hinging on the theme of alternative ways to measure (altmetrics) scholarly impact, collaborators joined us in the session and got hands-on with tools that we can use to see how publications and other research outputs are talked about and shared on the web. To help build content for lessons linked to the Open Science course in the School of Open, participants tested a handful of free tools to see what they were able to measure, how usable the tools were, and considered ways to share this with others who aren’t familiar with altmetrics. We will be organizing the content over the next few weeks, and offering the altmetrics lesson as a standalone exercise once it’s complete. For more information about how the session went, see this blog post.
Collaborations across the Open Space
We also participated in a session with Wikimedia, OKFN, and other orgs to talk about how we could better collaborate and share news among our organizations so we don’t keep reinventing the wheel. I won’t go into detail here, as the wiki session writeup does it much better, and has continued to grow since the festival. For example, something as simple as a blog aggregator for all “open” related news would help those working in this space tremendously. To join our efforts, head over to the wiki and add your thoughts and be notified of follow-up meetings.
Digital Self Preservation Toolkit
One neat thing to come out of this year’s Mozfest was the beginnings of a Digital Self Preservation Toolkit exploring the idea of what happens to your body of creative, educational, or scientific work when you die. Some questions we asked and discussed were: In your country, what happens to your work when you die? What steps can you take to ensure its posterity? How would you want it shared and who would you want to own it? Our initial aim was to develop a set of tools and tips to help people think through how they might want to release their work upon death, building on an idea that the Question Copyright folks had last year around a free culture trust. Skirting the technical and legal issues for the time being, we came up with a prototype IP donor badge that creators might use to signify their intent, a concept form that they would fill out, and a mock-up website where such a toolkit might reside. We are now continuing our efforts in collaboration with folks from numerous organizations interested in the same questions, and you can join us to move the project forward at the Free Culture Trust wiki.
OER Research Hub’s Open Education Data Detective
Lastly, we’d like to highlight our collaboration with the OER Research Hub, who held a “scrum” on visualizing open education data called the Open Ed Data Detective. Participants experimented with open education data that the OER Research Hub made available, including data on School of Open courses.Comments Off
Seal Of The Executive Office Of The President / Public Domain
Yesterday President Barack Obama issued an Executive Order requiring federal government information to be open and machine-readable by default. This Order is the latest in a series of actions going back to 2009 in support of increasing access to and transparency of government information.
In addition to the Executive Order, the White House released a Memorandum (PDF) explaining how federal government agencies will comply with the new open data policy.
This Memorandum requires agencies to collect or create information in a way that supports downstream information processing and dissemination activities. This includes using machine readable and open formats, data standards, and common core and extensible metadata for all new information creation and collection efforts. It also includes agencies ensuring information stewardship through the use of open licenses and review of information for privacy, confidentiality, security, or other restrictions to release.
It provides a forward-thinking set of guidelines for open data to be released by U.S. federal agencies:
Open data: For the purposes of this Memorandum, the term “open data” refers to publicly available data structured in a way that enables the data to be fully discoverable and usable by end users. In general, open data will be consistent with the following principles:
- Public. Consistent with OMB’s Open Government Directive, agencies must adopt a presumption in favor of openness to the extent permitted by law and subject to privacy, confidentiality, security, or other valid restrictions.
- Accessible. Open data are made available in convenient, modifiable, and open formats that can be retrieved, downloaded, indexed, and searched. Formats should be machine-readable (i.e., data are reasonably structured to allow automated processing). Open data structures do not discriminate against any person or group of persons and should be made available to the widest range of users for the widest range of purposes, often by providing the data in multiple formats for consumption. To the extent permitted by law, these formats should be non-proprietary, publicly available, and no restrictions should be placed upon their use.
- Described. Open data are described fully so that consumers of the data have sufficient information to understand their strengths, weaknesses, analytical limitations, security requirements, as well as how to process them. This involves the use of robust, granular metadata (i.e., fields or elements that describe data), thorough documentation of data elements, data dictionaries, and, if applicable, additional descriptions of the purpose of the collection, the population of interest, the characteristics of the sample, and the method of data collection.
- Reusable. Open data are made available under an open license that places no restrictions on their use.
- Complete. Open data are published in primary forms (i.e., as collected at the source), with the finest possible level of granularity that is practicable and permitted by law and other requirements. Derived or aggregate open data should also be published but must reference the primary data.
- Timely. Open data are made available as quickly as necessary to preserve the value of the data. Frequency of release should account for key audiences and downstream needs.
- Managed Post-Release. A point of contact must be designated to assist with data use and to respond to complaints about adherence to these open data requirements.
The Memorandum provides some more information about how U.S. government information will be made reusable:
Ensure information stewardship through the use of open licenses – Agencies must apply open licenses, in consultation with the best practices found in Project Open Data, to information as it is collected or created so that if data are made public there are no restrictions on copying, publishing, distributing, transmitting, adapting, or otherwise using the information for non-commercial or for commercial purposes.
Depending on the exact implementation details, this could be a fantastic move that would remove any legal confusion about using federal government data. By leveraging open licenses, the U.S. federal government would be doing a great service to reusers by communicating those rights available in advance. And, if the U.S. truly wishes to make federal government information available without restriction, it could consider using a tool such as the CC0 Public Domain Dedication. CC0 is used by many data providers to place open data directly in the public domain. We’ve already suggested this (PDF) as an option for sharing federally funded research data.
The White House should be commended for taking another positive step forward to ensure that U.S. government data is made legally and technically accessible and useable.3 Comments »
Celebrating Open Data
Open Data Day 2013 can be described as a success. Why? Because hundreds of people participated in more than 100 events distributed across six continents all over the world, celebrating open data and all that we can do with it. Here at CC, we planned and executed a community-supported event to build open learning resources around the topic of Open Science, done in a hackathon-style sprint event that gathered people with diverse backgrounds and experience levels. An undergraduate student and a post-doc researcher, both from Stanford. An instructional designer from Los Angeles and an associate professor from Auburn University, plus a handful more of very talented people. Oh, and a mother and high school-aged daughter duo that simply wanted to see what “open” is about. We all connected to help build an open course to teach others about Open Science. Here’s how we did it.
Open Content for Learning
It’s worth mentioning that the course materials that were produced during the sprint will be openly licensed CC BY and shared so that their benefit to Open Education and Open Science are not restricted by legal boundaries. The material is being curated and will undergo a review process over the next couple weeks before being ported to the School of Open, a collaborative project by Creative Commons, P2PU, and a strong volunteer community of “open” experts and organizations. Though fitting the content to P2PU’s online course platform was in the back of our minds, time and consideration were largely placed on identifying important ideas that explain what Open Access, Open Research, and Open Data mean for Open Science, and how we can engage more “young scientists” (this is an ever-broadening term) in the ways of Open.
The Net Works Effect*
Adding a layer on top of open content itself, which is elastic in nature, our approach to this hackathon-style event focused on being very lean, the type of event that can be run by anyone, anywhere, and requiring very few resources. We created a Google Drive folder and a set of publicly-editable documents to collect openly-licensed resources, map out a tentative module/lesson plan, coordinate communications between participants, and generally provide a single place to collaborate on Open Science learning materials. Connecting with other event organizers at the OKFN and PLOS, mailing lists, Twitter hashtags, and other forms of communication were established so that there was a support network for those who were organizing events and those who were interested in participating in Open Data Day events on some level. David Eaves, Rufus Pollock, Ross Mounce, and many others were loud and clear on the Open Data Day mailing list, making sure news about each event was passed around.
— creativecommons (@creativecommons) February 22, 2013
Before the event, a registration page was created for the course sprint. We offered a handful of in-person tickets for folks to come down to our office in Mountain View, as well as a number of remote participant tickets for those who were in different geographical locations. Google Hangout “rooms” were set up on laptop computers placed in physical conference rooms at the CC HQ, allowing remote participants to work in real-time with persons on the ground. To see a more detailed description of the day’s event, see the schedule document here.
So what did we make? The sprinters involved in the project collected and organized resources that explain common aspects of Open Science. The main sections (access, methods, data) were helpful in searching for content, but there was a great deal of overlap between sections, which highlighted the relationhips between them. Beyond the collection of resources, sets of tasks were built that are meant to guide learners out beyond the course and into the communities of Open Science, interacting with the ideas, technical systems, and people who are opening up science. The Introduction to Open Science course on P2PU is still in a lightly-framed state, but the plan is to include the course in the launch of the School of Open during Open Education Week, March 11-15. If you’re interested in helping make this transition or to help build or review other courses that we call “open,” come introduce yourself in the School of Open Google Group. Or check out what else is happening on P2PU.
Beyond the course itself, we’re going to take a look at the sprint process we used, and work out some of the kinks. This rapid open-content creation technique is manageable, low-cost, and builds the Commons. There’s enough openly-licensed content existing on the web to produce a range of learning experiences, so now it seems that it’s a matter of developing open technology tools to the point where we can build education on the web together, easily. For more information about this and other Open Education projects being worked on by Creative Commons, see this page.
We Got Together for Open
Thanks to those who were able to participate in the Open Science course, as well as those who contributed the planning documents leading up to the event. We’ve done well.
PLOS Sci-Ed Blog, Guest Post: Open Data Day, Course Sprints, and Hackathons!
David Eaves’ Blog, International #OpenDataDay: Now at 90 Cities (and… the White House)
Debbie Morrison’s Blog, A Course Design ‘Sprint’: My Experience in an Education Hackathon
Also: The Flickr album from the event can be found here.
*This phrase coined by P. Kishor here, describing the interconnectedness of Open Data Day events.2 Comments »