Last week the Wikimedia Foundation announced it is adopting an open access policy for research works created using foundation funds. According to their blog post, the new open access policy “will ensure that all research the Wikimedia Foundation supports through grants, equipment, or research collaboration is made widely accessible and reusable. Research, data, and code developed through these collaborations will be made available in Open Access venues and under a free license, in keeping with the Wikimedia Foundation’s mission to support free knowledge.”
The details of the open access policy can be found on the Wikimedia Foundation website. There will be an expectation that researchers receiving funds from the foundation will provide “unrestricted access to and reuse of all their research output…”. Published materials, proposals, and supporting materials will be covered under the open access policy. The policy states that media files must be made available under the Creative Commons Attribution-ShareAlike 3.0 license (the version currently used by Wikipedia), or any other free license. In addition, the policy requires that data be made available under an Open Definition-conformant license (with the CC0 Public Domain Dedication preferred), and that any source code be licensed under the GNU General Public License version 2.0 or any other Open Source Initiative-approved license.
The open access policy from the Wikimedia Foundation joins other institutions–including governments, philanthropic foundations, universities, and intergovernmental organizations who have adopted policies to increase access to important and useful information and data for the public good. Thanks to Wikimedia for their continued leadership in support of free knowledge for all.Comments Off
Philanthropic foundations fund the creation of scholarly research, education and training materials, and rich data with the public good in mind. Creative Commons has long advocated for foundations to add open license requirements to their grants. Releasing grant-funded content under permissive open licenses means that materials may be more easily shared and re-used by the public, and combined with other resources that are also published under open licenses.
Yesterday the Bill & Melinda Gates Foundation announced it is adopting an open access policy for grant-funded research. The policy “enables the unrestricted access and reuse of all peer-reviewed published research funded, in whole or in part, by the foundation, including any underlying data sets.” Grant funded research and data must be published under the Creative Commons Attribution 4.0 license (CC BY). The policy applies to all foundation program areas and takes effect January 1, 2015.
Here are more details from the Foundation’s Open Access Policy:
- Publications Are Discoverable and Accessible Online. Publications will be deposited in a specified repository(s) with proper tagging of metadata.
- Publication Will Be On “Open Access” Terms. All publications shall be published under the Creative Commons Attribution 4.0 Generic License (CC BY 4.0) or an equivalent license. This will permit all users of the publication to copy and redistribute the material in any medium or format and transform and build upon the material, including for any purpose (including commercial) without further permission or fees being required.
- Foundation Will Pay Necessary Fees. The foundation would pay reasonable fees required by a publisher to effect publication on these terms.
- Publications Will Be Accessible and Open Immediately. All publications shall be available immediately upon their publication, without any embargo period. An embargo period is the period during which the publisher will require a subscription or the payment of a fee to gain access to the publication. We are, however, providing a transition period of up to two years from the effective date of the policy (or until January 1, 2017). During the transition period, the foundation will allow publications in journals that provide up to a 12-month embargo period.
- Data Underlying Published Research Results Will Be Accessible and Open Immediately. The foundation will require that data underlying the published research results be immediately accessible and open. This too is subject to the transition period and a 12-month embargo may be applied.
Trevor Mundel, President of Global Health at the foundation, said that Gates “put[s] a high priority not only on the research necessary to deliver the next important drug or vaccine, but also on the collection and sharing of data so other scientists and health experts can benefit from this knowledge.”
Congratulations to the Bill & Melinda Gates Foundation on adopting a default open licensing policy for its grant-funded research. This terrific announcement follows a similar move by the William and Flora Hewlett Foundation, who recently extended their CC BY licensing policy from the Open Educational Resources grants to now apply foundation-wide for all project-based grant funds.
Regarding deposit and sharing of data, the Gates Foundation might consider permitting grantees to utilize the CC0 Public Domain Dedication, which allows authors to dedicate data to the public domain by waiving all rights to the data worldwide under copyright law. CC0 is widely used to provide barrier-free re-use to data.
We’ve updated the information we’ve been tracking on foundation intellectual property policies to reflect the new agreement from Gates, and continue to urge other philanthropic foundations to adopt open policies for grant-funded research and projects.3 Comments »
Earlier this month, CODATA and World Data System, both interdisciplinary committees of the International Council for Science, jointly organized SciDataCon, an international conference on data sharing for global sustainability. The conference was held Nov 2-5, 2014, on the campus of Jawaharlal Nehru University, New Delhi, India. Creative Commons Science had a busy schedule at the conference attended by 170+ delegates from all over the world, many from the global south.
We started early with a full day workshop on text and data mining (TDM) in cooperation with Content Mine. The workshop was attended by a mix of PhD students and researchers from the fields of immunology and plant genomics research. It was really rewarding to see the participants get a handle on the software and go through the exercises. Finally, the conversation about legal uncertainty around TDM appraised them about the challenges, but bottom-up support for TDM can be a strong ally in ensuring that this practice remains out of the reach of legal restrictions.
During the main conference we joined panel discussions on data citation with Bonnie Carroll (Iia), Brian Hole (Ubiquity Press), Paul Uhlir (NAS) and Jan Brase (DataCite) and international data sharing with Chaitanya Baru (NSF), Rama Hampapuram (NASA) and Ross Wilkinson (ANDS). We also participated in a daily roundup of the state of data sharing as presented at the conference organized by Elizabeth Griffin (CNRC).
SciDataCon, which used to be called CODATA, is held every two years, and is an important showcase of open science around the world. It is an important gathering for it brings together many scientists from the global south. A lot remains to be done to make real-time, pervasive data sharing and reuse a reality in much of the world, but there are heartening signs. At a national level, India’s data portal holds promise, but making data licensing information more explicit and data easily searchable by license would make it more useful. Citizen science projects in the Netherlands, India and Taiwan demonstrated how crowds can be involved in experiments while ensuring the user-generated content is made available for reuse, and SNEHA’s work on understanding perspectives on data sharing for public health research was particularly insightful of the value of listening to the feedback from participants.
We look forward to continue working with CODATA and WDS promoting and supporting open science and data initiatives around the world, and particularly in the global south, and hope for more success stories in the next SciDataCon.1 Comment »
We are in New Delhi and Mumbai for a number of presentations, workshops and meetings. Please come say hello if you are at these events or in the area.
SciDataCon2014 in New Delhi
The International Conference on Data Sharing and Integration for Global Sustainability (SciDataCon) is motivated by the conviction that the most research challenges cannot be addressed without attending to issues relating to research data essential to all scientific endeavors. However, several cultural and technological challenges are still preventing the research community from realizing the full benefits of progress in open access and sharing. CODATA and WDS, interdisciplinary committees of the International Council for Science (ICSU) are co-sponsoring and organizing a high profile international biennial conference at Jawaharlal Nehru University, New Delhi.
Nov 2: A day long Text and data mining (TDM) workshop offered in collaboration with ContentMine
TDM is an important scientific technique for analyzing large corpora of articles used to uncover both existing and new insights in unstructured data sets that typically are obtained programmatically from many different sources. While the science and technology TDM is complex enough, its legal complications are equally dizzying. Not only is its legal status unclear at best, it varies from jurisdiction to jurisdiction making cross-national collaboration difficult. Besides the license status of the original material, contractual agreements between research institutions and publishers, who are often the gatekeepers of the corpora, can create significant hurdles. The workshop offers an introduction to TDM, presenting the legal considerations through hands-on exercises.
Effective and efficient application of scientific data for the benefit of humanity entails agreed goals, clear and reproducible methods, and transparent communication throughout the data chain from producer to user via data organizer and research publisher. How well is that working? A Panel Discussion at the close of each day will summarise that day’s conclusions, and respond to the question of how well the data chain may be working from a trio of perspectives: Conference Organizer, data-management expert, and data producer.
Synthesis Data Citation Principles and Their Implications for TDM: Importance, Credit and Attribution, Evidence, Unique Identification, Access, Persistence, Specificity and Verifiability, and Interoperability and Flexibility: these eight important phrases describe the data citation principles agreed upon by the community and published under a joint declaration and endorsed by 185 individuals and 83 organizations. But, what are the implications of these principles beyond just citation, particularly with respect to automated analysis of large corpus of articles? This presentation will briefly present the principles, and then explore some of the issues that we have to come to grips with in order to make text and data mining (TDM) easy for scientists.
Maximizing Legal Interoperability Through Open Licenses: Many scientists do think about interoperability as they have to work with colleagues from other domains. However, common interoperability efforts are focused on technical, and if we are lucky, semantic interoperability. Rarely do scientists think of legal interoperability in the design of their science experiments. Can my work be legally mixed with someone else’s work without violating any intellectual property (or worse, privacy and security) laws? Is my work portable across not just scientific domains but also across judicial boundaries? We attempt to shed light on some of these questions in this presentation.
Nov 5: Talk on CC/OKF open science activities to be given at the computer science dept., Indian Institute of Technology-Delhi
Jenny Molloy, OKFN Open Science and I will be introducing the young computer science students at IIT-Delhi on the various open science and data activities around the world. This talk is organized by Prof. Aaditeshwar Seth, Computer Science, IIT-Delhi.
Nov 6-8: Meetings on citizen science and sensors at the Homi Bhabha Centre for Science Education (HBCSE), Mumbai
HBCSE at Tata Institute of Fundamental Research (TIFR), Mumbai is a National Center with the broad goals to promote equity and excellence in science and mathematics education from primary school to undergraduate college level, and encourage the growth of scientific literacy in the country. We will be discussing with HBCSE’s metaStudio potential areas of collaboration in citizen science and the use of sensors in projects to accelerate the growth of scientific awareness in the country through direct public participation in science.Comments Off
What were five hundred folks from 30 countries doing in 40+ different sessions running concurrently in three rooms of two gorgeous buildings in Ciudad de México? They were showing, sharing and learning from the best of each other’s work utilizing open data, pushing governments to adopt open policies, and hacking for social, environmental and humanitarian change in Latin America and the Caribbean. Condatos may be the most important regional conference on open data held in Latam, but it is undoubtedly a showcase of the diversity, ingenuity, vibrancy and perseverance of the changemakers in that historic yet energetic region.
Creative Commons was invited to a panel discussion on user licenses. Some of the innovative sessions that stood out were on Migrahack, health education in favelas in Brasil, a session on the Internet of Things, a hacking workshop, and mapping labs including one on using drones for mapping.
The two buildings of the conference venue were definitely symbolic of the dynamic nature of the gathering—the historic and gorgeous Biblioteca de México with Octavio Paz looking down on the young crowd and its high stone walls inscribed with words from the giants of Mexican literature were like bookends in time; the soaring, modernistic architecture of Cineteca Nacional were a nod to the exponential change in thinking and practice that was being hacked by the young crowd.
We are grateful for the chance to present our vision for a public commons of information that can both drive and be driven by the energy and innovation on display at the conference, and are thrilled at the new partnerships that hold promise for further expansion of the powerful concepts of open and sharing.
To the extent possible under law, Puneet Kishor has waived all copyright and related or neighboring rights to all photos and PDF in this blog post.Comments Off
Whether patients, or part of traffic, or exercising or simply walking with one of the behavioral trackers du jour, we are constantly giving data about ourselves and our surroundings to data collecters with few returns. From privacy regulations to bureaucratic barriers to collecting and locking up information just in case it might create monetary value in the future, there are a multitude of barriers between those who collect information and those who want to use it.
With support from Robert Wood Johnson Foundation (RWJF), we are launching two projects exploring different aspects that often get in the way of easy sharing of citizen-sourced information.
In collaboration with the Institute for Human Genetics and EngageUC at UCSF, and Personal Genome Project at Harvard University, we will explore the practical, ethical and legal implications of emphasizing benefits of sharing over the need for privacy at a workshop planned for Spring 2015 in Washington DC. A few of the questions to be tackled at the workshop: What if, instead of emphasizing the imperative of protecting privacy, we emphasized the potential benefits from sharing? Would most patients agree to let their information be shared? more →
Partnering with Manylabs, a San Francisco-based sensor tools and education nonprofit, and Urban Matter, Inc., a Brooklyn-based design studio, and in collaboration with the City of Louisville, Kentucky, and Propeller Health, maker of a mobile platform for respiratory health management, we will design, develop and install a network of sensor-based hardware that will collect environmental information at high temporal and spatial scales and store it in a software platform designed explicitly for storing and retrieving such data.
Further, we will design, create and install a public data art installation that will be powered by the data we collect thereby communicating back to the public what has been collected about them. more →Comments Off
One year ago, CC announced the Affiliate Project Grants to support and expand CC’s global network of dedicated experts. With a little help from Google, we were able to increase the capacity of CC’s Affiliates to undertake projects around the world benefiting a more free, open, and innovative internet.
We received over 70 applicants, and we were able to fund 18 to tackle important work in their country – work like using music to break down physical barriers and give Palestinians a voice, gathering leaders in Tanzania to discuss how sharing information can help prevent diabetes, and helping Romanian librarians provide quality educational materials to all.
Watching these projects unfold over the last several months has been reaffirming for everyone at CC. The Affiliates are central to CC’s work, without whom we would simply not be closer to our goal of a more open internet.
Click here to find out the full details of the different grants, and read on to see what our 18 teams had to say on the results they achieved, motivations for their projects, the work still to be done, and lessons learned.
“We are pleased that we were able to impact the way the people who shared their stories with us think about the concept of sharing stories. Some people when they were asked before to share their suffering and their personal stories on video were not totally sure they wanted to do it, but after seeing the output of their stories reflected on by poets and artists from all over the world, we think we were able to provide them a platform to express themselves and feel part of a greater community that is sharing the same hopes and fears.
[We want to expand] the project concept to other marginalized communities around the world.”
-Bashar Lubbad, Palestine, “Hope Spoken/Broken: Change in the Eyes of Palestinian Refugees”
“The result was publication of a guide on free culture movements in Arabic and a website where it can be downloaded freely in e-book format: www.freecultureguide.net. We target artists, journalists, bloggers and other content creators and the general public who is unfamiliar to the free culture movement and concepts, as this is the first book of its kind in Arabic about this topic.”
-Ahmed Mansour, CC Morocco, “Creative BookSprint“
“Lack of consumer level tools is still seen as a major obstacle in CC adoption. WpLicense is now a tool that can be applied to millions of blogs.”
-Tarmo Toikkanen, CC Finland, “WordPress License Revived”
“More concretely, participants learnt how to: adapt traditional services to a non-traditional model; locate learning objects that can be reused under CC licence; investigate and use alternative publishing platforms; and apply project management processes to a hack project.”
-Matt McGregor, CC New Zealand, “Media Text Hack“
“Museums and other memory institutions in Taiwan often have their collections digitized.
A major part of the digitized works shall be in the public domain. However, many of these institutions often keep these works in the equivalents of digital safes, and there are no easy ways to access and reuse them. Together with Netivism Ltd. (a social enterprise based in Taipei) CC Taiwan engaged with memory institutions and independent collectors in Taiwan about the tools and practices for public domain repositories.
Exemplary public domain repositories are being setup using MediaGoblin (a free software package for hosting media collections) with new extensions developed for and supported by this project grant.”
-Tyng-Ruey Chuang, CC Taiwan, “Practices and Depositories for the Public Domain”
“As a result of the interaction, the students were able to experience the Open culture which has caused a boom in the Kenyan tech scene. They identified industries that were etched on the sole foundation of Open tools in Kenya and were able to understand more experientially than before, the importance of such ideals.”
-Simeon Oriko, CC Kenya, “School of Open Kenya Initiative“
“Obami, a platform for resource exchange for elementary school students, has seen a number of copyright violations. Instead of policing kids’ actions, the Creative Commons for Kids program will teach kids how to open and share their creative and educational works legally through the use of CC licenses […] introducing Creative Commons to the next generation of Africa.”
-Kelsey Wiens, CC South Africa, “Creative Commons For Kids”
“Despite all the work we have done, CC is still an unknown concept to most people in the Arab region. We live in a copy/paste region where it will take a lot of hard work for people to understand the concepts of attribution. After a series of CC presentations in local schools (ages 12 to 18), we found that CC awareness is almost non-existent. On the other hand, our videos at wezank.com have been very popular online and we believe that using this asset to spread CC’s mission & vision would be highly effective across the region. [… This project] is about creating content in Arabic for the CC community, and at any stage, anyone wishing to present CC in Arabic will be able to use those videos.”
-Maya Zankoul, CC Lebanon, “CC Simply Explained in Arabic“
“[Information is power]… In Africa, this rich geography of information doesn’t yet exist. And not because there isn’t the richness of knowledge, history or place, but, for a number of reasons, because there is little culture of contribution to the Internet.”
-Kelsey Wiens, Cross Regional Africa, “Activate Africa”
“If the government [in Japan] adopts CC BY or CC zero, data released under these terms will bring scalable impact on the public in a sense that it will help reuse of government data with minimum restrictions. The workshop materials are open to the public, and some of the attendees will learn to teach others, which give the project some ripple effects beyond its immediate outcomes.”
-Tomoaki Watanabe, CC Japan, “Workshops and Symposium for Open Data in Japan”
“In the Arab world there were several personalities who have a positive influence in the history of their country, in different areas. That’s why I wish to publish with the help of the Arab community, an Arabic book under CC license, which tells us their lives, stories, and their influence on their own countries.”
-Faiza Souici, CC Algeria, “Arabic Icons”
“In Colombia, libraries and librarians have become one of the important civil society groups that are collectively seeking information, understanding and participating in public spaces trying to redefine copyright as a tool for access to knowledge and not just as a source of income for some people. […] The material in this course will be open as a self-guided course that can be tapped on demand — individually, at a user-preferred time and date. Moreover, the course can be harnessed as a group, from a collective or specific institution, to be facilitated according to the possibilities and conditions of a given community.”
-Maritza Sanchez, CC Colombia / El Salvador / Uruguay, “An Online Course on Basic Copyright for Latinamerican Librarians”
Work on the Horizon
“Latin Americans are creating and freely making available high quality and innovative music independently from big companies. But it is necessary to work better on both musicians understanding their rights and the power of sharing.”
-Renata Avila, CC Guatemala, “Promoting Free Music in Central and South America”
“While Chile has encouraged the creation of open access journals nationwide, researchers with high rates of publication and citation do not see them as a real possibility when publishing. Any policy to promote the creation of journals in Chile should consider factors that give them an edge in the scientific circuit and thus becoming a real possibility by leading Chilean scientists.”
-Francisco Vera, CC Chile, “Promotion of Open Knowledge in the Chilean Academia: Ways to Facilitate Adoption of Creative Commons in the Academic World“
“The conclusion of this project is that there are only building blocks for Open Educational Resources (OER) in Romania since at the moment there is not a clear OER practice – only grassroots initiatives or projects with huge potential of becoming OER. Most of the projects we discovered in essence share the same philosophy behind OER, but they nevertheless omit to attribute a license for the created resources. In conclusion, more awareness and training activities are needed in order to reach a level of maturity regarding OER and their use.”
-Bogdan Manolea, CC Romania, “OER Awareness Activities for Librarians and Academics in Romania“
CC Romania / CC BY
“Because many pupils and students cannot access hard copy textbooks which are discouragingly expensive, the importance of Creative Commons licenses in closing the literacy gaps which have been brought about by income inequality cannot be overstated.”
-Moses Mulumba, CC Uganda, “Promoting Creative Commons Initiatives in Uganda“
“The lessons that I learnt and which I can share is that grants from CC headquarters however, small [has great] potential impact to CC Affiliates as it acts as catalysts to the Affiliates to keep things going and mobilizing other funds locally.”
-Paul Kihwelo, CC Tanzania, “Tanzania Creative Commons Salon“
“We learnt that there is a high level of interest in Creative Commons in Ireland, and a need to continuously engage with people who are interested in Creative Commons.”
-Darius Whelan, CC Ireland, “Awareness-raising Event in Dublin, January 2014”
What do you get when you write software that becomes the basis of just about every geospatial application out there? You get perspective. Frank Warmerdam has been authoring, improving, supporting, and shepherding Shapelib, libtiff, GDAL and OGR for the past 15 years. Frank believes that by sharing effort, by adopting open, cooperatively developed standards, and avoiding proprietary licenses, adoption of open technologies could be supercharged. And lucky for us, he is right. To paraphrase him, open standards facilitate communication, capture common practice, and externalize arbitrary decisions.
Frank has done it all — worked as an independent consultant, for a proprietary remote sensing company, for a large search engine and mapping company, and now for a small, innovative space hardware maker. But most importantly, he has been a leader in the open geospatial world, at the helm of the Open GeoSpatial Foundation (OSGeo) that I myself have been involved with as long as I have personally known Frank, that is, for a good part of the past decade.
While OSGeo has faced a number of challenges, it has also enjoyed tremendous success through growing number of projects and chapters, local conferences, being perceived as a legitimate player, and recently, getting representation in its Charter Membership from 37 countries.
Frank says working on data libraries is a grungy job. Everyone wants ‘em but no one wants to work on ‘em. We relate to that as licenses are kinda like that, an essential infrastructure play that require getting the legal and technical details right, yet are most effective when they recede in the background and make us enjoy the content to the fullest.
Per Frank, the next set of challenges revolve around getting open geodata with easy to understand, interoperable license terms. As micro-satellite imagery becomes ubiquitous with frequent imagery collects, the resulting flood of imagery may lead to more ready adoption of open terms, perhaps even a current, live, or almost-live global, medium resolution basemap for OpenStreetMap. We can dream, and with my friend Frank to lead us with his quiet actions and measured wisdom, our dreams will come true.Comments Off
Two weeks ago we wrote about the U.S. Executive Order and announcement of Project Open Data, an open source project (managed on Github) that lays out the implementation details behind behind the President’s Executive Order and memo. The project offers more information on open licenses, and gives examples of acceptable licenses for U.S. federal data. Some of this information is clear, while other pieces require more clarification. Below we’ve provided some commentary and notes on the licensing parts of Project Open Data.
The Open Licenses page on Project Open Data says that a license will be considered “open” if the following conditions are met:
Reuse. The license must allow for reproductions, modifications and derivative works and permit their distribution under the terms of the original work.
Users can copy and make adaptations of the data. The government may use a copyleft license, thus requiring that adapted works be shared under the same license as the original. In our view, the reference to the government using a license is confusing. Works created by federal government employees in the in the public domain, and a license is not appropriate–at least as a matter of U.S. copyright law. More on this below.
The rights attached to the work must not depend on the work being part of a particular package. If the work is extracted from that package and used or distributed within the terms of the work’s license, all parties to whom the work is redistributed should have the same rights as those that are granted in conjunction with the original package.
Everyone is offered the work under the same public license.
Redistribution. The license shall not restrict any party from selling or giving away the work either on its own or as part of a package made from works from many different sources.
Third parties can sell the data verbatim or produce adaptations of the data and sell those.
The license shall not require a royalty or other fee for such sale or distribution.
Users don’t have to pay to use the licensed data.
The license may require as a condition for the work being distributed in modified form that the resulting work carry a different name or version number from the original work.
When the data gets remixed the licensor can require that the remixer note that their remixed version is different from the original.
The rights attached to the work must apply to all to whom it is redistributed without the need for execution of an additional license by those parties.
Public licenses must be used, which means that everyone gets offered the data under the same terms, without the need to negotiation individual licenses.
The license must not place restrictions on other works that are distributed along with the licensed work. For example, the license must not insist that all other works distributed on the same medium are open.
The license doesn’t infect other data or content that is distributed alongside the openly licensed data. It’s important that the open data is marked as such; the same goes for marking of the the non-open data.
If adaptations of the work are made publicly available, these must be under the same license terms as the original work.
This is a confusing statement, because it seems to require that all data be licensed under a copyleft license. This does not align with the licensing options listed in the Open License Examples page.
No Discrimination against Persons, Groups, or Fields of Endeavor. The license must not discriminate against any person or group of persons. The license must not restrict anyone from making use of the work in a specific field of endeavor. For example, it may not restrict the work from being used in a business, or from being used for research.
Anyone may use the licensed data for any reason.
Open License Examples
The Open License Examples page offers a helpful guide as to which open licenses will be accepted for government data released by federal agencies. As we noted in our earlier post, there is some confusion in that the Open Data Policy Memo says, “open data are made available under an open license that places no restrictions on their use.” Saying that data should be placed under a license with no restrictions doesn’t make sense, since even a very “open” license (such as CC BY) requires attribution to the author a condition on using the license. If the United States truly wishes to make federal government data available without restriction, it could consider mandating only those tools that accomplish this, for example the CC0 Public Domain Dedication or the Open Data Commons Public Domain Dedication and License.
Data and content created by government employees within the scope of their employment are not subject to domestic copyright protection under 17 U.S.C. § 105.
The fact that data and content created by federal government employees is not subject to copyright protection in the United States is a longstanding positive feature of the US code. But as noted here, this copyright-free zone only applies when talking about domestic protection, e.g. inside the United States. Outside its borders, the United States government could assert that, for example, one of its works is protected under French copyright law, and then enforce its copyright in France. It’s unclear how much this legal nuance is leveraged outside of the United States. But it does seem to create a challenge for the U.S. federal agencies in utilizing public domain dedication tools like CC0. This is because CC0 puts content into the worldwide public domain, whereas under Section 105 works created by federal government employees are only in the public domain in the United States. So, while it’s useful that works created by U.S. federal government employees is in the public domain in the United States, it’s a shame that this seems to preclude federal agencies from utilizing public domain tools like CC0, which would help communicate broad reuse rights easily and in machine-readable form. This begs the larger question, if information created by federal government employees is in the public domain in the United States, then is it inappropriate to license this data and content under one of the licenses noted below? And, if that is true, then what content will be licensed under the conformant licenses? Third party content?
When purchasing data or content from third-party vendors, however care must be taken to ensure the information is not hindered by a restrictive, non-open license. In general, such licenses should comply with the open knowledge definition of an open license. Several examples of common open licenses are listed below:
- Creative Commons BY, BY-SA, or CC0
- GNU Free Documentation License
- Open Data Commons Public Domain Dedication and Licence (PDDL)
- Open Data Commons Attribution License
- Open Data Commons Open Database License (ODbL)
- Creative Commons CC0
Notwithstanding the questions above about licensing options for the work produced by federal government employees, the Administration is taking a great step in recommending that licenses should align with the Open Definition. In addition, the Administration might include information about appropriate software licenses, should those come into play when they release data.2 Comments »
Seal Of The Executive Office Of The President / Public Domain
Yesterday President Barack Obama issued an Executive Order requiring federal government information to be open and machine-readable by default. This Order is the latest in a series of actions going back to 2009 in support of increasing access to and transparency of government information.
In addition to the Executive Order, the White House released a Memorandum (PDF) explaining how federal government agencies will comply with the new open data policy.
This Memorandum requires agencies to collect or create information in a way that supports downstream information processing and dissemination activities. This includes using machine readable and open formats, data standards, and common core and extensible metadata for all new information creation and collection efforts. It also includes agencies ensuring information stewardship through the use of open licenses and review of information for privacy, confidentiality, security, or other restrictions to release.
It provides a forward-thinking set of guidelines for open data to be released by U.S. federal agencies:
Open data: For the purposes of this Memorandum, the term “open data” refers to publicly available data structured in a way that enables the data to be fully discoverable and usable by end users. In general, open data will be consistent with the following principles:
- Public. Consistent with OMB’s Open Government Directive, agencies must adopt a presumption in favor of openness to the extent permitted by law and subject to privacy, confidentiality, security, or other valid restrictions.
- Accessible. Open data are made available in convenient, modifiable, and open formats that can be retrieved, downloaded, indexed, and searched. Formats should be machine-readable (i.e., data are reasonably structured to allow automated processing). Open data structures do not discriminate against any person or group of persons and should be made available to the widest range of users for the widest range of purposes, often by providing the data in multiple formats for consumption. To the extent permitted by law, these formats should be non-proprietary, publicly available, and no restrictions should be placed upon their use.
- Described. Open data are described fully so that consumers of the data have sufficient information to understand their strengths, weaknesses, analytical limitations, security requirements, as well as how to process them. This involves the use of robust, granular metadata (i.e., fields or elements that describe data), thorough documentation of data elements, data dictionaries, and, if applicable, additional descriptions of the purpose of the collection, the population of interest, the characteristics of the sample, and the method of data collection.
- Reusable. Open data are made available under an open license that places no restrictions on their use.
- Complete. Open data are published in primary forms (i.e., as collected at the source), with the finest possible level of granularity that is practicable and permitted by law and other requirements. Derived or aggregate open data should also be published but must reference the primary data.
- Timely. Open data are made available as quickly as necessary to preserve the value of the data. Frequency of release should account for key audiences and downstream needs.
- Managed Post-Release. A point of contact must be designated to assist with data use and to respond to complaints about adherence to these open data requirements.
The Memorandum provides some more information about how U.S. government information will be made reusable:
Ensure information stewardship through the use of open licenses – Agencies must apply open licenses, in consultation with the best practices found in Project Open Data, to information as it is collected or created so that if data are made public there are no restrictions on copying, publishing, distributing, transmitting, adapting, or otherwise using the information for non-commercial or for commercial purposes.
Depending on the exact implementation details, this could be a fantastic move that would remove any legal confusion about using federal government data. By leveraging open licenses, the U.S. federal government would be doing a great service to reusers by communicating those rights available in advance. And, if the U.S. truly wishes to make federal government information available without restriction, it could consider using a tool such as the CC0 Public Domain Dedication. CC0 is used by many data providers to place open data directly in the public domain. We’ve already suggested this (PDF) as an option for sharing federally funded research data.
The White House should be commended for taking another positive step forward to ensure that U.S. government data is made legally and technically accessible and useable.3 Comments »