Seal Of The Executive Office Of The President / Public Domain
Yesterday President Barack Obama issued an Executive Order requiring federal government information to be open and machine-readable by default. This Order is the latest in a series of actions going back to 2009 in support of increasing access to and transparency of government information.
In addition to the Executive Order, the White House released a Memorandum (PDF) explaining how federal government agencies will comply with the new open data policy.
This Memorandum requires agencies to collect or create information in a way that supports downstream information processing and dissemination activities. This includes using machine readable and open formats, data standards, and common core and extensible metadata for all new information creation and collection efforts. It also includes agencies ensuring information stewardship through the use of open licenses and review of information for privacy, confidentiality, security, or other restrictions to release.
It provides a forward-thinking set of guidelines for open data to be released by U.S. federal agencies:
Open data: For the purposes of this Memorandum, the term “open data” refers to publicly available data structured in a way that enables the data to be fully discoverable and usable by end users. In general, open data will be consistent with the following principles:
- Public. Consistent with OMB’s Open Government Directive, agencies must adopt a presumption in favor of openness to the extent permitted by law and subject to privacy, confidentiality, security, or other valid restrictions.
- Accessible. Open data are made available in convenient, modifiable, and open formats that can be retrieved, downloaded, indexed, and searched. Formats should be machine-readable (i.e., data are reasonably structured to allow automated processing). Open data structures do not discriminate against any person or group of persons and should be made available to the widest range of users for the widest range of purposes, often by providing the data in multiple formats for consumption. To the extent permitted by law, these formats should be non-proprietary, publicly available, and no restrictions should be placed upon their use.
- Described. Open data are described fully so that consumers of the data have sufficient information to understand their strengths, weaknesses, analytical limitations, security requirements, as well as how to process them. This involves the use of robust, granular metadata (i.e., fields or elements that describe data), thorough documentation of data elements, data dictionaries, and, if applicable, additional descriptions of the purpose of the collection, the population of interest, the characteristics of the sample, and the method of data collection.
- Reusable. Open data are made available under an open license that places no restrictions on their use.
- Complete. Open data are published in primary forms (i.e., as collected at the source), with the finest possible level of granularity that is practicable and permitted by law and other requirements. Derived or aggregate open data should also be published but must reference the primary data.
- Timely. Open data are made available as quickly as necessary to preserve the value of the data. Frequency of release should account for key audiences and downstream needs.
- Managed Post-Release. A point of contact must be designated to assist with data use and to respond to complaints about adherence to these open data requirements.
The Memorandum provides some more information about how U.S. government information will be made reusable:
Ensure information stewardship through the use of open licenses – Agencies must apply open licenses, in consultation with the best practices found in Project Open Data, to information as it is collected or created so that if data are made public there are no restrictions on copying, publishing, distributing, transmitting, adapting, or otherwise using the information for non-commercial or for commercial purposes.
Depending on the exact implementation details, this could be a fantastic move that would remove any legal confusion about using federal government data. By leveraging open licenses, the U.S. federal government would be doing a great service to reusers by communicating those rights available in advance. And, if the U.S. truly wishes to make federal government information available without restriction, it could consider using a tool such as the CC0 Public Domain Dedication. CC0 is used by many data providers to place open data directly in the public domain. We’ve already suggested this (PDF) as an option for sharing federally funded research data.
The White House should be commended for taking another positive step forward to ensure that U.S. government data is made legally and technically accessible and useable.