4 Stars for Metadata: an Open Ranking System for Library, Archive, and Museum Collection Metadata
MacKenzie Smith, June 17th, 2011
The library, archives and museums (i.e. LAM) community is increasingly interested in the potential of Linked Open Data to enable new ways of leveraging and improving our digital collections, as recently illustrated by the first international Linked Open Data in Libraries Museums and Archives Summit (LOD-LAM) Summit in San Francisco. The Linked Open Data approach combines knowledge and information in new ways by linking data about cultural heritage and other materials coming from different Museums, Archives and Libraries. This not only allows for the enrichment of metadata describing individual cultural objects, but also makes our collections more accessible to users by supporting new forms of online discovery and data-driven research.
But as cultural institutions start to embrace the Linked Open Data practices, the intellectual property rights associated with their digital collections become a more pressing concern. Cultural institutions often struggle with rights issues related to the content in their collections, primarily due to the fact that these institutions often do not hold the (copy)rights to the works in their collections. Instead, copyrights often rest with the authors or creators of the works, or intermediaries who have obtained these rights from the authors, so that cultural institutions must get permission before they can make their digital collections available online.
However, the situation with regard to the metadata — individual metadata records and collections of records — to describe these cultural collections is generally less complex. Factual data are not protected by copyright, and where descriptive metadata records or record collections are covered by rights (either because they are not strictly factual, or because they are vested with other rights such as the European Union’s sui generis database right) it is generally the cultural institutions themselves who are the rights holders. This means that in most cases cultural institutions can independently decide how to publish their descriptive metadata records — individually and collectively — allowing them to embrace the Linked Open Data approach if they so choose.
As the word “open” implies, the Linked Open Data approach requires that data be published under a license or other legal tool that allows everyone to freely use and reuse the data. This requirement is one of most basic elements of the LOD architecture. And, according to Tim Berners-Lee’s 5 star scheme, the most basic way of making available data online is to make it ‘available on the web (whatever format), but with an open licence’. However, there still is considerable confusion in the field as to what exactly qualifies as “open” and “open licenses”.
While there are a number of definitions available such as the Open Knowledge Definition and the Definition of Free Cultural Works, these don’t easily translate into a licensing recommendation for cultural institutions that want to make their descriptive metadata available as Linked Open Data. To address this, participants of the LOD-LAM summit drafted ‘a 4-star classification-scheme for linked open cultural metadata’. The proposed scheme (obviously inspired by Tim Berners-Lee’s Linked Open Data star scheme) ranks the different options for metadata publishing — legal waivers and licenses — by their usefulness in the LOD context.
In line with the Open Knowledge Definition and the Definition of Free Cultural Works, licenses that either impose restrictions on the ways the metadata may be used (such as ‘non-commercial only’ or ‘no derivatives’) are not considered truly “open” licenses in this context. This means that metatdata made available under a more restrictive license than those proposed in the 4-star system above should not be considered Linked Open Data.
According to the classification there are 4 publishing options suitable for descriptive metadata as Linked Open Data, and libraries, archives and museums trying to maximize the benefits and interoperability of their metadata collections should aim for the approach with the highest number of stars that they’re comfortable with. Ideally the LAM community will come to agreement about the best approach to sharing metadata so that we all do it in a consistent way that makes our ambitions for new research and discovery services achievable.
Finally, it should be noted that the ranking system only addresses metadata licensing (individual records and collections of records) and does not specify how that metadata is made available, e.g., via APIs or downloadable files.
The proposed classification system is described in detail on the International LOD-LAM Summit blog but to give you a sneak preview, here are the rankings:
★★★★ Public Domain (CC0 / ODC PDDL / Public Domain Mark)
★★★ Attribution License (CC-BY / ODC-BY) where the licensor considers linkbacks to meet the attribution requirement
★★ Attribution License (CC-BY / ODC-BY) with another form of attribution defined by the licensor
★ Attribution Share-Alike License (CC-BY-SA/ODC-ODbL)
We encourage discussion of this proposal as we work towards a final draft this summer, so please take a look and tell us what you think!
Paul Keller, Creative Commons and Knowledgeland (Netherlands)
Adrian Pohl, Open Knowledge Foundation and hbz (Germany)
MacKenzie Smith, MIT Libraries (USA)
John Wilbanks, Creative Commons (USA)