science

CC Science’s Indian November

Puneet Kishor, October 28th, 2014

india2014We are in New Delhi and Mumbai for a number of presentations, workshops and meetings. Please come say hello if you are at these events or in the area.

SciDataCon2014 in New Delhi

The International Conference on Data Sharing and Integration for Global Sustainability (SciDataCon) is motivated by the conviction that the most research challenges cannot be addressed without attending to issues relating to research data essential to all scientific endeavors. However, several cultural and technological challenges are still preventing the research community from realizing the full benefits of progress in open access and sharing. CODATA and WDS, interdisciplinary committees of the International Council for Science (ICSU) are co-sponsoring and organizing a high profile international biennial conference at Jawaharlal Nehru University, New Delhi.

Nov 2: A day long Text and data mining (TDM) workshop offered in collaboration with ContentMine

TDM is an important scientific technique for analyzing large corpora of articles used to uncover both existing and new insights in unstructured data sets that typically are obtained programmatically from many different sources. While the science and technology TDM is complex enough, its legal complications are equally dizzying. Not only is its legal status unclear at best, it varies from jurisdiction to jurisdiction making cross-national collaboration difficult. Besides the license status of the original material, contractual agreements between research institutions and publishers, who are often the gatekeepers of the corpora, can create significant hurdles. The workshop offers an introduction to TDM, presenting the legal considerations through hands-on exercises.

Nov 3: How well is the data chain working?

Effective and efficient application of scientific data for the benefit of humanity entails agreed goals, clear and reproducible methods, and transparent communication throughout the data chain from producer to user via data organizer and research publisher. How well is that working? A Panel Discussion at the close of each day will summarise that day’s conclusions, and respond to the question of how well the data chain may be working from a trio of perspectives: Conference Organizer, data-management expert, and data producer.

Nov 5: Citing Data to Facilitate Multidisciplinary Research

Synthesis Data Citation Principles and Their Implications for TDM: Importance, Credit and Attribution, Evidence, Unique Identification, Access, Persistence, Specificity and Verifiability, and Interoperability and Flexibility: these eight important phrases describe the data citation principles agreed upon by the community and published under a joint declaration and endorsed by 185 individuals and 83 organizations. But, what are the implications of these principles beyond just citation, particularly with respect to automated analysis of large corpus of articles? This presentation will briefly present the principles, and then explore some of the issues that we have to come to grips with in order to make text and data mining (TDM) easy for scientists.

Nov 5: Challenges and Benefits of Open Science Data and International Data Sharing

Maximizing Legal Interoperability Through Open Licenses: Many scientists do think about interoperability as they have to work with colleagues from other domains. However, common interoperability efforts are focused on technical, and if we are lucky, semantic interoperability. Rarely do scientists think of legal interoperability in the design of their science experiments. Can my work be legally mixed with someone else’s work without violating any intellectual property (or worse, privacy and security) laws? Is my work portable across not just scientific domains but also across judicial boundaries? We attempt to shed light on some of these questions in this presentation.

Nov 5: Talk on CC/OKF open science activities to be given at the computer science dept., Indian Institute of Technology-Delhi

Jenny Molloy, OKFN Open Science and I will be introducing the young computer science students at IIT-Delhi on the various open science and data activities around the world. This talk is organized by Prof. Aaditeshwar Seth, Computer Science, IIT-Delhi.

Nov 6-8: Meetings on citizen science and sensors at the Homi Bhabha Centre for Science Education (HBCSE), Mumbai

HBCSE at Tata Institute of Fundamental Research (TIFR), Mumbai is a National Center with the broad goals to promote equity and excellence in science and mathematics education from primary school to undergraduate college level, and encourage the growth of scientific literacy in the country. We will be discussing with HBCSE’s metaStudio potential areas of collaboration in citizen science and the use of sensors in projects to accelerate the growth of scientific awareness in the country through direct public participation in science.

No Comments »

Vaya Con Datos

Puneet Kishor, October 5th, 2014

condatos-header

What were five hundred folks from 30 countries doing in 40+ different sessions running concurrently in three rooms of two gorgeous buildings in Ciudad de México? They were showing, sharing and learning from the best of each other’s work utilizing open data, pushing governments to adopt open policies, and hacking for social, environmental and humanitarian change in Latin America and the Caribbean. Condatos may be the most important regional conference on open data held in Latam, but it is undoubtedly a showcase of the diversity, ingenuity, vibrancy and perseverance of the changemakers in that historic yet energetic region.

sessions

Creative Commons was invited to a panel discussion on user licenses. Some of the innovative sessions that stood out were on Migrahack, health education in favelas in Brasil, a session on the Internet of Things, a hacking workshop, and mapping labs including one on using drones for mapping.

mex-buildings

The two buildings of the conference venue were definitely symbolic of the dynamic nature of the gathering—the historic and gorgeous Biblioteca de México with Octavio Paz looking down on the young crowd and its high stone walls inscribed with words from the giants of Mexican literature were like bookends in time; the soaring, modernistic architecture of Cineteca Nacional were a nod to the exponential change in thinking and practice that was being hacked by the young crowd.

kishor-condatos

We are grateful for the chance to present our vision for a public commons of information that can both drive and be driven by the energy and innovation on display at the conference, and are thrilled at the new partnerships that hold promise for further expansion of the powerful concepts of open and sharing.

CC0 To the extent possible under law, Puneet Kishor has waived all copyright and related or neighboring rights to all photos and PDF in this blog post.

Comments Off

Examining deficiencies of and limitations on data sharing

Puneet Kishor, August 18th, 2014

Whether patients, or part of traffic, or exercising or simply walking with one of the behavioral trackers du jour, we are constantly giving data about ourselves and our surroundings to data collecters with few returns. From privacy regulations to bureaucratic barriers to collecting and locking up information just in case it might create monetary value in the future, there are a multitude of barriers between those who collect information and those who want to use it.

With support from Robert Wood Johnson Foundation (RWJF), we are launching two projects exploring different aspects that often get in the way of easy sharing of citizen-sourced information.

Sharing v. Privacy

reports

Original image by Puneet Kishor released under a CC0 Public Domain Dedication

In collaboration with the Institute for Human Genetics and EngageUC at UCSF, and Personal Genome Project at Harvard University, we will explore the practical, ethical and legal implications of emphasizing benefits of sharing over the need for privacy at a workshop planned for Spring 2015 in Washington DC. A few of the questions to be tackled at the workshop: What if, instead of emphasizing the imperative of protecting privacy, we emphasized the potential benefits from sharing? Would most patients agree to let their information be shared? more →

Sensored City

inverted-model-of-data-collection

Original image by Puneet Kishor released under a CC0 Public Domain Dedication

Partnering with Manylabs, a San Francisco-based sensor tools and education nonprofit, and Urban Matter, Inc., a Brooklyn-based design studio, and in collaboration with the City of Louisville, Kentucky, and Propeller Health, maker of a mobile platform for respiratory health management, we will design, develop and install a network of sensor-based hardware that will collect environmental information at high temporal and spatial scales and store it in a software platform designed explicitly for storing and retrieving such data.

Further, we will design, create and install a public data art installation that will be powered by the data we collect thereby communicating back to the public what has been collected about them. more →

Silent-Lights

Silent Lights Image © Urban Matter, Inc., used with permission.

Please follow our progress on Sharing v. Privacy and the Sensored City projects, and get in touch with us if you want to learn more.

Comments Off

CC Signs Bouchout Declaration for Open Biodiversity

Puneet Kishor, July 11th, 2014

Bouchout CC stampCC is supporting the Bouchout Declration for Open Biodiversity Knowledge Management by becoming a signatory. The Declaration’s objective is to help make biodiversity data openly available to everyone around the world. It offers the biodiversity community a way to demonstrate their commitment to open science, one of the fundamental components of CC’s vision for an open and participatory internet.

In April 2013 CC participated in a workshop on Names attribution, rights, and licensing convened by the Global Names Project which led to a report titled Scientific names of organisms: attribution, rights, and licensing that concluded:

“There are no copyright impediments to the sharing of names and related data. The system must reward those who make the contributions upon which we rely. Building an attribution system remains one of the more urgent challenges that we need to address together.”

Many of the attendees of the workshop and of the report cited above are among those who met in June in Meise, Belgium and released the Bouchout Declaration.

Donat Agosti Bouchout Declaration

Donat Agosti introducing the Bouchout Declaration at the OpenDataWeek, RMLL, Miontpellier, France, July 11, 2014. Photo by P. Kishor released under CC0 Public Domain Dedication

The declaration calls for free and open use of digital resources about biodiversity and associated access services and exhorts the use of licenses or waivers that grant or allow all users a free, irrevocable, world-wide, right to copy, use, distribute, transmit and display the work publicly as well as to build on the work and to make derivative works, subject to proper attribution consistent with community practices, while recognizing that providers may develop commercial products with more restrictive licensing. This is not only aligned with the vision of CC itself, CC is also the creator and steward of the legal and technical infrastructure that allows open licensing of content.

Phylogeny viewer

Screenshot of phylogeny from PhyLoTA as displayed in BioNames. The user can zoom in and out and pan, as well as change the layout of the tree from BioNames: linking taxonomy, texts, and trees by Roderick D. M. Page used under a CC BY License.

The declaration also promotes Tracking the use of identifiers in links and citations to ensure that sources and suppliers of data are assigned credit for their contributions and Persistent identifiers for data objects and physical objects such as specimens, images and taxonomic treatments with standard mechanisms to take users directly to content and data. CC has participated from the beginning in the activities that led to the Joint Declaration of the Data Citation Principles and that promotes the use of persistent identifiers to allow discovery and attribution of resources.

Finally, the declaration calls for Policy developments that will foster free and open access to biodiversity data. CC works assiduously on creating, fostering, nurturing and assisting in the promulgation of open policies and practices that advance the public good by supporting open policy advocates, organizations and policy makers.

We have a few concerns: most copyright laws around the world treat data as not protected by copyright, thus would not require licensing. We are also aware that some cultures wish to preserve and protect traditional knowledge, so we want to make sure information is released by only those who have the right to do so without impinging on the rights of such segments that might otherwise be negatively affected by its release. However, overall we believe that open biodiversity information is crucial for science and society. Mancos in the App Store Be it heralding the Seeds of Change, participating in the Group on Earth Observations (GEO), or assisting the Paleobiology Database to move to CC BY license, CC is playing a vital role in the progress of open science in the areas of biodiversity and natural resources. CC has committed to assisting organizations joining Google in the White House Climate Data Initiative. On a personal front I have released the entire codebase of Earth-Base under the CC0 Public Domain Dedication making possible applications such as Mancos on the iOS App Store.

bouchout_signatories

Bouchout Signatories. Image by Plazi released under a CC0 Public Domain Dedication

Most of the world’s biodiversity is in developing countries, and ironically, most of biodiversity information and collections are in developed countries. Agosti calls this, “Biopiracy: taking biodiversity material from the developing world for profit, without sharing benefit or providing the people who live there with access to this crucial information.” (Agosti, D. 2006. Biodiversity data are out of local taxonomists’ reach. Nature 439, 392) Opening up the data will benefit the developing counties by giving them free and easy access to information about their own biological riches. Friction-free access to and reuse of data, software and APIs is essential to answering pressing questions about biodiversity and furthering the move to better understanding and stewarding our planet and its resources. Signing the Bouchout Declaration strengthens this movement.

Comments Off

Liberating the Haystack for the Needles

Puneet Kishor, June 2nd, 2014

This post with invaluable assistance from the CC legal and policy teams.

Text and data mining (TDM) is becoming an increasingly important scientific technique for analyzing large amounts of data. The technique is used to uncover both existing and new insights in unstructured data sets that typically are obtained programmatically from many different sources.

pbdb

PBDB Navigator screenshot released under a CC0 1.0 Public Domain Dedication

A few of the innovative examples include GeoDeepDive, a system that helps geoscientists discover information and knowledge buried in the text, tables, and figures of geology journal articles; improving human curation of chemical-gene-disease networks for the Comparative Toxicogenomics Database; and discovering a new link between genes and osteoporosis.

Legal Uncertainty

While the science and technology of TDM are complex enough involving information retrieval (IR), optical character recognition (OCR), and natural language processing (NLP), the legal complications are, sadly, equally dizzying. The legal status of TDM is unclear at best, both because there are a multitude of techniques to engage in TDM, and because the implications of various techniques vary from jurisdiction to jurisdiction. This makes cross-national collaboration, integral to science, difficult at best. For example, TDM is generally considered to not implicate copyright in the U.S. There are several theories as to why TDM falls outside copyright, but the most obvious is that it uses copyrighted material for a transformative purpose and is therefore a fair use. Judge Baer, writing in Author’s Guild, Inc., et. al. v. Hathi Trust, et. al. (Case 1:11-cv-06351-HB)

“The use to which the works in the HDL are put is transformative because the copies serve an entirely different purpose than the original works: the purpose is superior search capabilities rather than actual access to copyrighted material. The search capabilities of the HDL have already given rise to new methods of academic inquiry such as text mining.”

Judge Baer goes on to state:

“I cannot imagine a definition of fair use that would not encompass the transformative uses made by Defendants’ MDP and would require that I terminate this invaluable contribution to the progress of science and cultivation of the arts.”

The clarity, however, is far from universal as the situation outside the U.S. gets muddy. While there have been a few welcome developments in the U.K., the copyright laws of many other countries have little to no clarity on whether TDM falls outside of the reach of copyright and related laws. Where TDM does implicate copyright, the license status of the original material can make automated access and analysis very complicated, requiring additional checks to ensure any material is only being used as permitted by the license. And, even where the relevant licenses are free and open, and conducive to TDM, contractual agreements between research institutions and publishers, who are often the gatekeepers of the corpora, can create significant hurdles.

Public Sentiment

In a comment on proposed U.K. exception for information mining, both iCommons and the Open Knowledge Foundation (OKFN) supported the UK Government’s opinion that it is inappropriate for “Certain activities of public benefit such as medical research obtained through text mining to be in effect subject to veto by the owners of copyrights in the reports of such research, where access to the reports was obtained lawfully.” PLOS opined, “Enabling content mining is a core part of the value offering for Open Access publication services.” In its response to EU copyright review, LIBER stated, “All exceptions related to education, learning and access to knowledge to be made mandatory. In particular, we would like to see a specific exception for text and data mining for all research purposes.” OKFN’s Working Group on Open Access stated:

“We assert that there is no legal, ethical or moral reason to refuse to allow legitimate accessors of research content (OA or otherwise) to use machines to analyse the published output of the research community. Researchers expect to access and process the full content of the research literature with their computer programs and should be able to use their machines as they use their eyes.”

Support for text and data mining under the guise of “The right to read is the right to mine” has been demonstrated by other organizations including the declarations by Copyright for Creativity (July 2013) and the International Federation of Library Associations and Organizations (December 2013). If we as a society wish to realize the incredible potential for text and data mining, the practice should not be controlled through contractual terms or licensing.

Instead of relying on contractual restrictions or licensing to engage in text and data mining, non-consumptive uses of texts should be expressly eliminated from the reach of copyright and contract. The UK’s Hargreaves Report (PDF, p. 47) suggested the adoption of an exception to copyright law for non-consumptive uses, which are “uses of a work enabled by technology which does not trade on the underlying creative and expressive purpose of the work.”

Most recently, the UK copyright reform legislation introduced changes that makes it easier to engage in TDM for non-commercial purposes, allows storing of the corpus locally as long as it remains protected from general public access, and perhaps most importantly, disallows contractual negotiations that would make it difficult to conduct TDM.

The above sentiments are laudable, and copyright reforms friendly to TDM are very important, and we support such efforts. However, we believe the more knowledgeable potential users of TDM are about the technology and related issues, the better they will be able to negotiate conditions that make their research easy and efficient. Hence, we want to push forward with education and awareness building as a bottom-up effort.

Building Bottom-Up Support

Content Mine


Image by R. Mounce extracted from: doi: 10.11646/phytotaxa.163.5.1 licensed under the Creative Commons Attribution Licence (CC-BY) 3.0 license

We are working with the ContentMine team developing an agenda for a workshop that would provide training in TDM and educate the participants regarding the legal considerations through hands-on exercises. We will introduce the topic, the tools and techniques, tackle a specific problem, and then use that to expose researchers to the legal complications that they may encounter in conducting their research and the legal considerations they should keep in mind when choosing a license for their works. We have three objectives for this series of workshops—

  1. Introduce participants to the basic tools and techniques of text and data mining (TDM);
  2. Make participants aware of the legal intricacies of TDM and the implications of choosing the right licenses that enable TDM for downstream users;
  3. Nurture a community of practice whose members may draw upon each other for continued help.

To be clear, we are not intending the workshop to be a detailed and comprehensive training in TDM, and it is certainly not a replacement for expertise in this deep and comprehensive technique. Instead, the workshop is designed to be both an introduction to basic technical and legal concepts as well as an opportunity to get to network with experts as well as novices with interest in the field. We hope participants intending to use TDM for their work will be better informed when seeking collaboration with TDM experts.

TDM workshops

Original artwork by Puneet Kishor released under CC0 Public Domain Dedication

The first instance of this workshop will be held at the 2014 Open Knowledge Festival. We hope to follow it with one in Nairobi in Aug 2014 at the International Workshop on Open Data for Science and Sustainability in Developing Countries (OpenDataSSDC) organized by the CODATA Task Group on Preservation of and Access to Scientific and Technical Data in Developing Countries (CODATA PASTD), and one possibly at SciDataCon in New Delhi in Nov 2014. We hope to make these workshops a recurring event, building a roster of interesting exercises and problems to solve, and constantly improving the content based on audience feedback and ongoing research.

In cooperation with computing, legal and library experts, we will adapt the workshop agenda to make it more suitable and relatable to the host institutions. Our aim is to reach communities of researchers in countries that are otherwise under-represented in the global conversation on open science and data. We have identified researchers, and will continue to identify more, both on the technical as well as legal side with whom we intend to start building a network. If you are working with TDM, intend to work with TDM, and have expertise either in its technology or in related legal issues specific to your jurisdiction, please contact us.

We also intend to develop a community of practice for TDM, either standalone or via existing platforms such as StackExchange, and will utilize online resources such as forums, mailing lists, and a roster of technical, legal and institutional experts available to provide assistance with TDM.

2 Comments »

Seeds of Change

Puneet Kishor, May 21st, 2014

packet of seeds

I received a fat packet in mail, full of seeds with unusual names—Magma Mustard; Flashy Lightning Lettuce; Lemon Pastel Calendula; Cherry Vanilla Quinoa—and an even more unusual but evocative note stuck on the packets.

fancy seeds

This Open Source Seed pledge is intended to ensure your freedom to use the seed contained herein in any way you choose, and to make sure those freedoms are enjoyed by all subsequent users. By opening this packet, you pledge that you will not restrict others’ use of these seeds and their derivatives by patents, licenses, or any other means. You pledge that if you transfer these seeds or their derivatives they will also be accompanied by this pledge.

pledge

Welcome to the Open Source Seed Initiative, a group that includes scientists, citizens, plant breeders, farmers, seed companies, and gardeners, and has its origins in both the open source software movement and in the realization among plant breeders and social scientists that continued restrictions on seed may hinder our ability to improve our crops and provide access to genetic resources.

Jack Kloppenburg, Professor, Department of Community and Environmental Sociology, and one of the founders of OSSI, contacted me a couple of years ago, just around the time I joined CC full-time. He was hoping for a CC-type license for the seeds. CC’s focus, however, is restricted to copyright. And, at least for now, copyright is an area that keeps our hands full. However, OSSI’s goals are very much in line with CC’s mission, to free information, to make it flow from those who create it to those who want to use it, with least impedance. And, what better example of information than a seed in which the very blueprint of life is embedded.

note from Jack

Jack’s email signature reads, “Well,” she said, “you have a high tolerance for lunatics, don’t you?” Knowing Jack, that sounds about right. You’ve got to be crazy to be able to change the world.

Yes Jack, let’s talk, heck, let’s not just talk, but let’s actually collaborate and spread the seeds of change.

Comments Off

Precocious One Year Old Turning Academic Publishing On Its Head

Puneet Kishor, February 12th, 2014

 

“If we can set a goal to sequence the Human Genome for $99, then why shouldn’t we demand the same goal for the publication of research?”

 

PeerJ logo started with that bold challenge. Now, the scrappy startup that dared has done it. One year old today, PeerJ, the peer-reviewed journal, has seen startling growth having published 232 articles under CC-BY 3.0 last year. By the way, per Scimago that number is more than what 90% of any other journal publishes in a year. Then in April 2013 PeerJ started publishing PeerJ PrePrints, the non-peer-reviewed preprint server with 186 PrePrints in 2013, all under CC BY 3.0.

Now PeerJ has more than 800 Academic Editors, from a wide variety of countries and institutions. There are also five Nobel Prize winners on the PeerJ Board. PeerJ receives submissions from all over the world, and covers all of the biological, health, medical sciences. As of the time of this post’s publication, the top subject areas for PeerJ submissions were

Subject Articles
Ecology 106
Bioinformatics 69
Evolutionary Studies 66
Zoology 54
Computational Biology 49
Microbiology 48
Psychiatry and Psychology 47
Marine Biology 45
Biodiversity 45
Biochemistry 45

Not everything has been easy. Starting an entire publishing company from scratch has been a learning experience for the entire team. From no brand recognition, no history, no infrastructure etc. to having successfully established themselves in all the places that a publishing company should be in: archiving solutions; DOI issuing services; indexing services; membership of professional bodies; ISSN registrations etc. PeerJ has done very well. Last year PeerJ won the ALPSP Award for Publishing Innovation.

PeerJ’s vision/mission are deceptively simple:

  • Keep Innovating
  • Remember Whom We Serve
  • Pass on the Savings
Interpretive drawing of DNHM D2945 Hongshanornis longicresta

PeerJ decision-making process is fast, very fast. Authors get their first decision back in a median of 24 days. Being small, and non-traditional means they can take risks. They have built interesting functionality and models such as optional open peer review; Their business model is based on individuals purchasing low cost lifetime publication plans, and that has resulted in a lot of their functionality being very individual-centric.

Compared to traditional publishers, PeerJ is a very tech-focused company. They built all the technology themselves, quite unusual in the academic publishing world, which normally uses third parties for their peer-review software and publication platforms. By doing it themselves they have much more control over their destiny, cost, and can build functionality which suits their unique needs. The high percentage of authors describing their experience with PeerJ as their best publishing experience is arguably a direct result of this. Much of PeerJ’s software is open source, and their techie roots are evident in their engagement with the community via events such as Hack4ac, a hackday to specifically celebrate, ahem, CC BY!

Peter Binfield, Co-Founder, says:

We firmly believe that Open Access publishing is the future of the academic journal publishing system. With the current trends we see in the marketplace (including governmental legislation; institutional mandates; the rapid growth of the major OA publishers; and the increasing education and desire from authors) we believe that Open Access content will easily make up >50% of newly published content in the next 4 or 5 years.

 
Once all academic content is OA and under an appropriate re-use license we believe that significant new opportunities will emerge for people to use this content; to build on it for new discoveries and products; and to accelerate the scientific discovery process.

Binfield continues:

We regard the CC-BY license as the gold standard for OA Publications. Some other publishers provide authors with “NC” options, or try to write their own OA licenses, but we have a firm belief in the CC BY flavor. If there are many different OA licenses in play then it becomes increasingly difficult for users to determine what rights they have for any given piece of work, and so it is cleaner and simpler if everyone agrees on a single (preferably liberal) license. We were pleased to see the license updated to 4.0 and were quick to adopt it.

In Jan 2014, PeerJ moved to CC BY 4.0 for all articles newly submitted from that point onwards (prior articles remain under CC BY 3.0 of course). Today, on PeerJ’s first birthday, we at CC send PeerJ our best wishes, and look forward to ever more courageous, even outrageous innovations from this precocious one year old.

Comments Off

CC is now a Group on Earth Observations (GEO) Participating Organization

Puneet Kishor, January 16th, 2014

GEO logo

As of yesterday (January 15, 2014), the Group on Earth Observations approved Creative Commons as now a Participating Organization (PO) at its GEO-X Plenary in Geneva.

GEO was launched in response to calls for action by the 2002 World Summit on Sustainable Development and by the G8 (Group of Eight) leading industrialized countries to exploit the growing potential of Earth observations to support decision making in an increasingly complex and environmentally stressed world. GEO is coordinating efforts to build a Global Earth Observation System of Systems (GEOSS).

GEOSS logo

GEOSS provides decision-support tools to a wide variety of users via a global and flexible network of content providers. GEOSS lets decision makers access a range of information by linking together existing and planned observing systems around the world and support the development of new systems where gaps exist. GEOSS promotes common technical standards so that data from the thousands of different instruments can be combined into coherent data sets. The GEOPortal offers a single Internet access point for users seeking data, imagery, and analytical software packages relevant to all parts of the globe. For users with limited or no access to the internet, similar information is available via the GEONETCast network of telecommunication satellites.

GEO is a voluntary partnership of governments and international organizations providing a framework to develop new projects and coordinate their strategies and investments. As of 2013, GEO’s Members include 89 Governments and the European Commission. In addition, 67 intergovernmental, international, and regional organizations with a mandate in Earth observation or related issues have been recognized as Participating Organizations (PO).

Dr. Robert Chen, CC’s Science Advisory Board member, was at the Plenary, and he had the following comment, “The GEO Executive Director, Barbara Ryan, pointed out in plenary that there was an extensive discussion in the GEO Executive Committee about making sure that new POs are active contributors to GEO activities. She noted that all of the proposed POs in today’s slate met this criterion.”

Creative Commons has been contributing to the GEO Data Sharing Task Force’s Legal Interoperability Sub-Group and its draft white paper on “Legal Options for the Exchange of Data through the GEOSS Data-CORE (PDF).” (I was a part of the Sub-Group as a Science Fellow, and our Senior Counsel, Sarah Pearson, reviewed the paper). We intend to continue to be active contributors by guiding GEO and its members on the legal aspects of data sharing.

Thanks to Paul Uhlir of the Board on Research Data and Information, National Academies for making the right introductions; and to John Wilbanks, another Science Advisory Board member, for initially encouraging CC to get involved with GEO.

Comments Off

Paleobiology Database now CC BY

Puneet Kishor, December 19th, 2013

[written in collaboration with Shanan Peters, Professor, Department of GeoScience, University of Wisconsin-Madison and the Principal Investigator of the Paleodb Project]

The Paleobiology Database

now available under

CC BY

After a year of community feedback and discussion, the Paleobiology Database has taken the decision that “All records are made available to the public based on a Creative Commons license that requires attribution before use.” The Paleobiology Database is now licensed under a CC-BY 4.0 International License.

Paleontology

Paleontology, the description and biological classification of fossils, has spawned countless field expeditions, museum trips, and hundreds of thousands of publications. The construction of databases that aggregate these descriptive data on fossils in a way that allows large-scale, synthetic questions to be addressed, such as the long-term history of biodiversity and rates of biological extinction and origination during global environment change, has greatly expanded the intellectual reach of paleontology and has led to many important new insights into macroevolutionary and macroecological processes.

Paleobiology Database

One of the largest compendia of fossil data assembled to date is the Paleobiology Database (PBDB), founded in 1998 by John Alroy and Charles Marshall. These two pioneers assembled a small team of scientists who were motivated to generate the first geographically-explicit, sampling standardized global biodiversity curve. The PBDB has since grown to include an international group of more than 150 contributing scientists with diverse research agendas. Collectively, this body of volunteer and grant-supported investigators have spent more than 9 continuous person years entering more than 280,000 taxonomic names, nearly 500,000 published opinions on the status and classification of those names, and over 1.1 million taxonomic occurrences. Some PBDB data derive from the original fieldwork and specimen-based studies of the contributors, but the majority of the data were extracted from the text, figures, and tables of over 48,000 published papers, books, and monographs that span the range of topics covered by paleontology. Their efforts have been well rewarded by enabling new science. As of December 2013, the PBDB had produced almost two hundred official peer reviewed publications, all of which address scientific questions that cannot be adequately answered without such a database.

Ptyagnostus atavus or Leiopyge calva Zone (Cambrian of the United States)
Olenoides superbus, Late Middle Cambrian, Upper Marjum Formation, House Range, Millard County, Utah, USA - Houston Museum of Natural Science

Photo by Wikipedia user Dwergenpaartje under CC0 Public Domain Dedication

  • Where: Utah (38.9° N, 113.4° W: paleocoordinates 4.1° S, 92.0° W)
  • When: Ptyagnostus atavus or Leiopyge trilobite zone, Marjum Limestone Formation, Marjumian (513.0 – 498.5 Ma)
  • Environment/lithology: offshore ramp; burrowed, peloidal packstone
  • Size classes: macrofossils, mesofossils
  • Primary reference: A. J. Rowell and N. E. Caruso. 1985. The evolutionary significance of Nisusia sulcata, an early articulate brachiopod. Journal of Paleontology 59(5):1227-1242 [A. Hendy/A. Hendy] more details
  • Purpose of describing collection: taxonomic analysis
PaleoDB collection 262: authorized by Jack Sepkoski, entered by Mike Sommers on 20.11.1998

Shift to CC BY

From its inception, the paleontologists who have invested the most effort in entering data have made decisions about data management and access policies, which ultimately brings up the important questions of proper licensing and citation. In the first application of the PBDB licensing policy, the individual contributors chose their own CC license for each fossil collection record. As a result there were three kinds of contributors: those who didn’t know what to do, didn’t care, or didn’t know about the new policy that required them to specify how existing collections should be licensed (55% of the data), those who selected the most restricted option available to them (34% of the data), and those who selected the most unrestricted option available to them (10% of the data).

This received mostly negative response via social media and other outlets, partly because of the increased attention the database was receiving during a leadership and governance transition. Naturally, the governance group responded to the community feedback. The first actual action was by individual contributors. Many of the contributors who either didn’t know about CC licenses or who didn’t think fully about their meaning and implications changed their own individual licenses. This always went from a more restrictive license to the least restrictive option available to them: CC BY. That wave of individual choices towards the least restrictive license immediately shifted the balance for records in the database. At that point, only one contributor had a restrictive license, and the governance group quickly moved to adopt one single unifying license for the database: CC BY. Now, all new records are explicitly CC BY as part of database policy, although individual contributors still have the option of placing a moratorium on the public release of their own new data so as to protect their individual scientific interests.

Future of PBDB

In addition to being a scientific asset to the field of paleontology, the PBDB and other databases like it provide an addition means by which to participate in rapidly emerging initiatives and developments in cyberinfrastructure. To increase its reach in this area, the PBDB now has an Application Programming Interface (API), which makes data more easily and transparently accessible, both to individual researchers and to applications, such as the open source web application PBDB Navigator and the Mancos iOS mobile application. Both of these applications are built on the public API and are designed to allow the history of life and environment documented by the PBDB to be more discoverable. These new modes of interactivity and visualization highlight unintended, but potentially useful, aspects of the PBDB. The PBDB API has facilitated a loosely coupled integration with other related but independently managed biological and paleontological database initiatives and online resources, such as the Neotoma Paleoecology Database, Morphobank, and the Encyclopedia of Life. The PBDB API can also be harnessed by geoscientists outside of paleontology, thereby facilitating the integration of paleontological data with diverse types of data and model output, such as paleogeographic plate rotation and geophysical models in GPlates. The liberal CC BY license ensures interoperability and data access necessary to facilitate fundamentally new science and because it expands the reach of paleontology to a broader community of researchers and educators than is possible via any single website or application.

1 Comment »

BioMed Central moves to CC BY 4.0 along with CC0 for data

Puneet Kishor, December 18th, 2013

CC 4.0

at

BioMed Central, The Open Access Publisher chemcentral_logo SpringerOpen

BioMed Central (BMC) is one of the largest open access (OA) publishers in the world with 250 peer-reviewed OA journals, and more than 100,000 OA articles published yearly. BMC is also long-time user of CC licenses to accomplish its mission of husbanding and promoting open science. BMC has been publishing articles under a CC license since 2004.

In June of least year, BMC’s Iain Hrynaszkiewicz and Matthew Cockerill, published an editorial titled Open by default in which they proposed a copyright license and waiver agreement for open access research and data in peer-reviewed journals. The gist of the editorial was that

Copyright and licensing of scientific data, internationally, are complex and present legal barriers to data sharing, integration and reuse, and therefore restrict the most efficient transfer and discovery of scientific knowledge, (and that implementing) a combined Creative Commons Attribution license (for copyrightable material) and Creative Commons CC0 waiver (for data) agreement for content published in peer-reviewed open access journals… in science publishing will help clarify what users—people and machines—of the published literature can do, legally, with journal articles and make research using the published literature more efficient.

Starting September 3, 2013, in keeping with its forward-looking mission, BMC started requiring a CC0 Public Domain Dedication for data supporting the published articles.

This is good because CC0 reduces all impedance to sharing and reuse by placing the work in the public domain. Good scientific practices assure proper credit is given via citation, something scientists have already been doing for centuries. Marking data with CC0 sends a clear signal of zero impedance to reuse. CC0 is a public domain dedication, however, wherever such a dedication is not possible, CC0 has a public license fallback. Either way, the impedance to data reuse is eliminated or minimized. Making CC0 the default removes uncertainty, and speeds up the process of accessible, collaborative, participatory and inclusive science.

But wait, there is more… starting February 3, 2014, BMC, Chemistry Central and all of SpringerOpen family of journals are also Moving Forward to the latest CC BY 4.0 license. Changes in CC-BY — version 4.0, released on Nov 25, 2013, represent more than two years of community process, public input and feedback to develop a truly open, global license suitable for both copyright, related rights and, where applicable, database rights. By moving to CC4.0, BMC is not only getting set for reliable, globally recognizable mark of open, it is also setting a high bar for the future of open science.

We at Creative Commons are big fans of BMC, and we applaud their move to creating a stronger, more vibrant open commons of science.

Comments Off


Page 1 of 512345