Text-based search is powerful. However, as more and more information is digitized and made available on the internet, the effectiveness of text-based search could stand to be supplemented with other technologies.
Aunt Bertha, an Austin, TX–based B Corporation, focuses on helping people to find government and charitable human service programs on the web. In the United States, there are 89,000 governments, a million charities, and more than three hundred thousand congregations. Many of these organizations provide food, health, housing, or education programs to those who need it (the “Seekers”). Aunt Bertha’s goal is to index all these programs so that the Seekers can find help in seconds.
Launched in the fall of 2010, Aunt Bertha founders learned something very interesting early on. In a medium-sized city, a Seeker can have at least 500 government and charitable programs to choose from. The user experience designer must ensure that the Seekers can easily find the program that fits their need, a task that’s harder than it might seem: not only are the Seekers are multi-faceted and complex; so are the programs that serve them. A common language that described both the Seekers and the available human services would go a long way to help as text-based search alone would not work. Enter the Open Eligibility Project.
Realizing that other organizations were facing the same problem — and that there had been attempts at categorizing these types of programs before, but the terms and methodologies used were full of bureaucratic jargon — the Open Eligibility Project set out to simplify the taxonomy, the terms that describe human services.
There are two important facets to human services taxonomy: Human Services and Human Situations. Human Services are simply the services provided by the organization—examples include clothes for school, computer classes and counseling. Human Situations are simply the attributes of the Seeker—for examples, mothers, ex-offenders or veterans. Here is one example of the use of this taxonomy on Aunt Bertha:
It is not always easy to find the balance between comprehensiveness and ease-of-use. For this project to be successful, a tension should always exist between these two goals. Lean too far one way and it becomes suitable only for the policy wonks. Lean the other way, and it loses specificity and the Seekers can not find what they are seeking.
Since launching the Open Eligibility Project, there has been some interesting traction in the area of human services taxonomy. Just this year, a new Civic Services Schema was submitted and accepted by Schema.org. The ServiceAudience field of the spec, in particular, is a great fit for Open Eligibility’s Human Situations tags. If government agencies adopt this spec, it will make their programs more findable by people who fit those situations (ex: programs for veterans, programs for foster children, etc.).
Aunt Bertha seeded the Open Eligibility Project with all of the types of services and situations listed on Aunt Bertha. But, there are more out there though, and help from others would make the taxonomy even better. That is why the founders were attracted to Creative Commons, and decided to release the taxonomy on Github under a CC BY-SA 3.0 license. Hackers, coders, and those concerned generally with human services are invited to join the Google+ community, and to contribute to the project on the Github page, or to connect with Aunt Bertha on Facebook or Twitter.Comments Off
The structure of human proteins defines, in part, what it is to be human. It is very expensive, as much as a couple of million USD, to determine the structure of human membrane proteins. Improvements in methods, computers and access to the complete sequence of our DNA, however, has made it possible to adopt more systematic approaches, and thus reduce the time and cost to determine the shapes of proteins. Structural genomics helps determine the 3D structures of proteins at a rapid rate and in a cost-effective manner. Structural information provides one of the most powerful means to discover how proteins work and to define ligands that modulate their function. Such ligands are starting points for drug discovery.
The Structural Genomics Consortium (SGC) at the Universities of Oxford and Toronto, solves the structures of human proteins of medical relevance and places all its findings, reagents and know-how into the public domain without restriction. Using these structures and the reagents generated as part of the structure determination process as well as the chemical probes identified, the SGC works with organizations across the world to further the understanding of the biological roles of these proteins. The SGC is particularly interested in human protein kinases, metabolism-associated proteins, integral membrane proteins, and proteins associated with epigenetics and rare diseases.
Drug discovery tends to be a crapshoot. As we are not good at target validation that essentially occurs in patients, more than 90% of the pioneer targets fail in Phase 2. Nevertheless, many academics and pharmas work on the same, small group of targets in competition with each other, wasting resources and careers, needlessly exposing patients to molecules destined for failure. The SGC chooses not to work under the lamp post, focusing on those targets for which there is little or no literature. This is because it is such pioneer targets, which will deliver pioneer, breakthrough medicines.
The SGC is a not-for-profit, public-private partnership, funded by public and charitable funders in Canada and UK, and eight large pharmaceutical companies – GSK, Pfizer, Novartis, Lilly, Boehringer Ingelheim, Janssen, Takeda and Abbvie, whose mandate is to promote the development of new medicines by determining 3D structures on a large scale and cost-effectively, targeting human proteins of biomedical importance and proteins from human parasites that represent potential drug targets.
The SGC is now responsible for between a quarter and half of all structures deposited into the Protein Data Bank (PDB) each year. The SGC has released the structures of nearly 1500 proteins with implications to the development of new therapies for cancer, diabetes, obesity, and psychiatric disorders. As evident from the chart, SGC has published as many protein kinases as the rest of academia combined.
The SGC’s structural biology insights have allowed us to make significant progress toward the understanding of signal transduction, epigenetics and chromatin biology, and metabolic disease. The SGC has adopted the following Open Access policy—the SGC and its scientists are committed to making their research outputs (materials and knowledge) available without restriction on use. This means that the SGC promptly places its results in the public domain and agrees to not file for patent protection on any of its research outputs. This not only provides the public with this fundamental knowledge, but also allows commercial efforts and other academics to utilize the data freely and without any delay. The SGC seeks the same commitment from any research collaborator. The structural information is made available to everyone either when the structure is released by the PDB, or pre-released on www.thesgc.org.
Prof. Chas Bountra at the University of Oxford says:
“Society desperately needs new treatments for many chronic (AD, bipolar disorder, pain…) or rare diseases. This need is growing because of aging societies and diseases of modern living. As a biomedical community, we have yet to deliver truly novel treatments for many such conditions. This is not for lack of effort or resources. It is simply that these disorders are complex and there are too many variables or unknowns. It is clear that no one group or organisation can do this on their own. What we are trying to do is to bring together the best scientists from across the world, irrespective of affiliation, pooling resources and infrastructures, reducing wasteful duplicative activity to catalyse the creation of new medicines for patients. Secrecy and competition in early phases of target identification/discovery are slowing down drug discovery, making the process more difficult and more expensive.”
We at CC applaud the SGC’s commitment to open access and look to them for leadership in this arena. We believe the SGC’s findings would be a great candidate for the CC0 Public Domain Dedication because of the CC0 mark’s global recognition and a common legal status.Comments Off
This past August, I facilitated an online peer-learning course in the School of Open introducing open science to newcomers, and Michelle Sidler worked behind the scenes to keep things glued together. This guest post was written by Michelle, and gives a look at how things went teaching an entirely free course on open science over the web. It’s pretty cool.
Guiding Students through the Course
During last month’s round of School of Open courses, I helped out with a facilitated version of the Open Science course supported by Creative Commons, the Open Knowledge Foundation, and PLOS. On four Tuesdays in August, Billy Meinke hosted online discussions with a handful of well-known members of the open science community while participants from around the world completed course modules and blogged about their experiences. Here’s how things went down.
Note: The course materials and online discussions are available on the Open Science P2PU course page, and will continue to grow over the next few weeks as participants share blog about their experiences working with aspects of science that are either open or not.
While completing course units, participants blogged their experiences, offering reflections and insights about open science and sharing online resources they found. Participants were researchers and scientists from around the world, including biologists, climatologists, librarians, and even musicians.
Though we are still working through much of the blog posts, here are some examples of people learning about open access, open data, and open research for free through the School of Open:
The first of three modules introduced the topic of open access (OA), and after browsing through content about OA, learners were to report on the openness of published research articles they found on the web. A learner named Peter Desmet provided a fine overview of the history of open access and the different “flavours” of open access in an entry on his blog. The second module led folks to the topic of open data for science, where a peer by the name Odon shared her process of learning through her blog, Odonlife. Her writings offered definitions and descriptions of open data and assessed the openness of datasets she found online. Drawing from these lessons, she also described her experiences contributing to open data crowdsourcing projects and how they inspired her to start a similar project. For the third unit on open research, a peer in the course named Nicki Clarkson described the work of Jon Tennant, a paleontologist and open science advocate who deposited the data from his PhD research into the Paleontology Database, a repository for similar data. Jon even commented on her post, thanking her for the shout-out—another example of the ways in which open information brings researchers together!
In addition to supporting the online course participants, Billy Meinke hosted online discussions with many open science friends and advocates from many locales and types of involvement with science around the world. Guests from a variety of organizations joined open, broadcasted Google Hangouts and shared their experiences in open science with dozens of learners watching each stream. Thanks to all the guests who took the time to chat with us about open science! Links to the video and etherpad notes (taken during the live sessions) can be found on the Open Science course page.
Taking the Open Science course further
The Open Science course doesn’t end when we complete the units and assignments. Continue the conversation by spreading the word to other scientists about this resource and encouraging them to participate. There has been interest in volunteer translation efforts and other adaptations of the material. Anyone is free to do so, in compliance with the CC BY-SA license on the course. Much of the material is licensed CC BY or CC0, which give even more open reuse rights!
If you’d like to find out more about what’s happening with this course and others in the School of Open, head on over to the School of Open Google Group and join the discussion! You can also sign up to be notified when the next facilitated course launches, likely in Spring 2014.1 Comment »
I met Peter Sand a few months ago at a #Sensored meetup in SoMa. The setting was exactly like the hardware labs from my undergraduate engineering days, and Peter was there exactly like one of my buddies showing kits and circuits cobbled together to do science (except, Peter is quieter and more polite than most of my buddies). Peter founded ManyLabs, a San Francisco-based nonprofit that wants:
students of any age to become comfortable with data, scientific processes, and mathematical representations of the world. We want people to learn about the strengths and limitations of using math and data to address real-world problems.
Hmmmm… think about that for a minute. Peter is thinking really long-term. He wants to invest in kids today (although ManyLabs kits are suitable for and to be enjoyed by anyone of any age) so they become good at using math and data in the future. Now, that is my kind of guy.
ManyLabs has released a collection of interactive science activities and projects under the Creative Commons BY-SA license. Many of these activities and projects are based on Arduino, an open-source microcontroller board. While most Arduino-based education projects are focused on electronics, programming, or robotics, ManyLabs is instead aiming for compatibility with the existing curricula of biology, physics, math, data, and my favorite, environment classrooms.
Previously ManyLabs was using a CC BY-NC-SA license. “We moved away from a non-commercial license because we want to make usage of the content more flexible. We want the materials to make the widest possible contribution to education,” explained Peter.
While the initial content has been seeded by a small group of contributors, ManyLabs hopes to make the site more community-driven by releasing authoring tools that will allow anyone to create, share, and modify interactive lessons. They also plan to release a platform for CC-licensed data that will allow students, teachers, and others in the community to share data gathered from sensors and manual observations. Together these tools aim to promote scientific reasoning and data literacy, both in schools and in the world at-large.
We are fully behind Peter and his mission. So, go ahead, share, sign in or sign up, and create a lesson. What better way to make the world more open than by teaching kids today about Open to ensure that tomorrow’s world will be full of young people who would have known nothing else.Comments Off
What do you get when you write software that becomes the basis of just about every geospatial application out there? You get perspective. Frank Warmerdam has been authoring, improving, supporting, and shepherding Shapelib, libtiff, GDAL and OGR for the past 15 years. Frank believes that by sharing effort, by adopting open, cooperatively developed standards, and avoiding proprietary licenses, adoption of open technologies could be supercharged. And lucky for us, he is right. To paraphrase him, open standards facilitate communication, capture common practice, and externalize arbitrary decisions.
Frank has done it all — worked as an independent consultant, for a proprietary remote sensing company, for a large search engine and mapping company, and now for a small, innovative space hardware maker. But most importantly, he has been a leader in the open geospatial world, at the helm of the Open GeoSpatial Foundation (OSGeo) that I myself have been involved with as long as I have personally known Frank, that is, for a good part of the past decade.
While OSGeo has faced a number of challenges, it has also enjoyed tremendous success through growing number of projects and chapters, local conferences, being perceived as a legitimate player, and recently, getting representation in its Charter Membership from 37 countries.
Frank says working on data libraries is a grungy job. Everyone wants ‘em but no one wants to work on ‘em. We relate to that as licenses are kinda like that, an essential infrastructure play that require getting the legal and technical details right, yet are most effective when they recede in the background and make us enjoy the content to the fullest.
Per Frank, the next set of challenges revolve around getting open geodata with easy to understand, interoperable license terms. As micro-satellite imagery becomes ubiquitous with frequent imagery collects, the resulting flood of imagery may lead to more ready adoption of open terms, perhaps even a current, live, or almost-live global, medium resolution basemap for OpenStreetMap. We can dream, and with my friend Frank to lead us with his quiet actions and measured wisdom, our dreams will come true.Comments Off
About 400 map makers, coders, cartographers, designers, business services providers and data mungers of chiefly spatial persuasion gathered in San Francisco to “talk OpenStreetMap, learn from each other, and move the project forward.” These conference attendees are a tip of an iceberg composed of 1.1 million registered users who have collectively gathered 3.2 billion GPS points around the world since OpenStreetMap was launched in 2004 as a free, editable map of the whole world. Unlike proprietary datasets, OpenStreetMap allows free access to the full map dataset. About 28 GB of data representing the entire planet can be downloaded in full, but also is available in immediately-useful forms like maps and commercial services. OpenStreetMap is open data licensed under the Open Data Commons Open Database License (ODbL) with the cartography in its tiles and its documentation licensed under a CC BY-SA 2.0 license.
The program ranged from building and nurturing OSM communities, to technical wizardry, to improving infrastructure. Martijn van Exel provided an insight into the OSM community in the United States (see table below). Big countries and large areas pose challenges already in the queue to be tackled.
|land area||3.7 million sq miles|
|casual (< 100 edits)||71.0%|
|active (>100 edits, active in last 3M)||6.8%|
|power (>1000 edits, active in last 3M, active for >1Y||2.6%|
|total edits, all time||723,000,000|
|edits by top 10 mappers (incl bots and import accounts)||69.8%|
|edits by power mappers (excl most bots and import accounts)||57.3%|
Scientific authoring workflow is a beast. You keep notes on paper (hopefully, a notebook, and not just loose pages), in word-processing documents unhelpfully named “notes” followed by “notes1,” “notes2″ or worse, “notes_old,” “notes_old1.” You manage your bibliography on your desktop or on the web, you have a directory folder full of images, charts, photos and other media, and you collaborate with your co-authors by emailing attachments back and forth.
Sooner or later you start doubting your sanity but you soldier on. Finally you publish your paper, heave a sigh of relief, and move on, thereby ensuring your data can’t be reused and your work can’t be reproduced easily.
Several coders, designers, scientists, and publishers met at PLOS to brainstorm toward a better, more modern way. The Markdown for Science workshop was organized by Martin Fenner and Stian Håklev and supported by a 1K Challenge Grant from FORCE11.
Photos by Puneet Kishor, CC0 PD Dedication
While a lot of good ideas were generated, we have a long way to go. Keep an eye on this project, and better yet, pitch in with your ideas and code. Together we can tame this beast.Comments Off
Today the Public Library of Science announced the Accelerating Science Award Program (ASAP). The award program seeks nominations of individuals who have used, applied, or remixed scientific research — published through open access — in order to realize innovations in science, medicine, and technology. The goal of ASAP is to build awareness of and encourage the use of scientific research published through open access. Major sponsors include the Wellcome Trust and Google.
Three winners will each receive $30,000. The nomination period opens today and runs through June 15, 2013. Potential nominees may include individuals, teams, or groups of collaborators -– such as scientists, researchers, educators, social services, technology leaders, entrepreneurs, policy makers, patient advocates, public health workers, and students -– who have used scientific research in transformative ways. The winners will be announced in Washington, DC, in October 2013 at an Open Access Week event hosted by SPARC and the World Bank.
Creative Commons is a supporter of ASAP, along with several other library organizations, publishers, and research organizations.
For more information, including the full details of the ASAP program, nomination process, and the award specifics, go to http://asap.plos.org/. For program rules visit http://asap.plos.org/nominate/rules/.Comments Off
Hanging around with our own kind, we in the open science community might get lulled into thinking that everyone out there thinks like us. In reality, most scientists actually do science instead of worrying about whether or not it is open. However, even though some of their practices align with open science objectives, there is much more that can be done proactively to engender an open commons of science.
Sophie Kershaw, doctoral student in computational biology at University of Oxford, came up with the idea of injecting Open Science Training in formal curriculum, and teaching young scientists about Open while they are still young and learning about the scientific method, as part of her Open Knowledge Foundation supported Panton Fellowship. In Sophie’s words:
As the Open Science movement gathers pace, we are seeing developments in policy and infrastructure to support the transition of academia towards Open practices. Despite this, there is a considerable lag in awareness within the academic community itself – many researchers either haven’t heard about Open, or know the term but don’t know how to put it into practice! From a show of hands on the first day of my Open Science Training Initiative (OSTI), only ONE grad student out of 43 had heard of open science. It is now time for us all to step up our efforts in educating our academics in licensing, open access and data management, preferably through provision of pre-doctoral training. Our first research group plays a huge role in shaping our research outlook, but this leaves us with a huge variability in the level of awareness that students develop. Some will pitch up in a very forward-thinking group, where licensing, collaboration and data archiving is the order of the day, while others are left without this kind of information. Pre-doctoral training will ensure continuity of provision for ALL our science grads, enabling them to make their own decisions with confidence.
This kind of practical intervention delivered right to young scientists sounds like a great idea, and as Sophie says, reactions to the first edition of OSTI seem to confirm that:
Students from the inaugural OSTI came out strongly in favour of receiving training in licensing and engaging in debate on development of the publication process: furthermore, they’ve shown that while lectures are handy, hands-on experience is the best way to learn about how to license, how to release data, how to communicate science. We need to emphasize delivery of a coherent research story – comprising appropriately licensed data, code and writing – rather than merely the traditional written report. We need to make our young researchers see themselves as research users as much as research producers. Over time, this should help our newest grads deliver verifiable, reproducible research with vast potential for further development and scientific impact.
The Open Science Training Initiative is not an idea with immediate returns. Instead, it is for bringing about long-term change so the next generation of scientists and beyond proactively default to open. There are challenges ahead, such as creating right formats for different conditions and audiences, finding right partners who would incorporate OSTI in their courses, and scaling to reach the next generation of scientists all over the world. But, it is an idea we consider worth supporting, because the potential returns are lasting in nature.Comments Off
There are many ways we can measure the effect of the work we do — count the number of objects licensed with CC licenses, count the number of users who have used CC licenses, count the number of works created by reuse of works licensed with CC licenses, perhaps many other ways. But the one that is most immediately visible, and most satisfying, is seeing events of spontaneous openness appear as is for the international celebration of Open Data Day taking place tomorrow, February 23. Well, perhaps not so spontaneous, because organizing events takes planning, work, contacts, brainstorming, publicizing, and more.
Our own Billy Meinke is organizing an event at the CC HQ in Mt. View, and has also written more on the event here and around the world in a guest blog post on PLOS. Check it out, organize an event, or attend one near you. Heck, attend one far away by joining in over the web where possible. After all, that is how the net works.Comments Off