Making every human gene accessible and linkable

What if every human had access to the sum total of all biomedical knowledge? Not only through articles in popular science periodicals or sensational tv documentaries, but real access that goes both ways, where the general public can engage and influence their biomedical scientists. Biomedical research is typically reported through articles in scientific journals (the collection of which increases by more than 1 million per year). Some information is structured and shared through literally thousands of different, partially overlapping databases. Both of these information resources are almost completely opaque to the general public because of paywalls and the challenges of querying multiple non­interoperable databases. Even professional scientists with both institutionally funded journal and database access face major challenges in assimilating the knowledge needed to generate effective new hypotheses. It is not uncommon for a scientist to spend a large part of their precious time on technicalities of data access and integration, time that would be better spent on finding a cure for cancer, for example.

From its inception, Wikipedia has taken an entirely different approach to distributing knowledge. Topic­focused, encyclopaedic articles naturally lend themselves to interlinking and to evolution over time. Community ownership ensures broad access and, in the great majority of cases, results in high quality. Since 2005, Wikipedians have made a concerted effort to organize and improve the content of biomedically relevant articles via, for example, WikiProjectMolecular and Cellular Biology ( WikiProject Medicine. Recognizing the value of Wikipedia as a repository and foundry of this kind of knowledge, the NIH funded the Gene Wiki ( in 2010 to help continue to stimulate growth and improve content focused on human genes.

Now in its second iteration, the Gene Wiki project is coordinating with the recently released WikiData platform as it continues to advance its goals of democratizing access to biomedical knowledge. Wikidata is the centralized linked knowledge base for Wikimedia projects, such as all different language Wikipedias. Structured elements of Wikipedia articles such as tables can now be built dynamically from knowledge captured in Wikidata. As with Wikipedia articles, any Wikidata entry can be edited by anyone (both humans and computer programs). In the other direction, Wikidata provides interfaces, including a SPARQL endpoint, for external applications to query.

As Wikipedia provides open access to an evolving collection of human readable articles, Wikidata provides access to an evolving trove of structured data. Together, these public resources provide the means for research communities to efficiently share their insights with each other as well as the public at large. The knowledge is there to use, outside the scope of any professional limitation. Not only is this a great way of disseminating knowledge, it also opens up scientific knowledge for public scrutiny. Anyone has access to the data, to references about where it came from, and to the discussion pages behind each Wikidata item where they can engage in discussing the quality of the knowledge added. Community input is broadcasted back to the original data owners, which has already lead to improvements in the source data. We are working to promote this two­way traffic such that it leads to higher quality scientific data and thus improves our collective understanding of ourselves and the world around us.