Nov 2017

SciGraph publishes 1 billion facts as Linked Open Data

Last Thursday we reached a major milestone for the SciGraph project: nearly 1 billion facts (= RDF statements) have been released as Linked Open Data, most of it under a CC-BY license!

This data release follows and improves on the previous data release (February 2017) which included metadata for all journal articles published in the last 5 years.

What's in this release:

  • Datasets downloads. Almost 1 billion triples (23.2 GB compressed, or 205.2 GB uncompressed) comprising our SciGraph ontology, SKOS taxonomies and instance data covering the complete archive of Springer Nature publications, i.e. books and journals (1801-2017), conferences, affiliations, funders, research projects and grants. The data is current to end of 2017Q3.
  • Data Explorer. The data explorer allow users to visualize each single node in the graph and to move to other related nodes interactively. Furthermore, the Explorer allow users to get rich data descriptions for SciGraph things by traversing the knowledge graph and using content negotiation on SciGraph URLs. In other words, the Explorer is like a Linked Data API for developers: the RDF data is dereferenceable (Turtle, N-Triples, RDF/XML) and both HTTP and HTTPS protocols are supported.
  • Dual Licence. The majority of SciGraph data is being released under a Creative Commons Attribution (CC BY) 4.0 International License, with a small portion of the data (specifically abstracts and grants) separately licensed under a Creative Commons Attribution-NonCommercial (CC BY-NC) 4.0 International License.
  • Model Mappings. To align the SciGraph ontology with other well-known vocabularies we include several mappings and have used extensively two external datasets: ANZSRC (Australian and New Zealand Standard Research Classification) Fields of Research codes, and GRID (Global Research Identifier Database) identifiers.

Who is this for?

In general, for people who are interested in reusing our metadata e.g. for data analysis tasks, for developing applications that benefit from linking to Springer Nature content etc.. For example:

  • * Researchers and (linked) open data enthusiasts i.e. see the Linked Data Cloud.
  • * Metadata and information specialists e.g. librarians.
  • * Developers and Data Scientists.

Furthermore, we are in contact with various organisations who are interested in reusing large parts of our datasets, e.g. Wikidata, DBpedia and EMBL-EBI.


Any questions of feedback, leave a comment or email

We'd love to hear from you! Also, you can follow the #scigraph tag on twitter for last-minute news.

Cite this blog post:

Michele Pasin. SciGraph publishes 1 billion facts as Linked Open Data. Blog post on Published on Nov. 14, 2017.


See also:


paper  Modeling publications in SN SciGraph 2012-2019

Workshop on Scholarly Digital Editions, Graph Data-Models and Semantic Web Technologies, Université de Lausanne, Jun 2019.

paper  Interlinking SciGraph and DBpedia datasets using Link Discovery and Named Entity Recognition Techniques

Second biennial conference on Language, Data and Knowledge (LDK 2019), Leipzig, Germany, May 2019.


paper  Data integration and disintegration: Managing Springer Nature SciGraph with SHACL and OWL

Industry Track, International Semantic Web Conference (ISWC-17), Vienna, Austria, Oct 2017.

paper  Using Linked Open Data to Bootstrap a Knowledge Base of Classical Texts

WHiSe 2017 - 2nd Workshop on Humanities in the Semantic web (colocated with ISWC17), Vienna, Austria, Oct 2017.


paper  Insights into Nature’s Data Publishing Portal

The Semantic Puzzle (online interview), Apr 2016.


paper  Learning how to become a linked data publisher: the ontologies portal.

5th Workshop on Linked Science 2015, colocated with ISWC 2015., Bethlehem, USA, Sep 2015.


paper  Moving EMLoT towards the web of data: an approach to the representation of humanities citations based on role theory and formal ontology

New Technologies in Medieval and Renaissance Studies, (forthcoming). (part of the 'Envisioning REED in the Digital Age' collection)


paper  Data integration perspectives from the London Theatres Bibliography project

Annual Conference of the Canadian Society for Digital Humanities / Société pour l'étude des médias interactifs (SDH-SEMI 2010), Montreal, Canada, Jun 2010.