ME / Papers / 2019




Title:

Interlinking SciGraph and DBpedia datasets using Link Discovery and Named Entity Recognition Techniques


Download:


Abstract:

In recent years we have seen a proliferation of Linked Open Data (LOD) compliant datasets becoming available on the web, leading to an increased number of opportunities for data consumers to build smarter applications which integrate data coming from disparate sources. However, often the integration is not easily achievable since it requires discovering and expressing associations across heterogeneous data sets. The goal of this work is to increase the discoverability and reusability of the scholarly data by integrating them to highly interlinked datasets in the LOD cloud. In order to do so we applied techniques that a) improve the identity resolution across these two sources using Link Discovery for the structured data (i.e. by annotating Springer Nature (SN) SciGraph entities with links to DBpedia entities), and b) enriching SN SciGraph unstructured text content (document abstracts) with links to DBpedia entities using Named Entity Recognition (NER). We published the results of this work using standard vocabularies and provided an interactive exploration tool which presents the discovered links w.r.t. the breadth and depth of the DBpedia classes.

Full reference:

Beyza Yaman, Michele Pasin, Markus Freudenberg. Interlinking SciGraph and DBpedia datasets using Link Discovery and Named Entity Recognition Techniques - Second biennial conference on Language, Data and Knowledge (LDK 2019) Leipzig, Germany May 2019 .