- content
-
COVID-19 data as linked data Linking with data graphs Linking with COVID-19 publications Linking with life sciences Linking with public authoritative data sources Knowledge graphs to tackle COVID-19
COVID-19 data as linked data
The new SARS-CoV-2 (Q82069695) causing the COVID-19 disease (Q84263196) is shaking the world, affecting and challenging everyone. The more we know and the quicker we know it, the better the chance to minimise the damages.
The application of knowledge graphs to this end is a natural choice. The digital twin of the virus spread is a graph, as this example from Singapore shows.
Different data sets related to COVID-19 are published from independent publishers in various formats. Linking them and giving them explicit semantics can increase their value and re-use.
Linking with data graphs
The benefit of having COVID-19 data as Linked Data comes from the ability to link and explore independent sources. For example, COVID-19 sources often do not include other regional or mobility data. Then, even the simplest thing, having the countries not as a label but as their URI of Wikidata an DBpedia, brings rich possibilities for analysis by exploring and correlating geographic, demographic, relief and mobility data.
For example, this query already contains DBpedia URIs and COVID19 data from Worldmeters (this is just an example, the data is not up to date). Then this data set can be linked with data from DBpedia, including the corresponding Wikidata URI (owl:sameAs), so that additional data can be linked from Wikidata. Here's an example of a Wikidata query bringing population, the ration between total and city population, life expectancy and bordering countries. Here is a new collection of Wikidata queries on COVID19 data. This is the result of one of them, showing how different coronaviruses are related.
See also testing and other COVID19 data turned into LOD here.
Linking with COVID-19 publications
Kaggle published a dataset of COVID-19 publications which was soon turned into RDF knowledge graph by a team from the University of Ghent. The graph is available here. It can be queried using Linked Data Fragments.
CORD-19-on-FHIR is a superset of the COVID-19 Open Research Dataset (CORD-19) data, provided by the Allen Institute to support research on COVID-19 / SARS-CoV-2 / Novel Coronavirus. It is represented in FHIR RDF and was produced by data mining the CORD-19 dataset and adding semantic annotations. The purpose is to facilitate linkage with other biomedical datasets and enable answering research questions.
COVID19 and related publication are also available from Microsoft Academic Knowledge graph: SPARQL endpoint.
Linking with life sciences
Life sciences traditionally utilise Linked Data and Semantic Technologies. Linked Life Sciences knowledge graph make a big part of the web of data.
In fact, the biggest LOD data set is that of UniProt. Their SPARQL endpoint provides access to over 55 billion triples (knowledge on protein sequence). There is already a pre-release of a dataset on COVID19 which is accessible here. Soon after it was announced, it was sponged and available via SPARQL, before it is available at the UniProt endpoint.
The disease pathways of COVID-19 are now available at this SPARQL endpoint and are expected to be updated weekly.
AmeliCA has just published its knowledge graph on epidemics, accessible also through SPARQL. Their ontology looks like this:
Plenty of ontologies already available.
Although not available as LOD, an important data source for COVID-19 is ECDC. You can find here some test visualisations by Sascha LEIB, using ECDC data.
Linking with public authoritative data sources
The Europen Environmental Agency provides linked open data. They have recently reported a tremendous decrease in air pollution in Europe.
Some countries official statistics are provided as LOD, for example in Japan. EUROSTAT also provides some data as LOD, for example, the NUTS classification. All other statistical data can be transferred to LOD.
There are of relevant datasets available from:
Knowledge graphs to tackle COVID-19
The are several ongoing Knowledge Graph projects to represent all useful information to tackle the Coronavirus.