Skip to main content
SearchLoginLogin or Signup

Representing COVID-19 information in collaborative knowledge graphs: the case of Wikidata (lay summary)

This is a lay summary of the article published under the DOI: 10.3233/sw-210444

Published onApr 24, 2023
Representing COVID-19 information in collaborative knowledge graphs: the case of Wikidata (lay summary)
·

Researchers explain how Wikidata helped us during the COVID-19 pandemic

Researchers described how a public and free tool called Wikidata helped the world to quickly collect and spread trustworthy information about the COVID-19 disease. This helped people react quickly to the changes brought by the pandemic, which started in 2020.

Wikidata tried to provide this information to the world quickly. Wikidata is a large store of information that is freely available to everyone. Many people from all over the world worked together to build Wikidata. Thus, it is multilingual and accessible to both humans and computers.

Because anyone can edit Wikidata, the tool must validate information as it comes in. The data also needs to be stored so that new information can be efficiently incorporated. 

The tool also creates visualisations of its information so that people can easily interpret it, and all of the information must be easy to share. All of this must happen quickly so that people can use and react to new information, in this case about COVID-19, as soon as possible.

These researchers wanted to highlight some of the challenges Wikidata faced and how it overcame these challenges. They also wanted to show how Wikidata was helping users.

Wikidata stores its information in a computer database. It covers many topics, like COVID-19, hospitals, and vaccines, that all have links between them. This lets people and computers quickly retrieve information relevant to a topic.

The database was made flexible enough to allow new and ever-changing topics to be added. It was designed to be able to draw data from several other databases. Many people run computer programs that automatically update Wikidata as they receive new data.

Many researchers have used Wikidata to further studies about COVID-19. Scientists have used it to analyse the SARS-Cov-2 virus’s genome, for example. It is also an effective educational tool on how to prevent the spread of COVID-19, since many people check what symptoms are common and how they might contribute to the spread.

These researchers were the first to study a crowd-sourced database like Wikidata in this way. This study was also the first to analyse how COVID-19 changed the types of medical information that people looked for.

The authors said the creators of Wikidata want to keep expanding the database. They want to improve the data quality and include new data types, such as the results of ongoing COVID-19 research. Similarly, they want to help people be even more prepared for pandemics.

The authors of this study were from Tunisia, Australia, Brazil, Polan, Spain, America, Jordan, India, and Germany. Many African countries were severely impacted by COVID-19. Because Wikidata is multilingual, it likely helped many Africans access information in languages other than English.

Abstract

Information related to the COVID-19 pandemic ranges from biological to bibliographic, from geographical to genetic and beyond. The structure of the raw data is highly complex, so converting it to meaningful insight requires data curation, integration, extraction and visualization, the global crowdsourcing of which provides both additional challenges and opportunities. Wikidata is an interdisciplinary, multilingual, open collaborative knowledge base of more than 90 million entities connected by well over a billion relationships. It acts as a web-scale platform for broader computer-supported cooperative work and linked open data, since it can be written to and queried in multiple ways in near real time by specialists, automated tools and the public. The main query language, SPARQL, is a semantic language used to retrieve and process information from databases saved in Resource Description Framework (RDF) format. Here, we introduce four aspects of Wikidata that enable it to serve as a knowledge base for general information on the COVID-19 pandemic: its flexible data model, its multilingual features, its alignment to multiple external databases, and its multidisciplinary organization. The rich knowledge graph created for COVID-19 in Wikidata can be visualized, explored, and analyzed for purposes like decision support as well as educational and scholarly research.

Disclaimer

This summary is a free resource intended to make African research and research that affects Africa, more accessible to non-expert global audiences. It was compiled by ScienceLink's team of professional African science communicators as part of the Masakhane MT: Decolonise Science project. ScienceLink has taken every precaution possible during the writing, editing, and fact-checking process to ensure that this summary is easy to read and understand, while accurately reporting on the facts presented in the original research paper. Note, however, that this summary has not been fact-checked or approved by the authors of the original research paper, so this summary should be used as a secondary resource. Therefore, before using, citing or republishing this summary, please verify the information presented with the original authors of the research paper, or email [email protected] for more information.

Connections
A Reply to this Pub
Representing COVID-19 information in collaborative knowledge graphs: The case of Wikidata
Description

Information related to the COVID-19 pandemic ranges from biological to bibliographic, from geographical to genetic and beyond. The structure of the raw data is highly complex, so converting it to meaningful insight requires data curation, integration, extraction and visualization, the global crowdsourcing of which provides both additional challenges and opportunities. Wikidata is an interdisciplinary, multilingual, open collaborative knowledge base of more than 90 million entities connected by well over a billion relationships. It acts as a web-scale platform for broader computer-supported cooperative work and linked open data, since it can be written to and queried in multiple ways in near real time by specialists, automated tools and the public. The main query language, SPARQL, is a semantic language used to retrieve and process information from databases saved in Resource Description Framework (RDF) format. Here, we introduce four aspects of Wikidata that enable it to serve as a knowledge base for general information on the COVID-19 pandemic: its flexible data model, its multilingual features, its alignment to multiple external databases, and its multidisciplinary organization. The rich knowledge graph created for COVID-19 in Wikidata can be visualized, explored, and analyzed for purposes like decision support as well as educational and scholarly research.

Comments
0
comment
No comments here
Why not start the discussion?