Wikidata Workshop : Home

Overview

Wikidata is an open knowledge base hosted by the Wikimedia Foundation that can be read and edited by both humans and machines. Wikidata acts as the central source of common, open structured data used by Wikipedia, Wiktionary, Wikisource, and others. It is used in a variety of academic and industrial applications.

In recent years, we have seen an increase in the number of scientific publications around Wikidata. While there are a number of venues for the Wikidata community to exchange, none of those publish original research. We want to bridge the gap between these communities and the research events and give the research-focused part of the Wikidata community a venue to meet and exchange information and knowledge.

The Wikidata Workshop 2022 focuses on the challenges and opportunities of working on a collaborative open-domain knowledge graph such as Wikidata, which is edited by an international and multilingual community. We encourage submissions that observe the influence such a knowledge graph has on the web of data, as well as those working on improving this knowledge graph itself. This workshop brings together everyone working around Wikidata in both the scientific field and industry to discuss trends and topics around this collaborative knowledge graph.

What is Wikidata?

Call for Papers

This workshop will have two tracks: Novel Work, and Previously Published Work.

Papers in the Novel Work track will be published as part of the workshop proceedings. The Previously Published Work track is for papers already published in other conferences, giving the community the chance to access and discuss relevant work that has been presented elsewhere as part of the workshop.

Novel Work Track

The papers will be peer-reviewed by at least three researchers. Selected papers will be published on CEUR (we only publish to CEUR if the authors agree to have their papers published).

For the Novel Work track, we will accept papers up to 12 pages (excluding references, contribution of the paper should justify the length of the paper). We invite the following types of papers:

Full research paper

Novel research contributions (7-12 pages)

Short research paper

Novel research contributions of smaller scope than full papers (3-6 pages)

Position paper

Presenting a novel idea, that is not yet in the scope of a research contribution (6-8 pages)

Resource paper

Presenting a new dataset or other resource, includes the publication of that resource (8-12 pages)

Demo paper

Presenting a system based on research concepts (6-8 pages)

Previously Published Work Track

Published papers will be reviewed by the organising committee in terms of topical fit and prominence of the publication venue. They will not be published as part of the proceedings.

For the Previously Published Work track, we will accept papers with no page limit, prioritizing instead the importance and relevance of the publication. We invite the following types of papers:

Full research paper

Previously published full papers

Resource paper

Previously published datasets or other resources that are important or interesting to the community

Demo paper

Presenting a previously published system based on research concepts

Submission

Papers have to be submitted through EasyChair.

We ask authors to declare the track they intend on submitting to. To do so, please add, at the beginning of the "title" field on the EasyChair submission, either the string "[Novel]", for the Novel Work track, or the string "[Published]", for the Previously Published track.

Submission Link: https://easychair.org/conferences/?conf=wikidataworkshop2022

Important Dates

Deadline extended!

Papers due: Friday, 5 August 2022

Notification of accepted papers: Friday, 30 September 2022

Camera ready papers due: Monday, 10 October 2022

Workshop date: Monday, 24 October 2022

Submission Guidelines

Submissions must be as PDF, for the [Novel] track formatted in the style of the Springer Publications format for Lecture Notes in Computer Science (LNCS). For details on the LNCS style, see Springer’s Author Instructions. For the [Published] track, no reformatting of the original PDFs is needed.

Journal extensions

Extended versions of journal papers are invited for submission to a special issue of the Semantic Web Journal.

Schedule Detail

The workshop time is: 2 - 6 pm (CEST), 1 - 5 pm (UK), 5 - 9 am (California, US)

All times below in CEST.

14:00 - 14:10

Welcome
Welcome from the organisers, agenda, rules of engagement
14:10 - 14:55

Keynote 1: Lydia Pintscher
14:55 - 15:15

Lightning Talks 1
15:15 - 15:35

Poster Session 1
15:35 - 15:45

Break
15:45 - 16:30

Keynote 2: Tiago Lubiana
16:30 - 16:50

Lightning Talks 2
16:50 - 17:10

Poster Session 2
17:10 - 17:30

Lightning Talks 3
17:30 - 17:50

Poster Session 3
17:50 - 18:00

Closing
Concluding remarks, closing

Sessions / Papers

Session 1: Wikidata Ontology and Schema

Talks: 14:55 - 15:15 (CEST); Posters: 15:15 - 15:35 (CEST)

Armin Haller, Axel Polleres, Daniil Dobriy, Nicolas Ferranti and Sergio J. Rodrı́guez Méndez: An Analysis of Links in Wikidata (Room 1)
Nicolas Ferranti, Axel Polleres, Jairo Francisco De Souza and Shqiponja Ahmetaj: Formalizing Property Constraints in Wikidata (Room 2)
Valentina Anita Carriero, Paul Groth and Valentina Presutti: Towards improving Wikidata reuse with emerging patterns (Room 3)
Jose Emilio Labra Gayo: WShEx: A language to describe and validate Wikibase entities (Room 4)
Leila Feddoul, Frank Löffler and Sirko Schindler: Analysis of Consistency between Wikidata and Wikipedia categories (Room 5)
Sofia Baroncini, Margherita Martorana, Mario Scrocca, Zuzanna Smiech and Axel Polleres: Analysing the Evolution of Community-Driven (Sub-)Schemas within Wikidata (Room 6)
Daniil Dobriy and Axel Polleres: Analysing and promoting ontology interoperability in Wikibase (Room 7)
Wolfgang Fahl, Tim Holzheim, Andrea Westerinen, Christoph Lange and Stefan Decker: Property cardinality analysis to extract truly tabular query results from Wikidata (Room 8)

Session 2: Wikidata Querying and Quality

Talks: 16:30 - 16:50 (CEST); Posters: 16:50 - 17:10 (CEST)

Bohui Zhang, Filip Ilievski and Pedro Szekely: Enriching Wikidata with Linked Open Data (Room 1)
Wolfgang Fahl, Tim Holzheim, Andrea Westerinen, Christoph Lange and Decker Stefan: Getting and hosting your own copy of Wikidata (Room 2)
Antoine Willerval, Dennis Diefenbach and Pierre Maret: Easily setting up a local Wikidata SPARQL endpoint using the qEndpoint (Room 3)
Nicholas Klein, Filip Ilievski, Hayden Freedman and Pedro Szekely: Identifying Surprising Facts in Wikidata and Kartik Shenoy, Filip Ilievski, Daniel Garijo, Daniel Schwabe and Pedro Szekely: A Study of the Quality of Wikidata (Room 4)
Seyed Amir Hosseini Beghaeiraveri: Towards Automated Technologies in the Referencing Quality of Wikidata (Room 5)
Hans Chalupsky and Pedro Szekely: Hybrid Structured and Similarity Queries over Wikidata plus Embeddings with Kypher-V (Room 6)

Session 3: Personalisation and usability

Talks: 17:10 - 17:30 (CEST); Posters: 17:30 - 17:50 (CEST)

Nicholas Klein, Filip Ilievski and Pedro Szekely: Generating Explainable Abstractions for Wikidata Entities (Room 1)
Fariz Darari: COVIWD: COVID-19 Wikidata Dashboard (Room 2)
Seyed Amir Hosseini Beghaeiraveri, Alasdair Gray and Fiona McNeill: Experiences of Using WDumper to Create Topical Subsets from Wikidata (Room 3)
Philipp Scharpf, Moritz Schubotz, Andreas Spitz, André Greiner-Petter and Bela Gipp: Collaborative and AI-aided Exam Question Generation using Wikidata in Education (Room 4)
Sola Shirai, Aamod Khatiwada, Oktie Hassanzadeh and Debarun Bhattacharjya: Rule-Based Link Prediction over Event-Related Causal Knowledge in Wikidata (Room 5)
Lozana Rossenova, Paul Duchesne and Ina Blümel: Wikidata and Wikibase as complementary research data management services for cultural heritage data (Room 6)
Lucas Jarnac and Pierre Monnin: Wikidata to Bootstrap an Enterprise Knowledge Graph: How to Stay on Topic? (Room 7)

Accepted Papers

Sofia Baroncini, Margherita Martorana, Mario Scrocca, Zuzanna Smiech and Axel Polleres
Analysing the Evolution of Community-Driven (Sub-)Schemas within Wikidata

Seyed Amir Hosseini Beghaeiraveri
Towards Automated Technologies in the Referencing Quality of Wikidata

Seyed Amir Hosseini Beghaeiraveri, Alasdair Gray and Fiona McNeill
Experiences of Using WDumper to Create Topical Subsets from Wikidata

Valentina Anita Carriero, Paul Groth and Valentina Presutti
Towards improving Wikidata reuse with emerging patterns

Hans Chalupsky and Pedro Szekely
Hybrid Structured and Similarity Queries over Wikidata plus Embeddings with Kypher-V

Fariz Darari
COVIWD: COVID-19 Wikidata Dashboard

Daniil Dobriy and Axel Polleres
Analysing and promoting ontology interoperability in Wikibase

Wolfgang Fahl, Tim Holzheim, Andrea Westerinen, Christoph Lange and Stefan Decker
Property cardinality analysis to extract truly tabular query results from Wikidata

Wolfgang Fahl, Tim Holzheim, Andrea Westerinen, Christoph Lange and Decker Stefan
Getting and hosting your own copy of Wikidata

Leila Feddoul, Frank Löffler and Sirko Schindler
Analysis of Consistency between Wikidata and Wikipedia categories

Nicolas Ferranti, Axel Polleres, Jairo Francisco De Souza and Shqiponja Ahmetaj
Formalizing Property Constraints in Wikidata

Jose Emilio Labra Gayo
WShEx: A language to describe and validate Wikibase entities

Armin Haller, Axel Polleres, Daniil Dobriy, Nicolas Ferranti and Sergio J. Rodrı́guez Méndez
An Analysis of Links in Wikidata

Lucas Jarnac and Pierre Monnin: Wikidata to Bootstrap an Enterprise Knowledge Graph
How to Stay on Topic?

Nicholas Klein, Filip Ilievski, Hayden Freedman and Pedro Szekely
Identifying Surprising Facts in Wikidata

Nicholas Klein, Filip Ilievski and Pedro Szekely
Generating Explainable Abstractions for Wikidata Entities

Lozana Rossenova, Paul Duchesne and Ina Blümel
Wikidata and Wikibase as complementary research data management services for cultural heritage data

Philipp Scharpf, Moritz Schubotz, Andreas Spitz, André Greiner-Petter and Bela Gipp
Collaborative and AI-aided Exam Question Generation using Wikidata in Education

Kartik Shenoy, Filip Ilievski, Daniel Garijo, Daniel Schwabe and Pedro Szekely
A Study of the Quality of Wikidata

Sola Shirai, Aamod Khatiwada, Oktie Hassanzadeh and Debarun Bhattacharjya
Rule-Based Link Prediction over Event-Related Causal Knowledge in Wikidata

Antoine Willerval, Dennis Diefenbach and Pierre Maret
Easily setting up a local Wikidata SPARQL endpoint using the qEndpoint

Bohui Zhang, Filip Ilievski and Pedro Szekely
Enriching Wikidata with Linked Open Data

Our Speakers

Lydia Pintscher

Wikimedia Deutschland

Keynote

10 years of Wikidata - insights for researchers and current challenges

Abstract

Wikidata is about to turn 10. In those 10 years Wikidata has opened up data and got a large number of people excited about knowledge graphs. With Wikidata the Wikimedia Movement has managed to create a place where humans and machines come together every day to give more people more access to more knowledge. Wikidata's data is powering a lot of the technology we use every day, from digital personal assistants to investigative journalism to library systems and more. We will take a look behind the szenes of the project and get to know the community that stands behind this project. We will explore insights that will help researchers better understand how and why Wikidata works. 10 years are of course just the beginning so we will also take a look at the current challenges that researchers can support the project with as well as upcoming exciting new developments.

Bio

Lydia Pintscher is the Product Manager for Wikidata at Wikimedia Deutschland. She studied computer science with a focus on innovation and language at the University of Karlsruhe. She is a long-time free software contributor, most notably as a member of the board of KDE e.V.

Tiago Lubiana

WikiProject COVID-19

Keynote

WikiProject COVID-19: modelling the pandemic in real time

Abstract

The Wikidata WikiProject COVID-19 sprouted during the pandemic as a gathering of Wikidata editors curating and organizing structured information about all aspects of the COVID-19 pandemic. The scale and pace of a global pandemic have highlighted issues around the consistent structuring of information, such as the scope of geographically-bound statements and the period of outbreaks, along with challenges for quickly updating information on pages across multiple languages. This keynote will be about the life cycle of the WikiProject COVID-19: how it came to be, what the participants contributed to Wikidata, the academic outputs, and the legacy for the community.

Bio

Tiago Lubiana is a co-founder of WikiProject COVID-19 and a Ph.D. Candidate at the University of São Paulo, studying the modeling of cell types on Wikidata. He is a member of the Equity, Diversity, and Inclusion Committee of the International Society for Biocuration, a regular contributor of the Cell Ontology and an active Wikidata editor on biomedical sciences, running three bots: the CellosaurusBot, the ComplexPortalBot, and the CovidDatahubbot. He is also a Research Scholar at the Ronin Institute and a member of WikiMovimento Brasil, an NGO supporting activities related to Wikimedia. In 2021, he was awarded a Shuttleworth Flash Grant, a prize to "social change agents, no strings attached, in support of their work."

Location

Co-located with ISWC 2022

Online event

Image: Wikimedia Hackathon 2020, CC-BY-SA 4.0

Organization

Organizing Committee

Lucie-Aimée Kaffee, University of Southampton. lucie.kaffee[[@]]gmail.com

Lucie-Aimée Kaffee is a postdoctoral research fellow at the University of Copenhagen. She acquired her PhD from the University of Southampton and was previously a research intern at Bloomberg, London, a research fellow at TIB Hannover and software developer in the Wikidata team, Wikimedia Germany. Her research focus is multilingual linked data in collaborative knowledge graphs and natural language processing. Lucie was part of the OC of the Wikidata Workshop co-located with ISWC'20, proceedings chair of ISWC'20 and ISWC'21, OC of AMAR: First International Workshop on Approaches for Making Data Interoperable at SEMANTiCS'19 and participated in the PC of a variety of conferences and workshops.

Simon Razniewski, Max Planck Institute for Informatics, srazniew[[@]]mpi-inf.mpg.de

Simon Razniewski is a senior researcher at the Max Planck Institute for Informatics in Saarbrücken, Germany, where he heads the Knowledge Base Construction and Quality research area. His research focuses on methods for knowledge base construction, as well as quality assessment, with applications in Wikidata and beyond. He has held senior roles in program committees of major conferences such as IJCAI'21 (area chair), or ISWC'20 and CIKM'20 (senior PC member). He has held visiting positions at places such as AT&T Labs-Research, the University of Queensland, and UCSD, and his research has been recognized with multiple awards and research grants.

Gabriel Amaral, King's College London, gabriel.amaral[[@]]kcl.ac.uk

Gabriel Amaral is a computer scientist, graduated summa cum laude from the Federal University of Ceará, and a PhD candidate at King's College London. He is part of the Marie Curie European training network Cleopatra, delivering technologies to build and use large-scale, multilingual knowledge graphs. His research tackles the quality of references and the verification of claims found in Wikidata.

Kholoud Saad Alghamdi, King's College London, kholoud.alghamdi[[@]]kcl.ac.uk

Kholoud Saad Alghamdi is a PhD candidate at King's College London. She obtained her master's degree in Computer Science from the University of Southampton. Her PhD project develops an items recommender system for Wikidata editors. Before that, she was lecturer at King Abdulaziz University and worked previously as a data analyst in the industry.

Program Committee

Seyed Amir Hosseini Beghaeiraveri, Heriot-Watt University

Niel Chah, University of Toronto

Houcemeddine Turki, Faculty of Medicine of Sfax

David Abián, King's College London

John Samuel, CPE Lyon, LIRIS - UMR 5205

Luis Galárraga, Inria

Filip Ilievski, Information Sciences Institute, USC

Lydia Pintscher, Wikimedia Deutschland

Elisavet Koutsiana, King's College London

Pierre-Henri Paris, CNAM

Alessandro Piscopo, BBC

Mahir Morshed, University of Illinois at Urbana-Champaign

Dennis Diefenbach, The QA Company

Alasdair Gray, Heriot-Watt University

Daniel Garijo, Universidad Politécnica de Madrid

Andrew D. Gordon, Microsoft Research and University of Edinburgh

Thomas Pellissier Tanon, Télécom ParisTech

Cristina Sarasua, University of Zurich

Pavlos Vougiouklis, University of Southampton

Overview

What is Wikidata?

Call for Papers

Novel Work Track

Previously Published Work Track

Submission

Important Dates

Submission Guidelines

Journal extensions

Schedule Detail

Welcome

Keynote 1: Lydia Pintscher

Lightning Talks 1

Poster Session 1

Break

Keynote 2: Tiago Lubiana

Lightning Talks 2

Poster Session 2

Lightning Talks 3

Poster Session 3

Closing

Sessions / Papers

Session 1: Wikidata Ontology and Schema

Session 2: Wikidata Querying and Quality

Session 3: Personalisation and usability

Accepted Papers

Our Speakers

Lydia Pintscher

Keynote

Abstract

Bio

Tiago Lubiana

Keynote

Abstract

Bio

Location

Co-located with ISWC 2022

Online event

Organization

Organizing Committee

Lucie-Aimée Kaffee, University of Southampton. lucie.kaffee[[@]]gmail.com

Simon Razniewski, Max Planck Institute for Informatics, srazniew[[@]]mpi-inf.mpg.de

Gabriel Amaral, King's College London, gabriel.amaral[[@]]kcl.ac.uk

Kholoud Saad Alghamdi, King's College London, kholoud.alghamdi[[@]]kcl.ac.uk

Program Committee