Wikidata is an open knowledge base hosted by the Wikimedia Foundation that can be read and edited by both humans and machines. Wikidata acts as the central source of common, open structured data used by Wikipedia, Wiktionary, Wikisource, and others. It is used in a variety of academic and industrial applications.
In recent years, we have seen an increase in the number of scientific publications around Wikidata. While there are a number of venues for the Wikidata community to exchange, none of those publish original research. We want to bridge the gap between these communities and the research events and give the research-focused part of the Wikidata community a venue to meet and exchange information and knowledge.
The Wikidata Workshop 2022 focuses on the challenges and opportunities of working on a collaborative open-domain knowledge graph such as Wikidata, which is edited by an international and multilingual community. We encourage submissions that observe the influence such a knowledge graph has on the web of data, as well as those working on improving this knowledge graph itself. This workshop brings together everyone working around Wikidata in both the scientific field and industry to discuss trends and topics around this collaborative knowledge graph.
Call for Papers
This workshop will have two tracks: Novel Work, and Previously Published Work.
Papers in the Novel Work track will be published as part of the workshop proceedings. The Previously Published Work track is for papers already published in other conferences, giving the community the chance to access and discuss relevant work that has been presented elsewhere as part of the workshop.
Novel Work Track
The papers will be peer-reviewed by at least three researchers. Selected papers will be published on CEUR (we only publish to CEUR if the authors agree to have their papers published).
For the Novel Work track, we will accept papers up to 12 pages (excluding references, contribution of the paper should justify the length of the paper). We invite the following types of papers:
Previously Published Work Track
Published papers will be reviewed by the organising committee in terms of topical fit and prominence of the publication venue. They will not be published as part of the proceedings.
For the Previously Published Work track, we will accept papers with no page limit, prioritizing instead the importance and relevance of the publication. We invite the following types of papers:
Papers have to be submitted through EasyChair.
We ask authors to declare the track they intend on submitting to. To do so, please add, at the beginning of the "title" field on the EasyChair submission, either the string "[Novel]", for the Novel Work track, or the string "[Published]", for the Previously Published track.
Submission Link: https://easychair.org/conferences/?conf=wikidataworkshop2022
Papers due: Friday, 5 August 2022
Notification of accepted papers: Friday, 30 September 2022
Camera ready papers due: Monday, 10 October 2022
Workshop date: Monday, 24 October 2022
Submissions must be as PDF, for the [Novel] track formatted in the style of the Springer Publications format for Lecture Notes in Computer Science (LNCS). For details on the LNCS style, see Springer’s Author Instructions. For the [Published] track, no reformatting of the original PDFs is needed.
Extended versions of journal papers are invited for submission to a special issue of the Semantic Web Journal.
The workshop time is: 2 - 6 pm (CEST), 1 - 5 pm (UK), 5 - 9 am (California, US)
All times below in CEST.
14:00 - 14:10
WelcomeWelcome from the organisers, agenda, rules of engagement
14:10 - 14:55
Keynote 1: Lydia Pintscher
14:55 - 15:15
Lightning Talks 1
15:15 - 15:35
Poster Session 1
15:35 - 15:45
15:45 - 16:30
Keynote 2: Tiago Lubiana
16:30 - 16:50
Lightning Talks 2
16:50 - 17:10
Poster Session 2
17:10 - 17:30
Lightning Talks 3
17:30 - 17:50
Poster Session 3
17:50 - 18:00
ClosingConcluding remarks, closing
Sessions / Papers
Session 1: Wikidata Ontology and Schema
Talks: 14:55 - 15:15 (CEST); Posters: 15:15 - 15:35 (CEST)
- Armin Haller, Axel Polleres, Daniil Dobriy, Nicolas Ferranti and Sergio J. Rodrı́guez Méndez: An Analysis of Links in Wikidata (Room 1)
- Nicolas Ferranti, Axel Polleres, Jairo Francisco De Souza and Shqiponja Ahmetaj: Formalizing Property Constraints in Wikidata (Room 2)
- Valentina Anita Carriero, Paul Groth and Valentina Presutti: Towards improving Wikidata reuse with emerging patterns (Room 3)
- Jose Emilio Labra Gayo: WShEx: A language to describe and validate Wikibase entities (Room 4)
- Leila Feddoul, Frank Löffler and Sirko Schindler: Analysis of Consistency between Wikidata and Wikipedia categories (Room 5)
- Sofia Baroncini, Margherita Martorana, Mario Scrocca, Zuzanna Smiech and Axel Polleres: Analysing the Evolution of Community-Driven (Sub-)Schemas within Wikidata (Room 6)
- Daniil Dobriy and Axel Polleres: Analysing and promoting ontology interoperability in Wikibase (Room 7)
- Wolfgang Fahl, Tim Holzheim, Andrea Westerinen, Christoph Lange and Stefan Decker: Property cardinality analysis to extract truly tabular query results from Wikidata (Room 8)
Session 2: Wikidata Querying and Quality
Talks: 16:30 - 16:50 (CEST); Posters: 16:50 - 17:10 (CEST)
- Bohui Zhang, Filip Ilievski and Pedro Szekely: Enriching Wikidata with Linked Open Data (Room 1)
- Wolfgang Fahl, Tim Holzheim, Andrea Westerinen, Christoph Lange and Decker Stefan: Getting and hosting your own copy of Wikidata (Room 2)
- Antoine Willerval, Dennis Diefenbach and Pierre Maret: Easily setting up a local Wikidata SPARQL endpoint using the qEndpoint (Room 3)
- Nicholas Klein, Filip Ilievski, Hayden Freedman and Pedro Szekely: Identifying Surprising Facts in Wikidata and Kartik Shenoy, Filip Ilievski, Daniel Garijo, Daniel Schwabe and Pedro Szekely: A Study of the Quality of Wikidata (Room 4)
- Seyed Amir Hosseini Beghaeiraveri: Towards Automated Technologies in the Referencing Quality of Wikidata (Room 5)
- Hans Chalupsky and Pedro Szekely: Hybrid Structured and Similarity Queries over Wikidata plus Embeddings with Kypher-V (Room 6)
Session 3: Personalisation and usability
Talks: 17:10 - 17:30 (CEST); Posters: 17:30 - 17:50 (CEST)
- Nicholas Klein, Filip Ilievski and Pedro Szekely: Generating Explainable Abstractions for Wikidata Entities (Room 1)
- Fariz Darari: COVIWD: COVID-19 Wikidata Dashboard (Room 2)
- Seyed Amir Hosseini Beghaeiraveri, Alasdair Gray and Fiona McNeill: Experiences of Using WDumper to Create Topical Subsets from Wikidata (Room 3)
- Philipp Scharpf, Moritz Schubotz, Andreas Spitz, André Greiner-Petter and Bela Gipp: Collaborative and AI-aided Exam Question Generation using Wikidata in Education (Room 4)
- Sola Shirai, Aamod Khatiwada, Oktie Hassanzadeh and Debarun Bhattacharjya: Rule-Based Link Prediction over Event-Related Causal Knowledge in Wikidata (Room 5)
- Lozana Rossenova, Paul Duchesne and Ina Blümel: Wikidata and Wikibase as complementary research data management services for cultural heritage data (Room 6)
- Lucas Jarnac and Pierre Monnin: Wikidata to Bootstrap an Enterprise Knowledge Graph: How to Stay on Topic? (Room 7)
Sofia Baroncini, Margherita Martorana, Mario Scrocca, Zuzanna Smiech and Axel Polleres
Analysing the Evolution of Community-Driven (Sub-)Schemas within Wikidata
Seyed Amir Hosseini Beghaeiraveri
Towards Automated Technologies in the Referencing Quality of Wikidata
Seyed Amir Hosseini Beghaeiraveri, Alasdair Gray and Fiona McNeill
Experiences of Using WDumper to Create Topical Subsets from Wikidata
Valentina Anita Carriero, Paul Groth and Valentina Presutti
Towards improving Wikidata reuse with emerging patterns
Hans Chalupsky and Pedro Szekely
Hybrid Structured and Similarity Queries over Wikidata plus Embeddings with Kypher-V
COVIWD: COVID-19 Wikidata Dashboard
Daniil Dobriy and Axel Polleres
Analysing and promoting ontology interoperability in Wikibase
Wolfgang Fahl, Tim Holzheim, Andrea Westerinen, Christoph Lange and Stefan Decker
Property cardinality analysis to extract truly tabular query results from Wikidata
Wolfgang Fahl, Tim Holzheim, Andrea Westerinen, Christoph Lange and Decker Stefan
Getting and hosting your own copy of Wikidata
Leila Feddoul, Frank Löffler and Sirko Schindler
Analysis of Consistency between Wikidata and Wikipedia categories
Nicolas Ferranti, Axel Polleres, Jairo Francisco De Souza and Shqiponja Ahmetaj
Formalizing Property Constraints in Wikidata
Jose Emilio Labra Gayo
WShEx: A language to describe and validate Wikibase entities
Armin Haller, Axel Polleres, Daniil Dobriy, Nicolas Ferranti and Sergio J. Rodrı́guez Méndez
An Analysis of Links in Wikidata
Lucas Jarnac and Pierre Monnin: Wikidata to Bootstrap an Enterprise Knowledge Graph
How to Stay on Topic?
Nicholas Klein, Filip Ilievski, Hayden Freedman and Pedro Szekely
Identifying Surprising Facts in Wikidata
Nicholas Klein, Filip Ilievski and Pedro Szekely
Generating Explainable Abstractions for Wikidata Entities
Lozana Rossenova, Paul Duchesne and Ina Blümel
Wikidata and Wikibase as complementary research data management services for cultural heritage data
Philipp Scharpf, Moritz Schubotz, Andreas Spitz, André Greiner-Petter and Bela Gipp
Collaborative and AI-aided Exam Question Generation using Wikidata in Education
Kartik Shenoy, Filip Ilievski, Daniel Garijo, Daniel Schwabe and Pedro Szekely
A Study of the Quality of Wikidata
Sola Shirai, Aamod Khatiwada, Oktie Hassanzadeh and Debarun Bhattacharjya
Rule-Based Link Prediction over Event-Related Causal Knowledge in Wikidata
Antoine Willerval, Dennis Diefenbach and Pierre Maret
Easily setting up a local Wikidata SPARQL endpoint using the qEndpoint
Bohui Zhang, Filip Ilievski and Pedro Szekely
Enriching Wikidata with Linked Open Data
10 years of Wikidata - insights for researchers and current challenges
Wikidata is about to turn 10. In those 10 years Wikidata has opened up data and got a large number of people excited about knowledge graphs. With Wikidata the Wikimedia Movement has managed to create a place where humans and machines come together every day to give more people more access to more knowledge. Wikidata's data is powering a lot of the technology we use every day, from digital personal assistants to investigative journalism to library systems and more. We will take a look behind the szenes of the project and get to know the community that stands behind this project. We will explore insights that will help researchers better understand how and why Wikidata works. 10 years are of course just the beginning so we will also take a look at the current challenges that researchers can support the project with as well as upcoming exciting new developments.
Lydia Pintscher is the Product Manager for Wikidata at Wikimedia Deutschland. She studied computer science with a focus on innovation and language at the University of Karlsruhe. She is a long-time free software contributor, most notably as a member of the board of KDE e.V.
WikiProject COVID-19: modelling the pandemic in real time
The Wikidata WikiProject COVID-19 sprouted during the pandemic as a gathering of Wikidata editors curating and organizing structured information about all aspects of the COVID-19 pandemic. The scale and pace of a global pandemic have highlighted issues around the consistent structuring of information, such as the scope of geographically-bound statements and the period of outbreaks, along with challenges for quickly updating information on pages across multiple languages. This keynote will be about the life cycle of the WikiProject COVID-19: how it came to be, what the participants contributed to Wikidata, the academic outputs, and the legacy for the community.
Tiago Lubiana is a co-founder of WikiProject COVID-19 and a Ph.D. Candidate at the University of São Paulo, studying the modeling of cell types on Wikidata. He is a member of the Equity, Diversity, and Inclusion Committee of the International Society for Biocuration, a regular contributor of the Cell Ontology and an active Wikidata editor on biomedical sciences, running three bots: the CellosaurusBot, the ComplexPortalBot, and the CovidDatahubbot. He is also a Research Scholar at the Ronin Institute and a member of WikiMovimento Brasil, an NGO supporting activities related to Wikimedia. In 2021, he was awarded a Shuttleworth Flash Grant, a prize to "social change agents, no strings attached, in support of their work."
Lucie-Aimée Kaffee, University of Southampton. lucie.kaffee[[@]]gmail.com
Lucie-Aimée Kaffee is a postdoctoral research fellow at the University of Copenhagen. She acquired her PhD from the University of Southampton and was previously a research intern at Bloomberg, London, a research fellow at TIB Hannover and software developer in the Wikidata team, Wikimedia Germany. Her research focus is multilingual linked data in collaborative knowledge graphs and natural language processing. Lucie was part of the OC of the Wikidata Workshop co-located with ISWC'20, proceedings chair of ISWC'20 and ISWC'21, OC of AMAR: First International Workshop on Approaches for Making Data Interoperable at SEMANTiCS'19 and participated in the PC of a variety of conferences and workshops.
Simon Razniewski, Max Planck Institute for Informatics, srazniew[[@]]mpi-inf.mpg.de
Simon Razniewski is a senior researcher at the Max Planck Institute for Informatics in Saarbrücken, Germany, where he heads the Knowledge Base Construction and Quality research area. His research focuses on methods for knowledge base construction, as well as quality assessment, with applications in Wikidata and beyond. He has held senior roles in program committees of major conferences such as IJCAI'21 (area chair), or ISWC'20 and CIKM'20 (senior PC member). He has held visiting positions at places such as AT&T Labs-Research, the University of Queensland, and UCSD, and his research has been recognized with multiple awards and research grants.
Gabriel Amaral, King's College London, gabriel.amaral[[@]]kcl.ac.uk
Gabriel Amaral is a computer scientist, graduated summa cum laude from the Federal University of Ceará, and a PhD candidate at King's College London. He is part of the Marie Curie European training network Cleopatra, delivering technologies to build and use large-scale, multilingual knowledge graphs. His research tackles the quality of references and the verification of claims found in Wikidata.
Kholoud Saad Alghamdi, King's College London, kholoud.alghamdi[[@]]kcl.ac.uk
Kholoud Saad Alghamdi is a PhD candidate at King's College London. She obtained her master's degree in Computer Science from the University of Southampton. Her PhD project develops an items recommender system for Wikidata editors. Before that, she was lecturer at King Abdulaziz University and worked previously as a data analyst in the industry.
Seyed Amir Hosseini Beghaeiraveri, Heriot-Watt University
Niel Chah, University of Toronto
Houcemeddine Turki, Faculty of Medicine of Sfax
David Abián, King's College London
John Samuel, CPE Lyon, LIRIS - UMR 5205
Luis Galárraga, Inria
Filip Ilievski, Information Sciences Institute, USC
Lydia Pintscher, Wikimedia Deutschland
Elisavet Koutsiana, King's College London
Pierre-Henri Paris, CNAM
Alessandro Piscopo, BBC
Mahir Morshed, University of Illinois at Urbana-Champaign
Dennis Diefenbach, The QA Company
Alasdair Gray, Heriot-Watt University
Daniel Garijo, Universidad Politécnica de Madrid
Andrew D. Gordon, Microsoft Research and University of Edinburgh
Thomas Pellissier Tanon, Télécom ParisTech
Cristina Sarasua, University of Zurich
Pavlos Vougiouklis, University of Southampton