Wikidata Workshop : Home

Overview

Wikidata is an open knowledge base hosted by the Wikimedia Foundation that can be read and edited by both humans and machines. Wikidata acts as the central source of common, open structured data used by Wikipedia, Wiktionary, Wikisource, and others. It is used in a variety of academic and industrial applications.

In recent years, we have seen an increase in the number of scientific publications around Wikidata. While there are a number of venues for the Wikidata community to exchange, none of those publish original research. We want to bridge the gap between these communities and the research events and give the research-focused part of the Wikidata community a venue to meet and exchange information and knowledge.

The Wikidata Workshop 2021 focuses on the challenges and opportunities of working on a collaborative open-domain knowledge graph such as Wikidata, which is edited by an international and multilingual community. We encourage submissions that observe the influence such a knowledge graph has on the web of data, as well as those working on improving this knowledge graph itself. This workshop brings together everyone working around Wikidata in both the scientific field and industry to discuss trends and topics around this collaborative knowledge graph.

What is Wikidata?

Call for Papers

The papers will be peer-reviewed by at least three researchers. Selected papers will be published on CEUR (we only publish to CEUR if the authors agree to have their papers published). Papers have to be submitted through easychair.

Submission Link: https://easychair.org/conferences/?conf=wikidata21

Important Dates

Papers due: Friday, August 6 ~~30 July~~ 2021 (extended)

Notification of accepted papers: Friday, 24 September 2021

Camera ready papers due: Monday, 4 October 2021

Workshop date: 24 October 2021

Submission Guidelines

Submissions must be as PDF, formatted in the style of the Springer Publications format for Lecture Notes in Computer Science (LNCS). For details on the LNCS style, see Springer’s Author Instructions.

We will accept papers up to 12 pages (excluding references, contribution of the paper should justify the length of the paper), including the following:

Full research paper

Novel research contributions (7-12 pages)

Short research paper

Novel research contributions of smaller scope than full papers (3-6 pages)

Position paper

Presenting a novel idea, that is not yet in the scope of a research contribution (6-8 pages)

Resource paper

Presenting a new dataset or other resource, includes the publication of that resource (8-12 pages)

Demo paper

Presenting a system based on research concepts (6-8 pages)

Schedule Detail

The workshop time is: 3 - 8 pm (CET), 2 - 7 pm (UK), 6 - 11 am (California, US)

All times below in CET.

15:00 - 15:15

Welcome
Welcome from the organisers, agenda, rules of engagement
15:15 - 16:00

Keynote 1: Érica Azzellini
16:00 - 16:20

Lightning Talks 1
16:20 - 16:40

Poster Session 1
16:40 - 17:00

Break
17:00 - 17:45

Keynote 2: Andrew Lih
17:45 - 18:10

Lightning Talks 2
18:10 - 18:30

Poster Session 2
18:30 - 19:00

Break
19:00 - 19:20

Lightning Talks 3
19:20 - 19:40

Poster Session 3
19:40 - 20:00

Closing
Concluding remarks, closing

Sessions / Papers

Session 1: Links and evolution

Talks: 16:00 - 16:20 (CET); Posters: 16:20 - 16:40 (CET)

Armand Boschin, Thomas Bonald: Enriching Wikidata with Semantified Wikipedia Hyperlinks (Room 1)
Houcemeddine Turki, Mohamed Ali Hadj Taieb, Mohamed Ben Aouicha: Coupling Wikipedia Categories with Wikidata Statements for Better Semantics (Room 2)
Mahir Morshed: Modeling Syntactic Dependency Relationships in Wikidata Lexicographical Data (Room 3)
Lukas Schmelzeisen, Corina Dima, Steffen Staab: Wikidated 1.0: An Evolving Knowledge Graph Dataset of Wikidata’s Revision History (Room 4)

Session 2: Domains and bias

Talks: 17:45 - 18:10 (CET); Posters: 18:10 - 18:30 (CET)

Wessel Radstok, Melisachew Wudage Chekol, Mirko T. Schäfer: Are knowledge graph embedding models biased, or is it the data that they are trained on? (Room 1)
Seyed Amir Hosseini Beghaeiraveri, Alasdair Gray, Fiona McNeill Reference Statistics in Wikidata Topical Subsets (Room 2)
Philipp Scharpf, Moritz Schubotz, Bela Gipp: Mathematics in Wikidata (Room 3)
Bernhard Krabina, Axel Polleres: Seeding Wikidata with Municipal Finance Data (Room 4)
Oktie Hassanzadeh: Building a Knowledge Graph of Events and Consequences Using Wikidata (Room 5)

Session 3: Personalisation and usability

Talks: 19:00 - 19:20 (CET); Posters: 19:20 - 19:40 (CET)

Hans Chalupsky, Pedro Szekely, Filip Ilievski, Daniel Garijo, Kartik Shenoy: Creating and Querying Personalized Versions of Wikidata on a Laptop (Room 1)
Daniel Henselmann, Andreas Harth: Constructing demand-driven Wikidata Subsets (Room 2)
Niel Chah, Periklis Andritsos: WikiMetaData Studio: Dashboards From Data Profiling the Languages, Properties, and Items of Wikidata (Room 3)
Filip Ilievski, Pedro Szekely, Gleb Satyukov, Amandeep Singh: User-friendly Comparison of Similarity Algorithms on Wikidata (Room 4)

Our Speakers

Érica Azzellini

Wiki Movimento Brasil

Keynote

Reimagining Wikidata from the margins: a vision for decolonizing the internet

Abstract

Although Wikidata intends to structure the sum of all human knowledge, we’re still missing Global South and other marginalized communities from the North - both in data and in contributors, which can be misleading on how topics regarding communities from the margins are represented in the project, which sources are used for it, or even if they exist or are made visible. As Wikidata celebrates nine years with its sustainability being the focus for WikidataCon 2021, discussions around how Wikidata’s infrastructure, content and community can be used to support a better understanding and representation of the diversity of human knowledge are happening through the “Reimagining Wikidata from the margins” project. This is a talk about the vision that inspired the project and how this process is articulating different stakeholders to decolonize the internet - one item at a time.

Andrew Lih

Smithsonian Institution and Metropolitan Museum of Arts

Keynote

The Art and Science of Wiki Co-creation with GLAM Partners
or
Wiki Co-creation with GLAM Partners

Abstract

Twenty years after Wikipedia's founding, we are seeing a rich set of mutually beneficial partnerships between the Wikimedia community and the cultural and heritage sector. Andrew Lih will describe the evolution of participation models related to contribution, collaboration and co-creation with GLAM partners over the past decade. He will also cover the foundational work the Wikidata and Wikimedia community have embarked on including comprehensive image donations, Wikidata modeling, data roundtripping, best practices, the use of machine learning and, image recognition for depiction metadata, and the current work on Structured Data on Commons.

Location

Co-located with ISWC 2021

Online event

Image: Wikimedia Hackathon 2020, CC-BY-SA 4.0

Organization

Organizing Committee

Lucie-Aimée Kaffee, University of Southampton. lucie.kaffee[[@]]gmail.com

Lucie-Aimée Kaffee is a PhD candidate at ECS, University of Southampton and research intern at Bloomberg’s AI Group in London. She was previously a research fellow at TIB, Hannover and software developer in the Wikidata team, Wikimedia Germany. Her research focus is multilingual linked data in collaborative knowledge graphs, and she has published on Wikidata research. Lucie was the proceedings chair of ISWC 2018, OC of AMAR: First International Workshop on Approaches for Making Data Interoperable at SEMANTiCS 2019 and participated in the PC of The Web Conference 2020, AAAI-20 Students, ESWC 2019, ISWC 2019 and SEMANTiCS 2019 and of the workshops Wikidata Quality 2019 and Workshop on Contextualized Knowledge Graphs at ISWC 2018 and ISWC 2019. She has organized Ladies that FOSS (2016), an event to enable a more diverse open source development community, Wikidata meetings in London (2018) and was part of the committee of WikidataCon 2019.

Simon Razniewski, Max Planck Institute for Informatics, srazniew[[@]]mpi-inf.mpg.de

Simon Razniewski is a senior researcher at the Max Planck Institute for Informatics in Saarbrücken, Germany, where he heads the Knowledge Base Construction and Quality research area. His research focuses on methods for knowledge base construction, as well as quality assessment. He has held senior roles in program committees of major conferences such as IJCAI'21 (area chair), or ISWC'20 and CIKM'20 (senior PC member). He has held visiting positions at places such as AT&T Labs-Research, the University of Queensland, and UCSD, and his research on data management and knowledge bases has been recognized with multiple awards and grants.

Aidan Hogan, University of Chile, ahogan[[@]]dcc.uchile.cl

Aidan Hogan is an Associate Professor at the Department of Computer Science, University of Chile, and an Associate Researcher of the Millennium Institute for Foundational Research on Data (IMFD). He previously worked at the Digital Enterprise Research Institute (DERI) (now called INSIGHT) based in the National University of Ireland, Galway, where he completed his PhD in 2011 and a PostDoc under the supervision of Prof. Dr. Axel Polleres. He has served various roles in the OC of the ISWC and ESWC conferences, including co-chair of In-Use, Workshop & Tutorial, and Poster & Demo Tracks. He has been the PC Chair for AMW and will be a PC Chair for JIST-KG. He has helped co-organise several workshops, including several years of the COLD and QuWeDa series at ISWC. He is also on the EB of the Semantic Web Journal, and the Journal of Web Semantics.

Program Committee

Miriam Redi, Wikimedia Foundation

John Samuel, CPE Lyon

Dennis Diefenbach, University Jean Monet

Lydia Pintscher, Wikimedia Deutschland

Edgar Meij, Bloomberg L.P.

Thomas Pellissier Tanon, Lexistems

Hiba Arnaout, MPI for Informatics

Fabian Suchanek, Télécom ParisTech

Fariz Darari, University of Indonesia

Filip Ilievski, ISI

Marco Ponza, Bloomberg L.P.

Cristina Sarasua, University of Zurich

Pavlos Vougiouklis, Huawei Technologies, Edinburgh

Finn Årup Nielsen, Technical University of Denmark

Andrew D. Gordon, Microsoft Research & University of Edinburgh

Michael Luggen, Fribourg University

Shrestha Ghosh, MPI for Informatics

Daniel Garijo, ISI

Gong Cheng, Nanjing University

Anastasia Dimou, Gent University

Sebastián Ferrada, Universidad de Chile

Overview

What is Wikidata?

Call for Papers

Important Dates

Submission Guidelines

Full research paper

Short research paper

Position paper

Resource paper

Demo paper

Schedule Detail

Welcome

Keynote 1: Érica Azzellini

Lightning Talks 1

Poster Session 1

Break

Keynote 2: Andrew Lih

Lightning Talks 2

Poster Session 2

Break

Lightning Talks 3

Poster Session 3

Closing

Sessions / Papers

Session 1: Links and evolution

Session 2: Domains and bias

Session 3: Personalisation and usability

Our Speakers

Érica Azzellini

Keynote

Abstract

Andrew Lih

Keynote

Abstract

Location

Co-located with ISWC 2021

Online event

Organization

Organizing Committee

Lucie-Aimée Kaffee, University of Southampton. lucie.kaffee[[@]]gmail.com

Simon Razniewski, Max Planck Institute for Informatics, srazniew[[@]]mpi-inf.mpg.de

Aidan Hogan, University of Chile, ahogan[[@]]dcc.uchile.cl

Program Committee