Wikidata is an open knowledge base hosted by the Wikimedia Foundation that can be read and edited by both humans and machines. Wikidata acts as the central source of common, open structured data used by Wikipedia, Wiktionary, Wikisource, and others. It is used in a variety of academic and industrial applications.
In recent years, we have seen an increase in the number of scientific publications around Wikidata. While there are a number of venues for the Wikidata community to exchange, none of those publish original research. We want to bridge the gap between these communities and the research events and give the research-focused part of the Wikidata community a venue to meet and exchange information and knowledge.
The Wikidata Workshop 2021 focuses on the challenges and opportunities of working on a collaborative open-domain knowledge graph such as Wikidata, which is edited by an international and multilingual community. We encourage submissions that observe the influence such a knowledge graph has on the web of data, as well as those working on improving this knowledge graph itself. This workshop brings together everyone working around Wikidata in both the scientific field and industry to discuss trends and topics around this collaborative knowledge graph.
Call for Papers
The papers will be peer-reviewed by at least three researchers. Selected papers will be published on CEUR (we only publish to CEUR if the authors agree to have their papers published). Papers have to be submitted through easychair.
Submission Link: https://easychair.org/conferences/?conf=wikidata21
Papers due: Friday, August 6
30 July 2021 (extended)
Notification of accepted papers: Friday, 24 September 2021
Camera ready papers due: Monday, 4 October 2021
Workshop date: 24 October 2021
Submissions must be as PDF, formatted in the style of the Springer Publications format for Lecture Notes in Computer Science (LNCS). For details on the LNCS style, see Springer’s Author Instructions.
We will accept papers up to 12 pages (excluding references, contribution of the paper should justify the length of the paper), including the following:
The workshop time is: 3 - 8 pm (CET), 2 - 7 pm (UK), 6 - 11 am (California, US)
All times below in CET.
15:00 - 15:15
WelcomeWelcome from the organisers, agenda, rules of engagement
15:15 - 16:00
Keynote 1: Érica Azzellini
16:00 - 16:20
Lightning Talks 1
16:20 - 16:40
Poster Session 1
16:40 - 17:00
17:00 - 17:45
Keynote 2: Andrew Lih
17:45 - 18:10
Lightning Talks 2
18:10 - 18:30
Poster Session 2
18:30 - 19:00
19:00 - 19:20
Lightning Talks 3
19:20 - 19:40
Poster Session 3
19:40 - 20:00
ClosingConcluding remarks, closing
Sessions / Papers
Session 1: Links and evolution
Talks: 16:00 - 16:20 (CET); Posters: 16:20 - 16:40 (CET)
- Armand Boschin, Thomas Bonald: Enriching Wikidata with Semantified Wikipedia Hyperlinks (Room 1)
- Houcemeddine Turki, Mohamed Ali Hadj Taieb, Mohamed Ben Aouicha: Coupling Wikipedia Categories with Wikidata Statements for Better Semantics (Room 2)
- Mahir Morshed: Modeling Syntactic Dependency Relationships in Wikidata Lexicographical Data (Room 3)
- Lukas Schmelzeisen, Corina Dima, Steffen Staab: Wikidated 1.0: An Evolving Knowledge Graph Dataset of Wikidata’s Revision History (Room 4)
Session 2: Domains and bias
Talks: 17:45 - 18:10 (CET); Posters: 18:10 - 18:30 (CET)
- Wessel Radstok, Melisachew Wudage Chekol, Mirko T. Schäfer: Are knowledge graph embedding models biased, or is it the data that they are trained on? (Room 1)
- Seyed Amir Hosseini Beghaeiraveri, Alasdair Gray, Fiona McNeill Reference Statistics in Wikidata Topical Subsets (Room 2)
- Philipp Scharpf, Moritz Schubotz, Bela Gipp: Mathematics in Wikidata (Room 3)
- Bernhard Krabina, Axel Polleres: Seeding Wikidata with Municipal Finance Data (Room 4)
- Oktie Hassanzadeh: Building a Knowledge Graph of Events and Consequences Using Wikidata (Room 5)
Session 3: Personalisation and usability
Talks: 19:00 - 19:20 (CET); Posters: 19:20 - 19:40 (CET)
- Hans Chalupsky, Pedro Szekely, Filip Ilievski, Daniel Garijo, Kartik Shenoy: Creating and Querying Personalized Versions of Wikidata on a Laptop (Room 1)
- Daniel Henselmann, Andreas Harth: Constructing demand-driven Wikidata Subsets (Room 2)
- Niel Chah, Periklis Andritsos: WikiMetaData Studio: Dashboards From Data Profiling the Languages, Properties, and Items of Wikidata (Room 3)
- Filip Ilievski, Pedro Szekely, Gleb Satyukov, Amandeep Singh: User-friendly Comparison of Similarity Algorithms on Wikidata (Room 4)
Wiki Movimento Brasil
Reimagining Wikidata from the margins: a vision for decolonizing the internet
Although Wikidata intends to structure the sum of all human knowledge, we’re still missing Global South and other marginalized communities from the North - both in data and in contributors, which can be misleading on how topics regarding communities from the margins are represented in the project, which sources are used for it, or even if they exist or are made visible. As Wikidata celebrates nine years with its sustainability being the focus for WikidataCon 2021, discussions around how Wikidata’s infrastructure, content and community can be used to support a better understanding and representation of the diversity of human knowledge are happening through the “Reimagining Wikidata from the margins” project. This is a talk about the vision that inspired the project and how this process is articulating different stakeholders to decolonize the internet - one item at a time.
Smithsonian Institution and Metropolitan Museum of Arts
The Art and Science of Wiki Co-creation with GLAM Partners
Wiki Co-creation with GLAM Partners
Twenty years after Wikipedia's founding, we are seeing a rich set of mutually beneficial partnerships between the Wikimedia community and the cultural and heritage sector. Andrew Lih will describe the evolution of participation models related to contribution, collaboration and co-creation with GLAM partners over the past decade. He will also cover the foundational work the Wikidata and Wikimedia community have embarked on including comprehensive image donations, Wikidata modeling, data roundtripping, best practices, the use of machine learning and, image recognition for depiction metadata, and the current work on Structured Data on Commons.
Lucie-Aimée Kaffee, University of Southampton. lucie.kaffee[[@]]gmail.com
Lucie-Aimée Kaffee is a PhD candidate at ECS, University of Southampton and research intern at Bloomberg’s AI Group in London. She was previously a research fellow at TIB, Hannover and software developer in the Wikidata team, Wikimedia Germany. Her research focus is multilingual linked data in collaborative knowledge graphs, and she has published on Wikidata research. Lucie was the proceedings chair of ISWC 2018, OC of AMAR: First International Workshop on Approaches for Making Data Interoperable at SEMANTiCS 2019 and participated in the PC of The Web Conference 2020, AAAI-20 Students, ESWC 2019, ISWC 2019 and SEMANTiCS 2019 and of the workshops Wikidata Quality 2019 and Workshop on Contextualized Knowledge Graphs at ISWC 2018 and ISWC 2019. She has organized Ladies that FOSS (2016), an event to enable a more diverse open source development community, Wikidata meetings in London (2018) and was part of the committee of WikidataCon 2019.
Simon Razniewski, Max Planck Institute for Informatics, srazniew[[@]]mpi-inf.mpg.de
Simon Razniewski is a senior researcher at the Max Planck Institute for Informatics in Saarbrücken, Germany, where he heads the Knowledge Base Construction and Quality research area. His research focuses on methods for knowledge base construction, as well as quality assessment. He has held senior roles in program committees of major conferences such as IJCAI'21 (area chair), or ISWC'20 and CIKM'20 (senior PC member). He has held visiting positions at places such as AT&T Labs-Research, the University of Queensland, and UCSD, and his research on data management and knowledge bases has been recognized with multiple awards and grants.
Aidan Hogan, University of Chile, ahogan[[@]]dcc.uchile.cl
Aidan Hogan is an Associate Professor at the Department of Computer Science, University of Chile, and an Associate Researcher of the Millennium Institute for Foundational Research on Data (IMFD). He previously worked at the Digital Enterprise Research Institute (DERI) (now called INSIGHT) based in the National University of Ireland, Galway, where he completed his PhD in 2011 and a PostDoc under the supervision of Prof. Dr. Axel Polleres. He has served various roles in the OC of the ISWC and ESWC conferences, including co-chair of In-Use, Workshop & Tutorial, and Poster & Demo Tracks. He has been the PC Chair for AMW and will be a PC Chair for JIST-KG. He has helped co-organise several workshops, including several years of the COLD and QuWeDa series at ISWC. He is also on the EB of the Semantic Web Journal, and the Journal of Web Semantics.
Miriam Redi, Wikimedia Foundation
John Samuel, CPE Lyon
Dennis Diefenbach, University Jean Monet
Lydia Pintscher, Wikimedia Deutschland
Edgar Meij, Bloomberg L.P.
Thomas Pellissier Tanon, Lexistems
Hiba Arnaout, MPI for Informatics
Fabian Suchanek, Télécom ParisTech
Fariz Darari, University of Indonesia
Filip Ilievski, ISI
Marco Ponza, Bloomberg L.P.
Cristina Sarasua, University of Zurich
Pavlos Vougiouklis, Huawei Technologies, Edinburgh
Finn Årup Nielsen, Technical University of Denmark
Andrew D. Gordon, Microsoft Research & University of Edinburgh
Michael Luggen, Fribourg University
Shrestha Ghosh, MPI for Informatics
Daniel Garijo, ISI
Gong Cheng, Nanjing University
Anastasia Dimou, Gent University
Sebastián Ferrada, Universidad de Chile