Call for Papers
Motivations for the Workshop
Since the appearance of transformers (Vaswani et al., 2017), Deep Learning (DL) and neural approaches have brought a huge contribution to Natural Language Processing (NLP) either with highly specialized models for specific application or via Large Language Models (LLMs) (Devlin et al., 2019; Brown et al., 2020; Touvron et al., 2023) that are efficient few-shot learners for many NLP tasks. Such models usually build on huge web-scale data (raw multilingual corpora and annotated specialized, task related, corpora) that are now widely available on the Web. This approach has clearly shown many successes, but still suffers from several weaknesses, such as the cost/impact of training on raw data, biases, hallucinations, explainability, among others (Nah et al., 2023).
The Linguistic Linked Open Data (LLOD) (Chiarcos et al., 2013) community aims at creating/distributing explicitly structured data (modelled as RDF graphs) and interlinking such data across languages. This collection of datasets, gathered inside the LLOD Cloud (Chiarcos et al., 2020), contains a huge amount of multilingual ontological (e.g. DBpedia (Lehmann et al., 2015)); lexical (e.g., DBnary (Sérasset, 2015), Wordnet (McCrae et al., 2014), Wikidata (Vrandečić and Krötzsch, 2014)); or linguistic (e.g., Universal Dependencies Treebank (Nivre et al., 2020; Chiarcos et al., 2021), DBpedia Abstract Corpus (Brümmer et al., 2016)) information, structured using common metadata (e.g., OntoLex (McCrae et al., 2017), NIF (Hellmann et al., 2013), etc.) and standardised data categories (e.g., lexinfo (Cimiano et al., 2011), OliA (Chiarcos and Sukhareva, 2015)).
Both communities bring striking contributions that seem to be highly complementary. However, if knowledge (ontological) graphs are now routinely used in DL, there is still very few research studying the value of Linguistic/Lexical knowledge in the context of DL. We think that, today, there is a real opportunity to bring both communities together to take the best of both worlds. Indeed, with more and more work on Graph Neural Networks (Wu et al., 2023) and Embeddings on RDF graphs (Ristoski et al., 2019), there is more and more opportunity to apply DL techniques to build, interlink or enhance Linguistic Linked Open Datasets, to borrow data from the LLOD Cloud for enhancing Neural Models on NLP tasks, or to take the best of both worlds for specific NLP use cases.
Submission Topics
This workshop aims at gathering researchers that work on the interaction between DL and LLOD in order to discuss what each approach has to bring to the other. For this, we welcome contributions on original work involving some of the following (non exhaustive) topics:
- Deep Learning for Linguistic Linked Data,
among which (but not exclusively):
- Modelling, Resources & Interlinking,
- Relation Extraction
- Corpus annotation
- Ontology localization
- Knowledge/Linguistic Graphs creation or expansion
- Linguistic Linked Data for Deep Learning, among which (but not exclusively):
- Linguistic/Knowledge Graphs as training data
- Fine tuning LLMs using Linguistic Linked (meta)Data
- Graph Neural Networks
- Knowledge/Linguistic Graphs embeddings
- LLOD for model explainability/sourcing
- Neural models for under-resourced languages
- Joint Deep Learning and Linguistic Data applications
- Use cases combining Language Models and Structured Linguistic Data
- LLOD and DL for Digital Humanities
- Question-Answering on graph data
All application domains (Digital Humanities, FinTech, Education, Linguistics, Cybersecurity…) as well as approaches (NLG, NLU, Data Extraction…) are welcome, provided that the work is based on the use of BOTH Deep Learning techniques and Linguistic Linked (meta)Data.
Important Dates
All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”)
- Final submissions due: 9th March 2024 (no further extension to be expected)
- Notification of acceptance: 2nd April 2024
- Camera-ready due: 9th April 2024