Skip to the content.

๐Ÿ‘Ž๐Ÿ‘ LeWiDi

Download the data

๐Ÿ‘Ž๐Ÿ‘ News


๐Ÿ‘Ž๐Ÿ‘ Overview

In recent years, the assumption that natural language (NL) expressions have a single and clearly identifiable interpretation in a given context is more and more recognized as just a convenient idealization. The objective of the Learning with Disagreement shared task is to provide a unified testing framework for learning from disagreements, using datasets containing information about disagreements for interpreting language. Learning with Disagreement (Le-Wi-Di) 2021 created a benchmark consisting of 6 existing and widely used datasets, but focusing primarily on semantic ambiguity and image classification.

For SemEval 2023, we run a second shared task on the topic of Learning with Disagreements:

  1. the focus is entirely on subjective tasks, where training with aggregated labels makes much less sense, and
  2. while relying on the same infrastructure, it will involve new datasets.

We believe that the shared task is extremely timely, given the current high degree of interest in subjective tasks such as offensive language detection in general, and in particular on the issue of disagreements in such data (Basile et al., 2021; Leonardelli et al., 2021; Akhtar et al., 2021; Davani et al., 2022; Uma et al., 2021)


๐Ÿ‘Ž๐Ÿ‘ The Datasets

To this end, we collected a benchmark of four (textual) datasets with different characteristics, in terms of genres (social media and conversations), of languages (English and Arabic), of tasks (misogyny, hate-speech, offensiveness detection) and of annotationsโ€™ methods (experts, specific demographics groups, AMT-crowd). But all datasets providing a multiplicity of labels for each instance.

The four datasets presented are:


๐Ÿ‘Ž๐Ÿ‘ Aim of the task and data format

We encourage participants in developing methods able to capture agreements/disagreements, rather than focusing on developing the best model. To this end, we developed an harmonized json format used to release all datasets. Thus, features that are common to all datasets, are released in a homogenous format, so to facilitate participants in testing their methods across all the datasets.

Among the information released that is common to all datasets, and of particular relevance for the task, are the disaggregated crowd-annotations labels and the annotatorsโ€™ reference. Moreover, dataset-specific information are also released, and vary for each dataset, from demographics of annotators (ArMIS and HS-Brexit datasets), to the other annotations made by the same annotators within the same dataset (all datasets) or additional annotations given for for the same item (HS-Brexit and ConvAbuse datasets) by the same annotator. Participants can leverage on this dataset-specific information to improve perfomance for a specific dataset. </details>


๐Ÿ‘Ž๐Ÿ‘ The competition

The shared task was hosted on Codalab. Please refer to Codalab platform for more detailed information about the competition.


๐Ÿ‘Ž๐Ÿ‘ Organisers


๐Ÿ‘Ž๐Ÿ‘ Communication

Contact us directly, if you have further inquiries. Our google group with news about the task. Follow us on Twitter, for news about learning with disagreements and more!


๐Ÿ‘Ž๐Ÿ‘ Previous Editions