SemEval2024 shared task on "Multilingual Detection of Persuasion Techniques in Memes"

Registration to the website is open. After registering, you can get access to the data and submit your predictions.

Memes are one of the most popular type of content used in an online disinformation campaign. They are mostly effective on social media platforms, since there they can easily reach a large number of users. Memes in a disinformation campaign achieve their goal of influencing the users through a number of rhetorical and psychological techniques, such as causal oversimplification, name calling, smear.
The goal of the shared task is to build models for identifying such techniques in the textual content of a meme only (one subtask) and in a multimodal setting in which both the textual and the visual content are to be analysed together (two subtasks).

Technical Description

We refer to propaganda whenever information is purposefully shaped to foster a predetermined agenda. Propaganda uses psychological and rhetorical techniques to reach its purpose. Such techniques include the use of logical fallacies and appealing to the emotions of the audience. Logical fallacies are usually hard to spot since the argumentation, at first sight, might seem correct and objective. However, a careful analysis shows that the conclusion cannot be drawn from the premise without the misuse of logical rules. Another set of techniques makes use of emotional language to induce the audience to agree with the speaker only on the basis of the emotional bond that is being created, provoking the suspension of any rational analysis of the argumentation.

Memes consist of an image superimposed with text. The role of the image in a deceptive meme is either to reinforce/complement a technique in the text or to convey itself one or more persuasion techniques.

Technical Description

We defined the following subtasks:

Subtask 1 - Given only the “textual content” of a meme, identify which of the 20 persuasion techniques, organized in a hierarchy, it uses. If the ancestor node of a technique is selected, only a partial reward is given. This is a hierarchical multilabel classification problem. You can find a view of the hierarchy in the figure below (note that there are 22 techniques in the image, but in subtask 1 "Transfer" and ""Appeal to Strong emotion" are not present, so just picture the hierarchy without them). Full details on it are available here. If you need additional annotated data to solve this task, you can check the PTC corpus" as well as the SemEval 2023 task 3 data.
Subtask 2a - Given a meme, identify which of the 22 persuasion techniques, organized in a hierarchy, are used both in the textual and in the visual content of the meme (multimodal task). If the ancestor node of a technique is selected, only partial reward will be given. This is a hierarchical multilabel classification problem. You can find info on the hierarchy here.
Subtask 2b - Given a meme (both the textual and the visual content), identify whether it contains a persuasion technique (at least one of the 22 techniques we considered in this task), or no technique. This is a binary classification problem. Note that this is a simplified version of subtask 2a in which the hierarchy is cut at the first two children of the root node.

Note that for all subtasks, there will be three surprise test datasets in different languages (a fourth one in English will be released as well), which will be revealed only at the final stages of the shared task. i.e. together with the release of the test data. This has the goal to test zero-shot approaches.

The hierarchy is basically a Directed Acyclic graph that groups subsets of the techniques that share similar characteristics in a hierarchical structure.

Hierarchy of the techniques for Subtask 2a (in Subtask 1 "Transfer" and "Appeal to Strong emotion" are not present). The hierarchy is also inspired by this document

Data Description

The corpus is availble in your team page after registering for the shared task github page. Beware that the content of some memes might be considered offensive or too strong by some viewers. Subscribe to the task mailing list and the Twitter accounts (see the bottom of the page) to get updates on the task (the mailing list will be the official channel of communications).
Note that, for all subtasks, you are free to use the annotations of the PTC corpus (more than 20,000 sentences). The domain of that corpus is news articles, but the annotations are made using the same guidelines, altough fewer techniques were considered. A similar, albeit multilingual, corpus is also available . Even in this case, the domain of the corpus is news articles from 9 languages. Beware that the number of techniques and the annotation guidelines are slightly different.

We provide a training set to build your systems locally. We will provide a development set and a public leaderboard to share your results in real time with the other participants involved in the task. We will further provide a test set (without annotations) and an online submission website to score your systems.

Input and Submission File Format

The input data for subtask 1 is the text extracted from the meme. The training, the development and the test sets for all subtasks are distributed as json files, one single file per subtask.

The input data for subtasks 2a and 2b, in addition to the text extracted from the meme, is the image of the meme itself. The images are distributed together with the subtask json in a zip file, and it is available, upon registration, from the personal page of your team.

Here is an example of a meme:

Subtask 1

The entry for that example in the json file for subtask 1 is


		{
			"id": "125",
			"text": "I HATE TRUMP\n\nMOST TERRORIST DO",
			"labels": [
				"Loaded Language",
				"Name calling/Labeling"
		],
		"link": "https://..."
		},

where

id is the unique identifier of the example across all three subtasks
text is the textual content of the meme, as a single UTF-8 string. While the text is first extracted automatically from the meme, we manually post-process it to remove errors and to format it in such a way that each sentence is on a single row and blocks of text in different areas of the image are separated by a blank row. Note that task 1 is an NLP task since the image is not provided as an input.
labels is a list of valid technique names (the full list is available in your team page after registration) used in the text. Since these are the gold labels, they will be provided for the training set only. In this case two techniques were spotted: Loaded Language and Name calling/Labeling.

A submission for task 1 is a single json file with the same format as the input file, but where only the fields id, labels are required. Note that if your algorithm detects no technique in a meme, then the field "labels" should be an empty list.

Subtask 2a

The input for subtask 2a is a json and a folder with the images of the memes. The entry in the json file for the meme above is


		{
			"id": "125",
			"text": "I HATE TRUMP\n\nMOST TERRORIST DO",
			"labels": [
            				"Reductio ad hitlerum",
            				"Smears",
            				"Loaded Language",
            				"Name calling/Labeling"
        			],
            	"image": "125_image.png",
		"link": "https://..."
		},

where image is the name of the file with the image of the meme in the folder. The meaning of id, text and labels is the same as for task 1. However, the list of technique names is different (the full list is available in your team page after registration). Note that the field labels will be provided for the training set only, since it corresponds to the gold labels. Notice, however, that now we are able to see the image of the meme, hence we might be able to spot more techniques. In this example smears and Reductio ad hitlerum become evident only after we are able to understand who the two sentences are attributed to. There are other cases in which a technique is conveyed by the image only (see example with id 189 in the training set).

A submission for task 2 consists in a single json file with the same format as the input file, but where only the fields id, labels, for each example, are required.

Subtask 2b

Subtask 2b is the same as subtask 2a. However, it is going to be evaluated as a binary task, whether at least one technique is present in the meme or no technique is present ("propagandistic" and "non_propagandistic", respectively). Notice, these two labels correspond to the children of the root node of the hierarchy.

The entry for that example in the json file for subtask 1 is

 
               {
                       "id": "125",
                       "text": "I HATE TRUMP\n\nMOST TERRORIST DO",
                       "label": "propagandistic"
               },

Evaluation

Upon registration, participants will have access to their team page, where they can also download the scripts we use for computing the results on the leaderboard. You can use the scorers to test your models locally. Subtask 1 and 2a depends on a hierarchy. Taking the figure above as reference, the gold label is always a leaf node of the DAG. However, any node of the DAG can be a predicted label:

if the prediction is a leaf node and it is the correct label, then a full reward is given. For example Red Herring is predicted and it is the gold label as well.
If the prediction is NOT a leaf node and it is an ancestor of the correct gold lable, then a partial reward is given (the reward depends on the distance between the two nodes). For example, if the gold label is Red Herring and the predicted label is Distraction or Appeal to Logic.
if the prediction is not an ancestor node of the correct label, then a null reward is given. For example, if the gold label is Red Herring and the predicted label is Black and White Fallacy or Appeal to Emotions. < A graphical example is given here.
However, notice that, the hierarchy can be ignored by restricting the predictions to technique names only. This way, the task would be identical to SemEval 2023 task 3.

Here is a brief description of the evaluation measures the scorers compute.

Subtask 1

Subtask 1 is a hierarchical multilabel classification problem. Taking the figure above with the hierarchy as example, any node of the DAG can be a predicted label. The gold label is always a leaf node of the DAG. If the prediction is the correct label, We use hierarchical-F₁ (see section 6) as the official evaluation measure. A graphical example of the evaluation function is available here.

Subtask 2a

Subtask 2a is a hierarchical multilabel classification problem. We use hierarchical-F₁ (see section 6) as the official evaluation measure. A graphical example of the evaluation function is available here.

Subtask 2b

Subtask 2b is a binary classification problem. The two labels are indicate whether there is at least one persuasion technique in the meme or none. We use macro-F₁ as the official evaluation measure.

The final version of the hierarchy will also be inspired by this document

How to Participate

Go to the registration page and fill the form. You'll be able to access the data and have the possibility to submit your predictions.
When filling in the data, think carefully about the name of your team. Changing the name is complicated in the current version of the website, doing so at the last minute causes confusion in the other participants if they want to refer to your team in their paper, therefore team name changes will not be allowed.
In order to disseminate the results, we give the chance to the participants to share a link to a paper or a website describing their systems (the link can be updated in the team page at any time).
You will get an email with your team passcode. In case you do not receive the email, after checking your SPAM folder, then send us an email. We recommed you write down the passcode (and bookmark your team page).
We will use your email only to send you updates on the corpus or to let you know if we organise any event on the topic, we promise.
Use the passcode on the top-right box to enter your team page. There you can download the data and submit your runs.
You may not share you passcode with others or give access to the dataset to unauthorised users.
Phase 1. Submit your predictions on the development set to check your performance evolution. You will get an immediate feedback for each submission and you can check other participants' performances.
Avoid submitting an abnormal number of submissions with the purpose of guessing the gold labels.
Manual predictions are forbidden; the whole process should be automatic.
Phase 2. Once the test set is available, you will be able to submit your predictions on it, but you won't get any feedback until the end of the evaluation phase.
You can make as many submissions as you like, but we will evaluate only the latest one. We plan to reopen and keep the leaderboard open after the end of the shared task in order to keep track of the state-of-the-art approaches. This implies that we don't plan to release the gold labels for the test set right after the end of phase 2. If you plan to do error analysis, do so on the development set.
Once phase 2 is over, any team, independently on the number of subtasks they participated in, will be able to write a paper describing their approach.
The dataset may include content which is protected by copyright of third parties.
- The dataset may be used only for scientific research purposes. The dataset may not be redistributed or shared in part or full with any third party. Any other use is explicitly prohibited.
- When writing your paper, do not use any of the images in the dataset. We (you and us organisers) will recreate some of the memes to avoid copyright claims.

Dates

The deadlines related to SemEval-level events, i.e. paper submission, notification to authors and camera-ready are only indicative. See SemEval website for up-to-date ones.

August 12	Registration opens
~~September 7~~ September 11	Release of the training set with gold labels. Release of the development set
January 13, 2023	Release of the gold labels of the dev set
January 20, 2024	Release of the test set
January 31, 2024	Registration closes
January 31, 2024 at 23:59 (Anywhere on Earth)	Test submission site closes
February 19, 2024	Paper Submission Deadline
March 18, 2024	Notification to authors
April 1, 2024	Camera ready papers due
TBA	SemEval 2024 workshop

Contact

We have created a google group for the task. Join it to ask any question and to interact with other participants.

If you need to contact the organisers only, send us an email.

Organisation:

Dimitar Dimitrov, Sofia University ”St. Kliment Ohrdiski”
Giovanni Da San Martino, University of Padova, Italy
Preslav Nakov, Mohamed bin Zayed University of Artificial Intelligence, UAE
Firoj Alam, Qatar Computing Research Institute, HBKU, Qatar
Maram Hasanain, Qatar Computing Research Institute, HBKU, Qatar
Abul Hasnat, Blackbird.ai
Fabrizio Silvestri, Sapienza University, Rome

Template by pFind Goodies

SemEval 2024 Task 4 "Multilingual Detection of Persuasion Techniques in Memes"

Technical Description

Technical Description

Data Description

Input and Submission File Format

Subtask 1

Subtask 2a

Subtask 2b

Evaluation

Subtask 1

Subtask 2a

Subtask 2b

How to Participate

Dates

Contact

Organisation:

SemEval 2024 Task 4
"Multilingual Detection of Persuasion Techniques in Memes"