====== PARSEME 2.0 Multilingual Shared Task on Identification and Paraphrasing of Multiword Expressions ====== * **Event title**: culminating workshop of the PARSEME 2.0 shared task * **Proposal**: submitted to [[https://semeval.github.io/SemEval2026/|SemEval 2026]] * **Location**: TBA * **Dates**: TBA * **Data**: to be provided by the ongoing [[:wg1:wg1:task1.2:parseme-2.0-annotation-campaign|PARSEME/UniDive annotation campaign on multiword expressions]] * **Shared task organizers**: * Manon Scholivet, Université Paris-Saclay, France * Takuya Nakamura, Université Paris-Saclay, France * [[https://perso.lisn.upsaclay.fr/savary/|Agata Savary]], Université Paris-Saclay, France * Eric Bilinski, Université Paris-Saclay, France * [[https://pageperso.lis-lab.fr/carlos.ramisch/|Carlos Ramisch]], Aix-Marseille Université, France |[[https://www.cost.eu/|{{ :cost_logo_rgb_lowresolution-cropped.jpg?100 |}}]]|{{ :en-funded_by_the_eu-pos.png?200 |}}|[[https://www.universite-paris-saclay.fr/|{{:other-events:logo-univ-saclay.png?100|}}]]|[[https://www.lisn.upsaclay.fr/|{{:other-events:logo-lisn.jpeg?100|}}]]|[[https://www.univ-amu.fr/|{{:other-events:logo-amu.png?100|}}]]|[[https://www.lis-lab.fr/|{{:other-events:logo-lis.jpg?100|}}]]| ===== Subtask 1: MWE identification ===== This subtask is an extension of [[https://gitlab.com/parseme/corpora/-/wikis/home#shared-tasks|PARSEME shared tasks]] on automatic identification of verbal MWEs. * Task: Given a raw text, automatically underline MWEs in it * Data: PARSEME 2.0 annotated corpora (not necessarily all the texts from release 1.3) * Language teams willing to participate with PARSEME data * Albanian * Egyptian (ca. 2700-2000 BC): MWEs from the UD-EUJA treebank. * Georgian * Greek (Modern) * Greek (Ancient) * Hebrew * Japanese * Lithuanian * Persian (Farsi) * Polish * Romanian * Serbian * Slovene * Swedish * Ukrainian * Minimum annotation effort: 2000 annotated MWEs ===== Subtask 2: MWE paraphrasing ===== * Task: Given a sentence with a MWE, rephrase a sentence so that there is no MWEs but the meaning is the same * Examples: * //She made up her mind to…// => //She finally decided to…// * //He kicked the bucket// ===> //He died// (But __not__ //He passed away//) * Data: * Selected sentences form PARSEME annotated corpora * The same sentences manually paraphrased * One to several hundred examples per language * Language teams willing to participate * Albanian * French * Greek (Modern) * Japanese * Hebrew * Lithuanian * Persian (Farsi) * Polish * Brazilian Portuguese * Romanian * Serbian * Slovene * Swedish * Ukrainian ===== Timeline ===== * 31 March 2025 - SemEval proposal submission * 19 May 2025 - SemEval notification * 15 July 2025: Sample data ready * 1 September 2025: Training data ready * 1 December 2025: Evaluation data ready (internal deadline; not for public release) * 10 January 2026: Evaluation start * 31 January 2026: Evaluation end (latest date; task organizers may choose an earlier date) * February 2026: Paper submission * March 2026: Notification to authors * April 2026: Camera ready * Summer 2026: SemEval workshop (co-located with a major NLP conference)