wg3:wg3_meeting_2025-03-12_edit
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
wg3:wg3_meeting_2025-03-12_edit [2025/03/12 06:30] – created gulsen.eryigit | wg3:wg3_meeting_2025-03-12_edit [2025/03/12 11:40] (current) – gulsen.eryigit | ||
---|---|---|---|
Line 3: | Line 3: | ||
Subtask reports: | Subtask reports: | ||
- | Task 3.2: Shared task on morphosyntactic parsing | + | * Task 3.2: Shared task on morphosyntactic parsing |
(Omer Goldman, Leonie Weissweiler, | (Omer Goldman, Leonie Weissweiler, | ||
- | Google group | + | [[https:// |
- | Training data and future evaluation code | + | [[https:// |
- | UniDive webpage | + | [[https:// |
- | Task 3.4: Evaluation campaign PARSEME 2.0 | + | |
(Manon Scholivet, Agata Savary) | (Manon Scholivet, Agata Savary) | ||
- | Taks 3.5: Evaluation campaign AdMIRe | + | * Taks 3.5: Evaluation campaign AdMIRe |
(Thomas Pickard, Aline Villavicencio) | (Thomas Pickard, Aline Villavicencio) | ||
General discussion | General discussion | ||
Next meeting: May 6, 12.00 CEST (on line) | Next meeting: May 6, 12.00 CEST (on line) | ||
- | List of Participants | + | ====== |
- | Gülşen Eryiğit (chair) | + | |
- | Joakim Nivre (co-chair) | + | |
- | Roberto Antonio Díaz Hernández | + | |
- | Ali Basirat | + | |
- | Csilla Horváth | + | |
- | Manon Scholivet | + | |
- | Rob van der Goot | + | |
- | Agata Savary | + | |
- | Ranka Stanković | + | |
- | Thomas Pickard | + | |
- | Aline Villavicencio | + | |
- | Tanja Samardzic | + | |
- | Dawit J | + | |
- | Alina Wróblewska | + | |
- | Luka Terčon | + | |
- | Olha Kanishcheva | + | |
- | Dan Zeman | + | |
- | Takuya Nakamura | + | |
- | Federica Gamba | + | |
- | Carlos Ramisch | + | |
- | Flavio Massimiliano Cecchini | + | |
- | Gosse Bouma | + | |
- | Rusudan Makhachashvili | + | |
- | Voula Giouli | + | |
- | Ebru Çavuşoğlu | + | |
- | Omer Goldman | + | |
- | Reut Tsarfaty | + | |
- | Chaya Liebeskind | + | |
- | Faruk Mardan | + | |
- | Adriana Pagano | + | |
- | Ilan Kernerman | + | |
- | Kutay acar | + | |
- | Ludmila Malahov | + | |
- | Teresa Lynn | + | |
- | Lucía Amorós-Poveda | + | |
- | PARSEME shared task (Manon, Agata) | + | |
- | subtask 1 (PARSEME 2.0) | + | |
- | quite established framework | + | |
- | novelty: non-verbal MWEs, diversity measures | + | |
- | subtask 2 (MWE generation) | + | |
- | given a context with eliminated MWEs, restore this MWE | + | |
- | Problems: how to evaluate the system | + | |
- | [ALINE] Consider taking into account the level of difficulty of the items? For example, some items will be more ambiguous and more difficult to determine | + | |
- | [JOAKIM] It is unclear which capacity of models we test | + | |
- | [TOM] Very difficult to evaluate, even manually. | + | |
- | subtask 3 (MWE comprehension/ | + | |
- | Given a sentence and a span of a potential idiomatic expressions, | + | |
- | [GULSEN] There are some datasets for this task. Maybe the 3rd category complicates the things. | + | |
- | [JOAKIM] | + | |
- | [TOM] The same as SemEval 2022 (EN, PT, Galician). There are artefact issues (the models don’t really pay attention to the context). | + | |
- | subtask 4 (paraphrasing) | + | |
- | Given a sentence, rephrase it so that there are no MWEs | + | |
- | [AGATA] The input should be raw text, without a span. Objective: simplification of a text. | + | |
- | [JOAKIM] The most natural tasks among (2, 3 and 4). Close to what people do with LLMs. | + | |
- | Can we avoid doing manual evaluation? (LLM as judge) | + | |
- | [TOM] His favorite | + | |
- | [ALINE] They work with questionnaires for humans for this problem. There is a synonym dataset. Another task: collect sentences with synonyms of MWEs. | + | |
- | [ALINE] Sometimes the simplest way to express a meaning is with a MWE. | + | |
- | Questions: | + | |
- | Which subtasks to choose? | + | |
- | How to evaluate them? | + | |
- | AdMIRe extension | + | * Gülşen Eryiğit (chair) |
- | Tom’s [[slides]][[https:// | + | * Joakim Nivre (co-chair) |
- | [[Task website]][[https:// | + | * Roberto Antonio Díaz Hernández |
- | Data curation guidelines & notes | + | * Ali Basirat |
+ | * Csilla Horváth | ||
+ | * Manon Scholivet | ||
+ | * Rob van der Goot | ||
+ | * Agata Savary | ||
+ | * Ranka Stanković | ||
+ | * Thomas Pickard | ||
+ | * Aline Villavicencio | ||
+ | * Tanja Samardzic | ||
+ | * Dawit J | ||
+ | * Alina Wróblewska | ||
+ | * Luka Terčon | ||
+ | * Olha Kanishcheva | ||
+ | * Dan Zeman | ||
+ | * Takuya Nakamura | ||
+ | * Federica Gamba | ||
+ | * Carlos Ramisch | ||
+ | * Flavio Massimiliano Cecchini | ||
+ | * Gosse Bouma | ||
+ | * Rusudan Makhachashvili | ||
+ | * Voula Giouli | ||
+ | * Ebru Çavuşoğlu | ||
+ | * Omer Goldman | ||
+ | * Reut Tsarfaty | ||
+ | * Chaya Liebeskind | ||
+ | * Faruk Mardan | ||
+ | * Adriana Pagano | ||
+ | * Ilan Kernerman | ||
+ | * Kutay acar | ||
+ | * Ludmila Malahov | ||
+ | * Teresa Lynn | ||
+ | * Lucía Amorós-Poveda | ||
+ | ====== PARSEME shared task (Manon, Agata) ====== | ||
+ | |||
+ | * subtask 1 (PARSEME 2.0) | ||
+ | * quite established framework | ||
+ | * novelty: non-verbal MWEs, diversity measures | ||
+ | * subtask 2 (MWE generation) | ||
+ | * given a context with eliminated MWEs, restore this MWE | ||
+ | * Problems: how to evaluate the system | ||
+ | * [ALINE] Consider taking into account the level of difficulty of the items? For example, some items will be more ambiguous and more difficult to determine | ||
+ | * [JOAKIM] It is unclear which capacity of models we test | ||
+ | * [TOM] Very difficult to evaluate, even manually. | ||
+ | * subtask 3 (MWE comprehension/ | ||
+ | * Given a sentence and a span of a potential idiomatic expressions, | ||
+ | * [GULSEN] There are some datasets for this task. Maybe the 3rd category complicates the things. | ||
+ | * [JOAKIM] | ||
+ | * [TOM] The same as SemEval 2022 (EN, PT, Galician). There are artefact issues (the models don’t really pay attention to the context). | ||
+ | * subtask 4 (paraphrasing) | ||
+ | * Given a sentence, rephrase it so that there are no MWEs | ||
+ | * [AGATA] The input should be raw text, without a span. Objective: simplification of a text. | ||
+ | * [JOAKIM] The most natural tasks among (2, 3 and 4). Close to what people do with LLMs. | ||
+ | * Can we avoid doing manual evaluation? (LLM as judge) | ||
+ | * [TOM] His favorite | ||
+ | * [ALINE] They work with questionnaires for humans for this problem. There is a synonym dataset. Another task: collect sentences with synonyms of MWEs. | ||
+ | * [ALINE] Sometimes the simplest way to express a meaning is with a MWE. | ||
+ | * Questions: | ||
+ | * Which subtasks to choose? | ||
+ | * How to evaluate them? | ||
+ | |||
+ | ====== AdMIRe extension ====== | ||
+ | * Tom’s [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
wg3/wg3_meeting_2025-03-12_edit.1741757448.txt.gz · Last modified: by gulsen.eryigit