====== Working Group 3: Multilingual and Cross-Lingual Language Technology ====== * **Leader**: [[https://jnivre.github.io|Joakim Nivre]] (Sweden) * **Vice-leader**: [[https://web.itu.edu.tr/gulsenc/|Gülşen Cebiroğlu Eryiğit]] (Turkey) ==== Workplan ==== Unified modelling helps solve NLP tasks with higher accuracy and better awareness of diversity. Therefore, this WG will be dedicated to NLP coordinating the development of tools leveraging universality and promoting diversity: - Multilingual and cross-lingual **syntactic parsers** which: * pay attention to hard and underrepresented phenomena (unbounded dependencies, MWEs,...), * leverage transfer of annotations or models in order to cope with data scarceness; - Prototypes of multilingual and cross-lingual **semantic parsers** which: * derive bi-lexical semantic dependencies from syntactic trees, * resolve idiosyncrasies in the syntax-semantics interface; - Multilingual **MWE discovery** tools which: * exploit large non-annotated data to compensate the sparseness of MWEs in annotated corpora, * are coupled both with lexicons and MWE identifiers; - Multilingual **MWE identifiers** which: * are coupled with MWE discovery and lexica to better handle unseen data, * pay attention to underrepresented phenomena, e.g., discontinuity/variability of MWEs; - Prototypes of tools for automatic identification of idiosyncratic **constructions**. The tools themselves will be funded at the national level. WG3 will bring the federating effect to these activities, notably by organizing multilingual **evaluation campaigns** on parsing and MWE identification. Diversity-based evaluation measures from WG4 will be promoted. The outcomes should validate the computational tractability of the terminologies unified in WG1. ==== Members and organisation ==== * [[https://www.cost.eu/actions/CA21167/#tabs+Name:Working%20Groups%20and%20Membership|List]] of current WG3 members * Activities are currently structured around four primary [[#wg3 tasks|WG3 tasks]] detailed below, but proposals for new activities are always welcome. ==== Upcoming meetings ==== ==== Minutes of past meetings ==== * [[wg3:wg3_meeting_2023-03-17|WG3 Meeting 1 Minutes]] **16-17 March 2023**, Paris-Saclay University, France (co-located with [[meetings:general_meetings:1st_unidive_general_meeting|UniDive 1st general meeting]]) * [[wg3:wg3_meeting_2023-09-08|WG3 Meeting 2 Minutes]] **8 September 2023**, Istanbul Technical University, Türkiye * [[wg3:wg3_meeting_2023-11-20|WG3 Meeting 3 Minutes]] **20 November 2023**, online * [[wg3:wg3_meeting_2023-12-18|WG3 Meeting 4 Minutes]] **18 December 2023**, online * [[wg3:wg3_meeting_2024-01-15|WG3 Meeting 5 Minutes]] **15 January 2024**, online * [[meetings:general_meetings:2nd_unidive_general_meeting|WG3 Meeting 6 (including joint meetings with WG1 and WG4)]] **9 February 2024**, University of Naples L'Orientale, Italy * [[wg3:wg3_meeting_2024-03-11|WG3 Meeting 7 Minutes]] **11 March 2024**, online * [[wg3:wg3_meeting_2024-04-29|WG3 Meeting 8 Minutes]] **29 April 2024**, online * [[https://docs.google.com/document/d/1RHzoiwi1_w6dx09GCd0gPPDtz1EG2zo05w2X_aZED-E/edit|WG3 Meeting 9 Minutes]] **10 June 2024**, online ==== WG3 Tasks ==== * **Task 3.1 Documentation of multilingual tools and resources** * __Leaders / Contacts__: A. Seza Doğruöz, Teresa Lynn, Maria Giagkou * __Objectives__: Assessing the "discoverability" of NLP tools and resources, and analyzing the NLP tool availability in the ELG catalogue. * __Workplan__: Data Collection and Analysis. * __How can I contribute:__ Please fill out the shared [[https://docs.google.com/spreadsheets/d/17_5jhUWeYy7WD6OY79Kdyn3ngoB-3Y82YF0yaRpQSv8/edit#gid=0|document]] for your languages in focus by following the task description in slide three of the following {{ :wg3:wg3_doctask_April2024.pdf |presentation }}. * __Documents / Links__ {{ :wg3:wg3_doctask_April2024.pdf |task description}} * **Task 3.2 Evaluation campaign: morphosyntactic parsing** * __Leaders / Contacts__: Omer Goldman, Leonie Weissweiler, Reut Tsarfaty * __Objectives__: Organization of the first WG3 evaluation campaign on morphosyntactic parsing, which aims to combine syntactic parsing with morphological analysis in a way that avoids (most) theoretical debates on word boundaries. * __Workplan__: Data preparation in 2024, shared task in 2025 * __How can I contribute:__ You may contribute for data preparation and join the campaign by writing to: msap-discussion-group@googlegroups.com * __Documents / Links__ ({{ :wg3:EvalCampaign_Omer.pdf|abstract}}, {{ :wg3:EvalCampaign_Omer_Slides.pdf|slides}}) * **Task 3.3 Conceptions of multilinguality** * __Leaders / Contacts__: Adriana Pagano (apagano@letras.ufmg.br), Ilan Kernerman (ilan@lexicala.com) * __Objectives__: Define the concepts of //multilingual//, //cross-lingual//, and //translingual// in the context of Language Technology. * __Workplan__: Devise a survey for the Action members and analyze (and publish) the results. * __How can I contribute:__ If you like to contribute to the data analysis, please contact the task leaders. * __Documents / Links:__ [ {{ :wg3:WG3_ConceptMulti_April2024.pdf | survey results }} ] ==== Channels ==== * [[https://unidive.lisn.upsaclay.fr/doku.php?id=mailing_lists|WG3 mailing list]] for general announcements and proposals * [[https://t.me/+496XKdbSyqI1MDli |WG3 Telegram group]] for special announcements and discussions