User Tools

Site Tools


wg1:wg1

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
wg1:wg1 [2024/01/29 10:01] – [Upcoming meetings] kaja.dobrovoljcwg1:wg1 [2024/04/10 14:25] (current) – [WG1 Tasks] bruno.guillaume
Line 25: Line 25:
 ==== Members and organisation ==== ==== Members and organisation ====
   * [[https://www.cost.eu/actions/CA21167/#tabs+Name:Working%20Groups%20and%20Membership|List]] of current WG1 members   * [[https://www.cost.eu/actions/CA21167/#tabs+Name:Working%20Groups%20and%20Membership|List]] of current WG1 members
-  * Expression of interest in WG1 tasks [[https://docs.google.com/spreadsheets/d/1ANohImm94mhug_Sf0n7StIKQVipW2_63n93Vhl-dP4w/edit#gid=21715724|[lists per task]]] - proposals for new tasks are welcome +  * Activities are currently structured around four primary [[#wg1 tasks|WG1 tasks]] detailed below, but proposals for new activities are always welcome. 
-    * Task 1.1: Linguistic typology and multilingual corpus annotation +
-    * Task 1.2: Extensions and updates to MWE annotation guidelines and UD-PARSEME unification +
-    * Task 1.3: Extensions and updates to morphosyntactic annotation guidelines +
-    * Task 1.4: Sharing tools, formats, and infrastructure+
        
  
 ==== Upcoming meetings ==== ==== Upcoming meetings ====
-  * **[[meetings:general_meetings:2nd_unidive_general_meeting|WG1 Meeting 7, 7th February 2024]]**co-located with the [[meetings:general_meetings:2nd_unidive_general_meeting|2nd General Meeting]] in Naples on 8-Feb 2024+  * WG Meeting 8 (online) - 11 April 2024, 09:00 **CEST** 
 +  * WG Meeting 9 (online) - 11 June 2024, 13:30 **CEST** 
  
  
 ==== Minutes of past meetings ==== ==== Minutes of past meetings ====
 +  * WG1 Meeting 7 (Naples, Italy) - 7 February 2024: co-located with the [[meetings:general_meetings:2nd_unidive_general_meeting|2nd General Meeting]] in Naples on 8-9 Feb 2024, [[https://docs.google.com/presentation/d/1ygvOkl3MymPtEB-Wt6OA66Di5pvZBaAPrWZHnAel1T8/edit?usp=sharing|[short report]]]
   * WG1 Meeting 6  (online) - 17 January 2024: Presentation of the WG1 activities in Naples [[https://docs.google.com/document/d/1jX70gHQTBAl80H3fLLOGE9SSPS6ugL4G0iP-tRBF2MY/edit?usp=sharing|[minutes]]]   * WG1 Meeting 6  (online) - 17 January 2024: Presentation of the WG1 activities in Naples [[https://docs.google.com/document/d/1jX70gHQTBAl80H3fLLOGE9SSPS6ugL4G0iP-tRBF2MY/edit?usp=sharing|[minutes]]]
   * WG1 Meeting 5 (online) - 20 December 2023: Updates on WG1 tasks and discussion of the activities proposed for Naples [[https://docs.google.com/document/d/1YifrnPGam_nUJU2SBMPU5D8uRV1oJsLTV8OVdtpmG3U/edit?usp=sharing|[minutes]]]   * WG1 Meeting 5 (online) - 20 December 2023: Updates on WG1 tasks and discussion of the activities proposed for Naples [[https://docs.google.com/document/d/1YifrnPGam_nUJU2SBMPU5D8uRV1oJsLTV8OVdtpmG3U/edit?usp=sharing|[minutes]]]
Line 45: Line 45:
  
  
-==== Documents ==== +==== WG1 Tasks ==== 
-  * **Task 1.1:** Linguistic typology and multilingual corpus annotation+  * **Task 1.1: Linguistic typology and multilingual corpus annotation**
       * [[https://docs.google.com/document/d/1fNRToU-LR7MQAQl3CzkDHqxxPXDBy5sWZAByxiXVyQg/edit?usp=sharing|Minutes]] from the task meetings       * [[https://docs.google.com/document/d/1fNRToU-LR7MQAQl3CzkDHqxxPXDBy5sWZAByxiXVyQg/edit?usp=sharing|Minutes]] from the task meetings
 +      * [[https://docs.google.com/document/d/1QbO0bTfWXSIIuD5M-W_nmy-62m6XGHkVjvV7ta2aHag/edit|Agenda]] and [[https://docs.google.com/presentation/d/1ygvOkl3MymPtEB-Wt6OA66Di5pvZBaAPrWZHnAel1T8/edit#slide=id.g2b7b733a998_0_6|report]] from the Naples 2024 meeting
  
-  * **Task 1.2** on MWE annotation guidelines and UD-PARSEME unification +  * **Task 1.2 on MWE annotation guidelines and UD-PARSEME unification** 
-    * [[https://docs.google.com/document/d/1jvOGO2Q_pJpm1rB0B6sAprKzEh2n95Jc-VTktaAW_j8/edit?usp=sharing|Minutes]] from the task meetings +    * __Leaders / Contacts__: Agata Savary, Voula Giouli, Stella Markanotatou, Sara Stymne, Carlos Ramisch 
-    * White paper proposition the [[https://nejlt.ep.liu.se/article/view/4453|roadmap for UD/PARSEME unification]]+    * __Objectives__: Model and annotate multiword expressions in a way which is unified across many languages. Make UD and PARSEME initiatives converge in this respect. 
 +    * __Workplan__: After performing pilot annotation based on the [[https://docs.google.com/document/d/1bvjSwHpj8I2zJXmftCpx19u3BNWdKtdeg21f4YVHhWw/edit#heading=h.gt53hu7d9q5p|draft guidelines for nominal MWEs]] during WG1 Day in Naples, we are currently working on transforming the annotator feedback into Gitlab issue discussions. We plan to have a consolidated version of the guidelines for all MWEs (verbal, nominal, modifier, functional) ready by end of 2024, so as to conduct a large-scale annotation campaign in early 2025. The results will be used for organizing a shared task on automatic MWE identification. 
 +    * __How can I contribute:__ <TBA> 
 +    * __Documents / Links__ 
 +        * [[https://docs.google.com/document/d/1jvOGO2Q_pJpm1rB0B6sAprKzEh2n95Jc-VTktaAW_j8/edit?usp=sharing|Minutes]] from the Task 1.2 meetings 
 +        * White paper proposition of the [[https://nejlt.ep.liu.se/article/view/4453|roadmap for UD/PARSEME unification]]
      
-  * **Task 1.3:** Extensions and updates to morphosyntactic annotation guidelines+  * **Task 1.3: Extensions and updates to morphosyntactic annotation guidelines**
       * [[https://docs.google.com/document/d/1Z6MkRiOWWud5Yj5DIY2KH-pEV5VCZwWhqs4IVuMqovc/edit?usp=sharing|Minutes]] from the task meetings       * [[https://docs.google.com/document/d/1Z6MkRiOWWud5Yj5DIY2KH-pEV5VCZwWhqs4IVuMqovc/edit?usp=sharing|Minutes]] from the task meetings
 +      * [[https://docs.google.com/document/d/1V2844LA8VU76T6vojQ4LEVxYgB_sI4AZ1_QF1WVQIkE/edit#heading=h.jepvhma8ziah|Agenda]] and [[https://docs.google.com/presentation/d/1ygvOkl3MymPtEB-Wt6OA66Di5pvZBaAPrWZHnAel1T8/edit#slide=id.g2b7b733a998_0_19|report]] from the Naples 2024 meeting
  
 +  * **Task 1.4: Sharing tools, formats, and infrastructure** 
 +    * __Leaders / Contacts__: Frantisek Forgac, Bruno Guillaume 
 +    * __Objectives__: The general objective of the task is to improve the technical part of annotation activities, focusing on tools, file formats and storage infrastructures. We are currently focusing on two more spectific objectives: 
 +       * Subtask **A**: Provide an overview of existing software and/or tools that support manual linguistic annotation 
 +       * Subtask **B**: Evaluate the pros and cons of tabular formats (such as CoNNL-U) currently used in the UD and Parseme projects 
 +    * __Workplan__:  
 +       * Subtask **A**: The specific objective is to create a comparison table of available manual annotation tools, with a focus on UD and Parseme interests (i.e. morpho-syntactic and multiword expression annotations). The next steps are: 
 +          * Consolidate the set of features to be used in the comparison (the rows of the tables) 
 +          * Create a survey to collect information about each annotation tool 
 +          * Analyse the results of the survey and produce the final version of the table. 
 +       * Subtask **B**: Conduct a detailed analysis of the advantages and disadvantages of the tabular annotation formats, specifically CoNLL-U, as utilized in the Universal Dependencies (UD) and PARSEME projects. The next steps are: 
 +          * Develop a Schema/Definition for Structured Data Format: Consider framing this as part of a shared task in the future 
 +          * Refine Data Encoding Standards: Currently, UD prescribes both WHAT to encode (the content) and HOW to encode it (the format). Ideally, these aspects should be decoupled: 
 +             * The format should dictate HOW to encode data, providing the structural means. 
 +             * Guidelines like UD or others should specify WHAT can be encoded, focusing on content restrictions. This separation would enhance the format's flexibility and adaptability to new types of annotations, while the guidelines ensure relevance of data. 
 +          * Generate Initial Working Examples 
 +             * Convert existing datasets to test the new format. 
 +             * Evaluate and compare these results with those of CoNLL-U and possibly enhanced formats such as CoNLL-U Plus. 
 +    * __How can I contribute?__  
 +      * Join to the ongoing discussions on GitHub (links above) 
 +      * Stay tuned for the call to complete the survey 
 +      * Join the task co-leaders team 
 +    * __Documents__ 
 +      * [[https://docs.google.com/spreadsheets/d/1FZo6sSdIkxXCm9p9FcV8PzVKCeXot6apnnA-zXYhRwk/edit#gid=0|Comparison table]] (WIP) 
 +      * GitHub discussions about [[https://github.com/UniDive/WG1/discussions/1|the comparison table]] and about [[https://github.com/UniDive/WG1/discussions/2|file formats]] 
 +      * Document used in the Task 1.4 session at the WG1 meeting in Naples (February 2024): [[https://docs.google.com/presentation/d/1mCdRAEb7KDgvJEd_QXwzgJHv2Jc3KGOnInFEERQmSUc/edit#slide=id.g2b694e49d96_0_0|Slides]] and [[https://docs.google.com/document/d/1H0-C2bqSD5EzoISxUYnfE-5ZLMhuE8XOa-FrfZANDfk/edit#heading=h.pmv33xdtvdy1|Agenda]]
 ==== Training ==== ==== Training ====
   * [[https://unidive.lisn.upsaclay.fr/doku.php?id=other-events:webinar-1#outcomes|UniDive webinar]] for newcomers to Universal Dependencies, PARSEME and/or Grew-match   * [[https://unidive.lisn.upsaclay.fr/doku.php?id=other-events:webinar-1#outcomes|UniDive webinar]] for newcomers to Universal Dependencies, PARSEME and/or Grew-match
 +  * [[https://unidive.lisn.upsaclay.fr/doku.php?id=meetings:other-events:1st_unidive_training_school|1st UniDive traininig school]] will take place in Chișinău, Moldova on 8-12 July 2024 
  
 ==== Channels ==== ==== Channels ====
wg1/wg1.1706518866.txt.gz · Last modified: 2024/01/29 10:01 by kaja.dobrovoljc