wg1:wg1
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
wg1:wg1 [2023/12/05 03:41] – [Documents] atul.kumar.ojha | wg1:wg1 [2024/04/10 14:25] – [WG1 Tasks] bruno.guillaume | ||
---|---|---|---|
Line 25: | Line 25: | ||
==== Members and organisation ==== | ==== Members and organisation ==== | ||
* [[https:// | * [[https:// | ||
- | * Expression of interest in WG1 tasks [[https:// | + | * Activities are currently structured around four primary |
- | * Task 1.1: Linguistic typology and multilingual corpus annotation | + | |
- | * Task 1.2: Extensions and updates to MWE annotation guidelines and UD-PARSEME unification | + | |
- | * Task 1.3: Extensions and updates to morphosyntactic annotation guidelines | + | |
- | * Task 1.4: Sharing tools, formats, and infrastructure | + | |
==== Upcoming meetings ==== | ==== Upcoming meetings ==== | ||
- | * **WG1 Meeting | + | * WG Meeting |
- | | + | * WG Meeting |
- | | + | |
- | * **WG1 Meeting | + | |
- | * Task 1.2 Meeting, 24 January 2024 from 13:00 to 14:00 CET, online | + | |
- | | + | |
==== Minutes of past meetings ==== | ==== Minutes of past meetings ==== | ||
+ | * WG1 Meeting 7 (Naples, Italy) - 7 February 2024: co-located with the [[meetings: | ||
+ | * WG1 Meeting 6 (online) - 17 January 2024: Presentation of the WG1 activities in Naples [[https:// | ||
+ | * WG1 Meeting 5 (online) - 20 December 2023: Updates on WG1 tasks and discussion of the activities proposed for Naples [[https:// | ||
+ | * WG1 Meeting 4 (online) - 27 November 2023: Updates on WG1 wiki, WG1 task activities and general [[https:// | ||
* WG1 Meeting 3 (online) - 25 October 2023: Updates on WG1 tasks activities [[https:// | * WG1 Meeting 3 (online) - 25 October 2023: Updates on WG1 tasks activities [[https:// | ||
* WG1 Meeting 2 (online) - 13 September 2023: launching WG1 tasks [[https:// | * WG1 Meeting 2 (online) - 13 September 2023: launching WG1 tasks [[https:// | ||
Line 47: | Line 45: | ||
- | ==== Documents | + | ==== WG1 Tasks ==== |
- | * **Task 1.1:** Linguistic typology and multilingual corpus annotation | + | * **Task 1.1: Linguistic typology and multilingual corpus annotation** |
* [[https:// | * [[https:// | ||
+ | * [[https:// | ||
- | * **Task 1.2** on MWE annotation guidelines and UD-PARSEME unification | + | * **Task 1.2 on MWE annotation guidelines and UD-PARSEME unification** |
- | * [[https:// | + | |
- | * White paper proposition the [[https:// | + | * __Objectives__: |
+ | * __Workplan__: | ||
+ | * __How can I contribute: | ||
+ | * __Documents / Links__ | ||
+ | | ||
+ | * White paper proposition | ||
| | ||
- | * **Task 1.3:** Extensions and updates to morphosyntactic annotation guidelines | + | * **Task 1.3: Extensions and updates to morphosyntactic annotation guidelines** |
* [[https:// | * [[https:// | ||
+ | * [[https:// | ||
+ | * **Task 1.4: Sharing tools, formats, and infrastructure** | ||
+ | * __Leaders / Contacts__: Frantisek Forgac, Bruno Guillaume | ||
+ | * __Objectives__: | ||
+ | * Subtask **A**: Provide an overview of existing software and/or tools that support manual linguistic annotation | ||
+ | * Subtask **B**: Evaluate the pros and cons of tabular formats (such as CoNNL-U) currently used in the UD and Parseme projects | ||
+ | * __Workplan__: | ||
+ | * Subtask **A**: The specific objective is to create a comparison table of available manual annotation tools, with a focus on UD and Parseme interests (i.e. morpho-syntactic and multiword expression annotations). The next steps are: | ||
+ | * Consolidate the set of features to be used in the comparison (the rows of the tables) | ||
+ | * Create a survey to collect information about each annotation tool | ||
+ | * Analyse the results of the survey and produce the final version of the table. | ||
+ | * Subtask **B**: Conduct a detailed analysis of the advantages and disadvantages of the tabular annotation formats, specifically CoNLL-U, as utilized in the Universal Dependencies (UD) and PARSEME projects. The next steps are: | ||
+ | * Develop a Schema/ | ||
+ | * Refine Data Encoding Standards: Currently, UD prescribes both WHAT to encode (the content) and HOW to encode it (the format). Ideally, these aspects should be decoupled: | ||
+ | * The format should dictate HOW to encode data, providing the structural means. | ||
+ | * Guidelines like UD or others should specify WHAT can be encoded, focusing on content restrictions. This separation would enhance the format' | ||
+ | * Generate Initial Working Examples | ||
+ | * Convert existing datasets to test the new format. | ||
+ | * Evaluate and compare these results with those of CoNLL-U and possibly enhanced formats such as CoNLL-U Plus. | ||
+ | * __How can I contribute? | ||
+ | * Join to the ongoing discussions on GitHub (links above) | ||
+ | * Stay tuned for the call to complete the survey | ||
+ | * Join the task co-leaders team | ||
+ | * __Documents__ | ||
+ | * [[https:// | ||
+ | * GitHub discussions about [[https:// | ||
+ | * Document used in the Task 1.4 session at the WG1 meeting in Naples (February 2024): [[https:// | ||
==== Training ==== | ==== Training ==== | ||
* [[https:// | * [[https:// | ||
+ | * [[https:// | ||
==== Channels ==== | ==== Channels ==== |
wg1/wg1.txt · Last modified: 2024/06/11 15:38 by dan.zeman