meetings:other-events:1st_unidive_training_school:courses
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
meetings:other-events:1st_unidive_training_school:courses [2024/04/17 14:19] – agata.savary | meetings:other-events:1st_unidive_training_school:courses [2024/04/29 13:47] (current) – [Dependency syntax, Surface-Syntactic UD, and UD] agata.savary | ||
---|---|---|---|
Line 18: | Line 18: | ||
* **Exercises**: | * **Exercises**: | ||
- | * understanding the SUD (and UD) annotation scheme by exploring some treebanks with Grew-match (SUD_English, | + | * understanding the [[https:// |
* example of a SUD annotation from scratch based on data from the participants which are glossed and translated in English | * example of a SUD annotation from scratch based on data from the participants which are glossed and translated in English | ||
* creation of a project on ArboratorGrew | * creation of a project on ArboratorGrew | ||
Line 28: | Line 28: | ||
* ideally, having some data you want to annotate (please take contact before the summer school for the preparation of the data) | * ideally, having some data you want to annotate (please take contact before the summer school for the preparation of the data) | ||
- | * **Preparatory work**: | + | * **Preparatory work** |
- | * looking at treebanks on Grew-Match | + | * looking at treebanks on Grew-Match |
- | * comparing UD and SUD annotation | + | * comparing UD and SUD annotation |
- | * reading Gerdes et al. 2018 | + | |
- | * reading a book or a tutorial on dependency syntax: Mel’cuk 1988, Tesnière 2015, Osborne 2019, Kahane 2013 | + | |
- | =====Annotation multiword expressions for newcomers===== | + | * **Further readings**: |
+ | * Lucien Tesnière (2015), [[https:// | ||
+ | * Igor Mel’cuk (1988), Dependency syntax: theory and practice. SUNY press. | ||
+ | * Timothy Osborne (2019), A Dependency Grammar of English. Benjamins. | ||
+ | * Sylvain Kahane, 2003, [[https:// | ||
+ | * De Marneffe, M. C., Manning, C. D., Nivre, J., & Zeman, D. (2021). [[https:// | ||
+ | * Gerdes K., Guillaume B., Kahane S., Perrier G. (2018) [[https:// | ||
+ | * Gerdes K., Guillaume B., Kahane S, Perrier G. (2021) [[https:// | ||
+ | |||
+ | =====Annotation | ||
* **Trainers** | * **Trainers** | ||
* [[https:// | * [[https:// | ||
- | * [[https:// | + | * [[https:// |
* **Objectives**: | * **Objectives**: | ||
Line 62: | Line 69: | ||
* **Preparatory work**: To be done by the trainees before the training school: | * **Preparatory work**: To be done by the trainees before the training school: | ||
* prepare a parallel corpus or a monolingual one; it would preferably contain a new language, a new dialect, or a new genre; by “new” we mean “not already covered in the PARSEME 1.3 corpus”. | * prepare a parallel corpus or a monolingual one; it would preferably contain a new language, a new dialect, or a new genre; by “new” we mean “not already covered in the PARSEME 1.3 corpus”. | ||
+ | |||
+ | =====Corpus annotation infrastructure===== | ||
+ | |||
+ | * **Trainers** | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | |||
+ | * **Objectives**: | ||
+ | * Understand and efficiently use the technical infrastructure supporting UD and PARSEME corpus annotation and query | ||
+ | |||
+ | * **Form of instruction** | ||
+ | * mostly practical exercises in corpus querying and processing | ||
+ | |||
+ | * **Contents (not necessarily in chronological order)** | ||
+ | * Session 1 (by Bruno Guillaume), joined with Sylvain' | ||
+ | * Storage formats of data: ConNLL-U, CUPT | ||
+ | * Basic usage of Grew-match of morpho-syntactic treebanks | ||
+ | * Hands-on: observe main difference between UD and SUD | ||
+ | * ArboratorGrew basic usage: users roles, graphical edition, conllu edition, metadata | ||
+ | * Sessions 2-3 (by Bruno Guillaume) | ||
+ | * Advanced usages of Grew-match | ||
+ | * On PARSEME data | ||
+ | * Usage of clustering / tables for corpus maintenance, | ||
+ | * Advanced usage of ArboratorGrew | ||
+ | * usage of rewriting rules for corpus pre-annotation / maintenance | ||
+ | * usage of Parser for pre-annotation | ||
+ | * usage of Github synchronisation | ||
+ | * Session 4 (by Agata Savary) | ||
+ | * Git for beginners: | ||
+ | * a repository, a clone, a commit | ||
+ | * Git operations: clone, pull, add, commit, push | ||
+ | * branches | ||
+ | *Gitlab vs. Github | ||
+ | * PARSEME Git infrastructure | ||
+ | * PARSEME project on Git and its repositories | ||
+ | * Managing language repositories | ||
+ | * PARSEME utilities | ||
+ | * PARSEME/UD consistency | ||
+ | * Sessions 5-6 (by Daniel Zeman) | ||
+ | * UD GitHub repositories | ||
+ | * Branches, push access, pull requests | ||
+ | * How to upload: Use git diff before committing and pushing | ||
+ | * TortoiseGit | ||
+ | * Prescribed structure of the dev branch | ||
+ | * Do not pull history from the master branch | ||
+ | * The docs repository, language-specific documentation | ||
+ | * Working with personal UD repositories | ||
+ | * Validator | ||
+ | * On-line report after uploading data | ||
+ | * How to run locally (there are two scripts!) | ||
+ | * How to locate and fix the error | ||
+ | * Demonstrate some common errors, validation levels | ||
+ | * How to register language-specific features, relation subtypes, auxiliaries | ||
+ | * How to fix documentation errors (demonstrate) | ||
+ | * Fixing the errors | ||
+ | * Annotation tool (cf. Grew) | ||
+ | * Text editor (do not use Word!) | ||
+ | * Udapi | ||
+ | * UD Github issues: asking for linguistic help in docs, reporting bugs in treebank-specific repos | ||
+ | * Referring to particular commits, files and lines in the repo. | ||
+ |
meetings/other-events/1st_unidive_training_school/courses.1713356373.txt.gz · Last modified: 2024/04/17 14:19 by agata.savary