Delexicalized Word Embeddings for Cross-lingual Dependency Parsing - Université de Lille Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Delexicalized Word Embeddings for Cross-lingual Dependency Parsing

Résumé

This paper presents a new approach to the problem of cross-lingual dependency parsing, aiming at leveraging training data from different source languages to learn a parser in a target language. Specifically , this approach first constructs word vector representations that exploit structural (i.e., dependency-based) contexts but only considering the morpho-syntactic information associated with each word and its contexts. These delexicalized word em-beddings, which can be trained on any set of languages and capture features shared across languages, are then used in combination with standard language-specific features to train a lexicalized parser in the target language. We evaluate our approach through experiments on a set of eight different languages that are part the Universal Dependencies Project. Our main results show that using such delexicalized embeddings, either trained in a monolin-gual or multilingual fashion, achieves significant improvements over monolingual baselines.
Fichier principal
Vignette du fichier
E17-1023.pdf (239.65 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01590639 , version 1 (20-09-2017)

Identifiants

Citer

Mathieu Dehouck, Pascal Denis. Delexicalized Word Embeddings for Cross-lingual Dependency Parsing. EACL, Apr 2017, Valencia, Spain. pp.241 - 250, ⟨10.18653/v1/E17-1023⟩. ⟨hal-01590639⟩
205 Consultations
230 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More