Turkish synonym identification from multiple resources: monolingual corpus, mono/bilingual online dictionaries, and WordNet

Authors: TUĞBA YILDIZ, BANU DİRİ, SAVAŞ YILDIRIM

Abstract: In this study, a model is proposed to determine synonymy by incorporating several resources. The model extracts the features from monolingual online dictionaries, a bilingual online dictionary, WordNet, and a monolingual Turkish corpus. Once it has built a candidate list, it determines the synonymy for a given word by means of those features. All these resources and the approaches are evaluated. Taking all features into account and applying machine learning algorithms, the model shows good performance of F-measure with 81.4%. The study contributes to the literature by integrating several resources and attempting the first corpus-driven synonym detection system for Turkish.

Keywords: Synonym, dependency relations, corpus-based statistics

Full Text: PDF