Skip to content

Tag: EuroWordNet

WordNet and wordnets

Princeton’s WordNet is a lexical database showing semantic relationships between words in the English language. It focuses on nouns, verbs, adjectives and adverbs, as words within these word classes are all content words, meaning that they have meaning by themselves (as opposed to function words). Princeton’s WordNet takes these content words and groups them into ‘synsets’, which are groups of cognitive synonyms, or words with the same meaning or sense (Sources: PARTS OF SPEECH, WordNet | A Lexical Database for English).

Wordnets have emerged in other languages based on this concept, including in my languages of study – Irish, Spanish and German. Wordnets for each of these languages can be found by following these links:

  • EuroWordNet database: a multilingual database providing wordnets for several European languages, including Spanish and German. Free samples from each language can be downloaded here.
  • Líona Séimeantach na Gaeilge (LSG), or the Language Semantic Network: an Irish-language wordnet, providing a comprehensive database of Irish words and the semantic links between them.  The PDF version can be downloaded here.

The PDF version of the LSG displays the wordnet in alphabetical order. As in the Princeton WordNet, content words are presented in synsets, showing relationships between words. The word ‘comhchiall’ denotes synonymous words, ‘aicmí’ denotes the class to which the word belongs and ‘fo-aicmí’ the subclasses stemming from the word. ‘Gaolta’ shows a related word that is not synonymous. In this screenshot below from the PDF, for example, one can see that the word ‘teangeolaíocht’ (linguistics) is shown to be in the class of ‘eolaíocht’ (science) with one subset being ‘pragmataic’ (pragmatics). It is shown to be related to, but not synonymous with, ‘gramadach’ (grammar).

This shows that synsets on the LSG, like those in Princeton’s WordNet, have a hierarchical element. Using the relations expressed through hypernyms and hyponyms, the LSG shows where each word lies within the hierarchy of similar words in the synset. In the example above, ‘eolaíocht’ is a hypernym for ‘teangeolaíocht’, and ‘pragmataic’ a hyponym for ‘teangeolaíocht’.  Antonyms are not shown within the LSG PDF file, unlike in Princeton’s WordNet. 

The entries are linked to the synsets available in the Princeton WordNet, which its creator, Kevin Scannell, states is helpful for his work on English-Irish machine translation. The entries are not mapped directly, however, partly due to the distinctions within Irish that do not exist within English (such as the difference between ‘rua’ and ‘dearg’) (Sources: LSG: Home, LSG: Details).

Leave a Comment