Skip to content

Concordances and Voyant Tools

Concordance programs allow the user to search for instances of a given word or phrase within a text or a corpus. In the most common concordance format, each concordance line displays an occurrence of the word as it appears in the text or database, along with the words occurring on either side of it. This shows the context in which the word appears. This format is called a KWIC concordance, or a key-word-in-context concordance.

Concordances provide a convenient format to analyse words in terms of their context and to examine the patterns in which they occur. This is useful for lexicographers as it is important to know how words occur in a certain language and their frequency when writing dictionaries. By studying concordances, one can also obtain data on collocations, which is very useful in discourse research (Sources: Using a Concordance for Discourse Researchhttps://ota.ox.ac.uk/documents/searching/handbook.html).

I used Voyant Tools to experiment with concordances using a plain text file version of Luther’s Bible. I downloaded the text from the Oxford Text Archive and uploaded it to Voyant Tools. This created the concordance that can be viewed here. In the bottom right-hand corner of the screen, one can see the concordance lines created.

From the most frequent word in the text, it is clear that I encountered the issue of German character representation. Instead of the German word ‘dass’, which means ‘that’ and which would have been written with a scharfes S (ß) at the time, ‘daãÿ’ is displayed. The appearance of this word also brings up the issue of stop words. Stop words are words that should be excluded from the results of a concordance and are typically function words. In Voyant Tools, you can choose to use pre-existing stopword lists or create your own. In this case, as some function words, such as ‘daß’, were displayed differently to how they should appear, it would be difficult to use this function.

One can see in the cirrus, or the word cloud, that some predictable content words were frequent in the text, such as ‘gott’ (God) and ‘sohn’ (son). It is interesting to note that the word cloud does not show these with a capital G and S. As all German nouns start with capital letters, this is an important feature of the language that is left out of the word cloud.

Published inCorporaTools and Information Technologies

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php