Skip to content

Tag: Sketch Engine

‘Fire’: Some observations

At the end of class on Tuesday, November 21, we were asked to further investigate the word ‘fire’ as a noun and a verb. I made the following observations during my short investigation:

The fact that ‘fire’ is a noun and a verb can be confirmed by looking at ‘fire’ in the Oxford English Dictionary. Although polysemy is often studied by using Princeton’s WordNet or the various wordnets in other languages (please see my post WordNet and wordnets for more information), it is clear that the dictionary entry in this case gives plenty of detail about the polysemy of the word, as both the entry for ‘fire’ as a verb and the entry for ‘fire’ as a noun contain dozens of different meanings.

I wanted to compare the usage of ‘fire’ as noun and a verb to see which one is used more frequently. I decided to refer to the British National Corpus on Sketch Engine. By looking at the Wordlist feature, I discovered that ‘fire’ appears 17,348 times as a lemma. It appears 14,172 times as a noun and 3,176 times as a verb, showing that it is used far more frequently as a noun.

Leave a Comment

WL4102 Main Blog Essay: Critical Discourse Analysis and Corpora – A Corpus-Based Project

In my post on Concordances and Voyant Tools, I touched on the use of corpora in discourse research. In this extended blog post, I will explore this idea further, by looking at the connectedness of critical discourse analysis (CDA) and linguistic corpora, using Sketch Engine to study two corpora, one in Spanish and the other in German.

Introduction

CDA is an approach to studying written and spoken communication and the relationship between this communication and society. Van Dijk states that a central focus of CDA is “(group) relations of power, dominance and inequality and the way these are reproduced or resisted by social group members through text and talk” (van Dijk 2). It emerged from different areas of linguistics, including text linguistics and sociolinguistics, and it relates to several modules that we have studied as part of the BA World Languages programme, particularly WL2102: Introduction to Semiotics. The approach is multidisciplinary and draws on methodological approaches that are effective in examining forms of social inequality, such as inequality based on class, sexuality and religion. (Sources: Critical Discourse Analysis: Theory and Interdisciplinarity: pages 11-15, Aims of Critical Discourse Analysis)

As language forms such an important part of the approach, I wanted to closer examine how we can see power structures relevant to CDA in linguistic corpora, and thus, observe how corpus linguistics and CDA connect. I decided to focus on power structures relating to skin colour and gender, using a German-language corpus to examine skin colour and a Spanish-language corpus to examine gender.

Methodology

For my investigation, I used Sketch Engine’s Word Sketch Difference feature, which compares a set of collocates for one lemma in a certain corpus to a set of collocates for another lemma within the corpus. Each lemma is given a colour (red or green) and the collocates that tend to combine with each one are given the same colour. Collocates in white tend to combine with both. If a collocate is shown in dark green or dark red, the collocation is stronger, meaning that the collocate combines far more often with the lemma of that colour and far less often with the other lemma. (Source: Word Sketch Difference lesson | Sketch Engine)

To look at group inequalities in a simple way, I decided to use lemmas that represent an opposing power relationship. It is important to note that these terms are not binary oppositions but I see them as opposing in terms of societal power structures. In the German-language corpus, I searched the lemmas ‘schwarzhäutig’ (black-skinned) and ‘weißhäutig’ (white-skinned), drawing on the amount of discrimination historically and presently faced by people of colour. In the Spanish-language corpus, I searched the lemmas ‘mujer’ (woman) and ‘hombre’ (man) based on the gender discrimination frequently experienced by women living in a patriarchal society.

The German-language corpus used was the German Web 2013 corpus (deTenTen13), which contains over 16 billion words. The Spanish-language corpus used was the Spanish Web 2018 corpus (esTenTen18), which contains over 17 billion words and two subcorpora for European Spanish and American Spanish.  Both of these corpora are made up of collected web-based texts. (Sources: deTenTen – German corpus from the web | Sketch EngineesTenTen – Spanish corpus from the web | Sketch Engine

Results

Skin colour in the German Web 2013 corpus:

The result of my search can be seen here

I decided to look at oppositions relating to skin colour in the German-language corpus for a specific reason: when using the Word Sketch Difference feature on Sketch Engine, the user can only enter a lemma. This means that one cannot enter a term such as ‘black person’ or ‘white person’. As the colours ‘white’ and ‘black’ often refer to any object with that colour, this was also not a helpful search. In German, the adjectives ‘schwarzhäutig’ (black-skinned) and ‘weißhäutig’ (white-skinned) are used, which means they can function as a lemma automatically connected to skin colour.

Although the search only results in one column due to the infrequency of the words in the corpus, this column alone gives us plenty of information. The three nouns that are shown to be modified by the adjective ‘schwarzhäutig’ six times or over and that are never modified by ‘weißhäutig’ in the corpus are ‘Hüne’ (giant/hulk), ‘Bastard’ (bastard) and ‘Afrikaner’ (African male). These collocates give us an impression of the language used in texts online around the word ‘schwarzhäutig’. The appearance of the word ‘Bastard’ shows how repeatedly negative some of this language can be. The presence of the word ‘Afrikaner’ also shows an association between black skin and African males, which is of course commonly problematic for people of colour from countries outside of Africa, as exemplified by the social media campaign by CNN ‘‘No, where are you really from?’‘.

These results contrast to ‘Amerikaner’ (American male), ‘Europäer’ (European male), ‘Fremde’ (stranger, foreign person) and ‘Blondine’ (blonde woman), which are words shown to only collocate with ‘weißhäutig’. Although these terms also show positioning of skin colour in terms of country of origin or nationality, no words such as ‘Bastard’ appear, showing the power structure.

Gender in the Spanish Web 2013 corpus

The result of my search can be seen here

The words ‘mujer’ and ‘hombre’ appear more frequently in the Spanish-language corpus compared to ‘weißhäutig’ and ‘schwarzhäutig’ in the German corpus.

The two verbs that mainly collocate with ‘mujer’ rather than ‘hombre’ as an object are ‘embarazar’ (to impregnate) and ‘violar’ (to violate/rape). At the bottom of the column, we can see that a verb that commonly collocates with ‘hombre’ rather than ‘mujer’ as an object is ‘armar’ (to arm). These collocates show ‘hombre’ as associated with weapons and ‘mujer’ as a receiver of violence, which shows the power structure. This is furthered when we consider the verb that mainly collocates with ‘mujer’ as a subject: ‘sufrir’ (to suffer).

By examining the column that shows collocates involving the preposition ‘sin’ (without), we can see that ‘mujer’ is associated with collocates such as ‘sin pareja’ (without a partner) and ‘sin hijo’ (without a child), which also shows certain expectations surrounding the role of women.

Conclusion

As a central focus of critical discourse analysis is group relations of power as shown through language, a corpus analysis can help greatly. Although my above examination of two corpora on Sketch Engine only consisted of a short investigation, it provided me with results that showed that power structures can be seen in corpora by examining collocates. In the German-language corpus, the word ‘Bastard’ showed the negativity surrounding the term ‘schwarzhӓutig’ and, in the Spanish-language corpus, ‘mujer’ was shown to receive violence and involve societal expectations surrounding partnership and children. In this sense, a corpus-based study relates to the central focus of CDA in studying the reproduction of power relations through communication. I feel that this information will be helpful for me going forward, especially as I will be taking a World Languages module next semester called WL4101: Language and Power. This task has shown me how relevant corpora are in my studies as a language student.

For more content relating to corpora and gender, please see my shorter blog essay.

List of sources:

Leave a Comment

‘Strong’, ‘Grasp’, ‘Consequence’: Some observations

At the end of class on Tuesday, November 13, we were asked to further investigate the words ‘strong’, ‘grasp’ and ‘consequence’ in terms of how we would go about describing them grammatically using the tools we had focused on in class. The following are some observations I made during my investigation:

 

Strong:

In the Oxford English Dictionary, ‘strong’ is shown to be a noun in the case of a group of strong people. When considering how this noun functions grammatically, it is important to note that it is a collective noun and cannot be made plural. It is also shown to be an adjective, meaning grammatically we have to consider the comparative and superlative forms, in this case ‘stronger’ and ‘strongest’ respectively. As it is a monosyllabic adjective and therefore must take the -er and -est suffixes, I did not need to consult a corpus. In the case of a disyllabic adjective, it can be useful to consult a corpus to see what the most common comparative and superlative forms are, as grammatically speaking, the adjective could take the -er and -est suffixes or the ‘more’ and ‘most’ forms.

 

Grasp:

‘Grasp’ is shown to be a noun and a verb in the Oxford English Dictionary. I reflected on the plural form and thought that, in terms of my own usage of the word, I would never say ‘grasps’ as a plural of ‘grasp’. I thought about ‘within their grasp’ as an expression and the fact that one can only say ‘I have a good grasp of maths’ in the singular. I decided to use the British National Corpus (BNC) to see if it can in fact be used as a plural. On Sketch Engine, I created a concordance based on the lemma ‘grasp’ as a noun, which can be viewed here. This showed that in the BNC, grasp as a noun is not used in the plural form. As a verb, it is important to note that it is a regular verb, with the past tense being formed with ‘I/you/he/etc. grasped’ and the past participle being ‘grasped’.

 

Consequence:

The Oxford English Dictionary shows ‘consequence’ to be a noun and a verb. The verb ‘consequence’ is described as rare and obsolete.  In the case of ‘consequence’ as a noun, it can easily be made into the plural form by adding an ‘s’. I created a concordance on Sketch Engine within the BNC based on the lemma ‘consequence’ as a noun, and one can see that it is commonly used. The concordance can be seen here.  

Leave a Comment
css.php