Introduction

This study explores migration, one of the most relevant phenomena of the twenty-first century. Migration offers undoubted possibilities of receiving (and sending) cultures (Williams and Baláž 2008). In addition, Zapata-Barrero (2009) argued that migration is a focal point to assess reactions to multiculturalism and, therefore, ‘measure’ the health of our societies. More precisely, this study examines the linguistic representation of migration in cross-cultural contexts, specifically in European parliamentary settings, using corpus-assisted tools. It also attempts to link the linguistic disciplinary field and the socio-political realm. This combination is highly innovative as it covers a large theoretical and methodological terrain, which may (1) refute or extend prior studies, (2) strengthen the pillars of generalisation, and (3) organise results in a clearer and deeper manner, including their levels of abstraction, impact, and signification. Although several studies in the field of linguistics have employed corpus-assisted analyses (see section “A critical cross-linguistics corpus-assisted approach to migration” below), few have used this methodology to investigate the issue of the representation of migration. The present study fills this research lacuna, establishing a unique association between linguistic results and their socio-political contexts.

In this study, linguistic representation refers to the use of language to convey (and construct) ‘any constellation of beliefs or ideas bearing on an aspect of social reality’ (Verschueren 1999, Preface). In a cross-cultural setting, I analysed data from three European Parliaments: the Spanish Chamber (CD), the European Parliament (EP), and the British House of Commons (HC). Similar to Baker and McEnery (2005), I delve into texts in which migrants are not the subjects of their own representations; ‘rather, they have their identities and discourses surrounding these identities constructed for them by more powerful spokespeople’ (p. 200). This approach provides information about the daily use of (mainstream) concepts and allows for further understanding of texts, conversations, and ideological positions. Corpus-assisted tools were selected because they provide multifaceted perspectives on reality. Specifically, they allowed us to undertake quantitative and qualitative analyses, as well as create synchronic and diachronic pictures of migration in parliamentary settings. Finally, the back-and-forth movement between the socio-political realms and linguistics resulted in the production of a second level of abstraction, understanding, and assessment.

I explored the following research questions:

  1. (1)

    What differences and similarities can be determined in the corpus-assisted synchronic images of migration representation at the CD, EP and HC?

  2. (2)

    What differences and similarities can be determined in the corpus-assisted diachronic images of the evolution of migration representation at the CD, EP and HC?

  3. (3)

    What are the comparative and socio-political consequences of these representations?

The remainder of this paper is organised as follows. The “A critical cross-linguistics corpus-assisted approach to migration” section briefly sketches the theoretical backdrop of my examination, complementing prior corpus-assisted studies using the framework proposed by Zapata-Barrero (2009), Zapata-Barrero et al. (2008, 2022), Zapata-Barrero and Yalaz (2018). The “Methodology” section describes the main protocols, technology, and artefacts used in the subsequent quantitative and qualitative analyses. The “Critical cross-linguistics synchronic and diachronic CADS analyses” section discusses the data yielded by synchronic and diachronic analyses of English and Spanish semantic nodes such as ‘migrant*’ and ‘immigrant*/inmigrant*’.Footnote 1 The main results, which also confirm the virtuous synergies that may be established between corpus-assisted analysis and socio-political frameworks, are presented and highlighted in the “Conclusion” section.

A critical cross-linguistics corpus-assisted approach to migration

As a phenomenon, migration has been approached from various standpoints. In the field of linguistics, the extant studies (e.g. Adserà and Pytliková 2016; Canagarajah 2017; Siegel 2018) have focused on migrant language use or the effects of migration on language. A smaller number of investigations have concentrated on the language used to describe migrants (Blommaert and Verschueren 1998) than on the language used by migrants themselves, and even fewer studies have discussed the linguistic representation of migration in parliaments (Martín Rojo and Van Dijk 1997; Van Der Valk 2003). The latter tend to fall under the umbrella of racism research (Van Dijk 2018, p. 231). Moreover, most of this research is qualitative and manual, resulting in informative descriptions and explanations of a relatively, or even ostensibly, small number of documents. This approach can be complemented with semi-automatic research to handle large amounts of (electronic) data, which would allow us to see not only the trees but also part of the forest.

Analysing linguistic representation with semi-automatic procedures, referred to in this study as ‘corpus-assisted discourse studies’ (CADS; coined by Partington 2004), is by no means new. Several studies have adopted this methodology (see Gabrielatos 2021, bibliography). Recently, a few studies have used corpus-assisted tools to examine migration representation. The ESRC-funded RASIM project at Lancaster University is one such study, conducted by Paul Baker as the principal investigator and several other scholars. Their research focuses on the identity construction of refugees, asylum seekers, immigrants, and migrants (thus, the acronym ‘RASIM’) in the British press over 10 years from 1996–2005 (Baker et al. 2008, p. 276). They show how a qualitative perspective, usually inspired by critical discourse analysis, and the quantitative/qualitative perspective provided by CADS create fruitful synergies that ‘do justice to both’ fields (Baker et al. 2008, p. 276). RASIM researchers rely on statistical data and employ both traditional and innovative tools (Calzada Pérez 2017) ‘to objectively identify widespread patterns of naturally occurring language and rare but telling examples, both of which may be overlooked by a small-scale analysis’ (Baker and McEnery 2005, p. 198). As such,

  1. (1)

    They manage to grasp the semantic associations (e.g. movement, tragedy, official reactions, crime, and nuisances) of the terms ‘refugee/s’ and ‘asylum seeker/s’ in the 2003 news corpus and UNHCR corpus (e.g. Baker and McEnery 2005).

  2. (2)

    They identify query terms (e.g. ‘refugee/s’, ‘asylum seeker/s’, ‘immigrant/s’, and ‘migrant/s’) and systematise representation results into eight categories of reference, which are often related to negative connotations: (1) provenance/transit/destination, (2) number, (3) entry, (4) economic problems, (5) residence, (6) return/repatriation, (7) legality, and (8) plight (Gabrielatos and Baker 2006).

  3. (3)

    They monitor synchronic variations and diachronic changes in the portrayal of migration protagonists (Gabrielatos and Baker 2008).

RASIM researchers have inspired countless scholars who have attempted to utilise these theoretical and methodological drives across various fields (e.g. Aleixandre Becerra 2018; Islentyeva 2021; Islentyeva and Abdel Kafi 2021). Charlotte Taylor is one of the most active among them. Her work on migration representation is noteworthy for various reasons, including her use of different focal points (e.g. naming social identities, metaphor identification, and retrieval), pertinent bibliographical reviews, and being guided by the values of transparency, replicability (Taylor 2014, p. 395), and self-reflexivity (Taylor and Marchi 2018). Three additional aspects of Taylor’s work were particularly helpful for the present study. First, Taylor (2014) located her endeavours within ‘cross-linguistic corpus-assisted research’ (see Partington et al. 2013). Despite the challenges associated with this type of research, for example, the uncertainty of functional equivalence or the impact of different language use on reporting results, Taylor highlighted the ‘lack of comparative research into discourse at the level of social practice and representation’ (Taylor 2014, p. 372). Second, she drew attention to mechanisms such as opposition and conflation (Taylor 2020). In her view, migration is depicted in terms of binaries, such as ‘legal’ or ‘illegal’, that prime each other automatically; that is, the term ‘legal’ immediately evokes the idea of ‘illegal’. These binaries also intertwine the two messages; thus, if illegal immigrants are associated with terrorism in particular discourses, legal migrants are also qualified in the same way, and eventually, all migrants are labelled terrorists. Third, referring to McEnery (2005), Taylor (2014) utilised Cohen’s (1972) understanding of ‘moral panic’, thereby enhancing the level of abstraction, that is, moving beyond words and associations into the sociology of migration. Thus, like other researchers, Taylor informs readers about what the terms denote and connote; however, she goes one step further by providing an organised, explanatory framework for their causes or consequences.

This article mainly focuses on cross-linguistic corpus-assisted discourse studies (CL-CADS), as understood by Taylor (2014). Additionally, it aspires to achieve higher levels of abstraction, which may help both organise results and facilitate a deeper understanding of the nature and consequences of representation. Nevertheless, I believe that a moral panic framework is not the only way to achieve this goal. I argue that the moral panic framework reinforces an a priori negative prejudice towards migrants; in other words, as researchers, we expect society and political agents to refer to them as a source of panic by default. In the realms of sociology and political science, I hope to find other sources of inspiration for an abstraction framework, including the various studies of Ricard Zapata-Barrero (e.g. Zapata-Barrero et al., 2008, 2022; Zapata-Barrero 2009; Zapata-Barrero and Yalaz 2018).

Briefly, Zapata-Barrero’s approach to migration (one of the key terms of multiculturalism in his view) was first fully articulated in 2008 and 2009 when studying communication in the Spanish CD during the sixth (1996–2000), seventh (2000–2004), and eighth (2004–2008) terms. These analyses present the main features of his theory. Zapata-Barrero’s understanding of ‘discourse’ is inspired by critical discourse analysts, notably Fairclough (1992). However, he inextricably linked this notion to four legitimising principles: (1) the efficiency of resources, (2) stability and security, (3) cohesion and trust and (4) equality and no discrimination. Principles sustain discourses that can be classified as reactive or proactive. Reactive discourses, which relate to principles (1) and (2) above, ‘[react] against the historical process and [seek] to restore the monocultural past’ (Zapata-Barrero et al. 2008, p. 118). Meanwhile, proactive principles, which are associated with principles (3) and (4) above, ‘[assume] history’s irreversibility, shaping its processes as frameworks which guide changes in society’ (Zapata-Barrero et al. 2008, p. 118–119).

Methodology

This study examines the representation of migration in the Spanish CD, the Spanish and English versions of EP interventions, and the British HC using corpus linguistics tools. It draws on Zapata-Barrero’s research but attains comparative complexity in at least the following aspects: (1) the corpora under study are larger, (2) multiple settings are studied, and (3) multiculturalism and multilingualism are openly exposed.

European Comparable and Parallel Archive of Parliamentary Speeches

The corpora used in this study were obtained from the European Comparable and Parallel Archive of Parliamentary Speeches (ECPC). Compiled at Universitat Jaume I (Castellón, Spain),Footnote 2 the archive contains transcribed speeches and writings from (1) the Spanish CD, (2) the EP in (original and translated) English and Spanish, and (3) the British HC. The specific sub-corpora selected for this study are listed below:

  1. (1)

    CD_04-08: 371 speeches and written interventions of Spanish members of parliament (MPs) (16,014,420 tokens, as counted by ECPC-Web and published in the Diario de Sesiones [parliamentary transcripts] from 2004 to 2008).

  2. (2)

    EP-ES_04-08: 287 speeches and written interventions by EP members (MEPs) (19,440,223 tokens as counted by ECPC-Web and published in the Official Journal of the European Union from 2004 to 2008).

  3. (3)

    EP-EN_04-08: 287 speeches and written interventions by MEPs in English (18,970,278 tokens as counted by ECPC-Web and published in the Official Journal of the European Union from 2004 to 2008).

  4. (4)

    HC_04-08: 721 speeches and written interventions by British MPs (50,598,543 tokens, as counted by the ECPC-Web and published in Hansard from 2004 to 2008).

EP-ES_04-08 and EP-EN_04-08 were composed of the same materials; the former is the Spanish version of parliamentary debates, and the latter is the English version. Both EP-ES_04-08 and EP-EN_04-08 comprise originals and translations. All the interventions (Spanish and English originals and translations) were equally valid and had comparable legal standing. In this study, translated text was not segregated from the original text; they were studied together since that is how they function and interact in the world.

The time span chosen for the analysis (2004–2008) coincided with the eighth CD legislature. According to Zapata-Barrero et al. (2008), this was part of the period of discourse formation on migration in Spain, which included the sixth (1996–2000) and seventh (2000–2004) legislatures. According to the authors, prior to this period, while migration (especially immigration) was mentioned in the Spanish parliament, it lacked a fully articulated and structured range of discourse. It was only when Spain started consolidating and strengthening its position in Europe, decades after the Francoist dictatorship, that immigration representation in parliament began catching up with that of other democracies. Further, the eighth legislature was the most prolific of the three in terms of migration (Zapata-Barrero et al. 2008, p. 96–99). The initial stages of institutionalised language representation arguably constitute a topic worth exploring because when the pillars of discourse are built, the influences of contemporary counterparts are the strongest and notions and approaches are formed; hence, it is also potentially a more unstable time. Comparing this representation with those in other, more ‘mature’ chambers also serves to widen the possible scope for interpretation.

ECPC-Web

ECPC-Web is a ‘fourth-generation concordancer’ (McEnery and Hardie 2012, p. 43–48) based on CQPweb and developed by Andrew Hardie. In particular, it ‘combine[s] an SQL database with a CQP back-end’ (McEnery and Hardie 2012, p. 46). It is a powerful and user-friendly web-based interface—with a complete gamut of statistical measures—that allows quick and reliable queries related to the selected nodes. ECPC-Web is one of the concordancers hosting the ECPC corpora. Figure 1 depicts an overview of ECPC-Web.

Fig. 1: ECPC-Web.
figure 1

Querying ‘migration’ at EP_EN-04-08 with ECPC-Web.

Frequencies and collocations

Among the many methods used by CADS proponents, frequency data and collocations are the most common. Frequency refers to the number of times queried terms appear in corpora and can be absolute (raw) or relative (normalised). Absolute frequency is the exact number of times the items appear in the corpora, without further consideration of the context or comparability with other corpora. For example, in the CD_04-08 corpus, the term ‘migración’ appears 59 times; thus, the absolute/raw frequency of ‘migración’ was 59. However, the comparison of differently sized corpora requires relative (or normalised) frequencies. This can be easily calculated as:

$${{{\mathrm{relative}}}}\,{{{\mathrm{freq}}}} = \left( {\frac{{{{{\mathrm{raw}}}}\,{{{\mathrm{freq}}}}}}{{{{{\mathrm{absolute}}}}\,{{{\mathrm{freq}}}}}}} \right) \,*\, {{{\mathrm{basis}}}}\,{{{\mathrm{for}}}}\,{{{\mathrm{normalisation}}}}$$

In this study, the basis for normalisation in all cases was one million words. Thus, the relative frequency of ‘migración’ in CD_04-08 was 3.68 per million words (pmw). The robustness of this method was ensured by using other frequency-related statistical tools, such as significance data as part of the inferential statistics and effect size measures as part of the descriptive statistics. Significance and effect size are often misunderstood and are by no means the same. According to Brezina (2018, p. 823–825), significance, measured here with a p value based on log likelihood (LL), indicates that ‘the difference observed in the corpus (sample) is likely to be a true difference in the population (all language use). If the p value is equal to or is larger than 0.05 (or 5%), we conclude that there is not enough evidence in the corpus to reject the null hypothesis’.

The values p < 0.05 and p < 0.01 are common significance thresholds for corpus linguistics and were, thus, considered significant in the present study. However, in discussing keyness, Wilson (2013) noted that a larger p-threshold (i.e. p < 0.000018) is advisable. Therefore, I reinforced LL significance using Bayes information criteria (BIC). The following clarifies the interpretation of BIC (Wilson 2013, p. 2–6):

BIC

Degree of evidence against H0

Correspondence between p values and degrees of evidence

<0

No evidence—favours H0

Below threshold

0–2

Not worth more than a bare mention

Below threshold

2–6

Positive evidence against H0

<0.00018

6–10

Strong evidence against H0

<0.000014

>10

Very strong evidence against H0

<0.0000024

Positive BIC values show evidence against similarity in the data; in contrast, negative values indicate similarity in the data. Since the p and BIC values were produced by UCREL’s log likelihood and effect size calculator,Footnote 3 I stopped at a minimum level of p < 0.00001 in practice, although the p values may be considerably smaller than this measure.

Effect size is explained by Brezina (2018, p. 889–894) as ‘a standardised measure, that is a measure comparable across different studies (…) that expresses the practical importance of the effect observed in the corpus or corpora’. In other words, having significant data does not guarantee that the researcher knows the magnitude of the difference, which is revealed by the effect size. Among the various, most informative effect size metrics, I selected Hardie’s Log Ratio (or LogR) for its easy operationalisation and interpretation (for more information about effect size, see Gabrielatos 2018). The following table serves as a guide for understanding Hardie’s LogR.Footnote 4

A word has the same relative frequency in A and B: the binary log of the ratio is 0.

A word is 2 times more common in A than in B: the binary log of the ratio is 1.

A word is 4 times more common in A than in B: the binary log of the ratio is 2.

A word is 8 times more common in A than in B: the binary log of the ratio is 3.

A word is 16 times more common in A than in B: the binary log of the ratio is 4.

A word is 32 times more common in A than in B: the binary log of the ratio is 5.

In addition to frequencies and the related statistics, I also applied collocation methods. According to McEnery and Hardie (2012, p. 123), ‘a collocation is a co-occurrence pattern that exists between two items that frequently occur in proximity to one another—but not necessarily adjacently or, indeed, in any fixed order’.

Following Baker et al. (2013), I generated statistics-based associations of nodes within a symmetrical span of five words to the right (5 R) and left (5 L). I incorporated collocations with at least five appearances in each ECPC sub-corpora. This entailed the consideration of a broad perspective on collocation, as the aim was to capture an ample range of nuances related to the node. All the collocations presented below were highly significant (as per LL-based p) and strongly associated with the node (as per LogR-filtered).

Specifically, the frequency data and collocations of the following central nodes were identified: migra*, immigra*/inmigra*, and emigra*. Correspondingly, the frequencies/collocations of terms formally related to these nodes (e.g. migration, immigration, and emigration) were examined. By doing so, I aimed to identify repeated meanings and associations since the more a node is used within a certain context, the more speakers are exposed to it and prone to incorporating it in their discourse (Hoey 2005). For comparison, I also searched for the frequency data of the secondary nodes (i.e. refuge* and asyl* in the English corpora; refuge* and asil* in the Spanish corpora). These query nodes were drawn from the work of RASIM researchers and Taylor (2014), with the addition of the emigra* family of terms.

Critical cross-linguistics synchronic and diachronic CADS analyses

Frequency as a corpus-related tool to visualise the spatio-temporal process of representation

Synchronic and diachronic frequencies and statistical results are important for studying the representation of migratory phenomena. Figures 2 and 3 show the raw and relative frequencies for the chosen nodes (migra*, immigra*/inmigra*, emigra*, refuge*, and asyl*/asil*) in all the ECPC sub-corpora from 2004 to 2008. The results are presented in the following order in this section, as if on a continuum that moves from Spanish to English, with the common (linguistic) trunk of the EP interventions: CD_04-08; EP-ES_04-08; EP-EN_04-08; HC_04-08.

Fig. 2: Raw frequencies.
figure 2

Raw frequencies for nodes in all corpora.

Fig. 3: Relative frequencies.
figure 3

Relative frequencies for nodes in all corpora.

I used the UCREL log-likelihood and effect size calculator to determine the magnitude of the statistical difference in the attention paid to all migration-related phenomena. Comparing CD_04-08 with EP-ES_04-08, LogR was 0.37 (LL = 278.53, p < 0.0001, BIC = 261.15). Hence, LogR showed slightly more node-related terms in EP-ES than in the CD. In other words, the difference between node frequencies was small, although according to LL, it was statistically significant; the BIC indicated ‘very strong evidence’ of the existence of difference.

A comparison between EP-ES_04-08 and EP-EN_04-08 yielded a LogR of 0.03 (LL = 2.43, p > 0.05, BIC = −15.04). Consequently, LogR showed ‘virtually the same frequency’Footnote 5 in both corpora. However, this difference was not statistically significant and requires further confirmation. According to the BIC, there is ‘very strong evidence’ of similarity.

When comparing EP-EN_04-08 with HC_04-08, LogR was 1.23 (LL = 3942.16, p < 0.0001, BIC = 3924.10). According to LogR, there were more than twice as many node-related terms in the EP-EN group as in the HC group.Footnote 6 This difference was statistically significant (as per LL); the BIC indicated ‘very strong evidence’Footnote 7 of the existence of difference. Comparisons between the CD and HC are not discussed in this article owing to space constraints.

Thus, according to the global image portrayed in a study including all the selected terms (migra*, inmigra*/immigra*, emigra*, refuge*, asil*/asyl*), the Spanish and English versions of the interventions from the EP devoted the same amount of attention to migration between 2004 and 2008. In turn, the CD and EP (Spanish) discussed the migratory issue at relatively similar frequencies during the same period. In contrast, this issue was the least discussed by the HC.

In addition to the overall results, in which all the term-related nodes were counted together, it was found that each node-related family of terms generated its own profile, as shown in Fig. 4.

Fig. 4: Frequency graph.
figure 4

Comparative chart for all nodes in all chambers.

Figure 4 may be complemented by the significance and effect size of individual node use during 2004–2008. This article only presents the statistics for migra* owing to space constraints (Table 1).

Table 1 Statistics on the use of migra* in the CD, EP-ES, EP-EN and HC.

As shown in Fig. 4 and Table 1, immigration-related terms were the most frequently used in all parliaments. Emigration-related terms were especially prominent in the Spanish CD (with a relative frequency of 34.28 pmw) and virtually non-existent in the HC (1.26 pmw). In contrast, the least frequent Spanish terms in the CD were those related to refuge* (10.80 pmw) and asylum* (17.80 pmw). Notably, MEPs in both the English and Spanish versions of debates used all the different nodes significantly more frequently than their counterparts in the HC and CD groups. Additionally, the EP used most node-related terms in an unsurprisingly similar manner in both the Spanish and English versions. The striking exception was the frequencies for migration-related terms, which differed significantly (LL = 269.59, p < 0.0001, LogR = 0.71, BIC = 252.13) in the Spanish (91.05 pmw) and English (148.76 pmw) versions. This can be attributed to translational interactions among other factors; however, this hypothesis requires further investigation. Finally, the HC was statistically significantly different from the other chambers.

It was also observed that, quantitatively, migration representation underwent an overlapping and dissimilar chronological progression in each parliament. Owing to space constraints, only the migra* occurrences during 2004–2008 are shown in Fig. 5.

Fig. 5: Diachronic graph.
figure 5

Diachronic chart for migration-related hits.

As seen above, the progression of CD_04-08 was especially spiky and unstable, with potential maximum exposure to contextual events. By contrast, the progression of HC_04-08 was unique because of its steady growth. Moreover, the chronological line of the CD was closer to that of EP-EN_04-08 and EP-ES_04-08. In the context of the EP, the Spanish and English versions mirrored each other, albeit with frequency differences.

Examining different settings during different periods provided an all-encompassing view of migration, highlighting when and where inflections were identified and when trajectories changed direction. The results offer a global picture of the object of the study and hint at further partial niches in the analyses.

Collocations as a corpus-related gateway to the representation of migratory movements

The study of collocations also concentrated on migra*-related terms. In this case, ECPC-Web generated 37 collocations in CD_04-08 for the nodes under study: 57 in EP-ES_04-08, 97 in EP-EN_04-08, and 105 in HC_04-08. After normalising the raw figures, the following findings were observed:

  1. (1)

    The corpus that generated the most collocations was EP-EN_04-08 (5.11).

  2. (2)

    The corpus that generated the least collocations was HC_04-08 (2.08).

  3. (3)

    The normalised figure for collocations of EP-ES_04-08 was 2.93.

  4. (4)

    The normalised figure for collocations of CD_04-08 was 2.31.

  5. (5)

    EP-ES_04-08 generated 21.16% more collocations than CD_04-08. However, this difference was not statistically significant (LL = 1.29; p > 0.05). Moreover, the BIC (−16.09) indicated ‘very strong evidence’ in favour of similarity.

  6. (6)

    EP-EN_04-08 generated 59.30% more collocations than HC_04-08; the difference was statistically significant (LL = 39.24; p < 0.00001), and according to the BIC (21.18), there was ‘very strong evidence’ of difference.

  7. (7)

    EP-EN_04-08 generated 42.66% more collocations than its Spanish equivalent. The difference was statistically significant (LL = 11.51; p < 0.001);Footnote 8 according to the BIC (−5.95), there was only ‘positive evidence’ in favour of similarity due to the especially stringent requirement stipulated by Wilson (2013).

Again, these figures distinguish the HC. The remaining chambers had similar non-significant or negative BIC scores, with collocational activity being particularly frantic in EP-EN_04-08.

The collocations must be narrowed further for a qualitative analysis to make the results comparable. Thus, I limited the focus to the top 35 collocations with the largest LogR in each chamber, highlighting the items or hits with the strongest effect size. I then classified them using thematic tags, following the inductive procedures proposed by the RASIM researchers, Zapata-Barrero (2009) and Taylor (2014). These tags were as follows: denomination of migration-related events, participant(s) in migratory events, institutions related to migration, movements as distinctive features of migration, places of interest, causes and effects of migration, proposals (or plans of actions), quantity as an objectifying mechanism of representation, and others (particularly cohesive/grammar-based collocations). These themes enabled the identification of the common and diverging areas of representation and simplified the understanding of the image built around collocations and their reactive or proactive nuances.

For efficiency, the discussion below begins with the sub-corpora that had the greatest overlap of thematic areas, EP-ES_04-08 and EP-EN_04-08, and concludes with the remaining sub-corpora (CD_04-08 and HC_04-08). However, Table 2 shows that a few collocations were common amongst all chambers. The translations of Spanish collocations are my own and appear between square brackets.

Table 2 Common collocations in all the chambers.

This means that, from 2004 to 2008, the common representation—derived from all the chambers at once through collocations of comparable migra* items—is that of an illegal flow of participants and events that requires control and management. EP-EN_04-08 even specifies that this management is to be done properly: ‘well-managed’. In the context of this study’s theoretical framework, this represents a reactive image, built upon terms easily associated with the principles of security and stability (‘illegal’, ‘flow’, and ‘control’) and the efficiency of resources (‘manage’). Therefore, the overlapping discourses across parliaments were fully reactive. See an example below:

Together, we must fight the degrading spread of illegal migration, which is why we need a strong and decisive European Union, a Union that enjoys the trust of those who live in it. (EP-EN_070523)

Focusing on individual settings, I analysed the English and Spanish versions of the EP discourses. Table 3 shows that both versions reinforce the common reactive representation described above:

  1. (1)

    The denominations of migratory events as threatening or evoking the principle of security and stability (‘pressures’, ‘legal’).

  2. (2)

    The participants were viewed as efficient resources for labour (‘workers’, ‘circular’, and ‘skilled’).

  3. (3)

    The institutionalisation of discourse at member states’ ‘ministerial’ levels. I believe this contributes to a reactive discourse in that the centre of responsibility is mainly displaced away from the EU.

  4. (4)

    The depiction of migration as a series of inward movements (note the use of the plural, which quantifies the flow of participants as multiple events) with established ‘routes’ for ‘arriving’ and external participants who are potential threats to the stability and security principle.

  5. (5)

    The quantification of participants and events (‘mass’, ‘massive’), which is a typically reactive strategy according to Hart (2021), and an indisputably reactive proposal of action (‘regulate’).

Table 3 EP-EN_04-08 and EP-ES_04-08 collocations.

For example:

In fact, this is a key point since it relates to participation in a single space and a single market which, furthermore, must face serious and heavy migratory pressures at its external borders. (EP-EN_20051025)

EP-ES_04-08 and EP-EN_04-08 only shared three neutral collocations around migra*: one general reference to migratory events (‘phenomena’) and two official labels, one for a particular type of phenomenon (‘asylum’) and another for a concrete kind of participant (‘refugees’). Official labels may be seen as contributing to a proactive discourse by acknowledging, in a neutral manner, the participants’ status (which accords with international laws but not always with international practice). These labels refer to what Taylor (2020) calls legitimising items.

EP-ES_04-08 had the unique proactive collocation of ‘causa’ [causes], which at least acknowledges the existence of a motivation. However, it seems to add to the overall reactive representation (discussed above) with its own reference to the economic [economica] side of migratory events and the quantification that the collocation ‘altamente’ [highly] conveys. Both types of language associations are closely linked to the principle of resource efficiency. The Spanish version of the EP intervention highlights the ‘Euroafricana’ [Euro-African] situation, which was statistically significant in Spanish. The following example presents the collocational data in this context:

… otro de los objetivos de esta Conferencia fue sin duda debatir las formas de reducir las causas de la migración; es decir, emprender la acción preventiva pertinente. (EP-ES_060516)

[another of the aims of this conference was undoubtedly to discuss ways to limit the causes for migration, that is, to undertake pertinent prevention action; author’s own translation]

The English version of the EP discourse (i.e. EP-EN_04-08) further added to the largely reactive representation with the following:

  1. (1)

    Eurocentric denominations of migratory events, especially ‘admission’, emphasising that migrants need member states’ acquiescence. Furthermore, events are qualified as ‘irregular’ and, hence, suspicious for the stability and security of receiving states. They are also conveyed as ‘patterns’, implying an underlying order that could be interpreted as orchestrated and devised.

  2. (2)

    A frequent emphasis on participants as (often somewhat defective) resources (‘unskilled’).

  3. (3)

    A metaphoricalisation of migratory movements as ‘channels’, ‘influx’, and ‘waves’, which is a typical mechanism of reactive discourses (Zapata-Barrero 2009).

The following sentence serves as a contextualising example:

Let us not forget that our partners are now transit countries for the waves of migration from sub-Saharan Africa. (EP-EN_080605)

However, EP-EN_04-08 proposed proactive plans of action (‘integrating’) that support the principles of cohesion and integration:

Against the backdrop of these challenges, the need to develop common procedures for integrating migrants and to devise common procedures over immigration policy appears both urgent and unavoidable. (EP-EN_061023)

Nevertheless, this result should be considered with caution since integration may have egalitarian nuances or imperialistic connotations.

Next, I examined the results of the monolingual chambers of the CD and HC groups (Table 4).

Table 4 CD_04-08 collocations.

Table 4 indicates that CD_04-08 shared some migra*-related collocations with both the EP-ES_04-08 and EP-EN_04-08 corpora, contributing mainly to reactive discourses that were conveyed through the use of:

  1. (1)

    Migration/migrant-related collocations, such as ‘presión’ [pressure], ‘legal’ [legal] and ‘circular’ [circular]. These collocations either threaten or seek to reinforce the principle of security and stability. Incidentally, the only general collocation associated with migratory events was ‘fenómeno/s’ [phenomenon/a], which seems to refer to migrants in a neutral manner.

  2. (2)

    Collocations that particularly highlight movements [movimientos] associated with migratory phenomena, especially when these movements are portrayed as occurring on established routes [rutas] that may be seen as consolidated paths that threaten EU countries.

  3. (3)

    Collocations that encapsulate proposals typically associated with the principle of resource efficiency (e.g. ‘regulación’ [regulation]) and are, hence, reactive according to this study’s theoretical framework.

A contextualised example is presented below:

Una globalización sin reglas, los problemas relacionados con el medio ambiente, los problemas del terrorismo internacional, los conflictos internacionales o los movimientos migratorios tienen su solución, (CD_040615)

[A globalisation without rules, problems regarding the environment, problems of international terrorism, international conflicts or migratory movements have a solution; author’s own translation].

The only CD_04-08 collocation that was also present in EP-ES_04-08 (‘economic*’) served to reinforce reactive mechanisms, foregrounding the efficiency of resources:

Si esto no se hace en los ámbitos institucionales a nivel internacional, esta realidad no la parará nadie; no habrá agencias capaces de contener el fenómeno migratorio por razones económicas. (CD_060621)

[If this is not done within international institutional realms, nobody will stop this reality; there will be no agencies capable of containing the economic migratory phenomenon; author’s own translation].

CD_04-08 and EP-ES_04-08 also shared the collocation ‘materia’; however, this was not taken into consideration since it was part of a linguistic link (‘en materia de’ or ‘as regards’) that does not affect discursive nuances. Incidentally, in terms of the top 35 collocations, CD_04-08 had nothing in common with HC_04-08.

Finally, Table 5 shows some reactive and proactive collocations that HC_04-08 shared with both versions of the EP interventions.

Table 5 HC collocations.

Among the reactive collocations, the following observations are noteworthy:

  1. (1)

    Hits that particularly highlight the labour-related nuances of migra* terms, supporting the principle of resource efficiency, such as participants who are described as ‘skilled’, ‘worker/s’, or even ‘lawful’ (the HC_04-08 equivalent to ‘legal’ in EP-EN_04-08 and EP-ES_04-08);

  2. (2)

    Hits (like ‘inward’) that emphasise the self-centred viewpoint of the description of events.

  3. (3)

    The quantification in the description of moving participants, who are described as a ‘mass’.

Regarding slightly more proactive results, HC_04-08 shared two neutral collocations (‘asylum’ and ‘refugees’) with both versions of the EP corpora. These collocations can be viewed as proactive nuances, as they serve to acknowledge such realities. The following examples serve to clarify the discussion:

Save for those elements that may be introduced earlier, including Tier 1 and the Highly Skilled Migrants Programme, I will say that I am happy not to go down the route of introducing the provisions of the Bill that refer to appeals until the points system is in place. (HC_051116)

HC_04-08 statistically shared reactive collocations with the English version of the EP (‘patterns’, ‘irregular’, ‘unskilled’, ‘influx’) that mainly uphold the principles of security, stability, and resource efficiency. Again, there were few neutral collocations (‘seekers’); at best, these collocations could be viewed as proactive for acknowledging the existence of international statuses.

Regarding the other chambers, some collocations among the top 35 were exclusive to the HC_04-08 corpus. At face value, these are all reactive, particularly those that:

  1. (1)

    Describe migration-related events using terms associated with the principles of security and stability, such as ‘awful’ and ‘unauthorised’.

  2. (2)

    Differentiate between national (‘indigenous’) citizens and others.

  3. (3)

    Place emphasis on negative effects through hits such as ‘impact/s’ and ‘exploitation’.

  4. (4)

    Quantify events and especially people (‘net’, ‘highly’, ‘large-scale’, ‘point-based’).

  5. (5)

    Accept the possibility of people moving inward (as a threat) but also being expelled ‘outward’.

The top 35 collocations of HC_04-08 depict greater institutionalisation in dealing with migratory events than those of the rest of the chambers, with hits such as ‘advisory’, ‘forum’, and ‘balanced’ (in the names ‘Migration Advisory Committee’, the ‘Migration Impact Forum’, and ‘the group on Balanced Migration’, respectively). The main migratory loci of concern for the HC are both within and outside the EU (i.e. ‘A8’ and ‘Eastern European countries’) and ‘non-EU’ regions. The following example provides contextualisation:

But British or foreign, migrant or indigenous, legal or illegal, a worker is a worker. (HC_040521)

Conclusion

The data utilised in this study are so profuse and varied that it is only appropriate to briefly review their comparative and sociological repercussions. As seen above, this study undertook two cross-linguistic/cultural analyses of frequency and collocation. Both analyses contribute to the generation of complex information (global and partial) about migration representation in parliamentary settings. This complex information facilitates a deeper understanding of events at both micro- and macro-levels. On their own, the values mean little: it is only when we activate the synchronic and diachronic lenses that we can start to make sense of them. Thus, by looking inwards (i.e. within a parliament) and outwards (i.e. across parliaments), I was able to determine that the EP had the highest normalised number of queried linguistic nodes. In contrast, the HC had the lowest normalised number of queried terms. This result was consistent both for data from all chambers and all term nodes and in the isolated use of different term families (for example, ‘migra*’, immigra*). Thus, the comparative data suggests that migration is a larger question at the European level than at the national level. Further, the comparative data also show that the CD is closer to the EP and farthest from the HC, which is, in (almost) all synchronic respects, the odd one out.

In addition, diachronic graphs add to the previous complexity. If we focus on the CD, we observe a spiky evolution that denotes instability. Nevertheless, when we look outwards (i.e., towards the EP and, especially, the HC), instability acquires its full dimension vis-à-vis the HC’s flat line. This is an unsurprising yet interesting result because it confirms that we are examining a period of migration representation building in the CD. In this respect, most associations in the Spanish chamber occurred within areas linked to the principles of security, stability, and resource efficiency. This suggests that from the early stages, migration has been represented (and, thus, conceived) in the CD as a challenge that needs control rather than as an opportunity. In addition, the fact that the HC is the odd one out in all frequency-related respects brings to mind Brexit and is another piece contributing to its portrayal and discussion.

Regarding collocation, quantity also plays a role; however, it is a comparative procedure that helps put the figures into perspective. For instance, the activity in EP_EN-04-08 was particularly frantic in this respect and certainly poorer in HC_04-08. Discussions were not only less frequent in the House of Commons, but they also created fewer connections around query nodes.

Furthermore, this study’s theoretical framework adds nuances to the complex image of migration representation in parliaments. By moving from collocations to semantic grouping to legal principles and proactive and reactive discourses, I surpassed the linguistic field and entered higher levels of abstraction. Thereby, the analysis revealed ‘widespread patterns’ (Baker and McEnery 2005, p. 198) of socio-political reactivity and indicated the ‘rare but telling examples’ (Baker and McEnery 2005, p. 198) of proactivity. Cases of Taylor’s (2020) opposition and conflation were partially highlighted by the cumulative associations between ‘manage’ and ‘migration’ (in contrast with uncontrollable, chaotic flows) that eventually channel reactive representations. Although rare, the use of proactive representations exists. This study’s analyses identified how they are conveyed through neutral collocations (official, legitimising denominations such as ‘asylum’ or ‘refugees’) and those especially related to the mechanisms of cohesion and trust (e.g. ‘integrating’ or ‘integration’). Proactive representations can set fertile grounds for even more proactive discourse in the times ahead. Future research could confirm how the proactive messages discussed here materialise in today’s ideologies.