Abstract
Keywords
Introduction
Although some early studies warned of negative effects of bilingualism on cognitive development of children (for information on research in the early 1900s, see Grosjean, 1982) such as poor school performance, lower verbal intelligence, emotional problems, and split personalities, others mostly conducted in the late 1990s and early 2000s provide evidence that bilingualism enhances executive control. Executive control has been used as a term encompassing cognitive skills such as switching attention between tasks or inhibition of attention when it comes to the stimuli that need to be ignored (Bialystok et al., 2008; Bialystok & Martin, 2004; Yow and Li, 2015, etc.). Some other authors (Antón et al., 2016; Miyake & Friedman, 2012, etc.) work with the concept of executive functions such as inhibition (the ability to suppress dominant responses), shifting (the ability to switch between tasks), and monitoring (the ability to update information in the working memory). The enhanced executive control in bilinguals has been believed to stem from their continuous exposure to the necessity to switch between their languages as well as the need to suppress the other language-related information which is irrelevant in the given language context. In this respect, bilinguals are often described as individuals capable of greater interference suppression, or having an improved inhibitory control, compared with their monolingual counterparts (Bialystok & Martin, 2004; Dagenbach & Carr, 1994; Dempster, 1992; Martin-Rhee & Bialystok, 2008).
Some studies also propose the Adaptive Control Hypothesis, whose proponents point out that using a language in various contexts activates different types of control processes in the individual’s brain. Green and Abutalebi (2013) distinguish eight control processes (goal maintenance, conflict monitoring, interference suppression, salient cue detection, selective response inhibition, task disengagement, task engagement, opportunistic planning), which get activated in three different real-world interactional contexts (single language, dual language, and dense code-switching) and which adapt to the demands imposed on them by these different contexts. They stress that language use in bilingual speakers increases the demand on the processes involved in utterance selection over and above those that are imposed on monolingual speakers.
The inhibitory control advantage that bilingual speakers reportedly have over the monolingual ones has also been supported by two other hypotheses: the Bilingual Inhibitory Control Advantage hypothesis proposing a bilingual advantage specific to the presence of conflict and the Bilingual Executive Processing Advantage hypothesis advocating a global advantage in processing across all contexts (Hannaway et al., 2019).
To measure an individual’s inhibition control capacity, several kinds of tests have been used ranging from the Simon task, measuring subjects’ reaction times in situations where there is a match between a stimulus and response when it comes to a location, to the Eriksen flanker task where the relevant stimulus is “flanked” by the irrelevant ones. In card-sort tasks, subjects are asked to sort cards according to one set of rules and subsequently another one. They, for instance, are asked to sort cards according to the colors the shapes drawn on the cards are filled with and subsequently sort these according to the shapes per se (Bialystok, 1999; Bialystok & Martin, 2004). In the color-based Stroop test the focus is on reactions to language-related stimuli. In the Simon task, the speeds of a spatial response (e.g., pressing a left or right button) to a spatial stimulus (e.g., the side of the ear a sound is played into) (Simon, 1969) or a non-spatial one (e.g., its shape, color, a pitch of the tone played) (Hübner & Mishra, 2013; Proctor, 2011) are measured. The Eriksen flanker task, as originally described by Eriksen and Eriksen (1974) tests subjects’ reactions to the stimuli presented in the form of letters flanked by the “noise letters” and measured their reaction times in the situations where the target letters and the noise ones were either congruent (the noise letters identical with the target one) or incongruent (the noise letters different from the target ones). Since then, other variants of the task have been used including the ones developed especially for children with the pictures of fish (swimming either left or right) (Yang & Lust, n.d.).
When devising the Stroop test, its founder, Ridley Stroop, drew on the law of associative inhibition which Kline describes as “If
Since then, a number of studies (Bialystok et al., 2008; Heidlmayr et al., 2014; Wang et al., 2016, etc.) showing that bilinguals tend to exhibit a lower Stroop effect have been conducted. The Stroop effect size is calculated as the difference between the naming reaction time in the situation where the color and the word it is printed in are incongruent and the naming reaction time in the situation where these two are congruent. In this respect, the lower Stroop effect suggests a higher degree of semantic contents inhibition—a phenomenon identified in the individuals using more than one language. As the studies point out, the increased inhibitory potential manifested by bilinguals in the Stroop test seems to relate to their language-switching experience.
Some studies emphasize the relationship between the Stroop effect size in bilinguals and other factors such as language proficiency and type of bilingualism (balanced vs. unbalanced), age of the bilinguals at the time of Stroop test, duration of immersion in the second-language (L2) environment, or even some hidden demographic factors that tend to be differently distributed among the bilingual and monolingual groups. When it comes to language proficiency and the type of bilingualism, Yow and Li (2015) found that an earlier age of the L2 acquisition and a more balanced use of the two languages result in a smaller interference effect in the Stroop task. A similar, that is, greater, interference effect was also observed in older unbalanced bilinguals by Zied et al. (2004) in the situations where they responded with their L2 to visual stimuli written in their dominant language with balanced bilinguals showing equivalent interference effects between all conditions. Rosselli et al. (2002), studying Stroop test performance among Spanish-dominant and English-dominant bilinguals, observed that the former were significantly slower in all the test conditions in English and the latter significantly slower in all the test conditions in Spanish. The level of language proficiency as a factor affecting multilinguals’ cognitive functioning also appears to be central to the threshold hypothesis (Cummins, 1976; Farrell, 2011), according to which there is a threshold level of linguistic competence which an individual must attain to enable the potentially beneficial aspects of being multilingual to influence their cognitive functions. As regards the age at which the test is conducted, Bialystok et al. (2008) reported a greater bilingual advantage in the Stroop task in older adults. However, the same researcher mentions no differences in performance in university undergraduates giving the fact that the cognitive performance is at its peak in this age group as a reason for the phenomenon (Bialystok et al., 2005). Ivnik et al. (1996) and West and Alain (2000) showed there is a significant decline in the Stroop test performance with advancing age while Comalli et al. (1962) demonstrated that older adults and children indicate longer response latencies than young adults. The duration of immersion in an L2 environment has been documented as a factor influencing the Stroop effect in the subjects’ first language (L1) by Heidlmayr et al. (2014), who point out a positive correlation between the two variables while reporting a negative correlation between this effect and the frequency of L2 and third language (L3) use as well as the L2 proficiency. Poarch and Van Hell (2012) report enhanced conflict resolution and orientation in Attention Network Task of the early childhood (EC; 4–6 years) bilingual and trilingual children compared with the performance of L2 children (studying English) and marginally enhanced conflict resolution of the former compared with the latter in the Simon task. Paap et al. (2015) also claim that the executive functions are affected by the variables of immigrant status, educational level, socio-economic status, as well as other factors.
There is also a growing body of research indicating that the language the Stroop task is conducted in affects the Stroop effect size, together with the subjects’ proficiency in this language. Badzakova-Trajkov et al. (2004), comparing the effect between Macedonian–English bilingual subjects and English monolingual subjects, arrived at the conclusion that bilinguals, who performed the test in both Macedonian and English, had displayed longer reaction times than monolinguals. This also applies to the times related to congruent and control conditions in their L1, which, as the researchers believe, is indicative of a change in the L1 processing due to the acquisition of an L2. In another study, Rosselli et al. (2002) tested Spanish–English bilinguals, English monolinguals, and Spanish monolinguals using the Stroop test to find whether there were any differences between the groups. They found that unbalanced Spanish-dominant bilinguals were significantly slower than the unbalanced English-dominant ones and the balanced bilinguals in the English color-naming condition. Similarly, Yow and Li (2015), who conducted the Stroop test among English–Mandarin young adult bilinguals, show that the more balanced use of two languages and the earlier age of L2 acquisition is, the smaller the interference effect in the Stroop task.
However, when it comes to between-language interference in the Stroop test conducted the way that requires the subjects to give responses in the language other than the one the stimuli were presented in, the high proficiency in the language that the stimuli are presented in appears to have an opposite effect, that is, results in greater interference and thus higher values of Stroop effect. This was observed by Sumiya and Healy (2008), who conducted an experiment with native English speakers showing that the between-language interference in these for English stimuli and Japanese responses was greater than for Japanese stimuli and English responses. Moreover, the researchers stress the fact that the Stroop effect is also partly determined by the phonological similarity between the color words in the two languages, since this similarity leads to greater interference.
Similarly, Marian et al. (2012) conducting the Stroop test on multilinguals in three languages found that the multilinguals were faster and more accurate in the within-language-competition condition than in the between-language-competition condition. They also concluded that another factor affecting the multilingual Stroop test performance in terms of speed, accuracy, and error patterns was the participants’ language proficiency.
However, there has recently appeared research disputing the effect of (bi-) multilingualism on executive functioning of which the inhibition control measured by the Stroop test is a part of. Grundy and Timmer (2017), for instance, in their meta-analysis of 27 studies focusing on children and adults, report that bilinguals have a small advantage over monolinguals in working memory. Similarly, Adesope et al. (2010), investigating the effects of bilingualism on cognition in children and young adults in their meta-analysis, found a moderate effect of bilingualism on cognition while Lukasik et al. (2018) comparing early and late bilingual versus monolingual working memory performance using verbal and visuospatial working memory composites also came to conclusions not supporting the bilingual executive advantage hypothesis. Similarly, Paap et al. (2015) point out that more than 80% of the tests for bilingual advantages conducted after 2011 yield null results. They also claim that the cumulative effect of confirmation biases and common research practices has either created a belief in a phenomenon that does not exist or has inflated the frequency and effect size of a genuine phenomenon that is likely to emerge only infrequently and in restricted and undetermined circumstances. (p. 265)
Likewise, the results of the study conducted by Kousaie and Phillips (2012), focusing on Stroop interference, do not show any differences in the variable between young bilingual adults and their monolingual counterparts making the authors question the evidence of bilingual advantage and the robustness of previous findings. Von Bastian et al. (2016), investigating 118 young adults in relation to 4 most prominent hypotheses of bilingual advantages for inhibitory control, conflict monitoring, shifting, and general cognitive performance, found no cognitive advantages related to bilingualism. Dick et al. (2019) in their study of 9- to 10-year-olds across the United States show that there is little evidence for a bilingual advantage for inhibitory control, attention and task switching, or cognitive flexibility.
Last but not least, de Bruin et al. (2015) mention the publication bias regarding studies investigating bilingual advantage claiming that while the studies fully supporting the bilingual-advantage theory were most likely to be published (followed by studies with mixed results), those challenging the bilingual advantage were published the least.
Goals of the study
The goal of this study was to find whether there are any differences in Stroop effect between Swedish EC monolingual and EC (bi-) multilingual university students in the language which is not their EC language and which the students learnt during the course of their previous studies at primary and secondary schools, that is, English. The terms “EC monolingualism” and “EC (bi-) multilingualism” were used in line with Poarch and Van Hell (2012) and Yang et al. (2011). The study was preceded by a pilot one involving a lower number of subjects, whose purpose was to test the feasibility of a study of a larger scale, that is, the one described in this paper.
Unlike in the previous studies, the aim was to find whether the number of languages one acquires as a young child in a natural acquisition setting affects their inhibition control in their early adulthood within the language they learnt later in an instruction-based setting, that is, during language lessons, and through exposure to activities of their interest outside the school. The reason why the inhibition control was measured in English, was twofold:
(a) To use the language in which the subjects were approximately on the same proficiency level (CEFR B2-C1—a requirement the students need to meet in order to be admitted to the university) to avoid the situation where different language proficiencies have an undesirable effect on the Stroop effect size (Yow & Li, 2015; Rosselli et al., 2002);
(b) English as a school subject studied by all the students represents knowledge adopted to a great extent at school, and thus the Stroop test with its incongruent conditions required the subjects to apply a new rule to this knowledge—a situation resembling an educational one of the type where new concepts are learnt through re-structuring the old ones; the data collected in this study were also intended to be used in another analysis investigating the relationship between the students’ inhibition control and their academic performance.
Another aim of the study was to find whether one can observe a relationship between the perceived proficiency in the subjects’ mother tongues and the Stroop effect since, according to the threshold hypothesis (Cummins, 1976), the higher proficiency in these might correlate with the Stroop effect size, that is, the higher the proficiency, the lower the Stroop effect. The Stroop test has been chosen as a method to measure the participants’ inhibitory control instead of the Simon test since some previous findings suggest that bilingualism may engage Stroop-type cognitive control mechanisms more than Simon-type mechanisms (Blumenfeld & Marian, 2014).
Method
Participants
The Stroop test was conducted on a total of 111 students attending Södertörn University, Stockholm, Sweden. The subjects were selected through convenience sampling with only one condition applied, that is, that none of the participants must have had English as their L1. Six of them were excluded from it as outliers: 4 of them due to their age (older than 40), and 1 due to his number of color identification errors exceeding 40, that is, a risk-taker prioritizing speed over accuracy, and 1 who failed to specify the number of languages she had acquired in their EC. Out of the remaining 105 participants, 41 were EC monolinguals (acquiring one language in their EC) and 64 had more than one language acquired in their EC (55 EC bilinguals and 9 EC trilinguals). All the EC bilinguals and trilinguals were second-generation immigrants. The average age was 25.55 years (
At the time of the study, 10 students were enrolled in the Elementary School K-3 teacher training program, 51 in the Elementary School 4–6 teacher training program, 18 in the Secondary School teacher training program, and 26 of them in the English Studies program. All of the EC monolingual students acquired Swedish as their mother tongue and all of the EC bilingual and multilingual students were early (bi-) multilinguals who learnt Swedish mostly through interaction with their teachers and peers as early as their preschools and during their out-of-school activities involving their Swedish-speaking schoolmates and friends at that age. The early pre-school exposure to Swedish also represented one of the selection criteria for all the participants. In this respect, in the (bi-) multilingual participants, Swedish was considered as their second (and in some of them third) EC language. It was also the language in which all of them studied most subjects within their university programs. The languages the bilinguals and multilinguals indicated as their first or second EC ones, that is, those they had spoken with their parent(s) since their birth, were Arabic, Aramaic, Bengali, Berber, Bosnian, Bulgarian, Finnish, Greek, Kurdish, Luganda, Rutooru, Malaysian, Norwegian, Punjabi, Russian, Serbian, Somali, Spanish, Syrian, Thai, Tigrinya, Turkish, and Urdu. All the participants studied English in the past, that is, at their elementary schools and high schools, and their knowledge of the language was at the B2C1 level of the Common European Framework of Reference for languages (required by Swedish universities), which also enabled them to study English-related courses in it, namely those focusing on methodology of teaching English, literary studies, basics of linguistics, and university grammar of English.
Data collection methods
The data pertinent to the participants’ mother tongues such as their number, their perceived oral and written proficiency in these, and the information about whether the students still used their mother tongues at home as well as to what extent were collected through questionnaires distributed among the students at the beginning of each experimental session. The questions targeting the mother tongues required the participants to specify which of them they had spoken at home as children and with whom. The students were asked to mark the degree of their perceived oral and written proficiency in these on the Likert-type scale of 1–5 with 1 representing
To measure the participants’ inhibitory control, a computerized version of the Stroop test available at https://www.psytoolkit.org/ was used. The task consisted of two conditions on which the participants were tested: (1) congruent trials, with the words denoting the same colors as the ones in which they were displayed on the screen (see Figure 1) and (2) incongruent trials, with the words denoting other colors than the ones they were displayed in (see Figure 2). For each trial type the students were instructed to identify the color in which the word was displayed on the screen as quickly as possible by pressing a corresponding key on their keyboards. The keys the subjects were instructed to press were those that had on themselves the initial letters of names of the colors. Therefore, when the word “red,” for instance, got displayed in blue color, the students were supposed to press the

Stroop test—congruent condition.

Stroop test—incongruent condition.
There were four colors used in the test (red, yellow, blue, and green) and the students were instructed to press the
Prior to this analysis, a pilot study of a smaller scale was conducted with a lower number of participants during which some minor issues were identified. These were mostly related to the technical aspects of the study and its execution such as recruiting volunteers willing to participate in it (not many students willing to take part without any kind of reward, so a reward-based system was established) and making participants understand that it is the accuracy of the color identification process that matters, not only its speed. Special attention was also paid to giving the questionnaires the form that required as little further explanation on the part of the researcher during the time they were being filled in as possible since requests for explanations appeared to disrupt the overall survey phase. A formula was also created in Microsoft Excel to calculate Stroop effect from the raw values saved on a server and to count the color identification errors the subjects made during the test since neither of these results was provided by the Stroop test program. The reaction times for wrong answers (incorrect color identification) have been excluded from the calculation.
Data analysis methods
For the between-the-groups comparison, the number of (bi-) multilinguals was reduced to 41 by using the random-sample-of-cases selection option in SPSS to obtain groups of equal size, that is, 41 each. The main Stroop effect size for those who were included in the experiment (
The Stroop effect sizes were subsequently compared for EC monolinguals and bilinguals (+trilinguals) and statistically analyzed using the independent-samples
The information obtained from the questionnaires about the perceived proficiency in the EC bilingual and multilingual students’ mother tongues was used to analyze any possible relationship between this variable and the inhibitory control indicated by Stroop test results with the aid of Pearson bivariate analysis. In this analysis, all the 60 EC (bi-) multilinguals who indicated their language proficiencies in their questionnaires were included.
Study results
The Stroop effect means are presented in Table 1, which shows the mean Stroop effect for the EC monolinguals being 119 ms (
Stroop effect in EC monolinguals and EC bilinguals (multilinguals).
Table 1 also shows that the EC monolinguals involved in the experiment have longer average reaction times in both the conditions than their bilingual/multilingual counterparts, that is, in the congruent trials they appear to react on average about 36 ms, and in the incongruent trials about 23 ms, later than the latter.
The independent samples

Distribution of Stroop effect values in the sample.
The Stroop effect sizes were also compared with the degrees of self-rated oral and written proficiency that the EC multilingual participants (
Correlation between Stroop effect and proficiency in bilinguals’ (multilinguals’) native language (other than Swedish).
Discussion and conclusion
Despite numerous similar studies conducted among monolinguals and multilinguals identifying enhanced inhibitory control among the latter (Bialystok, 1999; Bialystok et al., 2008; Bialystok & Martin, 2004; Wang et al., 2016, etc.), there has recently been a range of those questioning the effect of (bi-) multilingualism on executive functioning (de Bruin et al., 2015; Dick et al., 2019; Kousaie & Phillips, 2012; Paap et al., 2015; von Bastian et al., 2016). To add to the existent body of knowledge in the field, this study investigated whether any cognitive advantage in the form of enhanced executive functioning, namely inhibitory control, can be found in young EC multilingual adults in the language they learnt later in their lives in relation to their EC multilingualism, as this theme does not seem to have been sufficiently explored hitherto.
English was chosen as the language in which the Stroop test was conducted, since all the participants were on approximately the same proficiency level in it. In this respect, the undesirable effect of varying language proficiency on the Stroop effect described by, for example, Singh and Mishra (2013) or Roseli et al. (2002), was minimized. Moreover, all the participants had learnt English in an instruction-based setting, that is, as a school subject, and through other channels they were exposed to outside of school such as movies, podcasts, and computer games, that is, the ones not rarely used in schools for language-learning purposes as well. It was thus hypothesized that any inhibitory control advantage indicated in it might possibly be a predictor of the same advantage the students might benefit from when studying other school subjects in the situations where the inhibitory control needs to be activated—a novel aspect considered in another ongoing study investigating factors influencing university students’ performance. That is, the color identification rule in the Stroop test had to be followed in the context of previous knowledge the students have adopted at school. In this respect, the experiment made the participants apply a new cognitive concept requiring the inhibition of the semantic contents they have learnt at school before and prioritize visual stimuli in the form of color instead. This way an attempt was made at inducing the situation activating those cognitive processes that resemble the ones put into operation in school environments, especially in the situations where new concepts are learnt through restructuring the ones students have adopted before.
The question was also whether the exposure of children to two or more L1s can have some long-lasting effects in their early adulthood.
Finally, the study sought to investigate whether there could be any relationship between the degree of proficiency in the native language (other than Swedish) in the (bi-) multilingual group and the Stroop effect. The initial hypothesis was that the higher proficiency in the native language (other than Swedish) among EC (bi-) multilinguals, presumably indicative of their sustained use, might have a positive effect on their capacity to inhibit irrelevant stimuli.
The comparison of the Stroop effect calculated from the participants’ reaction times indicates a slightly higher (by approximately 13 ms), albeit statistically insignificant, value for the participants with more than one language acquired in EC. The differences in the mean reaction times between the two groups in congruent and incongruent trials appear to be statistically significant, though. In this respect, (bi-) multilinguals’ reactions were on average 36 ms faster in congruent trials and 23 ms faster in incongruent trials than EC monolinguals.’ As regards the relationship between the Stroop effect and the proficiency in the EC languages the EC multilingual participants (
One might hypothesize about why the results indicate no inhibitory control advantage for the EC (bi-) multilinguals—a phenomenon that has also been observed in other studies (e.g., Bialystok et al., 2005; Kousaie & Phillips, 2012; Morton & Harper, 2007). One of the explanations might be that the experiment was conducted in the language (English) that none of the participants uses as their EC one and which all of them have learnt in a language instruction-based setting and through extracurricular activities—a primary reason why English was chosen. That is, the selection of the language for the Stroop test is shown to affect the Stroop effect size by Heidlmayr et al. (2014), which, in their study, was significantly higher for L1 than L2, or by Sumiya and Healy (2008), who point out that this variable can even be affected by phonological similarity between the language the subjects are proficient in and the one the Stroop test is performed in. Another reason why no significant difference has been found within the study group might be that most participants are at the age at which, as Bialystok et al. (2005) point out, the smallest differences in the Stroop effect are observed. This is probably due to young adults usually being at the peak of their cognitive potential at this stage of their life and thus the differences in their inhibitory control are not as marked as they appear to be in young children or the elderly. This also appears to be the case with the study conducted by Kousaie and Phillips (2012), who found no Stroop effect differences in young and older adults.
Some studies also mention the proficiency in the language in which the Stroop test is conducted as a factor affecting its results. Singh and Mishra (2013), for instance, tested high and low proficient bilinguals speaking Hindi and English and found that high proficient bilinguals were faster in the Stroop task, showing that the L2 proficiency influences monitoring and resolution skills. Similarly, Roseli et al. (2002) claim that while in balanced bilinguals the language used in the Stroop test did not matter, in the unbalanced subjects the best-spoken language showed better results. The fact that the proficiency of the language in which the Stroop effect is measured appears to affect its values represented another reason why English had been selected for the experiment as all the participants were within its relatively narrow proficiency range (B2C1 level of Common European Framework of Reference for languages). However, no individual testing indicating the exact level of their proficiency in English had been carried out and thus might be taken into consideration in similar research in the future, as even the differences within this relatively narrow proficiency range might affect the Stroop effect values. Moreover, the computer-based format of the Stroop test, with the students working individually at their computers without their choices being monitored by anyone else during the process, might have influenced their motivation to conduct the task and the degree of conscientiousness with which they did it, potentially affecting their reaction times. Some studies (Kousaie & Phillips, 2012; Mezzacappa, 2004; Morton & Harper, 2007; Noble et al., 2005; Paap et al., 2015) question the relationship between what is presented by others as a bilingual advantage in executive functioning and bilingualism per se. These authors ascribe any possible correlation between the two variables to demographic factors such as socio-economic or immigrant status, and educational level that tend to be distributed differently among monolinguals and (bi-) multilinguals. Therefore, some authors stress the importance of matching groups participating in experiments for these factors carefully (e.g., Eneko et al., 2016).
Finally, English as well as other languages the subjects involved in the study have studied at school make all of them either bilinguals or multilinguals. This aspect might also somewhat “cognitively equalize” the EC monolinguals and their EC bi- or multilingual counterparts at a later age, thereby rendering the initial cognitive advantages stemming from EC (bi-) multilingualism hard to observe or non-existent later. That is, none of the participants was monolingual at the time the study was conducted, which can also explain the incongruence in results between this study and the previous ones focusing on the same phenomenon that were conducted between monolinguals and multilinguals. In this respect, it is also the total number of all the languages that the subjects have a command of that might influence their inhibitory control—an aspect to be focused on in a follow-up study. Another factor related to the multilingual participants’ linguistic repertoire, which could also have a confounding effect on the results, is the high degree of heterogeneity of languages spoken by the participants in their homes. As the study was conducted in Sweden and, in addition, at an institution with a multicultural profile, it was impossible to match the participants on that factor while retaining the current number of subjects.
Other possible studies conducted in the future exploring the relationship between students’ inhibitory control in a language learnt at school and their academic performance (similar to Dvorak, 2024) in certain school subjects might also answer questions related to a possible relationship between these two variables.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
