The ability of young learners to construct word meaning in context

This study examines young English readers’ ability to infer word meanings in context and to use metacognitive knowledge for constructing word meanings in relation to their reading performance. The participants were 61 fourth-grade students in the United States, comprising 24 monolingual English-speaking (ME) students and 37 English-as-a-second-language (L2) students; each group was also divided into strong and emergent readers in English. Participants were asked to read aloud paragraphs containing words unfamiliar to them in two different contextual conditions (i.e., explicit and implicit conditions), to guess the unfamiliar word meanings, and to tell a teacher how they arrived at the inferred meanings. Quantitative analyses found significant differences between strong and emergent readers in their oral fluency as well as in their ability to infer word meanings and articulate their use of metacognitive knowledge. Although significant differences were found in the ability to infer word meanings and the use of metacognitive reasoning between ME and L2 students, such differences disappeared after controlling for the size of students’ receptive vocabulary. Qualitative analyses also revealed differences in the kinds of knowledge and strategies that strong and emergent readers relied on when constructing the meaning of unknown words in both explicit and implicit contexts.


Introduction
It is well known that vocabulary plays a critical role in reading comprehension (Marulis & Neuman, 2010;Moghadam, Zainal, & Ghaderpour, 2012;Takanishi & Menestrel, 2017), although the precise mechanism linking learners' vocabulary knowledge and reading comprehension is not totally clear (Cain, Oakhill, & Lemmon, 2004). Many studies have reported positive relationships between learners' vocabulary size (i.e., how many words they know, also referred to as vocabulary breadth) and their reading comprehension (Duke & Carlisle, 2011;National Institute of Health, 2000;Tannenbaum, Torgesen, & Wagner, 2006), but vocabulary size is only one aspect of learners' vocabulary knowledge (Nation, 2013). To better understand the relationship between young readers' vocabulary knowledge and reading comprehension, their depth of vocabulary knowledge, or "richness of word understandings" (Proctor, Silverman, Harring, & Montecillo, 2012, p. 1636 and their ability to construct word meanings in a given context (Duke & Carlisle, 2011;Walters, 2004) must be examined as well. Unfortunately, however, there is a paucity of research looking into young readers' depth of vocabulary, their ability to infer meanings in context, and how such lexical abilities relate to reading comprehension. Research in this area is particularly limited when it comes to young learners of a second language (L2). This study, therefore, focuses on young English readers (both monolingual students and L2 learners at the upper-elementary school level) and aims to provide insights into: (a) their ability to infer meaning and define words in context, (b) their use of metacognitive knowledge for constructing word meanings, and, finally, (c) how such ability and knowledge relate to their reading performance. Information gained from this study is likely to be useful for developing effective strategy instruction for young L2 readers as well for emergent readers. In this paper, "young learners" and "young readers" are used synonymously and are defined as school-age children (both monolingual and L2learning children) up to the age of 12.

Knowledge and strategies to infer and construct word meaning in context
Vocabulary size, receptive vocabulary size in particular, is often reported to be highly correlated with reading comprehension among both L1 and L2 readers. For example, it is well known that adult English L2 learners need to be familiar with at least 95% to 98% of the words in a text in order to comprehend the text independently (without getting help from teachers, dictionaries, and other means; e.g., Laufer, 1997;Schmitt, Jiang, & Grabe, 2011). Recent studies on adult L2 readers, however, have uncovered complex relationships between vocabulary size and reading comprehension. For example, Masrai (2019) found that among high-proficient adult English L2 readers, vocabulary size of mid-frequent words contributed most to their reading comprehension, rather than vocabulary size of most-frequent words or lowfrequent words. Simply counting the number of known words, therefore, tells us only a partial story of the mechanism of reading comprehension. Vocabulary depth is a complex notion, and researchers have proposed numerous ways to conceptualize it (Schmitt, 2014). Some researchers have looked at word depth on a continuum from partial to precise knowledge, proceeding from never having seen a word, through the middle stages of having a general sense of the meaning, to knowing it well (Dale, 1965). Cronbach (1942) discussed vocabulary depth in relation to ability levels, ranging from generalization (the ability to define a word) to availability (the ability to use a word in production). Other researchers have proposed a receptive-productive knowledge continuum, although it is not easy to determine where on the continuum a word becomes available for productive use (Read, 2000). Still others have conceptualized vocabulary depth as a composition of multiple elements, such as knowledge of spoken form (e.g., pronunciation), written form (e.g., spelling), meaning, grammar, collocation, register, frequency, and associations (for detailed discussions, refer to Nation, 2013, andSchmitt, 2014).
Because young readers are in the midst of rapid development of word knowledge, the ability to infer and construct word meaning in context appears to be critical for their reading comprehension. Indeed, average young readers have to deal with many unfamiliar words when they read. Nagy and Herman (1987) estimated that the average English-speaking fifth grader encounters at least 10,000 different unknown words in a year. Because words "often have more than one meaning," even if young readers "know" the word, they still need to identify the most suitable meaning for a given context (Nagy & Scott, 2000, p. 72). In reality, while young readers may have clearer representations of certain words, they may only have partial knowledge of others, and as such they must constantly revise or add meanings to their vocabulary knowledge to make sense of texts.
To successfully infer word meanings in context, what kind of knowledge is necessary? Nagy (1995) proposed that three types of knowledge influence readers' contextual inferencing: linguistic knowledge, world knowledge, and strategic knowledge. Linguistic knowledge includes readers' vocabulary size, knowledge about possible word meanings (what Nagy called "word schemas"), and syntactic knowledge. World knowledge refers to readers' concepts about the world. Strategic knowledge is readers' metacognitive control over their cognitive resources when reading.
With respect to linguistic knowledge, L2 researchers have found that internal lexical representations (i.e., readers' knowledge about phonological, orthographical and morphological information about the word in question), as well as knowledge of syntactical and semantic relations among words and discourse knowledge, play a significant role in the inference of word meaning (e.g., de Bot, Paribakht, & Wesche, 1997;Haynes, 1993;Ke & Koda, 2019;Paribakht & Wesche, 1999). Decoding accuracy appears to be associated with word meaning inference in L2 as well (Prior et al., 2014). In L1 research, it is well documented that children as young as two, if not younger, are able to rely on parts of speech and other syntactic information to arrive at the meanings of new words (Westermann & Mani, 2018).
The roles of world knowledge and strategic knowledge in inferring word meanings in context are much less understood. Nassaji (2003) found that adult L2 readers relied most heavily on world knowledge when making lexical inferences in reading. In an L1 context, Diakidoy (1998) found that US sixth graders' prior knowledge about the content of passages predicted their comprehension, which in turn influenced their learning of unknown words in the passages. The value of strategic knowledge is thought to be important from an instructional point of view. Nassaji also reported that, among adult English learners, repeating the word or sections of the text was the most popular strategy for lexical inference, followed by analogy (i.e., guessing based on other phonologically or orthographically similar words), and verifying (i.e., checking appropriateness against the wider textual context). Huckin and Block (1993) proposed a cognitive processing model of L2 lexical inference that involves cognitive and metalinguistic control processing. Cognitive processing is a quick and dynamic process that takes place in various components, or what they referred to as modules, including vocabulary knowledge, morphosyntax knowledge, world knowledge, knowledge about textual discourse patterns, and so forth. Metalinguistic processing is both a linear and parallel process in which hypotheses about word meanings are generated and tested in context. The model predicts that, with these two processes activating together, a reader can infer meaning by strategically deciding how to use various knowledge types while interacting with contextual sources.

The role of context in constructing word meaning
Constructing meaning in context may not always happen effortlessly, but conscious and strategic use of textual context facilitates students' ability to infer the meaning of unknown words. Researchers have conceptualized context differently (Walters, 2004). Some have classified it based on the explicitness or implicitness of cues, others have paid attention to the location of cues for inference (i.e., local vs. global cues), and still others have focused on the types of knowledge required for making inferences. For example, Carton (1971) argues that context is composed of intralingual context (arising from knowledge about the target language), interlingual context (cues from other languages, including loan words and cognates), and extralingual context (arising from world knowledge and the target culture). Similarly, but somewhat more simply, Pressley, Levin, and McDaniel (1987) distinguished external context from internal context, defining external context as "surrounding sentence and paragraph cues" and internal context as "inspection of word parts" (p. 121). This distinction is worth noting because the influence of internal context has often been neglected or underestimated in studies due to the use of pseudowords as test stimuli. Even in the absence of many external cues, lexical/morphological knowledge (internal context) could provide enough cues to help readers guess the meaning of some words. Beck, McKeown, and McGaslin (1983) distinguished pedagogical contexts from natural contexts. Pedagogical contexts are specially designed for instructional purposes, and they contain explicit information to help students guess the meaning of unknown words. In contrast, natural contexts do not contain such explicit cues. Importantly, the majority of words in authentic texts appear without explicit contextual information.
There are very few empirical studies investigating the role of context in inferring word meaning among young learners. In Cain, Oakhill, and Lemmon (2004), L1 readers aged 9-10 with less developed reading comprehension skills found it more challenging than their more skilled counterparts to infer the meaning of unknown words (pseudowords were used) in contexts that required more processing demands (i.e., the cues for the word meaning did not appear in the immediate context). From a different perspective, Cain and Oakhill (2014) examined the impact of young L1 readers' word knowledge (both size and depth) on two types of inferences: local cohesion inferences (i.e., making connections between propositions by relying on synonyms or mapping related lexical items) and global coherence inferences (i.e., using vocabulary knowledge or general world knowledge to fill in gaps). Both types of inferences were associated with reading comprehension, but the students' vocabulary knowledge, their depth of knowledge in particular (measured by defining word meaning and judging similarities in the meaning of pairs of words), was more important for global coherence inferences than for local cohesion inferences.

Variables associated with young learners' reading comprehension
In addition to word-meaning inference abilities, many other possible variables influence reading comprehension. Concerning young L1 readers, research consistently reports strong relationships among word recognition (including phonological awareness), reading fluency, and reading comprehension (e.g., National Institute of Health, 2000). Interestingly, various oral language abilities (e.g., listening comprehension, syntactic complexity in spoken language, etc.) show increasingly stronger associations with reading comprehension from the early to later elementary school years, perhaps at least in part because the growing complexity of texts for upper-grade students requires more sophisticated language knowledge in general (Duke & Carlisle, 2011). Toward the middle and upper elementary school years, some children start exhibiting difficulties in reading, a phenomenon known as the fourth-grade slump (Chall, 1987). Importantly, the major challenges experienced by students going through the fourth-grade slump appear to differ from the major challenges experienced by students who start having reading difficulties at earlier grades. Rather than struggling with word recognition, students who face challenges at upper primary grades seem to have difficulties with texts that contain increasingly more unfamiliar and abstract words and that pose heavier conceptual demands (Kucan & Palincsar, 2011). Working memory also influences their performance (Cain et al., 2004). The timing of the onset of these challenges coincides with the drastic development of their metacognitive abilities, including metalinguistic knowledge such as knowledge of morphosyntax (Anglin, Miller, & Wakefield, 1993). In addition, various home factors, such as the frequency of storybook reading at home, are found to influence reading comprehension throughout kindergarten to upper elementary school (Sénéchal, 2006). Importantly, such home factors interact with school factors. Various instructional approaches and techniques (e.g., types of questions students are asked, teachers' interactive styles of engagement, the amount of time spent on reading tasks, etc.) all appear to influence reading comprehension (Duke & Carlisle, 2011).

Reading among young L2 readers
Compared with the large body of research on young L1 readers, empirical information on young L2 readers is relatively limited. The variables addressed in the previous section, such as oral language proficiency, reading fluency, decoding skills, word recognition in the target language, as well as working memory, socioeconomic status (SES), and literacy environment at home, also by and large apply to young L2 readers (e.g., August & Shanahan, 2006). However, research has also shown that, compared with monolingual children, young L2 readers have different types of linguistic knowledge (including vocabulary knowledge), background knowledge, and metacognitive reading strategies, depending on their unique bilingual experiences (e.g., García, 1991;Jimenez, García, & Pearson, 1996;Peregoy & Boyle, 2000;Verhoeven, 2011). Young L2 readers' L1 decoding skills and vocabulary knowledge can have some positive effects on their L2 reading, but the effects appear to depend on their L1-L2 typological combination and their chance of receiving high-quality L1 instruction (e.g., Proctor et al., 2012). While the importance of bilingualism is increasingly recognized in education, language-minority children worldwide are often denied access to reading instruction (and schooling) in their home language.
Identifying unique characteristics of young L2 readers can offer useful information for improving instruction, but comparative studies of L1 and L2 readers need to be conducted with careful consideration since researchers can easily overlook L2 readers' home language resources and unique bilingual experiences. For example, it is frequently reported that young L2 learners have smaller vocabulary sizes both in their L1 and L2 compared with their respective monolingual counterparts (e.g., Bialystok, Luk, Peets, & Yang, 2010;Carlo et al., 2004;Mancilla-Martinez & Lesaux, 2011). However, L2 learners are usually exposed to L1 and L2 words in different contexts with different purposes, and their L1 and L2 vocabularies largely do not overlap (e.g., Peña, Bedore, & Zlatic-Giunta, 2002). In studies where L2 readers' vocabulary in their home language was taken into account, their overall vocabulary size (L1 and L2 combined) was compatible with that of their monolingual peers (e.g., Butler, 2019;De Houwer, 2009;Goodrich & Lonigan, 2018).

Research questions
Unfortunately, many young L2 learners around the world receive school instruction only in their target language. Improving our understanding of the relationship between young readers' -particularly young L2 readers' -ability to construct word meaning in context and their reading comprehension, has great potential to inform the development of useful strategy instruction for young readers in need. Therefore, the present study, as part of a larger study investigating the role of lexical abilities in young learners' reading comprehension, explores the following questions related to English reading: · RQ1: Among fourth-grade students (both monolingual and bilingual) who have received academic instruction only in English, are there any differences in their performance, in relation to their reading comprehension levels (reading -strong readers vs. emergent readers) and language backgrounds (language -monolingual English-speaking students vs. L2 students), in the following areas? -vocabulary size (receptive domain); -oral reading fluency; -inference of meaning of unfamiliar words; -metacognitive reasoning to arrive at meaning.
· RQ2: Does context (explicit context vs. implicit context) influence the students' performance, in relation to their reading levels and language backgrounds, when inferring/constructing the meaning of unfamiliar words? · RQ3: How does context influence the kinds of knowledge and strategies the students use when inferring/constructing the meaning of unfamiliar words? The study focuses on fourth graders because this grade is a critical time for children's reading development (recall the fourth-grade slump, Chall, 1987) and because it is when children start drastically improving their metacognitive skills (Anglin et al., 1993).
With respect to context, this study adopts the conceptualizations of Pressley et al. (1987) and Beck et al. (1983) mentioned above, and compares two context conditions: explicit and implicit contexts. The former contains more external information "surrounding sentence and paragraph cues" (Pressley et al., 1987, p. 121), while the latter has very little such external information. Thus, learners must rely more on "inspection of word parts" (p. 121). The explicit context is designed to be more pedagogically friendly because it provides learners with more information that they can use for making lexical inferences.
It should be emphasized that this study focuses on how students infer and explain word meanings for reading comprehension rather than for vocabulary learning (a distinction suggested by Nation, 2013). Even if a reader can successfully infer word meanings in a given context, it does not necessarily mean that the same individual could learn such words from the context. Moreover, in examining learners' abilities to infer word meaning, some researchers make a distinction between source of knowledge (e.g., linguistic knowledge, world knowledge, etc.) and strategies (e.g., analyzing, monitoring, etc.; e.g., Nassaji, 2003). However, our data among children did not always allow us to make such a distinction reliably; consequently, the present study combines these elements and refers to them as inference knowledge and strategies hereafter.

Participants
The participants were 61 fourth-grade students (aged 9-10) in the United States from a single school district. Twenty-four of them were monolingual Englishspeaking students (ME students), and 37 were L2 learners of English from either Spanish-or Vietnamese-speaking homes. 1 In an effort to minimize variations in the amount and type of formal English instruction participants had previously received, we included only participants (both ME and L2 students) who had been enrolled in the same schools since kindergarten. 2 The L2 students all received English language development (ELD) instruction (i.e., they were pulled out of regular classes and received individual or small group ELD instruction) but had received no special support in their home language at school. All participants came from Title I schools, meaning that they came from middle to lower socioeconomic status (SES) backgrounds. Responses to questionnaires distributed to the participants' parents as part of this study also indicated that there were no major differences in SES (measured by parental educational levels and occupation, the number of books at home, habits of reading to children, etc.) across groups. 3 Both ME and L2 students were further categorized as either strong or emergent readers based on their reading levels in English. The reading level for L2 students was gauged by the following measurements: (a) a standardized reading test (reading performance on the Stanford Achievement Test; using normal curve equivalent [NCE] scores, students scoring 40 and lower were grouped as emergent readers and students scoring 60 and higher were grouped as strong readers); (b) San Diego Quick (a reading diagnostic test); (c) a district-administered running record (a teacher-based assessment in which miscues and self-corrections during oral reading were examined by teachers), and a recommendation from district ELD teachers. The Stanford Achievement Test was the only measurement used to determine reading levels for ME students. The same criterion based on NCE described above was used to group ME students into strong and emergent readers. The average NCEs for Stanford Achievement Test reading scores were 70.5 for strong ME readers (ME+), 25.3 for emergent ME readers (ME-), 70.2 for strong L2 readers (L2+), and 30.7 for emergent L2 readers (L2-). 4 All the L2 participants were originally judged as English-language learners (ELLs) who needed special ELD assistance when they entered kindergarten. At the time of their participation in this study (while they were enrolled in the fourth grade), all L2+ readers were classified as fluent English proficient (FEP) students and were no longer classified as ELLs by the district, while all L2-readers were still classified as ELLs. However, all L2 students from both reading groups had acquired sufficient oral skills in English based on a standardized test (the IDEA Oral Language Proficiency Test, IPT), which was also used by the district as a redesignation criterion. Namely, L2-students in the study were still classified as ELLs because they had not yet met the districts' criteria in reading and writing for redesignation, even though they had already acquired sufficient oral skills in English.
Participants were randomly selected from the students at the participating schools who met the preceding criteria.

Instruments and procedures
Two vocabulary assessments were employed in this study. First, to assess the size of students' English receptive vocabularies, the peabody picture vocabulary testrevised (PPVT-R, Dunn & Dunn, 1981, referred to as PPVT hereafter) was individually administered. Second, in order to assess students' ability to infer and explain word meanings in context, a vocabulary assessment (referred to as the inference assessment hereafter) was developed and administered. This assessment had two components. First, students were asked to read aloud short paragraphs that contained words that were unfamiliar to them, and then they were asked to guess the meanings of the unfamiliar words in question. This was designed to assess their reading fluency and ability to infer and explain the meaning of specified words in context. The students were allowed to read the paragraphs silently after reading them aloud if they wished to do so. They were also allowed to ask the administrator the meanings of any unfamiliar words (if any) besides the words in question. Second, immediately after inferring the meaning of each word, students were asked to explain how they arrived at the meanings of the words as they defined them. This question was designed to elicit their use of metacognitive reasoning for determining the meanings of the words in question.
The inference assessment consisted of 20 items that were assumed to be unfamiliar 5 to fourth graders, and two versions of this test (each version with the same set of 20 items) were prepared. Half of the words in each version were presented with explicit external contextual information and the other half were presented with implicit contextual information. As is evident from the following example, the paragraphs were short, and every effort was made to keep the syntax simple:

Explicit context
The teacher left Ricardo to watch the class while she went to make copies. She told him to be responsible and make sure we kept working on our project. She wanted him to oversee us. Implicit context The teacher left Ricardo with the class while she went to make copies. She wanted him to oversee us.
The items were counterbalanced, with half of the students in each group taking Version 1 and the other half taking Version 2 of the test. The items were selected from a larger pool of items for which four research assistants with ELD teaching experience had graded the explicitness of contextual information on a scale from 1 to 5, with 1 being least explicit and 5 being most explicit. Only the pairs whose difference in average rating was more than 2.5 scores apart were selected as final test items. The average score for items with explicit context was 4.3; for items with implicit context the average score was 1.3. Note that we controlled only for external contextual information; the degree of explicitness was judged based on the availability of external contextual cues (e.g., restatements of the word meaning, examples, and synonyms) but not on the availability of internal contextual cues (e.g., affixes and compounds). All the words in question were actual nouns, verbs, and adjectives; we did not use pseudowords (see the appendix for the list of words used in the inference assessment). The assessment was conducted individually by a district ELD coordinator with more than 10 years of teaching experience. She was blind to the backgrounds of the participating students.

Data analyses
All of the students' responses as well as their interactions with the teacher during the inference assessment were audio-recorded and transcribed. Several different coding systems were employed to analyze the data, as explained below. For each coding system, 12 students (three students from each group) were randomly chosen, and their responses were independently coded by two researchers. The disagreements were discussed (accounting for approximately 15% to 20% of the items for each coding system), and the rest of the data were independently coded again. Intraclass correlation coefficients between the two raters for the entire data set were calculated for each coding system, as reported below. The transcribed data were also analyzed qualitatively in order to understand the students' use of metacognitive knowledge.
First, overall fluency was judged based on students' performance when they read the item paragraphs aloud, and it was coded once for each student. Overall fluency was intended to holistically capture the level of speed and accuracy in oral text reading and was judged using a scale from 0 to 3 (where 0 indicated "not fluent" and 3 indicated "very fluent"). In other words, it was a gross measurement of students' ease in oral reading processing. The intraclass correlation coefficient was .91.
Second, students' ability to infer and define words in context ( defining words, considered a kind of depth of word knowledge in previous research) was coded for each of the students' responses. A 4-point scale was employed (where 0 was "mentioned completely irrelevant meaning in the given context or no response" and 3 was "could clearly define a relevant meaning in context"). Even if an answer did not exactly match the conventional meaning but the definition perfectly made sense in the given context, students could score a 3 (although such cases were rare). Admittedly, this coding could not unpack students' abilities to construct word meaning and to articulate ideas. What this coding captured, therefore, was their ability to articulate constructed meanings. The intraclass correlation coefficient was .92. In addition, word category identification was also coded for each response. This coding aimed to capture to what extent students' explanations of word meanings matched the lexical categories of the target words, such as nouns, verbs, and so forth. This is considered a kind of vocabulary knowledge (depth) that gives us additional information on the accuracy of their inference. A 3-point scale (where 0 was "did not match the target lexical category" and 2 was "perfectly matches") was employed. The intraclass correlation coefficient was .88.
Finally, students' use of metacognitive knowledge was analyzed both quantitatively and qualitatively. For the quantitative analysis, first, the degree of metacognitive reasoning was judged holistically using a 4-point scale (0 was "no sign of metacognitive reasoning" and 3 indicated "extensive metacognitive reasoning") for each response. This measure captures how well students could articulate their reasoning in meaning making. To receive a 3 for this measure, for example, students needed to provide an involved explanation or theory for why they gave a particular definition, such as identifying one or more specific sources of cues in context. The intraclass correlation coefficient was .85. To further examine the source of inferencing knowledge and strategies, the following coding scheme was developed inductively while consulting with previous studies of strategies employed by adult learners (e.g., Nassaji, 2003;Paribakht & Wesche, 1999): (a) phonological cues; (b) lexical cues (e.g., use of knowledge of compound and root words, morphological knowledge such as prefixes and suffixes, and use of first language lexical knowledge or cognates); (c) world knowledge; (d) external contextual information (use of contextual cues available in the text); (e) partial memory/knowledge (use of partial knowledge of the word in question); and (f) unspecified or no response. 6 The first two categories (phonological and lexical cues) concern internal information residing in the target words, whereas the next three (world knowledge, external contextual information, and the partial memory/knowledge) concern information external to the targeted words. Thus, for example, identifying a synonym of the target word in a given paragraph is categorized as the use of external contextual information. Multiple entries were possible for this coding, although it hardly happened. 7 The intraclass correlation coefficient was .91.

Results
Before answering the research questions, it was necessary to confirm that the two versions of the inference assessment were indeed compatible. First, the reliabilities of the two versions of the assessment were checked. The reliability coefficients (Cronbach's alpha) were .85 for Version 1 and .76 for Version 2. Second, the mean scores of defining words of the two versions were compared. The means for versions 1 and 2 were 1.92 and 1.76, respectively, and a one-way ANOVA failed to find a significant difference (F(1, 38) = .59, p > .1, ηp 2 = .03). Thus, the scores from both versions were combined for the analyses described below.

Students' ability to infer and define words in context and other related abilities by group
First, the students' performance on the PPVT (receptive vocabulary size) and the inference assessment (overall reading fluency, defining words, word category identification, and metacognitive reasoning) was examined. The descriptive results (means and standard deviations) are indicated in Table 1. With respect to the PPVT, students' performance was normalized with a mean (M) of 50 and a standard deviation (SD) of 21.06. 8 A two-way ANOVA indicated significant differences 6 One can argue that reading texts silently after the read-aloud (an option given to the students) can be a strategy. In the current study, it was not coded as a strategy. This is because when the students made a pause before responding and it was not clear whether they were reading the text silently or thinking about the meaning of the target word. A systematic analysis on the potential impact of this strategy can be a topic of future investigation. 7 Out of 10 items for each contextual condition, the average numbers of coding per child were 10.20 (for the explicit context condition) and 10.24 (for the implicit context condition).
It was very unlikely that the multiple coding significantly inflated the frequencies of each category of metacognitive knowledge and strategies for inferring word meaning. 8 The normed scores were used for the PPVT because the normalization was based on a large, nationwide representative sample in the USA, and all the participating students in this between strong and emergent readers (F(1, 57) = 41.22, p < .001, ηp 2 = .42) and ME and L2 readers (F(1, 57) = 19.25, p < .001, ηp 2 = .25). It failed to find a significant interaction effect, however (F(1, 57) = .83, p =.37, ηp 2 = .01). Note that for L2 students, only their L2 (English) vocabulary size was assessed, without taking their L1 vocabulary knowledge into account; as discussed in the literature review section, because this could be a potential bias towards L2 students, PPVT is used as a covariate in the analyses below.
In comparing the mean scores across groups for the rest of the measures, a series of two-factor ANOVAs were employed, first without controlling for PPVT and then controlling for PPVT (PPVT was used as a covariate). As shown in Table  2, when no covariate was used, significant differences between strong and emergent readers were found in all these measures, and differences between ME and L2 readers were found only for defining words and metacognitive reasoning. None of the variables showed interaction effects. After controlling for PPVT, a significant main effect for reading level was found for all the measures (overall fluency, defining words, word category identification, and metacognitive reasoning) while the main effect for language in defining words and metacognitive reasoning disappeared. In other words, when we controlled for students' receptive vocabulary size, all the other measures examined in the inference assessment showed differences in performance between strong and emergent readers (strong readers having higher scores) but failed to show any difference in performance between ME and L2 students. Word category identification scores were based on a 3-point scale (from 0 to 2), and the others were based on a 4-point scale (from 0 to 3). Standard deviations are indicated in parentheses. a Normalized scores. b N = 17 (one student could not offer any reasoning and was thus was excluded from the metacognitive reasoning analysis). study received schooling only in English. Although the PPVT has been used extensively among L2 learners as well, problems with using it with L2 learners have been raised, and potential influence in performance from the students' L1 was reported (e.g., Goriot et al., 2018).  Notes. ** p < .001, * p < .05. Effect sizes (partial eta-squared, ηp 2 ) are indicated in parentheses.

Knowledge and strategies for inferring word meaning by context
Students' source of knowledge and strategies are shown in Figures 3 (explicit context) and 4 (implicit context). The figures indicate the average frequencies that the students used for each category and the standard deviations. In order to get a general picture of their use of knowledge and strategies, a series of twoway ANOVAs with repeated measures were conducted while context was included as the between-subject factor. Because multiple comparisons were made, p-value was adjusted to .008. The phonological category was excluded from the ANOVA analyses because the distributions of the residuals were heavily skewed. It turned out the overwhelmingly majority of the students did not use the phonological strategy; only a handful of students repeatedly used it. For the rest of the categories, the results indicated that there were significant differences in the average mean frequencies between the explicit and implicit contexts in external information (F(1, 56) = 10.56, p = .002, ηp 2 = .16), partial memory/knowledge (F(1, 56) = 8.77, p = .004, ηp 2 = .14), and unspecified/no-response categories (F(1, 56) = 21.37, p < .001, ηp 2 = .28). 13 For lexical information, a significant difference was found in reading (F(1, 56) = 11.96, p < .001, ηp 2 = .18). For partial memory/knowledge, in addition to context, main effects were found both in reading (F(1, 56) = 11.30, p < .001, ηp 2 = .17) and language (F(1, 56) = 10.80, p = .002, ηp 2 = .16). Namely, strong readers on average used the internal lexical information more than emergent readers. Strong readers and ME readers made more use of partial memory and knowledge. Although it did not reach the adjusted significant level, there was an approaching level in the main effect of reading for unspecified/non-response (emergent readers being higher) (F(1, 56) = 6.30, p = .01, ηp 2 = .10). No interaction effect was found in any of the categories. Note, however, that the frequencies for each category were small and the standard deviations were relatively large, suggesting that there were substantial individual differences. Thus, we interpreted the statistical results only for a gross tendency, and then examined the data qualitatively for more detail. In the explicit context, as expected, students frequently relied on external contextual information as a cue, but the sources used often appeared to differ between students who could and could not successfully construct meaning. A close look at the transcription showed that the students who succeeded in constructing meaning in context were usually better at using multiple sources of contextual information available in the entire paragraph. They also tended to monitor and clearly articulate their meaning-construction process, as exemplified by the following case of a strong reader: By contrast, when the students could not infer the word meaning successfully, they tended to pay attention only to the immediate context of the word in question and guess the meaning by relying on their own knowledge associated with the limited information that they focused on. They also often missed syntactic or cohesion cues such as conjunctions, determiners, and pronouns that would provide readers with relational information among propositions and ultimately help them make suitable inferences in context. Below is an example from an emergent reader: "A jacuzzi" was a creative guess, and one can imagine why this student associated this idea with "relaxing." However, the definite article "the" in front of "getaway" indicates that "the getaway" is supposed to be mutual knowledge between the reader and the writer. Thus, introducing a new piece of information (in this case, "a jacuzzi") is not the most natural reading in this particular context.
Another notable characteristic of the students' inference knowledge/strategies in the explicit context condition was relatively high frequencies of the partial memory/knowledge category. The contextual information provided in the explicit condition seemed to help students improve or modify their partially acquired knowledge of the word in question. Some students retrieved incidences of encountering the target word somewhere else and used contextual information associated with the previous encounters as well as contextual information in the task at hand to construct the word meaning. For example, when asked what "pending" means after reading "the school board decided to build a new gym at the school last year. But this plan is pending because they don't have enough money. We don't have a gym yet," a student remembered her mother, who was trying to sell a house, saying to her neighbor that "the sale is pending." Another student, after being asked the meaning of "aviator," realized that he had heard of "aviation" on TV recently and guessed the meaning of "aviator" to be a pilot. Students with larger vocabularies seemed to have more advantage in using partial word knowledge/memory (the frequency was the highest among the ME+ group, as shown in Figure 3), and extra contextual information given in the paragraph helped them refine the meaning of the word in context. This mechanism may in turn assist them in further developing their vocabulary.
In the implicit context condition, we can expect that students need to rely more on internal word knowledge and their own world knowledge, given that there is relatively less contextual information available in the paragraphs themselves. This assumption is not necessarily supported in our data. But strong readers, regardless of ME and L2 backgrounds, used lexical knowledge more frequently than emergent readers, irrespective of the contexts. For example: Other students seemed to use internal word knowledge to ensure that the meaning they constructed based on (limited) external contextual information was appropriate. As regards lexical knowledge, students, strong readers in particular, could use knowledge of suffixes and prefixes appropriately in context, when such information was available. In contrast, students' efforts to use compound word knowledge did not work well on occasion despite their rather creative attempts. Students' difficulty constructing the meaning of compound words reflects the complexity of compound word meanings in English. For instance, one student reasoned that "outgoing girl" means "she goes out a lot," while another reasoned that "to oversee" means "to double-check" because "over means to do it over, to do it again." The next example, from an emergent reader, illustrates that segmenting a word is not always easy for students. Across different groups, students did not frequently rely on phonological sources of knowledge, but when students used this source, they usually arrived at inappropriate or confusing meanings. Those students tended to simply guess the meaning of words based on phonologically similar words, which were completely irrelevant to the context. Examples include "bending" and "depending" for the meaning of "pending," "nice" for the meaning of "durable" (based on the association that this was related to the word "adorable"), and "to show" for the meaning of "apparel" because of its phonological similarity to "appear." Relying on phonologically similar words often resulted in the misidentification of lexical categories as well (e.g., "apparel" is a noun but "appear" is a verb).
Finally, it is important to note that the frequency of "unspecifiable or no response" was higher among participants in the implicit context (more authentic contexts) than in the explicit contexts (more pedagogically oriented contexts). This finding can serve as a basis for implications for the use of explicit pedagogical texts in vocabulary learning for young readers, emergent readers in particular.

Discussion
This study investigated fourth-grade students' abilities to infer the meaning of words in context as well as their abilities to employ metacognitive knowledge/strategies to do so. It examined how such abilities differ between strong and emergent readers and between ME and L2 readers. It also explored how the explicitness of contextual information influenced the ability to construct meaning among students with different reading proficiency and language backgrounds. How students employ metacognitive sources in two contextual conditions was also investigated. Quantitative analyses indicate significant differences between strong and emergent readers in performance in overall fluency, reading aloud, defining words, identifying word categories, and metacognitive reasoning. Although performance in defining words and metacognitive reasoning was initially significantly different between ME and L2 readers, such differences disappeared after controlling for their receptive vocabulary size. As discussed above, reporting on L2 learners' vocabulary in the target language only can seriously mislead their "true" lexical knowledge, which is assumed to be spread across languages (e.g., De Houwer, 2009). In the present study, after controlling for students' vocabulary size in English, in addition to other potentially confounding variables (e.g., basic oral proficiency levels in English, SES backgrounds, and the amount and types of instruction previously received at school), the performances of ME and L2 students were found to be compatible. In contrast, differences between strong and emergent readers were unaffected even after controlling for their vocabulary size. The challenges that emergent readers faced, regardless of ME and L2 backgrounds, went beyond the need to increase their vocabulary size, including abilities for allocating sufficient mental resources (e.g., memory), making a good use of external information, and enhancing various types of lexical knowledge (i.e., vocabulary depth) in order to process texts for comprehension and articulate responses based on higher order reasoning.
All students, irrespective of their backgrounds, were better at defining words and using metacognitive reasoning to construct word meanings in explicit contexts (more pedagogically oriented texts) than in implicit contexts (more authentic texts). Providing more explicit contextual information, such as restating the meaning of the target words and offering synonyms and concrete examples, certainly helped all participants construct word meaning; the pedagogical benefits of having explicit information were evident. Although all students relied heavily on such external contextual information, readers who successfully could infer the word meaning seemed to be better at fully taking advantage of multiple sources of contextual information in the entire paragraph. By contrast, unsuccessful inference often came from paying insufficient attention to cohesive devices (e.g., conjunctions and determiners) across sentences and focusing on a certain word or phrase that appeared immediately before or after the target word within the same sentence. This finding is consistent with the study conducted by Can (2016) among L1 secondary school students which showed that their understanding of cohesive relations, conjunctions in particular, was associated with their reading comprehension levels. The finding in the present study suggests that young readers would also benefit from receiving instructional assistance on how to effectively use more global contextual cues and cohesive devices in meaning construction when they read.
Explicit contextual information also helped students evoke their partially acquired word knowledge and refine the word meaning in context. As noted previously, word knowledge is multifaceted and gradual (Schmitt, 2014). As children are exposed to new words, they refine and modify the meanings every time they encounter them in different contexts. Offering explicit contextual information seems to facilitate such a process. It is well known that over time, children with larger vocabulary size tend to develop their vocabulary size more rapidly than their peers with smaller vocabulary size. This phenomenon of cumulative advantage is called the Matthew effect (Duff, Tomblin, & Catts, 2015). Indeed, students with larger vocabularies in this study appeared to benefit more from the above-mentioned mechanism of refining word meaning, and this mechanism may partially explain why gaps in vocabulary size among students increase over time.
With respect to internal word information, strong readers, regardless of ME and L2 backgrounds, used such sources more frequently than emergent readers, irrespective of the textual context conditions. Students used knowledge of prefixes and suffixes strategically, when they used such knowledge at all. A study among L1-learning students in Grades 4 and 7, as well as high school undertaken by Nagy, Diakidoy, and Anderson (1993) indicated that the participants' knowledge of suffixes was so influential over their reading comprehension that it can be used as a diagnostic tool. The researchers also found that knowledge of suffixes developed substantially between the fourth-and seventh-grade levels. Moreover, individual differences in their morphological knowledge grew during the same time period. Considering that fourth graders who were strong readers in our study already seemed to benefit from using morphological knowledge to construct meaning in context, it may be a good idea to introduce explicit instruction on morphology to emergent readers at this grade level, if not earlier, before gaps in knowledge between strong and emergent readers become even more substantial.
Unlike participants' experiences with morphology, as exemplified in the "outgoing" and "oversee" cases reported above, the students appeared to find it challenging to identify the meaning of compound words regardless of their background. Indeed, syntactic and semantic relations within compounds in English are not straightforward. For example, "a magnifying glass" is a glass that magnifies, but "a looking glass" is not a glass that looks. Moreover, the meaning of a compound is not necessarily a combination of the meanings of each root. Complicated internal structures of English compounds are language specific. In English, there is no limit to the number of words that are allowed to be put together. Various types of combinations of lexical categories are possible, and determining the lexical category of compounds can be confusing for young learners. The lexical category of compounds usually follows the lexical category of the final root. For example, headstrong (noun + adjective) is an adjective and cellphone (noun + noun) is a noun. There are exceptions, however. Hands-on (noun + preposition) is not a preposition but an adjective, whereas higher-up (adjective + preposition) is a noun. Children need to understand such complexity associated with compounds. Again, it may be beneficial to have occasional explicit instruction on word formation in English as part of reading instruction in order to raise students' awareness of complex internal structures of compounds, as addressed above.
Curiously, there was hardly any instance of L2 students relying on cognates or their L1 knowledge in this study. This may be related to the fact that the L2 students in this study did not receive formal academic instruction in their L1 at school; they were not taught how to make use of their L1 resources at school. The lack of explicit reliance on L1 knowledge may also be due to the fact that Vietnamese was the L1 of the majority of the L2 students in this study, and Vietnamese is not related to English and thus has no cognates. Small sample sizes and an unbalanced number of students with Vietnamese and Spanish backgrounds did not allow us to conduct any systematic analyses of the influence of their L1 over their performance in the inference assessment. For a future study, it would be worthwhile to systematically investigate the role of L1 in students' meaning-making processes and strategies during their L2 reading (and ideally in their L1 reading as well).
In implicit contexts, namely more authentic texts, young readers had greater frequencies of unspecified and no responses than in the explicit context (a more pedagogical condition). It seems that less authentic texts more effectively support young readers (emergent readers in particular) in constructing word meaning in context with their more explicit cues. Until young readers have developed a certain level of ability to use both external and internal information to make sense of word meaning in context, providing them with authentic texts without any assistance (independent reading) may have a limited effect on improving their reading comprehension.
There are a few limitations in this study. First, because paragraphs in the inference assessment were rather short, we were able to examine students' ability to use relatively local cohesion relationships but had limited capacity to thoroughly examine the role of global contextual information -information that goes beyond a single lexical or phrasal cue that connects propositions. Global coherence inference requires readers to construct relevant lexical and mental networks in order to fill gaps in concepts that are not directly stated in the text. It would be interesting to investigate the role of global contexts in children's meaning-making processes. Second, only the students' receptive vocabulary size in English was obtained in this study. For L2 readers, it would have been better if their vocabulary knowledge in their L1 had been taken into account (even though the participants in this study had academic instruction exclusively in English). Moreover, information on participants' expressive vocabulary would have been useful, even though all the participants were identified as orally proficient by a standardized measure (i.e., IPT). This is because one could argue that expressive vocabulary might be a better indicator of children's ability to define word meanings and express their metacognitive reasoning. Having information on students' expressive vocabulary size would be particularly helpful for L2 readers because larger gaps between receptive and productive vocabulary are often found among bilingual children (e.g., Gibson, Jarmulowicz, & Oller, 2018). Furthermore, in this study we focused on understanding how young students construct the meaning of unknown words but not on their actual learning of such words. Research examining how the meaning-making process relates to the learning of words would be of great value.

Conclusion
Although there is substantial research on young learners' vocabulary size and its relation to reading comprehension, students' ability to construct meaning in context is not well understood. This study investigated fourth-grade students' ability to infer and define meanings of unknown words in context and how they use metacognitive knowledge and strategies to arrive at meanings in two different contexts (i.e., explicit and implicit contexts). The study found that such abilities differed between strong and emergent readers as well as between ME and L2 readers. Importantly, however, after controlling for the students' English receptive vocabulary size, ME and L2 readers were compatible in their performance. The study also found that the students used more strategies in the explicit context and used different strategies depending on context.
Given these results, the present study can provide a few practical implications for pedagogy. The first suggestion is to recognize the potential merit of using texts with explicit contexts to provide instruction on strategies for inferring word meaning. Second, it would be pedagogically useful to provide emergent readers with explicit instruction on how to use external information. Metacognitive knowledge used by strong readers can be viewed as an important source of information when designing instruction to assist emergent readers. Useful strategies can include paying attention to syntactic or cohesive information and using wider contexts rather than focusing on the immediate local contexts of the words in question. Similarly, it would be helpful to give students explicit instruction on how to use internal lexical information by relying on morphological knowledge, compounds, and so forth. Fostering such knowledge in emergent readers would strengthen their reading skills.