Relative complexity in a model of word difficulty: The role of loanwords in vocabulary size tests
PDF

Keywords

lexical sophistication
loanwords
cognates
vocabulary size test (VST)
frequency effect
L2 word difficulty

How to Cite

Canning, D., McLean, S., & Vitta, J. (2024). Relative complexity in a model of word difficulty: The role of loanwords in vocabulary size tests. Studies in Second Language Learning and Teaching. https://doi.org/10.14746/ssllt.38492

Number of views: 312


Number of downloads: 204

Abstract

Recent studies have shown that the frequency effect, although long used as a guide to word difficulty, fails to explain all variance in learner word knowledge. As such, a “more than frequency” conclusion has been offered to explain how lexical sophistication accounts for word difficulty. This study presents a multiple regression model of word-learning difficulty from a data set of monolingual Japanese first language (L1) learners. Vocabulary Size Test (VST) scores of 2,999 L1 Japanese university students were converted to logit scores to determine the word-learning difficulty of 80 target words. Five lexical sophistication variables were found to correlate with word-learning difficulty (frequency, cognate status, age of acquisition, prevalence, and polysemy) above a practical significance threshold. These were subsequently entered into a regression model with the logit scores as the dependent variable. The model (R2 = .55) indicates that three lexical sophistication variables significantly predicted VST scores: frequency (ß = -.28, p = .029), cognateness (ß = -.24, p = .005), and prevalence (ß = 0.22, p = .040). Despite suggestions that complexity studies be interpreted considering what is understood about the construct of linguistic complexity, researchers have rarely made explicit the differences between absolute and relative complexity variables. As some variables can be shown to vary in complexity according to the L1 population, these must be considered in discussions of test generalizability. Although frequency will continue to be the primary criterion for the selection of lexical items for teaching and testing, the cognate status of words can be used to predict the potential learning burden of the word more precisely for learners of different L1 backgrounds.

https://doi.org/10.14746/ssllt.38492
PDF

References

Adelman, J. S., & Brown, G. D. (2007). Phonographic neighbors, not orthographic neighbors, determine word naming latencies. Psychonomic Bulletin & Review, 14(3), 455-459. DOI: https://doi.org/10.3758/BF03194088

Afshartous, D., & Preston, R. A. (2011). Key results of interaction models with centering. Journal of Statistics Education, 19(3). DOI: https://doi.org/10.1080/10691898.2011.11889620

Allen, D., & Conklin, K. (2013). Crosslinguistic similarity norms for Japanese-English translation equivalents. Behavior Research Methods, 46(2), 540-563. DOI: https://doi.org/10.3758/s13428-013-0389-z

Beglar, D. (2010). A Rasch-based validation of the Vocabulary Size Test. Language Testing, 27(1), 101-118. DOI: https://doi.org/10.1177/0265532209340194

Brysbaert, M., Warriner, A. B., & Kuperman, V. (2013). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3), 904-911. DOI: https://doi.org/10.3758/s13428-013-0403-5

Brysbaert, M. (2019). How many participants do we have to include in properly powered experiments? A tutorial of power analysis with reference tables. Journal of Cognition. 2(1), 1-38. DOI: https://doi.org/10.5334/joc.72

Brysbaert, M., Keuleers, E., & Mandera, P. (2021). Which words do English nonnative speakers know? New supernational levels based on yes/no decision. Second Language Research, 37(2), 207-231. DOI: https://doi.org/10.1177/0267658320934526

Bulté, B., & Housen, A. (2012). Defining and operationalizing L2 complexity. In A. Housen, F. Kuiken & I. Vedder (Eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA (pp. 21-46). John Benjamins. DOI: https://doi.org/10.1075/lllt.32.02bul

Christian, J., Bickley, W., Tarka, M., & Clayton, K. (1978). Measures of free recall of 900 English nouns: Correlations with imagery, concreteness, meaningfulness, and frequency. Memory & Cognition, 6(4), 379-390. DOI: https://doi.org/10.3758/BF03197470

Chumbley, J. I., & Balota, D. A. (1984). A word’s meaning affects the decision in lexical decision. Memory & Cognition, 12(6), 590-606. DOI: https://doi.org/10.3758/BF03213348

Coltheart, M. (1981). The MRC psycholinguistic database. Quarterly Journal of Experimental Psychology, 33(4), 497-505. DOI: https://doi.org/10.1080/14640748108400805

Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. (2011). What is lexical proficiency? Some answers from computational models of speech data. TESOL Quarterly, 45(1), 182-193. DOI: https://doi.org/10.5054/tq.2010.244019

Crossley, S., Kyle, K., & Salsbury, T. (2016). A usage-based investigation of L2 lexical acquisition: The role of input and output. Modern Language Journal, 100(3), 702-715. DOI: https://doi.org/10.1111/modl.12344

Dahl, Ö. (2004). The growth and maintenance of linguistic complexity. John Benjamins. DOI: https://doi.org/10.1075/slcs.71

Daulton, F. E. (1998). Japanese loanword cognates and the acquisition of English vocabulary. The Language Teacher, 22(1), 17-25.

Daulton, F. E. (2007). Japan’s built-in lexicon of English-based loanwords. Multilingual Matters. DOI: https://doi.org/10.21832/9781847690319

Davies, M. (2008). The Corpus of Contemporary American English (COCA): 560 million words, 1990-present. https://corpus.byu.edu/coca/

De Wilde, V., Brysbaert, M., & Eyckmans, J. (2020). Learning English through out‐of‐school exposure: How do word‐related variables and proficiency influence receptive vocabulary learning? Language Learning, 70(2), 349-381. DOI: https://doi.org/10.1111/lang.12380

De Wilde, V. (2023). The auditory picture vocabulary test for English L2: A spoken receptive meaning-recognition test intended for Dutch-speaking L2 learners of English. Language Teaching Research. DOI: https://doi.org/10.1177/13621688221147462

Dijkstra, T., Grainger, J., & van Heuven, W. J. B. (1999). Recognition of cognates and interlingual homographs: The neglected role of phonology. Journal of Memory and Language, 41(4), 496-518. DOI: https://doi.org/10.1006/jmla.1999.2654

Ellis, N. C. (2002). Frequency effects in language processing. Studies in Sec-ond Language Acquisition, 24(2), 143-188. DOI: https://doi.org/10.1017/S0272263102002024

Ellis, N. C., & Beaton, A. (1993). Psycholinguistic determinants of foreign language vocabulary learning. Language Learning, 43(4), 559-617. DOI: https://doi.org/10.1111/j.1467-1770.1993.tb00627.x

Eguchi, M., & Kyle, K. (2020). Continuing to explore the multidimensional nature of lexical sophistication: The case of oral proficiency interviews. Modern Language Journal, 104(2), 381-400. DOI: https://doi.org/10.1111/modl.12637

Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G* Power 3: A flexi-ble statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175-191. DOI: https://doi.org/10.3758/BF03193146

Gries, S. T. (2020). Analyzing dispersion. In M. Paquot & S. T. Gries (Eds.), A practical handbook of corpus linguistics (pp. 99-118). Springer. DOI: https://doi.org/10.1007/978-3-030-46216-1_5

Hashimoto, B. J. (2021). Is frequency enough? The frequency model in vocabulary size testing. Language Assessment Quarterly, 18(2), 171-187. DOI: https://doi.org/10.1080/15434303.2020.1860058

Hashimoto, B. J., & Egbert, J. (2019). More than frequency? Exploring predictors of word difficulty for second language learners. Language Learning, 69(4), 839-872. DOI: https://doi.org/10.1111/lang.12353

Hoffman, P., Lambon Ralph, M. A., & Rogers, T. T. (2013). Semantic diversity: A measure of semantic ambiguity based on variability in the contextual usage of words. Behavior Research Methods, 45(3), 718-730. DOI: https://doi.org/10.3758/s13428-012-0278-x

Kim, M., Crossley, S. A., & Kyle, K. (2018). Lexical sophistication as a multidimensional phenomenon: Relations to second language lexical proficiency, development, and writing quality. Modern Language Journal, 102(1), 120-141. DOI: https://doi.org/10.1111/modl.12447

Kondrak, G. (2000). A new algorithm for the alignment of phonetic sequences. Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, 288-295.

Kondrak, G. (2003). Phonetic alignment and similarity. Computers and the Humanities, 37, 273-291. DOI: https://doi.org/10.1023/A:1025071200644

Kuperman, V., Stadthagen-Gonzalez, H., & Brysbaert, M. (2012). Age-of-acquisition ratings for 30,000 English words. Behavior Research Methods, 44(4), 978-990. DOI: https://doi.org/10.3758/s13428-012-0210-4

Kyle, K., & Crossley, S. A. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly, 49(4), 757-786. DOI: https://doi.org/10.1002/tesq.194

Kyle, K., & Crossley, S. (2016). The relationship between lexical sophistication and independent and source-based writing. Journal of Second Language Writing, 34, 12-24. DOI: https://doi.org/10.1016/j.jslw.2016.10.003

Kyle, K., Crossley, S., & Berger, C. (2018). The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behavior Research Methods, 50(3), 1030-1046. DOI: https://doi.org/10.3758/s13428-017-0924-4

Laufer, B. (1989). A factor of difficulty in vocabulary learning: Deceptive transparency. In I. S. P. Nation & R. Carter (Eds.), Vocabulary acquisition (pp. 10-20). Free University Press.

Laufer, B., & McLean, S. (2016). Loanwords and vocabulary size test scores: A case of different estimates for different L1 learners. Language As-sessment Quarterly, 13(3), 202-217. DOI: https://doi.org/10.1080/15434303.2016.1210611

Linacre, J. M. (2002). What do infit and outfit, mean-square and standardized mean? Rasch Measurement Transactions, 16(2), p. 878.

Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners’ oral narratives. Modern Language Journal, 96(2), 190-208. DOI: https://doi.org/10.1111/j.1540-4781.2011.01232_1.x

McDonald, S. A., & Shillcock, R. C. (2001). Rethinking the word frequency effect: The neglected role of distributional information in lexical processing. Language and Speech, 44(3), 295-322. DOI: https://doi.org/10.1177/00238309010440030101

McLean, S., Hogg, N., & Kramer, B. (2014). Estimations of Japanese university learners’ English vocabulary sizes using the vocabulary size test. Vocabulary Learning and Instruction, 3(2), 47-55. DOI: https://doi.org/10.7820/vli.v03.2.mclean.et.al

Mohammad, S. (2018). Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 174-184. DOI: https://doi.org/10.18653/v1/P18-1017

Moranski, K., & Ziegler, N. (2021). A case for multisite second language acquisition research: Challenges, risks, and rewards. Language Learning, 71(1), 204-242. DOI: https://doi.org/10.1111/lang.12434

Morrison, C., & Ellis, A. (1995). Roles of word frequency and age of acquisition in word naming and lexical decision. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21(1), 116-133. DOI: https://doi.org/10.1037//0278-7393.21.1.116

Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review, 63(1), 59-82. DOI: https://doi.org/10.3138/cmlr.63.1.59

Nation, I. S. P., & Beglar, D. (2007). A vocabulary size test. The Language Teacher, 31(7), 9-13.

NLP tools for the social sciences. (2016). TAALES 2.0 index description spreadsheet. https://docs.google.com/spreadsheets/d/1axmeHlKE-aelPHX4L17WpHjC7Jn4yQlE/edit#gid=858394526

Pallotti, G. (2015). A simple view of linguistic complexity. Second Language Research, 31(1), 117-134. DOI: https://doi.org/10.1177/0267658314536435

Pinchbeck, G. G., Brown, D., McLean, S., & Kramer, B. (2022). Validating word lists that represent learner knowledge in EFL contexts: The impact of the definition of word and the choice of source corpora. System, 106, 1-14. DOI: https://doi.org/10.1016/j.system.2022.102771

Peters, E. (2020). Factors affecting the learning of single-word items. In S. Webb (Ed.), The Routledge handbook of vocabulary studies (pp.125-142). Routledge. DOI: https://doi.org/10.4324/9780429291586-9

Schepens, J., Dijkstra, T., & Grootjen, F. (2011). Distributions of cognates in Europe as based on Levenshtein distance. Bilingualism: Language and Cognition, 15(1), 157-166. DOI: https://doi.org/10.1017/S1366728910000623

Schmitt, N. (1998). Tracking the incremental acquisition of second language vocabulary: A longitudinal study. Language Learning, 48(2), 281-317. DOI: https://doi.org/10.1111/1467-9922.00042

Schmitt, N., Dunn, K., O’Sullivan, B., Anthony, L., & Kremmel, B. (2021). Introducing knowledge-based vocabulary lists (KVL). TESOL Journal, 12(4), e622. DOI: https://doi.org/10.1002/tesj.622

Siskova, Z. (2012). Lexical richness in EFL students’ narratives. University of Reading Language Studies Working Papers, 4, 26-36.

Stewart, J., Vitta, J. P., Nicklin, C., McLean, S., Pinchbeck, G. G., & Kramer, B. (2022). The relationship between word difficulty and frequency: A response to Hashimoto (2021). Language Assessment Quarterly, 19(1), 90-101. DOI: https://doi.org/10.1080/15434303.2021.1992629

Tanaka-Ishii, K., & Terada, H. (2011). Word familiarity and frequency. Studia Linguistica, 65(1), 96-116. DOI: https://doi.org/10.1111/j.1467-9582.2010.01176.x

Toglia, M. P., & Battig, W. F. (1978). Handbook of semantic word norms. Lawrence Erlbaum.

Vitta, J. P., & Al-Hoorie, A. (2021). Measurement and sampling recommendations for L2 flipped learning experiments: A bottom-up methodological synthesis. Journal of Asia TEFL, 18(2), 682-692. DOI: https://doi.org/10.18823/asiatefl.2021.18.2.23.682

Vitta, J. P., Nicklin, C., & McLean, S. (2022). Effect size-driven sample-size planning, randomization, and multisite use in L2 instructed vocabulary acquisition experimental samples. Studies in Second Language Acquisition, 44(5), 1424-1448. DOI: https://doi.org/10.1017/S0272263121000541

Vitta, J. P., Nicklin, C., & Albright, S. W. (2023). Academic word difficulty and multidimensional lexical sophistication: An English‐for‐academic‐purposes‐focused conceptual replication of Hashimoto and Egbert (2019). Modern Language Journal, 107(1), 373-397. DOI: https://doi.org/10.1111/modl.12835

Willis, M., & Ohashi, Y. (2012). A model of L2 vocabulary learning and re-tention. The Language Learning Journal, 40(1), 125-137. DOI: https://doi.org/10.1080/09571736.2012.658232

Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8(3), p. 370.