Trends in language assessment and testing: A bibliometric study
Journal cover Studies in Second Language Learning and Teaching, volume 15, no. 1, year 2025
PDF

Keywords

bibliometrics
language testing
language assessment
citation analysis
co-citation analysis
keyword analysis

How to Cite

Zhang, X. (2025). Trends in language assessment and testing: A bibliometric study. Studies in Second Language Learning and Teaching, 15(1), 171–198. https://doi.org/10.14746/ssllt.25141

Number of views: 240


Number of downloads: 46

Abstract

The current bibliometric study employed citation analysis and keyword analysis to perform a review of language assessment and testing. Based on citation counts and keywords, this study identified the recent trends/changes and the most influential regions, institutions, scholars, and publications in the field. In addition, the intellectual structures of the field reviewed by the network maps of the most influential documents and scholars showed how these eminent documents and authors were related to each other. It was found that the field experienced significant changes with the emergence of new scholars, research themes, and topics. This study is also a tribute to hundreds of scholarly documents in the field, which keep the field moving forward.

https://doi.org/10.14746/ssllt.25141
PDF

References

Alderson, J. C. (1996). Do corpora have a role in language assessment? In J. Thomas & M. Short (Eds.), Using corpora for language research: Stud-ies in the honor of Geoffrey Leech (pp. 248-259). Longman.

Alderson, J. C. (2005). Diagnosing foreign language proficiency: The inter-face between learning and assessment. Continuum.

Alderson, J. C., & Kremmel, B. (2013). Re-examining the content validation of a grammar test: The (im)possibility of distinguishing vocabulary and structural knowledge. Language Testing, 30(4), 535-556. DOI: https://doi.org/10.1177/0265532213489568

Alderson, J. C., & Wall, D. (1993). Does washback exist? Applied Linguistics, 14(2), 115-129. DOI: https://doi.org/10.1093/applin/14.2.115

Alderson, J. C., & Wall, D. (1996) (Eds.). Language Testing, 13(3) [Special issue]. DOI: https://doi.org/10.1177/026553229601300304

Anthony, L. (2018). AntConc (Version 3.5.6) [Computer Software]. Waseda University. http://www.laurenceanthony.net/software

Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press.

Bachman, L. F. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17(1), 1-42. DOI: https://doi.org/10.1177/026553220001700101

Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford University Press.

Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford University Press.

Batty, A. O. (2015). A comparison of video-and audio-mediated listening tests with many-facet Rasch modeling and differential distractor functioning. Language Testing, 32(1), 3-20. DOI: https://doi.org/10.1177/0265532214531254

Benson, P., Chik, A., Gao, X., Huang, J., & Wang, W. (2009). Qualitative research in language teaching and learning journals, 1997-2006. Modern Language Journal, 93(1), 79-90. DOI: https://doi.org/10.1111/j.1540-4781.2009.00829.x

Berger, A. (2019). Specifying progression in academic speaking: A keyword analysis of CEFR-based proficiency descriptors. Language Assessment Quarterly, 17(1), 85-99. DOI: https://doi.org/10.1080/15434303.2019.1689981

Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency. Language Testing, 20, 1-25. DOI: https://doi.org/10.1191/0265532203lt242oa

Brown, H. D. (2000). Principles of language learning and teaching (Vol. 4). Longman.

Buck, G. (2001). Assessing listening. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511732959

Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Information, 22(2), 191-235. DOI: https://doi.org/10.1177/053901883022002003

Can Daşkın, N., & Hatipoğlu, Ç. (2019). Reference to a past learning event as a practice of informal formative assessment in L2 classroom interaction. Language Testing, 36(4), 527-551. DOI: https://doi.org/10.1177/0265532219857066

Carroll, J. B. (1961). Fundamental considerations in testing of English language proficiency of foreign students. In H. B. Allen & R. N. Campbell (Eds.), Teaching English as a second language: A book of readings (pp. 313-321). McGraw-Hill.

Chalhoub-Deville, M. (2003). Second language interaction: Current perspectives and future trends. Language Testing, 20(4), 369-383. DOI: https://doi.org/10.1191/0265532203lt264oa

Chang, Y. W., Huang, M. H., & Lin, C. W. (2015). Evolution of research subjects in library and information science based on keyword, bibliographical coupling, and co-citation analyses. Scientometrics, 105, 2071-2087. DOI: https://doi.org/10.1007/s11192-015-1762-8

Chapelle, C. A., & Douglas, D. (2006). Assessing language through computer technology. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511733116

Chen, J., White, S., McCloskey, M., Soroui, J., & Chun, Y. (2011). Effects of computer versus paper administration of an adult functional writing assessment. Assessing Writing, 16(1), 49-71. DOI: https://doi.org/10.1016/j.asw.2010.11.001

Chen, M. L. (2023). SLA as an interdiscipline: A bibliometric study. Studies in Second Language Learning and Teaching, 13(4), 843-882. DOI: https://doi.org/10.14746/ssllt.40218

Cheng, L., & Curtis, A. (2004). Washback or backwash: A review of the impact of testing on teaching and learning. In L. Cheng & Y. Watanabe (Eds.), Washback in language testing (pp. 25-40). Routledge. DOI: https://doi.org/10.4324/9781410609731-9

Chiu, W. T., & Ho, Y. S. (2007). Bibliometric analysis of tsunami research. Scientometrics, 73(1), 3-17. DOI: https://doi.org/10.1007/s11192-005-1523-1

Cohen, J. (1988). Statistical power analysis for the social sciences. Erlbaum.

Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching, assessment. Cambridge University Press.

Courtial, J. (1994). A coword analysis of scientometrics. Scientometrics, 31(3), 251-260. DOI: https://doi.org/10.1007/BF02016875

Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. (2011). Predicting lexical proficiency in language learner texts using computational indices. Language Testing, 28(4), 561-580. DOI: https://doi.org/10.1177/0265532210378031

Cushing, S. T. (2017). Corpus linguistics in language testing research. Language Testing, 34(4), 441-449. DOI: https://doi.org/10.1177/0265532217713044

de Bellis, N. (2009). Bibliometrics and citation analysis: From the science citation index to cybermetrics. Scarecrow Press.

de Bot, K. (2015). A history of applied linguistics: From 1980 to the present. Routledge. DOI: https://doi.org/10.4324/9781315743769

de Groote, S. L., & Raszewski, R. (2012). Coverage of Google Scholar, Scopus, and Web of Science: A case study of the h-index in nursing. Nursing Outlook, 60(6), 391-400. DOI: https://doi.org/10.1016/j.outlook.2012.04.007

Denies, K., & Janssen, R. (2016). Country and gender differences in the functioning of CEFR-based can-do statements as a tool for self-assessing English proficiency. Language Assessment Quarterly, 13(3), 251-276. DOI: https://doi.org/10.1080/15434303.2016.1212055

Deygers, B., Zeidler, B., Vilcu, D., & Carlsen, C. H. (2018). One framework to unite them all? Use of the CEFR in European university entrance policies. Language Assessment Quarterly, 15(1), 3-15. DOI: https://doi.org/10.1080/15434303.2016.1261350

Farhady, H. (2018). History of language testing and assessment. In J. I. Liontas (Ed.), The TESOL encyclopedia of English language teaching (pp. 1-7). John Wiley & Sons. DOI: https://doi.org/10.1002/9781118784235.eelt0343

Fulcher, G. (2003). Testing second language speaking. Pearson Longman.

Fulcher, G. (2004). Deluded by artifices? The Common European Framework and harmonization. Language Assessment Quarterly, 1(4), 253-266. DOI: https://doi.org/10.1207/s15434311laq0104_4

Garfield, E. (1955). Citation indexes for science: A new dimension in documentation through association of ideas. Science, 122(3159), 108-111. DOI: https://doi.org/10.1126/science.122.3159.108

Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & Computers, 36(2), 193-202. DOI: https://doi.org/10.3758/BF03195564

Green, A. (2018). Linking tests of English for academic purposes to the CEFR: The score user’s perspective. Language Assessment Quarterly, 15(1), 59-74. DOI: https://doi.org/10.1080/15434303.2017.1350685

Holden, G., Rosenberg, G., & Barker, K. (2005). Tracing thought through time and space: A selective review of bibliometrics in social work. In G. Holden, G. Rosenberg, & K. Barker (Eds.). Bibliometrics in social work (pp.1-34). Routledge. DOI: https://doi.org/10.1300/J010v41n03_01

Hyland, K., & Jiang, F. K. (2021a). A bibliometric study of EAP research: Who is doing what, where and when? Journal of English for Academic Purposes, 49, Article 100929. DOI: https://doi.org/10.1016/j.jeap.2020.100929

Hyland, K., & Jiang, F. K. (2021b). Delivering relevance: The emergence of ESP as a discipline. English for Specific Purposes, 64, 13-25. DOI: https://doi.org/10.1016/j.esp.2021.06.002

Hyland, K., & Jiang, F. (2023). Interaction in written texts: A bibliometric study of published research. Studies in Second Language Learning and Teaching, 13(4), 903-924. DOI: https://doi.org/10.14746/ssllt.40220

Jarvis, S. (2017). Grounding lexical diversity in human judgments. Language Testing, 34(4), 537-553. DOI: https://doi.org/10.1177/0265532217710632

Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527-535. DOI: https://doi.org/10.1037//0033-2909.112.3.527

Kane, M. T. (2006). Validation. In R. L. Brennan (Eds.), Educational meas-urement (4th ed., pp. 17-64). American Council on Education/Praeger.

LaFlair, G. T., & Staples, S. (2017). Using corpus linguistics to examine the extrapolation inference in the validity argument for a high-stakes speaking assessment. Language Testing, 34(4), 451-475. DOI: https://doi.org/10.1177/0265532217713951

Lado, R. (1961). Language testing: The construction and use of foreign language tests: A teacher’s book. McGraw-Hill.

Lantolf, J. P., & Poehner, M. E. (2014). Sociocultural theory and the pedagogical imperative in L2 education: Vygotskian praxis and the research/practice divide. Routledge. DOI: https://doi.org/10.4324/9780203813850

Lantolf, J. P., & Thorne, S. L. (2006). Sociocultural theory and the genesis of second language development. Oxford University Press.

Law, J., & Whittaker, J. (1992). Mapping acidification research: A test of the co-word method. Scientometrics, 23(3), 417-461. DOI: https://doi.org/10.1007/BF02029807

Lazaraton, A. (2002). A qualitative approach to the validation of oral language tests. Cambridge University Press.

Lei, L., Deng, Y., & Liu, D. (2023). Research on the learning/teaching of L2 listening: A bibliometric review and its implications. Studies in Second Language Learning and Teaching, 13(4), 781-810. DOI: https://doi.org/10.14746/ssllt.40216

Lei, L., & Liao, S. (2017). Publications in linguistics journals from Mainland China, Hong Kong, Taiwan, and Macau (2003-2012): A bibliometric analysis. Journal of Quantitative Linguistics, 24(1), 54-64. DOI: https://doi.org/10.1080/09296174.2016.1260274

Lei, L., & Liu, D. (2019a). Research trends in Applied Linguistics from 2005 to 2016: A bibliometric analysis and its implications. Applied Linguistics, 40(3), 540-561. DOI: https://doi.org/10.1093/applin/amy003

Lei, L., & Liu, D. (2019b). The research trends and contributions of System’s publications over the past four decades (1973-2017): A bibliometric analysis. System, 80, 1-13. DOI: https://doi.org/10.1016/j.system.2018.10.003

Little, D. (2007). The Common European Framework of Reference for Languages: Perspectives on the making of supranational language education policy. Modern Language Journal, 91(4), 645-655. DOI: https://doi.org/10.1111/j.1540-4781.2007.00627_2.x

Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4), 474-496. DOI: https://doi.org/10.1075/ijcl.15.4.02lu

Lu, X. (2011). A corpus‐based evaluation of syntactic complexity measures as indices of college‐level ESL writers’ language development. TESOL Quarterly, 45(1), 36-62. DOI: https://doi.org/10.5054/tq.2011.240859

Lu, X., & Ai, H. (2015). Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds. Journal of Second Language Writing, 29, 16-27. DOI: https://doi.org/10.1016/j.jslw.2015.06.003

McNamara, T. F. (1996). Measuring second language performance. Addison Wesley Longman.

McNamara, T. (2004). Language testing. In A. Davies & C. Elder (Eds.), The handbook of applied linguistics (pp. 763-783). Blackwell. DOI: https://doi.org/10.1002/9780470757000.ch31

Meara, P. (2012). The bibliometrics of vocabulary acquisition: An exploratory study. RELC Journal, 43(1), 7-22. DOI: https://doi.org/10.1177/0033688212439339

Meara, P. (2023). The Routledge handbook of vocabulary studies: A study in micro-bibliometrics. Studies in Second Language Learning and Teaching, 13(4), 883-902. DOI: https://doi.org/10.14746/ssllt.40219

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (pp. 13-103). Macmillan/American Council on Education.

Messick, S. (1996). Validity and washback in language testing. Language Testing, 13(3), 241-256. DOI: https://doi.org/10.1177/026553229601300302

Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139524759

Pan, M., & Qian, D. D. (2017). Embedding corpora into the content validation of the grammar test of the National Matriculation English Test (NMET) in China. Language Assessment Quarterly, 14(2), 120-139. DOI: https://doi.org/10.1080/15434303.2017.1303703

Park, K. (2014). Corpora and language assessment: The state of the art. Language Assessment Quarterly, 11(1), 27-44. DOI: https://doi.org/10.1080/15434303.2013.872647

Poehner, M. E., Zhang, J., & Lu, X. (2015). Computerized dynamic assessment (C-DA): Diagnosing L2 development according to learner responsiveness to mediation. Language Testing, 32(3), 337-357. DOI: https://doi.org/10.1177/0265532214560390

Purpura, J. E. (2004) Assessing grammar. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511733086

Rayson, P., & Garside, R. (2000). Comparing corpora using frequency profiling. In Proceedings of the workshop on comparing corpora (pp. 1-6). Association for Computational Linguistics. DOI: https://doi.org/10.3115/1117729.1117730

Read, J. (2000). Assessing vocabulary. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511732942

Riazi, A. M., Ghanbar, H., Marefat, F., & Fazel, I. (2023). Review and analysis of empirical articles published in TESOL Quarterly over its lifespan. Studies in Second Language Learning and Teaching, 13(4), 811-841. DOI: https://doi.org/10.14746/ssllt.40217

Roemer, R. C., & Borchardt, R. (2015). Meaningful metrics: A 21st-century librarian’s guide to bibliometrics, altmetrics, and research impact. American Library Association.

Römer, U. (2017). Language assessment and the inseparability of lexis and grammar: Focus on the construct of speaking. Language Testing, 34(4), 477-492. DOI: https://doi.org/10.1177/0265532217711431

Segbers, J., & Schroeder, S. (2017). How many words do children know? A corpus-based estimation of children’s total vocabulary size. Language Testing, 34(3), 297-320. DOI: https://doi.org/10.1177/0265532216641152

Shohamy, E. (2001). The power of tests: A critical perspective of the uses of language tests. Longman.

Skehan, P. (2009). Modelling second language performance: Integrating complexity, accuracy, fluency, and lexis. Applied Linguistics, 30(4), 510-532. DOI: https://doi.org/10.1093/applin/amp047

Small, H. (1973). Co‐citation in the scientific literature: A new measure of the relationship between two documents. Journal of the Association for Information Science and Technology, 24(4), 265-269. DOI: https://doi.org/10.1002/asi.4630240406

Spolsky, B. (1981). Some ethical questions about language testing. In C. Klein-Braley & D. K. Stevenson (Eds.), Practice and problems in language testing (pp. 5-30). Peter Lang.

Spolsky, B. (1995). Measured words: The development of objective language testing. Oxford University Press.

Spolsky, B. (2017). History of language testing. In E. Shohamy, I. G. Or, & S. May. (Eds.), Language testing and assessment (pp. 375-384). DOI: https://doi.org/10.1007/978-3-319-02261-1_32

Swales, J. (1986). Citation analysis and discourse analysis. Applied Linguistics, 7(1), 39-56. DOI: https://doi.org/10.1093/applin/7.1.39

Wagner, E. (2021). Assessing listening. In G. Fulcher & L. Harding (Eds.), The Routledge handbook of language testing (pp. 223-235). Routledge. DOI: https://doi.org/10.4324/9781003220756-18

Waltman, L., & Van Eck, N. J. (2013). A smart local moving algorithm for large-scale modularity-based community detection. European Physical Journal B, 86, Article 471. DOI: https://doi.org/10.1140/epjb/e2013-40829-0

Waltman, L., & Van Eck, N. J. (2017). VOSviewer manual. Online retrieved from http://www.vosviewer.com/documentation/Manual_VOSviewer_1.6.6.pdf

Waltman, L., Van Eck, N. J., & Noyons, E. C. M. (2010). A unified approach to mapping and clustering of bibliometric networks. Journal of Informetrics, 4(4), 629-635. DOI: https://doi.org/10.1016/j.joi.2010.07.002

Wang L., & Zhang L. J. (2019). Peter Skehan’s influence in research on task difficulty in second language learners’ acquisition of oral and written language. In E. Wen & W. Ahmadian (Eds.), Researching L2 task performance and pedagogy: In honor of Peter Skehan (pp. 183-198). John Benjamins. DOI: https://doi.org/10.1075/tblt.13.09wan

White, H. D. (2004). Citation analysis and discourse analysis revisited. Applied Linguistics, 25(1), 89-116. DOI: https://doi.org/10.1093/applin/25.1.89

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116. DOI: https://doi.org/10.1111/j.1745-3984.1977.tb00031.x

Wilson, A. (2013). Embracing Bayes factors for key item analysis in corpus linguistics. In M. Bieswanger & A. Koll-Stobbe (Eds.), New approaches to the study of linguistic variability (pp. 3-11). Peter Lang.

Vygotsky, L. S. (1978). Mind in society. Harvard University Press.

Xi, X. (2017). What does corpus linguistics have to offer to language as-sessment? Language Testing, 34(4), 565-577. DOI: https://doi.org/10.1177/0265532217720956

Xie, Q. (2019). Diagnosing linguistic problems in English academic writing of university students: An item bank approach. Language Assessment Quarterly. 17(2), 183-203. DOI: https://doi.org/10.1080/15434303.2019.1691214

Xu, Y., Zhuang, J., Blair, B., Kim, A., Li, F., Thorson Hernández, R., & Plonsky, L. (2023). Modeling quality and prestige in applied linguistics journals: A bibliometric and synthetic analysis. Studies in Second Language Learning and Teaching, 13(4), 755-779. DOI: https://doi.org/10.14746/ssllt.40215

Zhang, X. (2020). A bibliometric analysis of second language acquisition between 1997 and 2018. Studies in Second Language Acquisition, 42(1), 199-222. DOI: https://doi.org/10.1017/S0272263119000573