Recurrent strings in corpus-based pedagogical research: A reappraisal of the field

Main Article Content

Romuald Gozdawa-Gołębiowski
Marcin Opacki


Formulaic competence is a hotly debated issue in teaching circles, not only because of its role in L2 communication but also due to the inherent complexity of the identification criteria for formulaic strings. While the mixed approach, combining meaning-based and corpus- based identification measures, remains a natural solution, the subjective character of the criteria, together with the required involvement of native experts, diminishes its attractiveness for every-day pedagogical purposes. We would like to explore the potential of “corpus-only” identification tools. Specifically, our objective is to show that meaningless n-grams (of the, in a, etc.) generated by frequency searches contain useful pedagogical data, and that, coupled with MI scores frequency-based measures accurately characterize learners’ formulaic competence. Because of the relative simplicity of the identification procedure and free availability of corpus tools, frequency-based and distribution-based measures may become an important new pedagogical tool at the disposal of language teachers


Download data is not yet available.

Article Details

Jak cytować
Gozdawa-Gołębiowski, R., & Opacki, M. (2018). Recurrent strings in corpus-based pedagogical research: A reappraisal of the field. Glottodidactica. An International Journal of Applied Linguistics, 45(2), 134-149.


  1. Anthony, L. (2018). AntConc (Version 3.2.1) [Computer Software]. Tokyo, Japan: Waseda University.
  2. Bretaña, J.-M. P. / Bertrán, A. P. (2008). Combined statistical and grammatical criteria for the retrieval of phraseological units in an electronic corpus. In: S. Granger / F. Meunier (eds.), Phraseology: An interdisciplinary perspective (pp. 391–406). Amsterdam: John Benjamins.
  3. De Cock, S. (1998). A recurrent word combination approach to the study of formulae in the speech of native and non-native speakers of English. International Journal of Corpus Linguistics, 3, 59–80.
  4. Durrant, P. / Mathews-Aydınlı, J. (2011). A function-first approach to identifying formulaic language in academic writing. English for Specific Purposes, 30 (1), 58-72.
  5. Forsberg, F. (2010). Using conventional sequences in L2 French. Iral-international Review of Applied Linguistics in Language Teaching – IRAL-INT REV APPL LINGUIST, 48, 25–51.
  6. Gablasova, D. / Brezina, V. / McEnery, T. (2017). Collocations in corpus-based language learning research: Identifying, comparing and interpreting the evidence. Language Learning, 67, 155–179.
  7. Gilquin, G. / Paquot. M. (2008). Too chatty. Learner academic writing and register variation. English Text Construction, 1 (1), 41 – 61.
  8. Granger, S. (1998). Prefabricated patterns in advanced EFL writing: collocations and formulae. In: A. Cowi (ed.), Phraseology: Theory, analysis and applications (pp. 145–160). Oxford: Oxford University Press.
  9. Howarth, P. (1998). Phraseology and second language proficiency. Applied Linguistics, 19 (1), 24–44.
  10. Hulstijn, J. H. /Marchena, E. (1989). Avoidance: grammatical or semantic causes? Studies in Second Language Acquisition, 11 (03), 241–255.
  11. Jones, M. / Haywood, S. (2004). Facilitating the acquisition of formulaic sequences. In: N. Schmitt (ed.), Formulaic sequences (pp. 269–292). Amsterdam/Philadelphia: John Benjamins.
  12. Kazemi, M. / Katiraei, S. / Rasekh, A. E. (2014). The impact of teaching lexical bundles on improving Iranian EFL students’ writing skill. Procedia – Social and Behavioral Sciences, 98, 864–869.
  13. Martinez, R. / Schmitt, N. (2012). A phrasal expressions list. Applied Linguistics, 33 (3), 299–320.
  14. Nasiri, M. / Khorshidi, S. (2015). Dynamic assessment of formulaic sequences in Iranian EFL learners’ writing. International Journal of Language and Applied Linguistics, 1, 26–32.
  15. Nattinger, J. R. / DeCarrico, J. S. (1992). Lexical phrases and language teaching. Oxford: Oxford University Press.
  16. Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some implications for teaching. Applied Linguistics, 24 (2), 223–242.
  17. Nesselhauf, N. (2005). Collocations in a learner corpus. Philadelphia, PA: John Benjamins.
  18. Opacki, M. (2017). Reconsidering early bilingualism: A corpus-based study of Polish migrant children in the United Kingdom. Frankfurt am Main: Peter Lang.
  19. Opacki, M. / Gozdawa-Gołębiowski, R. (2017). Towards a distribution-based corpus analysis of transfer-susceptible NP modifiers. A case of Polish advanced users of L2 English. Konin Language Studies, 5 (1), 9–35.
  20. Paquot, M. / Granger, S. (2012). Formulaic language in learner corpora. Annual Review of Applied Linguistics, 32, 130–149.
  21. Pawley, A. (2007). Developments in the study of formulaic language since 1970: A personal view. Phraseology and Culture in English, 3–48.
  22. Pawley, A. / Syder, F. H. (1983). Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In: J. C. Richards / R. W. Schmidt (eds.), Language and communication (pp. 191–225). London: Longman.
  23. Peters, E. / Pauwels, P. (2015). Learning academic formulaic sequences. Journal of English for Academic Purposes, 20, 28–39.
  24. Schmitt, N. / Carter, R. (2004). Formulaic sequences in action. An introduction. In: N. Schmitt (ed.), Formulaic sequences: acquisition, processing and use (pp. 1–22). Amsterdam: John Benjamins.
  25. Schmitt, N. / Dörnyei, Z. / Adolphs, S. / Durow, V. (2004). Knowledge and acquisition of formulaic sequences: A longitudinal study. In: N. Schmitt (ed.), Formulaic sequences: Acquisition, processing and use (pp. 55–86). Amsterdam: John Benjamins.
  26. Simpson-Vlach, R. / Ellis, N. (2010). An academic formulas list: New methods in phraseology research. Applied Linguistics, 31, 487–512.
  27. Siyanova, A. / Schmitt, N. (2007). Native and nonnative use of multi-word vs. one-word verbs. International Review of Applied Linguistics, 45, 119–139.
  28. Siyanova, A. / Schmitt, N. (2008). L2 learner production and processing of collocation: A multi- study Perspective. Canadian Modern Language Review, 64 (3), 429–458.
  29. Skehan, P. /Foster, P. (2001). Cognition and tasks. In: P. Robinson (ed.), Cognition and second language learning (pp. 183–205). New York: Cambridge University Press.
  30. Szudarski, P. / Carter, R. (2016). The role of input flood and input enhancement in EFL learners’ acquisition of collocations: L2 input types and acquisition of collocations. International Journal of Applied Linguistics, 26 (2), 245–265.
  31. Weinert, R. (1995). The role of formulaic language in second language acquisition: A review. Applied Linguistics, 16 (2), 180–205.
  32. Wood, D. (2015). Fundamentals of formulaic language. London, New York, New Delhi: Bloomsbury Academic.
  33. Wray, A. (2005). Formulaic language and the lexicon. Cambridge: Cambridge University Press.
  34. Wray, A. (2008). Formulaic language: Pushing the boundaries. Oxford, New York: Oxford University Press.
  35. Wray, A. / Namba, K. (2003). Use of formulaic language by a Japanese-English bilingual child: A practical approach to data analysis. Japanese Journal for Multilingualism and Multiculturalism, 9 (1), 24–51.
  36. Yorio, C. A. (1989). Idiomaticity as an indicator of second language proficiency. In: K. Hyltenstam & L. K. Obler (eds.), Bilingualism across the lifespan: Aspects of acquisition, maturity, and loss (pp. 55–72). Cambridge: Cambridge University Press.