Using corpora for language teaching and assessment in L2 writing: A narrative review

Main Article Content

Ömer Faruk Kaya
https://orcid.org/0000-0001-7329-5557
Kutay Uzun
https://orcid.org/0000-0002-8434-0832
Hakan Cangır

Abstract

Corpora have primarily been used in linguistic research, but they have not yet become a pedagogical mainstay of language teaching and assessment practices. Therefore, this narrative review paper aimed to inform practitioners and researchers by examining the advantages and disadvantages of data-driven learning and exploring the use of corpora in foreign language teaching, particularly in writing. Specifically, the goals of this paper include: (1) elucidating what data-driven learning is and its potential to shape the learning experience, (2) explaining and exemplifying how learner corpora can guide EFL learners with particular attention to academic writing, and (3) providing insights into the indirect uses of corpora in teaching and assessing academic writing in L2. The review has met its objectives by presenting evidence compiled from the results of corpus-related studies and references to the use of corpus in language instruction.

Metrics

Metrics Loading ...

Article Details

How to Cite
Kaya, Ömer F., Uzun, K., & Cangır, H. (2022). Using corpora for language teaching and assessment in L2 writing: A narrative review. Focus on ELT Journal, 4(3), 46–62. https://doi.org/10.14744/felt.2022.4.3.4
Section
Articles

References

Ai, H., & Lu, X. (2010, June 8–12). A web-based system for automatic measurement of lexical complexity. Paper presented at the 27th Annual Symposium of the Computer-Assisted Language Consortium (CALICO-10). Amherst, MA.

Ander, S., & Yıldırım, Ö. (2010). Lexical errors in elementary level EFL learners' compositions. Procedia - Social and Behavioral Sciences, 2(2), 5299-5303. https://doi.org/10.1016/j.sbspro.2010.03.864

Anthony, L. (2022). AntConc (Version 4.1.3) [Computer Software]. Tokyo, Japan: Waseda University. Available from https://www.laurenceanthony.net/software

Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater® v.2. The Journal of Technology, Learning and Assessment, 4(3), 1-30.

Baisa, V., & Suchomel, V. (2014). SkELL: Web interface for English language learning. In A. Horák, & P. Rychlý (Eds.), Proceedings of recent advances in Slavonic natural language processing (pp. 63-70). NPL Publishing Consultants.

Biber, D., Johansson, S., Leech, G., Conrad, S., Finegan, E., & Quirk, R. (1999). Longman grammar of written and spoken English. Longman.

Boulton, A. (2008). DDL: Reaching the parts other teaching can't reach? In A. Frankenburg-García (Eds.), Proceedings of the 8th Teaching and Language Corpora Conference (pp. 38- 44). Associaçao de Estudos e de Investigaçao Científica do ISLA-Lisboa.

Boulton, A., & Cobb, T. (2017). Corpus use in language learning: A meta‐analysis. Language Learning, 67(2), 348-393. https://doi.org/10.1111/lang.12224

Brezina, V., & Gablasova, D. (2015). Is there a core general vocabulary? Introducing the New General Service List. Applied Linguistics, 36(1), 1-22. https://doi: 10.1093/applin/amt018

Cangır, H. (2021). Objective and subjective collocational frequency Association strength measures and EFL teacher intuitions. Pedagogical Linguistics, 2(1), 64-91. https://doi.org/10.1075/pl.20014.can

Carter, R., & McCarthy, M. (2006). Cambridge grammar of English. Cambridge University Press.

Chambers, A., & Le Baron, F. (2007). Chambers-le Baron corpus of research articles in French. Oxford Text Archive, http://hdl.handle.net/20.500.12024/2527.

Chang, W. L., & Sun, Y. C. (2009). Scaffolding and web concordancers as support for language learning. Computer Assisted Language Learning, 22(4), 283-302. https://doi.org/10.1080/09588220903184518

Chapelle, C. A., & Plakans, L. (2013). Assessment and testing: Overview. In C. A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 240-244). Blackwell/Wiley.

https://doi.org/10.1002/9781405198431.wbeal0603

Cobb, T. (1999). Applying constructivism: A test for the learner-as-scientist. Educational Technology Research and Development, 47(3), 15-31. https://doi.org/10.1007/BF02299631

Cobb, T. (n.d.). Compleat Lex Tutor v.8.5 [Software]. Accessed 17 July 2022 at https://www.lextutor.ca

Cobb, T., & Boulton, A. (2015). Classroom applications of corpus analysis. In D. Biber & R. Reppen (Eds.), The Cambridge handbook of English corpus linguistics (pp. 478-497). Cambridge University Press. https://doi.org/10.1017/CBO9781139764377.027

Collentine, J. (2000). Insights into the construction of grammatical knowledge provided by user-behavior tracking technologies. Language Learning & Technology, 3(2), 44-57. https://doi.org/10125/25072

Corino, E., & Onesti, C. (2019). Data-Driven Learning: A Scaffolding Methodology for CLIL and LSP Teaching and Learning. Frontiers in Education, 4(7), 1-12. https://doi.org/10.3389/ feduc.2019.00007

Council of Europe. (2001). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press. Accessed 16 July 2022 at https://rm.coe.int/1680459f97

Crossley, S. A., Bradfield, F., & Bustamante, A. (2019). Using human judgments to examine the validity of automated grammar, syntax, and mechanical errors in writing. Journal of Writing Research, 11(2), 251-270. https://doi.org/10.17239/jowr-2019.11.02.01

Crosthwaite, P. (2017). Retesting the limits of data-driven learning: feedback and error correction, Computer Assisted Language Learning, 30(6), 447-473. https://doi.org/10.1080/09588221.2017.1312462

Crosthwaite, P. (2020). Data-driven learning for the next generation: Corpora and DDL for pre-tertiary learners. Routledge. https://doi.org/10.4324/9780429425899

Crosthwaite, P., & Cheung, L. (2019). Learning the Language of Dentistry: Disciplinary Corpora in the Teaching of English for Specific Academic Purposes. John Benjamins. https://doi.org/10.1075/scl.93

Cushing, S. T. (2017). Corpus linguistics in language testing research. Language Testing, 34(4), 441-449. https://doi.org/10.1177/0265532217713044

De Smet, M. J. R., Leijten, M., & Van Waes, L. (2018). Exploring the process of reading during writing using eye tracking and keystroke logging. Written Communication, 35(4), 411447. https://doi.org/10.1177/0741088318788070

Edmonds, P. (2013). Just The Word. Accessed 17 July 2022 at http://www.just-the-word.com/

Flowerdew, J. (2009). Corpora in Language Teaching. In M. H. Long & C. J. Doughty (Eds.), The handbook of language teaching (pp. 327-350). Wiley-Blackwell. https://doi.org/10.1002/9781444315783.ch19

Flowerdew, L. (2010). Using corpora for writing instruction. In A. O'Keeffe, & M. McCarthy (Eds.). The Routledge handbook of corpus linguistics (pp. 444-457). Routledge. https://www.routledgehandbooks.com/doi/10.4324/9780203856949.ch32

Flowerdew, L. (2015). Data-driven learning and language learning theories: Whither the twain shall meet. In A. Leńko-Szymańska & A. Boulton (Eds.), Multiple affordances of language corpora for data-driven learning (pp. 15–36). John Benjamins. https://doi.org/10.1075/scl.69.02flo

Frankenberg-Garcia, A., Rees, G., Lew, R., Roberts, J., Sharma, N., & Butcher, P. (2019). ColloCaid: a tool to help academic English writers find the words they need. In F. Meunier (Eds.), CALL and complexity – short papers from EUROCALL 2019 (pp.144–150). https://doi.org/10.14705/rpnet.2019.38.1000

Gilquin, G., & Granger, S. (2022). ‘Using data-driven learning in language teaching’. In A. O’Keeffe, & M. McCarthy (Eds.) The Routledge handbook of corpus linguistics. Second Edition (pp. 430-442). Routledge. https://doi.org/10.4324/9780367076399-30

Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & Computers, 36(2), 193-202. https://doi.org/10.3758/BF03195564

Granger, S. (1994). The Learner Corpus: A revolution in applied linguistics. English Today, 10(3), 25-33. https://doi.org/10.1017/S0266078400007665

Granger, S. (2002). A bird's-eye view of learner corpus research. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition and foreign language teaching (pp. 3–33). John Benjamins. https://doi.org/10.1075/lllt.6.04gra

Granger, S. (2015). The contribution of learner corpora to reference and instructional materials design. The Cambridge Handbook of Learner Corpus Research, 485-510. https://doi.org/10.1017/cbo9781139649414.022

Granger, S., & Meunier, F. (1994). Towards a grammar checker for learners of English. In U. Fries, & G. Tottie (Eds.) Creating and using English language corpora (pp. 79-91). Rodopi.

Higgins, D., Ramineni, C., & Zechner, K. (2015). Learner corpora and automated scoring. In S. Granger, G. Gilquin, & F. Meunier (Eds.), Cambridge handbook of learner corpus research (pp. 567–586). Cambridge University Press. https://doi.org/10.1017/CBO9781139649414.026

Hoffmann, S., Evert, S., Smith, N., Lee, D., & Berglund-Prytz, Y. (2008). Corpus linguistics with BNCweb-a practical guide (Vol. 6). Peter Lang.

Huang, K. (2015). More does not mean better: Frequency and accuracy analysis of lexical bundles in Chinese EFL learners’ essay writing. System, 53, 13-23. https://doi.org/10.1016/j.system. 2015.06.011

Hymes, D. (1972). On communicative competence. In J. Pride, & J. Holmes (Eds.), Sociolinguistics (pp. 269-285). Penguin Books.

Indrarathne, B., Ratajczak, M., & Kormos, J. (2018). Modelling changes in the cognitive processing of grammar in implicit and explicit learning conditions: Insights from an eye-tracking study. Language Learning, 68(3), 669-708. https://doi.org/10.1111/lang.12290

Jarvis, S. (2017). Grounding lexical diversity in human judgments. Language Testing, 34(4), 537-553.

https://doi.org/10.1177/0265532217710632

Johns, T. (1991). Should you be persuaded: Two examples of data-driven learning. English Language Research Journal, 4, 1-16.

Johns, T. (1997). Contexts: The background, development and trialling of a concordance-based call program. In A. Wichmann, S. Fligelstone, T. McEnery, & G. Knowles (Eds.), Teaching and language corpora (pp. 15-36). Longman.

Kyle, K. (2016). Measuring syntactic development in L2 writing: Fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication [Doctoral dissertation, Georgia State University]. ScholarWorks @ Georgia State University. http://scholarworks.gsu.edu/alesl_diss/35

Kyle, K., & Crossley, S. A. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly, 49(4), 757-786. https://doi.org/10.1002/tesq.194

Kyle, K., & Crossley, S. A. (2018). Measuring syntactic complexity in L2 writing using fine-grained clausal and phrasal indices. Modern Language Journal, 102(2), 333–349. https://doi.org/10.1111/modl.12468

Landauer, T. K., Laham, D., & Foltz, P. W. (2000). The intelligent essay assessor. IEEE Intelligent Systems, 15, 27-31.

Lee, D. Y., & Swales, J. M. (2006). A corpus-based EAP course for NNS doctoral students: Moving from available specialized corpora to self-compiled corpora. English for Specific Purposes, 25, 56-75. https://doi.org/10.1016/j.esp.2005.02.010

Lee, H., Warschauer, M., & Lee, J. H. (2019). The effects of corpus use on second language vocabulary learning: A multilevel meta-analysis. Applied Linguistics, 40(5), 721–753. https://doi.org/10.1093/applin/amy012

Summers, D. (2003). Longman dictionary of contemporary English (4th edition). Longman.

Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4), 474-496. https://doi.org/10.1075/ijcl.15.4.02lu

McEnery, A., & Xiao, R. (2005). Help or help to: What do corpora have to say? English Studies, 86(2), 161-187.

https://doi.org/10.1080/0013838042000339880

McEnery, T., & Xiao, R. (2011). What corpora can offer in language teaching and learning. In E. Hinkel (Ed.) Handbook of research in second language teaching and learning (pp. 364-380). Routledge.

McEnery, T., Xiao, R., & Tono, Y. (2006). Corpus-based language studies: An advanced resource book. Routledge.

Meunier, F. (2020). A case for constructive alignment in DDL: Rethinking outcomes, practices, and assessment in (data-driven) language learning. In P. Crosthwaite (Ed.), Data-driven learning for the next generation: Corpora and DDL for pre-tertiary learners (pp. 13-30). Routledge.

Naismith, B., Juffs, A., Han, N.-R., & Zheng, D. (2022). Handle it in-house? International Journal of Corpus Linguistics, 27(3), 291–320. https://doi.org/10.1075/ijcl.20024.nai

Nesselhauf, N. (2004). Learner corpora: Learner corpora and their potential for language teaching. In J. Sinclair (Ed.), How to use corpora in language teaching (pp. 125-152). John Benjamins. https://doi.org/10.1075/scl.12.11nes

O'Donnell, M. (2016). UAM corpus tool 3.3f. Retrieved 30 June 2022.

O'Keeffe, A. (2021). Data-driven learning–a call for a broader research gaze. Language Teaching, 54(2), 259-272.

https://doi.org/10.1017/S0261444820000245

O'Sullivan, Í. (2007). Enhancing a process-oriented approach to literacy and language learning: The role of corpus consultation literacy. ReCALL, 19(3), 269-286. https://doi.org/10.1017/S095834400700033X

Page, E. B. (2003). Project Essay Grade: PEG. In M. D. Shermis, & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 43-54). Lawrence Erlbaum Associates.

Pérez-Paredes, P., Ordoñana Guillamón, C., van de Vyver, J., Meurice, A., Aguado Jiménez, P., Conole, G., & Sánchez Hernández, P. (2019). Mobile data-driven language learning: Affordances and learners’ perception. System, 84, 145-159. https://doi.org/https://doi.org/10.1016/j.system.2019.06.009

Römer, U. (2022). Applied corpus linguistics for language acquisition, pedagogy, and beyond. Language Teaching, 55(2), 233-244. https://doi.org/10.1017/S0261444821000392

Rudner, L. M., Garcia, V., & Welch, C. (2006). An evaluation of IntelliMetric™ essay scoring system. The Journal of Technology, Learning and Assessment, 4(4), 3-21.

Runcie, M. (2002). Oxford collocations dictionary for students of English. Oxford University Press.

Rundell, M. (2009). Macmillan English dictionary online. Macmillan Education. Available at

http://www.macmillandictionary.com/.

Satake, Y. (2020). How error types affect the accuracy of L2 error correction with corpus use. Journal of second language writing, 50, 100757. https://doi.org/10.1016/j.jslw.2020.100757

Schmidt, R. W. (1990). The role of consciousness in second language learning1. Applied linguistics, 11(2), 129-158. https://doi.org/10.1093/applin/11.2.129

Schmidt, R. W. (2001). Attention. In Robinson, P. (Ed.), Cognition and second language instruction (pp. 3-32). Cambridge University Press. https://doi.org/10.1017/CBO9781139524780.003

Silva, T., & Brice, C. (2004). Research in teaching writing. Annual Review of Applied Linguistics, 24, 70-106. https://doi.org/10.1017/s0267190504000042

Hanks, P. (1987). Definitions and explanations. In Sinclair, J. (Ed.), Looking up: An account of the COBUILD project in lexical computing and the development of the Collins COBUILD English language dictionary (pp. 116–136). Collins.

Sinclair, J. (1987). Collins COBUILD English Language Dictionary. Collins.

Sinclair, S., & Rockwell, G. (2016). Voyant Tools. Accessed 17 July 2022 at https://voyant-tools.org/

Smart, J. (2014). The role of guided induction in paper-based data-driven learning. ReCALL, 26, 184-201. https://doi.org/10.1017/S0958344014000081

Smith, B. (2012). Eye tracking as a measure of noticing: A study of explicit recasts in SCMC. Language Learning & Technology, 16(3), 53-81. http://dx.doi.org/10125/44300

Stockwell, G. (2007). A review of technology choice for teaching language skills and areas in the CALL literature. ReCALL, 19(2), 105-120. https://doi.org/10.1017/S0958344007000225

Timmis, I. (2010). Teachers telling tales: Exploring materials for teaching spoken language. In F. Mishan and A. Chambers (Eds.), Perspectives on language learning materials development. (pp. 63-85). Peter Lang.

Tono, Y. (2019). Coming full circle - From CEFR to CEFR-J and back. CEFR Journal - Research and Practice, JALT, 5-17. https://doi.org/10.37546/JALTSIG.CEFR1-1

Uzun, K. (2022). Emotional load, formality, informativeness and implicature in relation to L2 writing performance. In Language, Culture, Art and Politics in the Changing World (pp. 19-29). essay, Literatürk Academia.

Vantage Learning. (2007). MY access! ® efficacy report. Vantage Learning. Accessed from https://www. vantagelearning.com/school/research/myaccess.html

Vyatkina, N. (2016). Data-Driven learning of collocations: Learner performance, proficiency, and perceptions, Language Learning and Technology, 20(3), 159-79.

Willis, D. (1990). The lexical syllabus. Collins.

Yao, G. (2019). Vocabulary learning through data-driven learning in the context of Spanish as a foreign language. Research in Corpus Linguistics, 7, 18-46. https://doi.org/10.32714/ricl.07.02

Zare, J., & Karimpour, S. (2022). Classroom Concordancing and Second Language Motivational Self-System: A Data-Driven Learning Approach. Frontiers in Psychology, 13, Article 841584. https://doi.org/10.3389/fpsyg.2022.841584

Zare, J., Karimpour, S., & Aqajani Delavar, K. (2022). The impact of concordancing on English learners’ foreign language anxiety and enjoyment: An application of data-driven learning. System, 109, 102891.

https://doi.org/https://doi.org/10.1016/j.system.2022.102891

Zechner, K., Higgins, D., & Xi, X., & Williamson, D. M. (2009). Automatic scoring of non-native spontaneous speech in tests of spoken English. Speech Communication, 51(10), 883-895. https://doi.org/10.1016/j.specom.2009.04.009