Human-AI Collaborative Feedback in Translator Training: A Mixed-Methods Study of Translation Quality, Revision Behavior, and Learner Perceptions

Shiyue Chen

doi:10.69760/aghel.026002009

Authors

Shiyue Chen School of Humanities and Foreign Languages, Zhejiang Shuren University, Hangzhou, China Author https://orcid.org/0000-0003-1170-1948

DOI:

https://doi.org/10.69760/aghel.026002009

Keywords:

human-AI collaborative feedback, translation pedagogy, translator training, feedback literacy, translation quality assessment

Abstract

The integration of large language models (LLMs) into language education has prompted renewed interest in AI-assisted feedback, yet purely automated feedback remains vulnerable to contextual misalignment, cultural misreading, and reliability concerns that are particularly consequential in translation training. A human-AI collaborative feedback model, in which an instructor curates, corrects, and supplements LLM-generated commentary before students revise, offers a theoretically motivated alternative, yet its pedagogical effects in translator education remain empirically underexplored. This mixed-methods study examines the impact of such a hybrid feedback approach on undergraduate Chinese-to-English student translators. Forty senior undergraduates translated a 1,500-word cultural heritage text and received ChatGPT-4o-generated feedback subsequently reviewed and annotated by an experienced instructor using a color-coded transparency system. Quantitative analysis using a Multidimensional Quality Metrics (MQM) rubric revealed significant pre-to-post gains across all measured dimensions (overall MQM composite: Δ +1.20 on a 5-point scale, p < .001), with the largest improvements in terminology (Δ +1.47) and accuracy (Δ +1.32) and meaningful gains in cohesion, cultural adaptation, register, language conventions, and format (all p < .001). Think-aloud protocols revealed a consistent two-stage revision pattern and active source evaluation behavior, with students demonstrating greater decisiveness when AI and instructor annotations converged and deeper deliberation when they diverged. Student perception surveys indicated high ratings across clarity, trustworthiness, usefulness, and pedagogical value, with no significant differences between high- and low-performing students. Instructors reported meaningful workload relief on routine corrections while retaining pedagogical authority over higher-order feedback. These findings suggest the potential of a human-in-the-loop feedback framework for translator training in which AI handles systematic error detection while instructors validate, contextualize, and model evaluative judgment.

Author Biography

Shiyue Chen, School of Humanities and Foreign Languages, Zhejiang Shuren University, Hangzhou, China

Chen, S. School of Humanities and Foreign Languages, Zhejiang Shuren University, Hangzhou, China. Email: shiyue@zjsru.edu.cn. ORCID: https://orcid.org/0000-0003-1170-1948

References

Banihashem, S. K., Kerman, N. T., Noroozi, O., Moon, J., & Drachsler, H. (2024). Feedback sources in essay writing: Peer-generated or AI-generated feedback? International Journal of Educational Technology in Higher Education, 21(1), 23. https://doi.org/10.1186/s41239-024-00455-4

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa

Cao, S., & Zhou, T. (2025). Exploring the efficacy of ChatGPT-based feedback compared with teacher feedback and self-feedback: Evidence from Chinese-English translation. SAGE Open, 15(3), 21582440251369204. https://doi.org/10.1177/21582440251369204

Carless, D. (2012). Trust and its role in facilitating dialogic feedback. In Feedback in Higher and Professional Education. Routledge.

Carless, D., & Boud, D. (2018). The development of student feedback literacy: Enabling uptake of feedback. Assessment & Evaluation in Higher Education, 43(8), 1315–1325. https://doi.org/10.1080/02602938.2018.1463354

Chauhan, S., & Daniel, P. (2023). A comprehensive survey on various fully automatic machine translation evaluation metrics. Neural Processing Letters, 55(9), 12663–12717. https://doi.org/10.1007/s11063-022-10835-4

Chen, J., Zhang, L. J., Wang, X., & Zhang, T. (2021). Corrigendum: Impacts of self-regulated strategy development-based revision instruction on EFL students’ self-efficacy for text revision: A mixed-methods study. Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.747252

Chen, S., & Zhou, T. (2024). Culturally based semantic losses in Lonely Planet’s travel guides translations for Beijing, Shanghai, and Sichuan. Frontiers in Communication, 9. https://doi.org/10.3389/fcomm.2024.1343784

Chen, S., & Zhou, T. (2026). Prompt-induced cultural mediation and its limits: A micro-level analysis of LLM translation of Chinese tourism texts. Cogent Arts & Humanities, 13(1), 2631304. https://doi.org/10.1080/23311983.2026.2631304

Cheng, L., Li, Y., Su, Y., & Gao, L. (2023). Effect of regulation scripts for dialogic peer assessment on feedback quality, critical thinking and climate of trust. Assessment & Evaluation in Higher Education, 48(4), 451–463. https://doi.org/10.1080/02602938.2022.2092068

Dai, W., Lin, J., Jin, H., Li, T., Tsai, Y.-S., Gašević, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. 2023 IEEE International Conference on Advanced Learning Technologies (ICALT), 323–325. https://ieeexplore.ieee.org/abstract/document/10260740/

Derakhshan, A., & Taghizadeh, M. S. (n.d.). Does artificial intelligence (AI) nurture or hinder language learners’ higher-order thinking skills (HOTS)? A phenomenological study on L2 learners’ perspectives and lived experiences. https://doi.org/10.1111/ijal.12824

Er, E., Akçapınar, G., Bayazıt, A., Noroozi, O., & Banihashem, S. K. (2025). Assessing student perceptions and use of instructor versus AI-generated feedback. British Journal of Educational Technology, 56(3), 1074–1091. https://doi.org/10.1111/bjet.13558

Escalante, J., Pack, A., & Barrett, A. (2023). AI-generated feedback on writing: Insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20(1), 57. https://doi.org/10.1186/s41239-023-00425-2

Group, P., Hurtado Albir (principal investigator), A., Galán-Mañas, A., Kuznik, A., Olalla-Soler, C., Rodríguez-Inés, P., & Romero (research team, in alphabetical order), Lupe. (2018). Competence levels in translation: Working towards a European framework. The Interpreter and Translator Trainer, 12(2), 111–131. https://doi.org/10.1080/1750399X.2018.1466093

Guo, K., Chen, X., & Qiao, S. (2024). Exploring a collaborative approach to peer feedback in EFL writing: How do students participate? RELC Journal, 55(3), 658–672. https://doi.org/10.1177/00336882221143192

Guo, K., & Wang, D. (2024). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 29(7), 8435–8463. https://doi.org/10.1007/s10639-023-12146-0

Han, C., & Lu, X. (2023). Can automated machine translation evaluation metrics be used to assess students’ interpretation in the language learning classroom? Computer Assisted Language Learning, 36(5–6), 1064–1087. https://doi.org/10.1080/09588221.2021.1968915

Han, Y., & Xu, Y. (2021). Student feedback literacy and engagement with feedback: A case study of Chinese undergraduate students. Teaching in Higher Education, 26(2), 181–196. https://doi.org/10.1080/13562517.2019.1648410

Holstein, K., Aleven, V., & Rummel, N. (2020). A conceptual framework for human–AI hybrid adaptivity in education (pp. 240–254). https://doi.org/10.1007/978-3-030-52237-7_20

Inkpen, K., Chappidi, S., Mallari, K., Nushi, B., Ramesh, D., Michelucci, P., Mandava, V., Vepřek, L. H., & Quinn, G. (2023). Advancing human-AI complementarity: The impact of user expertise and algorithmic tuning on joint decision making. ACM Transactions on Computer-Human Interaction, 30(5), 1–29. https://doi.org/10.1145/3534561

Jiao, H., Hu, W., & Zhang, X. (2025). To eat or to feed: Can large language models provide useful feedback in translation education?

Kim, H. R., & Bowles, M. (2019). How deeply do second language learners process written corrective feedback? Insights gained from think-alouds. TESOL Quarterly, 53(4), 913–938. https://doi.org/10.1002/tesq.522

Kinder, A., Briese, F. J., Jacobs, M., Dern, N., Glodny, N., Jacobs, S., & Leßmann, S. (2025). Effects of adaptive feedback generated by a large language model: A case study in teacher education. Computers and Education: Artificial Intelligence, 8, 100349. https://doi.org/10.1016/j.caeai.2024.100349

Kiraly, D. (Ed.). (2015). Towards Authentic Experiential Learning in Translator Education (1st ed.). V&R Unipress. https://doi.org/10.14220/9783737004954

Koponen, M. (2016). Is machine translation post-editing worth the effort? A survey of research into post-editing and effort. The Journal of Specialised Translation, 25(2), 131–148.

Kumar, S., Datta, S., Singh, V., Datta, D., Kumar Singh, S., & Sharma, R. (2024). Applications, challenges, and future directions of human-in-the-loop learning. IEEE Access, 12, 75735–75760. https://doi.org/10.1109/ACCESS.2024.3401547

Lau, G. R., Low, W. Y., Tay, L., Guevarra, Y., Gašević, D., & Hartanto, A. (2025). Understanding critical thinking in generative artificial intelligence use: Development, validation, and correlates of the critical thinking in AI use scale (arXiv:2512.12413). arXiv. https://doi.org/10.48550/arXiv.2512.12413

Li, M., Yu, S., Mak, P., & Liu, C. (2023). Exploring the efficacy of peer assessment in university translation classrooms. The Interpreter and Translator Trainer, 17(4), 585–609. https://doi.org/10.1080/1750399X.2023.2236920

Lin, Z., Song, X., Guo, J., & Wang, F. (2021). Peer feedback in translation training: A quasi-experiment in an advanced Chinese–English translation course. Frontiers in Psychology, 12, 631898. https://doi.org/10.3389/fpsyg.2021.631898

Lommel, A., Uszkoreit, H., & Burchardt, A. (2014). Multidimensional Quality Metrics (MQM): A framework for declaring and describing translation quality metrics. Tradumàtica Tecnologies de La Traducció, 12, 455–463. https://doi.org/10.5565/rev/tradumatica.77

Ma, H., Ismail, L., & Han, W. (2024). A bibliometric analysis of artificial intelligence in language teaching and learning (1990–2023): Evolution, trends and future directions. Education and Information Technologies, 29(18), 25211–25235. https://doi.org/10.1007/s10639-024-12848-z

Mahapatra, S. (2024). Impact of ChatGPT on ESL students’ academic writing skills: A mixed methods intervention study. Smart Learning Environments, 11(1), 9. https://doi.org/10.1186/s40561-024-00295-9

Mariana, V., Cox, T., & Melby, A. (2015). The Multidimensional Quality Metrics (MQM) framework: A new framework for translation quality assessment. The Journal of Specialised Translation, 137–161. https://doi.org/10.26034/cm.jostrans.2015.343

Mellinger, C. D. (2019). Metacognition and self-assessment in specialized translation education: Task awareness and metacognitive bundling. Perspectives, 27(4), 604–621. https://doi.org/10.1080/0907676X.2019.1566390

Memarian, B., & Doleck, T. (2024). Human-in-the-loop in artificial intelligence in education: A review and entity-relationship (ER) analysis. Computers in Human Behavior: Artificial Humans, 2(1), 100053. https://doi.org/10.1016/j.chbah.2024.100053

Mohammed, T. A. S. (2025). Evaluating translation quality: A qualitative and quantitative assessment of machine and LLM-driven Arabic–English translations. Information, 16(6), 440. https://doi.org/10.3390/info16060440

Molenaar, I. (2022). Towards hybrid human-AI learning technologies. European Journal of Education, 57(4), 632–645. https://doi.org/10.1111/ejed.12527

Mosqueira-Rey, E., Hernández-Pereira, E., Alonso-Ríos, D., Bobes-Bascarán, J., & Fernández-Leal, Á. (2023). Human-in-the-loop machine learning: A state of the art. Artificial Intelligence Review, 56(4), 3005–3054. https://doi.org/10.1007/s10462-022-10246-w

Neunzig, W., & Tanqueiro, H. (2005). Teacher feedback in online education for trainee translators. Meta: Journal des Traducteurs / Meta: Translators’ Journal, 50(4). https://doi.org/10.7202/019873ar

Orak, S. D. (2025). Turkish EFL teachers’ perspectives on AI-generated feedback: Negotiating trust, control, and pedagogical adaptation in writing instruction. Applied Linguistics: Research, Measurement and Practice, 1(1), 74–94. https://doi.org/10.65334/n40kdg46

Ranalli, J. (2023). Automated writing evaluation: Student perception, use, and impact. Language Learning & Technology, 27(1), 1–24.

Sato, M., & Loewen, S. (2018). Metacognitive instruction enhances the effectiveness of corrective feedback: Variable effects of feedback types and linguistic targets. Language Learning, 68(2), 507–545. https://doi.org/10.1111/lang.12283

Selwyn, N. (2022). The future of AI and education: Some cautionary notes. European Journal of Education, 57(4), 620–631.

Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894

Tabari, M. A., Sato, M., & Wang, Y. (2023). Engagement with written corrective feedback: Examination of feedback types and think-aloud protocol as pedagogical interventions. Language Teaching Research, 13621688231202574. https://doi.org/10.1177/13621688231202574

Wang, L., Chen, X., Wang, C., Xu, L., Shadiev, R., & Li, Y. (2024). ChatGPT’s capabilities in providing feedback on undergraduate students’ argumentation: A case study. Thinking Skills and Creativity, 51, 101440.

Washbourne, K. (2014). Beyond error marking: Written corrective feedback for a dialogic pedagogy in translator training. The Interpreter and Translator Trainer, 8(2), 240–256. https://doi.org/10.1080/1750399X.2014.908554

Wiboolyasarin, W., Wiboolyasarin, K., Suwanwihok, K., Jinowat, N., & Muenjanchoey, R. (2024). Synergizing collaborative writing and AI feedback: An investigation into enhancing L2 writing proficiency in wiki-based environments. Computers and Education: Artificial Intelligence, 6, 100228. https://doi.org/10.1016/j.caeai.2024.100228

Wilson, J., & Czik, A. (2016). Automated essay evaluation software in English Language Arts classrooms: Effects on teacher feedback, student motivation, and writing quality. Computers & Education, 100, 94–109. https://doi.org/10.1016/j.compedu.2016.05.004

Winstone, N. E., Mathlin, G., & Nash, R. A. (2019). Building feedback literacy: Students’ perceptions of the developing engagement with feedback toolkit. Frontiers in Education, 4. https://doi.org/10.3389/feduc.2019.00039

Wu, X., Xiao, L., Sun, Y., Zhang, J., Ma, T., & He, L. (2022). A survey of human-in-the-loop for machine learning. Future Generation Computer Systems, 135, 364–381.

Xu, S., Su, Y., & Liu, K. (2025). Investigating student engagement with AI-driven feedback in translation revision: A mixed-methods study. Education and Information Technologies, 30(12), 16969–16995. https://doi.org/10.1007/s10639-025-13457-0

Xu, X., Sun, F., & Hu, W. (2025). Integrating human expertise with GenAI: Insights into a collaborative feedback approach in translation education. System, 129, 103600. https://doi.org/10.1016/j.system.2025.103600

Yu, S., Zhang, Y., Zheng, Y., & Lin, Z. (2020). Written corrective feedback strategies in English-Chinese translation classrooms. The Asia-Pacific Education Researcher, 29(2), 101–111. https://doi.org/10.1007/s40299-019-00456-2

Zheng, Y., Zhong, Q., Yu, S., & Li, X. (2020). Examining students’ responses to teacher translation feedback: Insights from the perspective of student engagement. Sage Open, 10(2), 2158244020932536. https://doi.org/10.1177/2158244020932536

Human-AI Collaborative Feedback in Translator Training: A Mixed-Methods Study of Translation Quality, Revision Behavior, and Learner Perceptions

Authors

DOI:

Keywords:

Abstract

Author Biography

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Latest publications

Developed By

Browse

Information

Language

Crosreff

Europub

Semantic scolar

Make a Submission