LLM-Based Translation Agent with Integrated Translation Memory

Primary supervisor

Trang Vu

Large language models (LLMs) have recently made significant progress in machine translation quality [1], but they still struggle with maintaining consistency and accuracy across entire documents. Professional translators commonly use translation memory (TM) tools to reuse past translations, ensuring consistent terminology and phrasing throughout a document. Inspired by the latest research, such as DelTA [2], a document-level translation agent with a multi-level memory architecture, and HiMATE [3], a multi-agent evaluation framework leveraging fine-grained MQM error typology, this project seeks to bridge the gap between LLMs and traditional TM systems. The goal is to enhance domain-specific accuracy and document-level coherence in LLM-based translation by intelligently incorporating a translation memory mechanism.

Aim/outline

An open source code
A publication in NLP venues such as ACL/EMNLP/NAACL/EACL

URLs/references

[1] Wu et al. 2024. (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts TACL

[2] Wang et al. 2025. DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory. ICLR 2025

[3] Zhang et al. 2025. HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation. arXiv:2505.16281

Required knowledge

Must: fluency in Python and PyTorch
Must: academic or working knowledge of Large Language Models, Machine Translation
Must: fluent in basic machine learning concepts (both theory and hands-on)
Preferred: have built a small fine-tuned language model (i.e., LLaMA)
Preferred: interested in doing a PhD

LLM-Based Translation Agent with Integrated Translation Memory

Primary supervisor

Aim/outline

URLs/references

Required knowledge

Honours projects

Supervisor Connect

Browse

Recently added