Enhancing Grammatical Correctness: The Efficacy of Large Language Models in Error Correction Task

Show simple item record

dc.contributor.author Korniienko, Oleksandr
dc.date.accessioned 2024-08-23T08:01:43Z
dc.date.available 2024-08-23T08:01:43Z
dc.date.issued 2024
dc.identifier.citation Korniienko Oleksandr. Enhancing Grammatical Correctness: The Efficacy of Large Language Models in Error Correction Task.Ukrainian Catholic University, Faculty of Applied Sciences, Department of Computer Sciences. Lviv 2024, 45 p. uk
dc.identifier.uri https://er.ucu.edu.ua/handle/1/4669
dc.language.iso uk uk
dc.subject Enhancing Grammatical Correctness uk
dc.subject Large Language Models uk
dc.subject Error Correction Task uk
dc.title Enhancing Grammatical Correctness: The Efficacy of Large Language Models in Error Correction Task uk
dc.type Preprint uk
dc.status Публікується вперше uk
dc.description.abstracten Recent studies have highlighted the exceptional capabilities of open-sourced foun- dational models like LLaMA, Mistral, and Gemma, particularly in scenarios requir- ing writing assistance. These models demonstrate proficiency in various tasks both in zero-shot settings and when fine-tuned with task-specific, instruction-driven data. Despite their adaptability, the application of these models to Grammatical Error Cor- rection (GEC) tasks, critical for producing grammatically accurate text in writing assistants, remains underexplored. This thesis explores the performance of open- sourced Large Language Models (LLMs) in GEC task across multiple setups: zero- shot, supervised fine-tuning, and Reinforcement Learning from Human Feedback (RLHF). Our research shows that task-specific fine-tuning significantly enhances LLM performance on GEC tasks. We also highlight the importance of precise prompt configuration in zero-shot settings to align models with the specific requirement of the CoNLL-2014 and BEA-2019 benchmarks, aiming for minimal necessary ed- its. Further, our experiments with RLHF, particularly Direct Preference Optimiza- tion, provide insights into aligning LLMs for specific applications, showing an im- provement of 0.3% in scores and indicating a further path for improvement. The best-performing model, Chat-LLaMA-2-13B-FT, matched the performance of state- of-the-art models with considerably less data, achieving an F0.5 score of 67.87% on the CoNLL-2014-test and 73.11% on the BEA-2019-test benchmarks. This thesis ex- pands our understanding of the capabilities of open-sourced LLMs in GEC and sets the stage for future enhancements in this area. The code and trained model are pub- licly available. uk


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Browse

My Account