Enhancing Grammatical Correctness: The Efficacy of Large Language Models in  Error Correction Task

Korniienko, Oleksandr

Home
→
Students Research & Project Works | Роботи студентів
→
Факультет прикладних наук
→
Освітня програма наук про дані
→
2024
→
View Item

dc.contributor.author	Korniienko, Oleksandr
dc.date.accessioned	2024-08-23T08:01:43Z
dc.date.available	2024-08-23T08:01:43Z
dc.date.issued	2024
dc.identifier.citation	Korniienko Oleksandr. Enhancing Grammatical Correctness: The Efficacy of Large Language Models in Error Correction Task.Ukrainian Catholic University, Faculty of Applied Sciences, Department of Computer Sciences. Lviv 2024, 45 p.	uk
dc.identifier.uri	https://er.ucu.edu.ua/handle/1/4669
dc.language.iso	uk	uk
dc.subject	Enhancing Grammatical Correctness	uk
dc.subject	Large Language Models	uk
dc.subject	Error Correction Task	uk
dc.title	Enhancing Grammatical Correctness: The Efficacy of Large Language Models in Error Correction Task	uk
dc.type	Preprint	uk
dc.status	Публікується вперше	uk
dc.description.abstracten	Recent studies have highlighted the exceptional capabilities of open-sourced foun- dational models like LLaMA, Mistral, and Gemma, particularly in scenarios requir- ing writing assistance. These models demonstrate proficiency in various tasks both in zero-shot settings and when fine-tuned with task-specific, instruction-driven data. Despite their adaptability, the application of these models to Grammatical Error Cor- rection (GEC) tasks, critical for producing grammatically accurate text in writing assistants, remains underexplored. This thesis explores the performance of open- sourced Large Language Models (LLMs) in GEC task across multiple setups: zero- shot, supervised fine-tuning, and Reinforcement Learning from Human Feedback (RLHF). Our research shows that task-specific fine-tuning significantly enhances LLM performance on GEC tasks. We also highlight the importance of precise prompt configuration in zero-shot settings to align models with the specific requirement of the CoNLL-2014 and BEA-2019 benchmarks, aiming for minimal necessary ed- its. Further, our experiments with RLHF, particularly Direct Preference Optimiza- tion, provide insights into aligning LLMs for specific applications, showing an im- provement of 0.3% in scores and indicating a further path for improvement. The best-performing model, Chat-LLaMA-2-13B-FT, matched the performance of state- of-the-art models with considerably less data, achieving an F0.5 score of 67.87% on the CoNLL-2014-test and 73.11% on the BEA-2019-test benchmarks. This thesis ex- pands our understanding of the capabilities of open-sourced LLMs in GEC and sets the stage for future enhancements in this area. The code and trained model are pub- licly available.	uk