Показати скорочений опис матеріалу
dc.contributor.author | Korniienko, Oleksandr | |
dc.date.accessioned | 2024-08-23T08:01:43Z | |
dc.date.available | 2024-08-23T08:01:43Z | |
dc.date.issued | 2024 | |
dc.identifier.citation | Korniienko Oleksandr. Enhancing Grammatical Correctness: The Efficacy of Large Language Models in Error Correction Task.Ukrainian Catholic University, Faculty of Applied Sciences, Department of Computer Sciences. Lviv 2024, 45 p. | uk |
dc.identifier.uri | https://er.ucu.edu.ua/handle/1/4669 | |
dc.language.iso | uk | uk |
dc.subject | Enhancing Grammatical Correctness | uk |
dc.subject | Large Language Models | uk |
dc.subject | Error Correction Task | uk |
dc.title | Enhancing Grammatical Correctness: The Efficacy of Large Language Models in Error Correction Task | uk |
dc.type | Preprint | uk |
dc.status | Публікується вперше | uk |
dc.description.abstracten | Recent studies have highlighted the exceptional capabilities of open-sourced foun- dational models like LLaMA, Mistral, and Gemma, particularly in scenarios requir- ing writing assistance. These models demonstrate proficiency in various tasks both in zero-shot settings and when fine-tuned with task-specific, instruction-driven data. Despite their adaptability, the application of these models to Grammatical Error Cor- rection (GEC) tasks, critical for producing grammatically accurate text in writing assistants, remains underexplored. This thesis explores the performance of open- sourced Large Language Models (LLMs) in GEC task across multiple setups: zero- shot, supervised fine-tuning, and Reinforcement Learning from Human Feedback (RLHF). Our research shows that task-specific fine-tuning significantly enhances LLM performance on GEC tasks. We also highlight the importance of precise prompt configuration in zero-shot settings to align models with the specific requirement of the CoNLL-2014 and BEA-2019 benchmarks, aiming for minimal necessary ed- its. Further, our experiments with RLHF, particularly Direct Preference Optimiza- tion, provide insights into aligning LLMs for specific applications, showing an im- provement of 0.3% in scores and indicating a further path for improvement. The best-performing model, Chat-LLaMA-2-13B-FT, matched the performance of state- of-the-art models with considerably less data, achieving an F0.5 score of 67.87% on the CoNLL-2014-test and 73.11% on the BEA-2019-test benchmarks. This thesis ex- pands our understanding of the capabilities of open-sourced LLMs in GEC and sets the stage for future enhancements in this area. The code and trained model are pub- licly available. | uk |