dc.description.abstracten |
Nowadays, thousands of sporting events take place every day. Most of the sports news (results of sports competitions) is written by hand, despite their pattern structure. In this work, we want to check possible or not to generate news based on the broadcast - a set of comments that describe the game in real-time. This problem solves for the Russian language and considered as a summarization problem, using extractive and abstract approaches. Among extractive models, we do not get significant results. However, we build an Oracle model that showed the best possible result equal to 0.21 F1 for ROUGE-1. For the abstraction approach, we get 0.26 F1 for the ROUGE-1 score using the NMT framework, the Bidirectional Encoder Representations from Transformers (BERT), as an encoder and text augmentation based on a thesaurus. Other types of encoders do not show significant improvements. |
uk |