BelLitGPT – language model technologies for the Belarusian language
https://doi.org/10.37661/1816-0301-2026-23-1-26-38
Abstract
Objectives. The research is conducted in the field of specialized generative neural networks for the Belarusian language. The authors aim to take the first step towards building a national generative language model.
Methods. The paper describes the development process of the BelLitGPT model (700 million parameters). It is based on a transfer learning strategy using the Russian-language model ruGPT-3 and consists of three stages: corpus preparation, tokenizer adaptation methodology and model training. The training corpus is compiled from the golden fund of classic Belarusian prose and prepared Wikipedia articles. The paper details the tokenizer adaptation method for expanding the vocabulary with specific Belarusian lexemes, as well as the model training and testing process.
Results. The research results confirm that BelLitGPT can generate coherent, grammatically and stylistically correct texts. Special attention is given to the creation of a hybrid neuro-symbolic approach for generating quatrains that adhere to rhythm and rhyme.
Conclusion. The experiment on scaling the architecture revealed difficulties in training a large model (13 billion parameters) under conditions of data scarcity.
About the Authors
Dmitry A. LyakhovBelarus
Dmitry A. Lyakhov, Cand. Sci. (Phys.-Math.), Senior Researcher
st. Surganova, 6, Minsk, 220012
Andrei M. Bandalouski
Belarus
Andrei M. Bandalouski, Cand. Sci. (Econ.), Head of Laboratory of Speech Synthesis and Recognition
st. Surganova, 6, Minsk, 220012
Sergey V. Kruglikov
Belarus
Sergey V. Kruglikov, Dr. Sci. (Milit.), Cand. Sci. (Eng.), Assoc. Prof., Principal Researcher
st. Surganova, 6, Minsk, 220012
Konstantin K. Shulgan
Belarus
Konstantin K. Shulgan, Deputy General Director for Digital Development
st. Surganova, 6, Minsk, 220012
References
1. Brown T., Mann B., Ryder N., Subbiah M., Kaplan J. D., …, Amodei D. Language models are few-shot learners. Advances in Neural Information Processing Systems, 2020, vol. 33, рр. 1877–1901.
2. Radford A., Wu J., Child R., Luan D., Amodei D., Sutskever I. Language models are unsupervised multitask learners. OpenAI, 2019. Available at: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf (accessed 03.11.2025).
3. Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., …, Polosukhin I. Attention is all you need. Advances in Neural Information Processing Systems, 2017, vol. 30, рр. 5998–6008.
4. Artetxe M., Ruder S., Yogatama D. On the cross-lingual transferability of monolingual representations. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020, рр. 4623–4637.
5. Jakubíček M., Kilgarriff A., Kovář V., Rychlỳ P., Suchomel V. The tenten corpus family. Proceedings of the 7th International Corpus Linguistics Conference (CL2013), Lancaster University, United Kingdom, 22–26 July 2013, рр. 125–127.
6. Sennrich R., Haddow B., Birch A. Neural machine translation of rare words with subword units. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7–12 August 2016, vol. 1, рр. 1715–1725.
7. Imamura K., Sumita E. Vocabulary adaptation for domain adaptation in neural machine translation. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October – 4 November 2018, рр. 4623–4637.
8. Ghazvininejad M., Shi X., Choi Y., Knight K. Hafez: an interactive poetry generation system. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 30 July – 4 August 2017, рр. 43–48.
9. Mesnard T., Hardin C., Dadashi R., Bhupatiraju S., Pathak S., …, Eck D. Gemma: Open models based on Gemini research and technology, 2024. Available at: https://arxiv.org/pdf/2403.08295 (accessed 03.11.2025).
10. Lau J. H., Cohn T., Baldwin T., Brooke J., Hammond A. Deep-speare: A joint neural model of poetic language, meter and rhyme. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018, vol. 1, рр. 1948–1958.
11. Zugarini A., Melacci S., Maggini M. Neural poetry: Learning to generate poems using syllables. Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series: 28th International Conference on Artificial Neural Networks, Munich, Germany, 17–19 September 2019, рр. 313–325.
Review
For citations:
Lyakhov D.A., Bandalouski A.M., Kruglikov S.V., Shulgan K.K. BelLitGPT – language model technologies for the Belarusian language. Informatics. 2026;23(1):26-38. (In Russ.) https://doi.org/10.37661/1816-0301-2026-23-1-26-38
JATS XML

















