Preview

Informatics

Advanced search

ALGORITHMS FOR IDENTIFICATION OF CUES WITH AUTHORS’ TEXT INSERTIONS IN BELARUSIAN ELECTRONIC BOOKS

Abstract

The main stages of algorithms for characters’ gender identification in Belarusian electronic texts are described. The algorithms are based on punctuation marking and gender indicators detection, such as past tense verbs and nouns with gender attributes. For indicators, special dictionaries are developed, thus making the algorithms more language-independent and allowing to create dictionaries for cognate languages. Testing showed the following results: the mean harmonic quantity for masculine gender detection makes up 92,2 %, and for feminine gender detection – 90,4%.

About the Authors

Y. S. Hetsevich
Аб’яднаны інстытут праблем інфарматыкі НАН Беларусі
Belarus


T,. I. Okrut
Аб’яднаны інстытут праблем інфарматыкі НАН Беларусі
Belarus


B. M. Lobanov
Аб’яднаны інстытут праблем інфарматыкі НАН Беларусі
Belarus


References

1. Гецэвіч, Ю.С. Аўтаматызаваная апрацоўка сімвальных выразаў у тэкстах для сістэмы сінтэзу беларускага маўлення / Ю.С. Гецэвіч // Інфарматыка. – 2011. – № 4. – С. 82–93.

2. AlchemyAPI Interactive Text Analysis Demo // AlchemyAPI [Electronic resource]. –2013. – Mode of access : http://www.alchemyapi.com/api/demo.html. – Date of access : 23.07.2013.

3. Quotations Extraction // AlchemyAPI [Electronic resource]. – 2013. – Mode of access :http://www.alchemyapi.com/api/entity/quotations.html. – Date of access : 23.07.2013.

4. Assignment of Character and Action Types in Folk Tales / P. Lendvai [et al.] // Formalising Natural Languages with NooJ : Selected Papers from the NooJ 2010 Intern. Conf. / eds.

5. Z. Gavriilidou, E. Chatzipapa, L. Papadopoulou, M. Silberzstein. – Greece: Democritus University of Thrace, 2010. – P. 102–111.

6. Jurić, T. Direct Speech Recognition in Text / T. Jurić, M. Stupar, D. Boras // Automatic Processing of Various Levels of Linguistic Phenomena: Selected Papers from the NooJ 2011 Intern.


Review

For citations:


Hetsevich Y.S., Okrut T.I., Lobanov B.M. ALGORITHMS FOR IDENTIFICATION OF CUES WITH AUTHORS’ TEXT INSERTIONS IN BELARUSIAN ELECTRONIC BOOKS. Informatics. 2014;(1):68-76. (In Russ.)

Views: 883


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1816-0301 (Print)
ISSN 2617-6963 (Online)