System of complex data analysis of thematic sites ISCAD IS
https://doi.org/10.37661/1816-0301-2024-21-1-105-120
Abstract
Objectives. Currently, the main source of information is the Internet. The huge amount of information available on the Internet makes it urgent to comprehensively analyze data from open Internet sources.
The goal of this work is to create a multi-purpose, modifiable cluster for in-depth analysis of data from Internet sources, the main objectives of which are to identify the most important publications in a certain subject area, thematic analysis of these publications, identifying the leader of a scientific direction and determining trends in the development of areas and interaction of groups of people.
Methods. To solve this problem, a methodology was developed for constructing a multi-purpose cluster using technologies for quickly constructing a thematic graph database, a knowledge graph, methods and models of machine learning for in-depth analysis of data.
Results. A system for comprehensive analysis of data from thematic sites ISKAD IS has been developed, a methodology for quickly constructing a thematic graph database and a comprehensive technology for in-depth analysis of data from Internet sources and analysis of data from the most important well-known world sites have been tested.
Conclusion. An IT environment has been created for the rapid construction of thematic graph databases. The results of using the technology for quickly constructing graph databases are shown using examples of the work of ISKAD IS.
About the Authors
I. I. PiletskiBelarus
Ivan I. Piletski, Ph. D. (Phys.-Math.), Assoc. Prof. of the Department of Informatics
st. P. Brovki, 6, Minsk, 220013
M. P. Batura
Belarus
Michal P. Batura, D. Sc. (Eng.), Prof., Head of the Laboratory "New Educational Technologies"
st. P. Brovki, 6, Minsk, 220013
N. A. Volоrоva
Belarus
Natalia A. Volоrоva, Ph. D. (Eng.), Assoc. Prof., Senior Researcher of the Laboratory "New Educational Technologies"
st. P. Brovki, 6, Minsk, 220013
P. A. Zorko
Belarus
Polina A. Zorko, Master's Student of the Department of Informatics
st. P. Brovki, 6, Minsk, 220013
A. O. Kulevich
Belarus
Alexei O. Kulevich, Master's Student of the Department of Informatics
st. P. Brovki, 6, Minsk, 220013
References
1. Batura M. P., Piletski I. I., Prytkov V. A., Volorova N. A. Intelligent system for comprehensive analysis of data from Internet sources. BIG DATA i analiz vysokogo urovnja : sbornik materialov VI Mezhdunarodnoj nauchno-prakticheskoj konferencii, Minsk, 20–21 maja 2020 g. : v 3 chastjah. Chast' 1 [BIG DATA and Advanced Analytics : Collection of Materials of the VI International Scientific and Practical Conference, Minsk, 20–21 May 2020 : in 3 Parts. Part 1]. Ed. board: V. A. Bogush [et al.]. Minsk, Bestprint, 2020, рр. 220–241 (In Russ.).
2. Piletski I. I., Batura M. P., Shilin L. Yu. Graph technologies in an intelligent system for complex analysis of data from Internet sources. Doklady Belorusskogo gosudarstvennogo universiteta informatiki i radioèlektroniki [Doklady BGUIR], 2020, vol. 18, no. 5. рр. 89–97 (In Russ.).
3. Batura M. P., Piletsky I. I., Volorova N. A., Zorko P. A., Kulevich A. O. Knowledge graph and machine learning as an IT environment for mining data from Internet sources. BIG DATA i analiz vysokogo urovnja : sbornik nauchnyh statej VIII Mezhdunarodnoj nauchno-prakticheskoj konferencii, Minsk, 11–12 maja 2022 g. [BIG DATA and Advanced Analytics : Collection of Scientific Articles of the VIII International Scientific and Practical Conference, Minsk, 11–12 May 2022]. Ed. board: V. A. Bogush [et al.]. Minsk, 2022, рр. 330–344 (In Russ.).
4. Diestel R. Graph Theory. Berlin, Springer-Verlag, 2017, 448 р.
5. Needham M., Hodler A. E. Graph Algorithms. Sebastopol, O’Reilly Media, 2019, 265 р.
6. Hamilton W. L., Ying R., Leskovec J. Representation learning on graphs: Methods and applications. IEEE Data Engineering Bulletin, 2017, vol. 40, no. 3, рр. 52–74.
7. Ovcinnikova J., Sostaks A., Cerans K. Visual diagrammatic queries in ViziQuer: Overview and implementation. Baltic Journal of Modern Computing, 2023, vol. 11, no. 2, рр. 317–350.
8. Portisch J., Heist N., Paulheim H. Knowledge graph embedding for data mining vs. knowledge graph embedding for link prediction – two sides of the same coin? Semantic Web, 2022, vol. 13, no. 3, рр. 399–422. https://doi.org/10.3233/SW-212892
9. Barrasa J., Hodler A. E., Webber J. Knowledge Graphs. Sebastopol, O’Reilly Media, 2021, 85 р.
Supplementary files
Review
For citations:
Piletski I.I., Batura M.P., Volоrоva N.A., Zorko P.A., Kulevich A.O. System of complex data analysis of thematic sites ISCAD IS. Informatics. 2024;21(1):105-120. (In Russ.) https://doi.org/10.37661/1816-0301-2024-21-1-105-120


















