Preview

Informatics

Advanced search

System of complex data analysis of thematic sites ISCAD IS

https://doi.org/10.37661/1816-0301-2024-21-1-105-120

Abstract

Objectives. Currently, the main source of information is the Internet. The huge amount of information available on the Internet makes it urgent to comprehensively analyze data from open Internet sources.

The goal of this work is to create a multi-purpose, modifiable cluster for in-depth analysis of data from Internet sources, the main objectives of which are to identify the most important publications in a certain subject area, thematic analysis of these publications, identifying the leader of a scientific direction and determining trends in the development of areas and interaction of groups of people.

Methods. To solve this problem, a methodology was developed for constructing a multi-purpose cluster using technologies for quickly constructing a thematic graph database, a knowledge graph, methods and models of machine learning for in-depth analysis of data.

Results. A system for comprehensive analysis of data from thematic sites ISKAD IS has been developed, a methodology for quickly constructing a thematic graph database and a comprehensive technology for in-depth analysis of data from Internet sources and analysis of data from the most important well-known world sites have been tested.

Conclusion. An IT environment has been created for the rapid construction of thematic graph databases. The results of using the technology for quickly constructing graph databases are shown using examples of the work of ISKAD IS.

About the Authors

I. I. Piletski
Belarusian State University of Informatics and Radioelectronics
Belarus

Ivan I. Piletski, Ph. D. (Phys.-Math.), Assoc. Prof. of the Department of Informatics

st. P. Brovki, 6, Minsk, 220013



M. P. Batura
Belarusian State University of Informatics and Radioelectronics
Belarus

Michal P. Batura, D. Sc. (Eng.), Prof., Head of the Laboratory "New Educational Technologies"

st. P. Brovki, 6, Minsk, 220013



N. A. Volоrоva
Belarusian State University of Informatics and Radioelectronics
Belarus

Natalia A. Volоrоva, Ph. D. (Eng.), Assoc. Prof., Senior Researcher of the Laboratory "New Educational Technologies"

st. P. Brovki, 6, Minsk, 220013



P. A. Zorko
Belarusian State University of Informatics and Radioelectronics
Belarus

Polina A. Zorko, Master's Student of the Department of Informatics

st. P. Brovki, 6, Minsk, 220013



A. O. Kulevich
Belarusian State University of Informatics and Radioelectronics
Belarus

Alexei O. Kulevich, Master's Student of the Department of Informatics

st. P. Brovki, 6, Minsk, 220013



References

1. Batura M. P., Piletski I. I., Prytkov V. A., Volorova N. A. Intelligent system for comprehensive analysis of data from Internet sources. BIG DATA i analiz vysokogo urovnja : sbornik materialov VI Mezhdunarodnoj nauchno-prakticheskoj konferencii, Minsk, 20–21 maja 2020 g. : v 3 chastjah. Chast' 1 [BIG DATA and Advanced Analytics : Collection of Materials of the VI International Scientific and Practical Conference, Minsk, 20–21 May 2020 : in 3 Parts. Part 1]. Ed. board: V. A. Bogush [et al.]. Minsk, Bestprint, 2020, рр. 220–241 (In Russ.).

2. Piletski I. I., Batura M. P., Shilin L. Yu. Graph technologies in an intelligent system for complex analysis of data from Internet sources. Doklady Belorusskogo gosudarstvennogo universiteta informatiki i radioèlektroniki [Doklady BGUIR], 2020, vol. 18, no. 5. рр. 89–97 (In Russ.).

3. Batura M. P., Piletsky I. I., Volorova N. A., Zorko P. A., Kulevich A. O. Knowledge graph and machine learning as an IT environment for mining data from Internet sources. BIG DATA i analiz vysokogo urovnja : sbornik nauchnyh statej VIII Mezhdunarodnoj nauchno-prakticheskoj konferencii, Minsk, 11–12 maja 2022 g. [BIG DATA and Advanced Analytics : Collection of Scientific Articles of the VIII International Scientific and Practical Conference, Minsk, 11–12 May 2022]. Ed. board: V. A. Bogush [et al.]. Minsk, 2022, рр. 330–344 (In Russ.).

4. Diestel R. Graph Theory. Berlin, Springer-Verlag, 2017, 448 р.

5. Needham M., Hodler A. E. Graph Algorithms. Sebastopol, O’Reilly Media, 2019, 265 р.

6. Hamilton W. L., Ying R., Leskovec J. Representation learning on graphs: Methods and applications. IEEE Data Engineering Bulletin, 2017, vol. 40, no. 3, рр. 52–74.

7. Ovcinnikova J., Sostaks A., Cerans K. Visual diagrammatic queries in ViziQuer: Overview and implementation. Baltic Journal of Modern Computing, 2023, vol. 11, no. 2, рр. 317–350.

8. Portisch J., Heist N., Paulheim H. Knowledge graph embedding for data mining vs. knowledge graph embedding for link prediction – two sides of the same coin? Semantic Web, 2022, vol. 13, no. 3, рр. 399–422. https://doi.org/10.3233/SW-212892

9. Barrasa J., Hodler A. E., Webber J. Knowledge Graphs. Sebastopol, O’Reilly Media, 2021, 85 р.


Supplementary files

Review

For citations:


Piletski I.I., Batura M.P., Volоrоva N.A., Zorko P.A., Kulevich A.O. System of complex data analysis of thematic sites ISCAD IS. Informatics. 2024;21(1):105-120. (In Russ.) https://doi.org/10.37661/1816-0301-2024-21-1-105-120

Views: 426


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1816-0301 (Print)
ISSN 2617-6963 (Online)