INFORMATION TECHNOLOGY
Object ives. The possibility of using semantic technologies for the development and improvement of the content management system of the scientific and educational portal eLab-Science and the Belarusian nuclear knowledge portal BelNET (Belarusian Nuclear Education and Training Portal, https://belnet.by/) created on its basis is being considered.
Methods. Original algorithms for automatic systematization have been developed, such as placing content records in the portal taxonomy based on semantic technologies and generating a list of keywords. The following concepts of semantic technologies are used: taxonomy (hierarchical structure of the portal), thesaurus, glossary. Resul t s. The developed algorithms were implemented and tested using a full-text search tool and the original Belarusian glossary on nuclear and radiation safety.
Conclusion. The described basic principles of organization and algorithms based on semantic technologies, which underlie the functioning of the content management system of the scientific and educational portal eLab-Science and the Belarusian nuclear knowledge portal BelNET, created on its basis, make it possible to effectively implement the placement of content records in the portal taxonomy, as well as automatically generate a set of keywords for the resource being created
MATHEMATICAL MODELING
Objectives. The objective of the study is to use copula models to analyze shares of the Russian stock market and describe changes in the relationship between the shares before and during the coronavirus infection (COVID-19).
Methods. An algorithm for using copulas and functions of the R programming language in its implementation is presented. To model the dynamics of financial series the ARMA-GJR-GARCH process (autoregressive moving average Glosten-Jagannathan-Runkle model with generalized autoregressive conditional heteroskedasticity) is used. The selection of optimal families and parameters of copula models is carried out. The adequacy of the obtained models is checked and the results of the study of the relationship between the series are analyzed.
Results. An algorithm has been developed for a relatively new approach to using copulas in conjunction with the ARMA-GJR-GARCH model. The approach was used to study the impact of coronavirus in the context of the Russian economy. It is revealed that during the COVID-19 period the dependence between different stocks increases. It is shown that the effect of volatility in financial series increases after the outbreak of the pandemic.
Conclusion. The research algorithm using copula models in conjunction with the ARMA-GJR-GARCH process has shown its feasibility. This approach can be used with other GARCH-type models to study finance and other areas.
BIOINFORMATICS
Objectives. Study of ordinal regressions presented via the set of binary logistic regressions and their application in clinical practice for T-staging of gastric cancer.
Methods. Methods of ordinal regression statistical models, model performance assessment, and survival analysis were used.
Results. Basic ordinal regression models have been studied and applied to the clinical data of gastric cancer. Some clinical predictors have been added to the well-known prognostic criteria according to the TNM classification in the multifactor regression model, results seem appropriate for a personalized approach when planning the treatment volume for improving efficacy.
Conclusion. The study showed that the analysis of ordinal models, along with multinomial ones, provides additional information that helps to understand the behavior of the latent variable in the complex cancer processes. The clinical part of the study facilitates a differentiated approach to preoperative planning of the treatment volume for patients with the same T-stage, based on modeling results.
LOGICAL DESIGN
Objectives. The problem of constructing dissimilarity measures based on the application of the Hamming distance to generate controlled random binary test sets is solved. The main goal of this article is to develop methods for determining the Hamming distance for the achievability of finding the difference between test sets when they coincide according to estimates of other difference measures.
Methods. Based on the Hamming distance used in the theory and practice of generating controlled random tests, new dissimilarity measures are proposed for two binary test n-bit patterns. The basis of the proposed dissimilarity measures is the formation of sets of Hamming distances for initial sets, represented as sequences of characters from different alphabets.
Results. The indistinguishability of pairs of binary test sets Ti and Tk is shown using a dissimilarity measure based on the application of the Hamming distance. In this case, different pairs of sets may have identical Hamming distance values. To construct new measures of difference, the original binary test sequences are represented as sequences consisting of characters belonging to different alphabets. Various strategies are proposed for applying new measures of difference based on the use of one of three rules in generating controlled probability tests. It is shown that in all three cases of dissimilarity measures, only the first few of their components areinformative, as a rule, no more than two or three. Accordingly, the computational complexity for all three options is comparable and does not exceed 3n comparison operations. The experimental studies carried out confirm the effectiveness of the proposed dissimilarity measures and their low computational complexity.
Conclusion. The proposed dissimilarity measures expand the possibilities of generating test sets when forming controlled random tests. It is shown that test sets that are indistinguishable when using the Hamming distance as a dissimilarity measure have different values of the proposed dissimilarity measures, which makes it possible to more accurately classify randomly generated sets that are candidate test cases
SIGNAL, IMAGE, SPEECH, TEXT PROCESSING AND PATTERN RECOGNITION
Objectives. The purpose of the work is to select the basic computing microplatform of the onboard microarchitectural computing complex for the detection of anomalous situations in the territory of the Republic of Belarus from space on the basis of artificial intelligence methods.
Methods. The method of comparative analysis is used to select a computing platform. A series of performance tests and comparative analysis (benchmarking) are performed on the selected equipment. The methods of comparative and benchmarking analysis are performed in accordance with the terms of reference to the current project.
Results. A comparative analysis and performance testing of Raspberry Pi 4 Model B and Cool Pi 4 Model B single-board computers, as well as AI-accelerator Google Coral USB Accelerator with Google Edge TPU have been performed. The comparative analysis showed that Raspberry Pi 4 Model B and Cool Pi 4 Model B fully meet the terms of reference to the current project. At the same time Cool Pi 4 Model B handles neural network calculations well, but four times slower than similar calculations on Google Coral USB Accelerator. Neural network computations on the Raspberry Pi 4 Model B are 22 times slower than similar computations on the Google Coral USB Accelerator. Cool Pi 4 Model B outperforms Raspberry Pi 4 Model B by the factor of two to three for data copying and compression and almost six times faster for neural network computations.
Conclusion. Despite the fact that Raspberry Pi 4 Model B meets the terms of reference to the project as a computational basis, when developing an on-board microarchitectural computing system for detecting anomalous situations, it is worth using more powerful alternatives with built-in AI-accelerators (e.g., Radxa Rock 5 Model A) or with an additional external AI-accelerator (e.g., a combination of Cool Pi 4 Model B and Google Coral USB Accelerator). Using a Raspberry Pi 4 Model B with an additional AI-accelerator is also acceptable and will speed up computations by several dozen times. AI-accelerators provide the fastest neural network computations, but there are features related to the novelty of the technology that will be explored in further development.
Objectives. The task of color image segmentation without the use of preliminary training is considered. It arises, for example, when it is necessary to perform image segmentation with semantic and color properties unknown in advance immediately after their acquisition, or when the set of images intended for segmentation is too small, as well as when performing preliminary "exploratory" analysis of images. In such cases, powerful neural network and other segmentation tools that require deep learning can not be used.
Methods. An algorithm for interactive image segmentation is proposed, based on the analysis of the colors of areas selected interactively. First, in interactive mode, the image areas belonging to the objects are selected very approximately, and then regions belonging to the background are chosen. In the next step, the set of colors of the selected object areas and the set of colors of the selected background areas are clustered separately by one of the clustering algorithms, for example, k-means, fuzzy c-means, or the multi-level clustering algorithm proposed by the author. After this, non-informative elements are removed from the set of cluster centers describing the objects and the set of clusters presenting the background. The modified sets of object and background cluster centers are used for image segmentation.
Results . The constructed algorithm allows selection of the required objects in color images if the colors of the objects and the background are different. Interactive selection of object areas and background areas does not require accuracy or much effort and usually takes several tens of seconds. For selection, rectangular areas that lie entirely inside the object images, and rectangular areas that belong completely to the background can be used. Below an example of interactive regions selection and color image segmentation is shown.
Conclusion. The experiments performed showed the effectiveness of the proposed approach to segmenting color images. It can be used in cases where the semantic and color properties of images are not known in advance, and in cases where the use of more powerful deep learning methods, including neural networks, is too expensive or impossible.
Objectives. The goal of the research is to develop a new person-dependent method for verification of a signature of one person made on a tablet with a stylus in the presence of a limited number of signature samples of this person.
Methods. The paper shows how to construct an individual pattern of the dynamic signatures of any person, which is described by points in a multidimensional feature space and is intended for subsequent verification of the authenticity of the signatures of a given person. It is constructed using 5<N<20 samples of genuine human signatures. The pattern forms a convex object in a multidimensional feature space. It describes the peculiar properties of a signature performed by a specific person.
Results . The dynamics of signature execution is represented by three discrete parametric functions: coordinates of the stylus X, Y and its pressure on the tablet P, recorded at fixed time intervals. In the process of research, a number of secondary functions-features were selected and calculated from them. Since these data sets have different lengths, the dynamic time warping algorithm is used to compare them. The results of this transformation are distances between the dynamic features of two signatures, which serve as coordinates of a point in the feature space that describes the similarity of these signatures. The set of such points describes similarity of all pairs of genuine human signatures presented for verification in a multidimensional feature space. The convex hull of the cloud of these points is used as a pattern of a particular person's signature. The genuine signatures of any person are always different from each other; significant differences between them can distort the verification result.
Conclusion. Experimental studies performed on genuine and fake signatures of 498 people from the largest available database of dynamic signatures, DeepSignDB, showed a verification accuracy of about 98 % when analyzing 24,900 signatures. Half of them are genuine, half are fake
ISSN 2617-6963 (Online)