Preview

Informatics

Advanced search
Vol 17, No 1 (2020)
View or download the full issue PDF (Russian)

BIOINFORMATICS 

7-17 2387
Abstract
A generative adversarial autoencoder for the rational design of potential HIV-1 entry inhibitors able to block the region of the viral envelope protein gp120 critical for the virus binding to cellular receptor CD4 was developed using deep learning methods. The research were carried out to create the  architecture of the neural network, to form  virtual compound library of potential anti-HIV-1 agents for training the neural network, to make  molecular docking of all compounds from this library with gp120, to  calculate the values of binding free energy, to generate molecular fingerprints for chemical compounds from the training dataset. The training the neural network was implemented followed by estimation of the learning outcomes and work of the autoencoder.  The validation of the neural network on a wide range of compounds from the ZINC database was carried out. The use of the neural network in combination with virtual screening of chemical databases was shown to form a productive platform for identifying the basic structures promising for the design of novel antiviral drugs that inhibit the early stages of HIV infection.

MATHEMATICAL MODELING 

18-28 902
Abstract
Three-dimensional reconstruction based on the results of video endoscopic examination is a promising area for supporting medical diagnostics and treatment planning for a wide range of pathologies. Nevertheless, the assessment of the results of such reconstruction and verification of the correspondence of the obtained three-dimensional model to the original scene is significantly challenging. As a solution to this problem, the possibility of using a modelling environment to emulate the process of obtaining source video endoscopic data from the generated scene is suggested. The problem of three-dimensional modelling of the esophagus using the Autodesk 3ds Max environment and the Arnold visualization engine is considered. The paper describes the procedural generation of textures for the model and proposes the using Periodic Spatial Generative Adversarial Network models based on convolutional neural networks. To compare the result of  reconstruction with a scene, generated using the proposed modelling environment, an optimality criterion is introduced, by which the individual stages of the three-dimensional reconstruction algorithm are compared when the model is optimized using the bundle adjustment method.
29-38 663
Abstract

Multi-server retrial queueing system with heterogeneous servers is analyzed. Requests arrive to the system according to the Markovian arrival process. Arriving primary requests and requests retrying from orbit occupy an available server with the highest service rate, if there is any available server. Otherwise, the requests move to the orbit having an infinite capacity. The total retrial rate infinitely increases when the number of requests in orbit increases. Service periods have exponential distribution. Behavior of the system is described by multi-dimensional continuous-time Markov chain which belongs to the class of asymptotically quasi-toeplitz Markov chains. This allows to derive simple and transparent ergodicity condition and compute the stationary probabilities distribution of chain states. Presented numerical results illustrate the dynamics of some system effectiveness indicators and the importance of considering of correlation in the requests arrival process.

39-46 695
Abstract

The paper considers a local wavelet transform with a singular basis wavelet. The problem of nonparametric approximation of a function is solved by the use of the  sequence of local wavelet transforms. Traditionally believed that the wavelet should have an average equal to zero. Earlier, the author considered  singular wavelets when the average value is not equal to zero. As an example, the delta-shaped functions, participated in the estimates of Parzen – Rosenblatt and Nadara – Watson, were used as a wavelet. Previously,  a sequence of wavelet transforms for the entire numerical axis and finite interval was constructed for singular wavelets. 

The paper proposes a sequence of local wavelet transforms, a local wavelet transform is defined, the theorems that formulate the properties of a local wavelet transform are proved. To confirm the effectiveness of the algorithm an example of approximating the function by use of  the sum of discrete local wavelet transforms is given. 

INFORMATION PROTECTION AND SYSTEM RELIABILITY 

102-108 810
Abstract
The main options for the formation of a shared secret using synchronized artificial neural networks and possible patterns of behavior of a cryptanalyst are considered. To solve the problem of increasing the    confidentiality of the generated shared secret, if it is used as a cryptographic key, it is proposed to use the  mixing a certain number of results of individual synchronizations (convolution). As a mixing function, we consider the convolution of the vectors of network weights by bitwise addition modulo 2 of all the results of individual synchronizations. It is shown that the probability of success of a cryptanalyst is reduced exponentially with an increase of the number of terms in the convolution and can be chosen arbitrarily small. Moreover, the distribution law of the generated key after convolution is close to uniform and the uniformity increases with the number of terms in the convolution.
109-118 1040
Abstract
New hashing technique based on SHA-3 (Secure Hash Algorithm-3) is introduced. Chaotic maps are used in this technique to enhance performance without losing security. Introduced algorithm was tested for        resistance against collisions, statistical analysis of output sequences was performed, hashing performance was evaluated. The testing showed a low collision probability. The testing corresponds the standards of National Institute of Standards and Technology and showed that output sequences are close to random. Performance testing showed 60 % enhancement in comparison with plain SHA-3.

LOGICAL DESIGN 

47-62 951
Abstract
The relevance of testing modern computing systems and, first of all, their storage devices is shown. The studies are based on the use of a universal method for generating the address sequences with desired      properties for multiple March tests of random access memory devices.  The modification of economical method of Antonov and Saleev is used as mathematical model to form Sobol sequences. For this model a structural diagram of its hardware implementation is presented, where the storage device for storing direction numbers is used as the basis. The set of multitudes makes up the generating matrix. It is noted that the form of the generating matrix determines the basic properties of the generated sequences. Mathematical expressions are obtained that make it possible to estimate the limiting values of switching activity, both of the sequence itself and of its individual bits. A technique is proposed for the synthesis of generators of address sequences with a given switching activity both of its individual bits and of the sequence as a whole. Examples of the application of the proposed methods are considered. The applicability of the presented results to the synthesis of test sequence generators with a given switching activity for the purpose of testing storage devices and the formation of controlled random test sequences is substantiated. The results of the practical implementation of address sequence generators are presented and their main characteristics are evaluated.
63-77 674
Abstract

One of the directions of logical optimization of multilevel representations of systems of Boolean     functions is the methods based on the search of subsystems of functions that have the same parts in the domains of functions of selected subsystems. Such subsystems are called related. The good relationship of functions leads to the appearance of a large number of identical structural parts (conjunctions, algebraic expressions,  subfunctions, etc.) in optimized forms of representation of functions which are used in the construction of   combinational logic circuits. The more the functions of the selected subsystem are related, the sooner it is expected that in the representations of the functions of this subsystem will be more identical subexpressions and synthesized logic circuits will have less complexity. 

We describe software-implemented algorithms for extracting subsystems of related functions from a BDD    representation of a system of Boolean functions based on introduced numerical estimates of the relationship of BDD representations of functions. The relationship of Boolean functions is the presence of Boolean vectors, where the functions take the value as one, or of the same equations in BDD representations. BDD representations of Boolean functions are compact forms defining functions and are constructed as the result of Shannon decomposition of the functions of the original system (resulting from the decomposition of subfunctions) by all variables, which the functions of the original system depend on. The experiments show the effectiveness of proposed algorithms and programs in the synthesis of logic circuits from  logic elements library.

SIGNAL, IMAGE, SPEECH, TEXT PROCESSING AND PATTERN RECOGNITION 

78-86 1130
Abstract
The comparative study of two types of voice signal representation for larynx pathology detection is presented. Parameters obtained in clinical system lingWaves compared to parameters obtained by mel-frequency cepstral analysis. The classifier based on the probabilistic model (logistic regression) was designed to determine the suitability of given parameters for the larynx pathology detection problem. To train the classifier, the base of voice samples of 60 persons was recorded, 30 of which constitute the control group, and the other 30 had various diseases of the larynx (nodules of the vocal folds, laryngeal paralysis, or functional dysphonia). The results show that the classifier based on mel-frequency cepstral parameters (83,8 %) higher than the classifier based on parameters obtained in lingWaves (60,4 %).
87-101 1275
Abstract

The paper describes results of analytical and experimental analysis of seventeen functions used for evaluation of binary classification results of arbitrary data. The results are presented by 2×2 error matrices. The behavior and properties of the main functions calculated by the elements of such matrices are studied.  Classification options with balanced and imbalanced datasets are analyzed. It is shown that there are linear dependencies between some functions, many functions are invariant to the transposition of the error matrix, which allows us to calculate the estimation without specifying the order in which their elements were written to the matrices.

It has been proven that all classical measures such as Sensitivity, Specificity, Precision, Accuracy, F1, F2, GM, the Jacquard index are sensitive to the imbalance of classified data and distort estimation of smaller class objects classification errors. Sensitivity to imbalance is found in the Matthews correlation coefficient and Kohen’s kappa. It has been experimentally shown that functions such as the confusion entropy, the discriminatory power, and the diagnostic odds ratio should not be used for analysis of binary classification of imbalanced datasets. The last two functions are invariant to the imbalance of classified data, but poorly evaluate results with approximately equal common percentage of classification errors in two classes.

We proved that the area under the ROC curve (AUC) and the Yuden index calculated from the binary classification confusion matrix are linearly dependent and are the best estimation functions of both balanced and imbalanced datasets.



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1816-0301 (Print)
ISSN 2617-6963 (Online)