The study of the reliability of the hardware part of the office cluster


The study of measures of reliability of the hardware part of the office cluster was carried out on the example of the cluster SKIF-GEO-Office RB (further as “cluster”) developed within the framework of scientific and technical program "SKIF-NEDRA" (2015-2018, Program of the Union State of Russia and Belarus). The cluster components are located in a small rack on the basis of full Tower "Aerocool Expredator Black" type case.

The basic architectural principles implemented in the cluster, the composition, structural and functional scheme of the cluster are given. The methodological support for calculating the reliability of the cluster, based on previous studies of the authors, and its structural scheme of reliability is justified. The choice of the main measures of reliability of the cluster core and the set of computing facilities is justified and formulas of calculation of these measures are given. The analysis of the consequences of failures of component parts of the cluster is carried out.

A mathematical model of reliability (state graph) of the set of computing facilities of cluster is proposed, which allows to derive formulas for calculating the average value of the time-to-failure and time-to-interruption of cluster. The estimation of the reliability of the cluster as a whole, based on the calculation of measures of reliability on the reference data on the reliability of components as well as on the operation of supercomputers of the family SKIF. The measures of reliability of the cluster are calculated.

About the Authors

T. S. Martinovich
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus

Tatyana S. Martinovich - Researcher, The United Institute of Informatics Problems of the National Academy of Sciences of Belarus.

st. Surganova, 6, Minsk, 220012.

N. N. Paramonov
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus

Nikolaj N. Paramonov - Cand. Sci. (Eng.), Associate Professor, Leading Researcher, The United Institute of Informatics Problems of the National Academy of Sciences of Belarus.

st. Surganova, 6, Minsk, 220012.

A. G. Rymarchuk
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus

Aleksandr G. Rymarchuk - Chief Designer of the Project, The United Institute of Informatics Problems of the National Academy of Sciences of Belarus.

st. Surganova, 6, Minsk, 220012.

O. P. Tchij
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus

Oleg P. Tchij - Cand. Sci. (Phys.-Math.), Head of the Laboratory of High-Performance Systems, The United Institute of Informatics Problems of the National Academy of Sciences of Belarus.

st. Surganova, 6, Minsk, 220012.


For citations:

Martinovich T.S., Paramonov N.N., Rymarchuk A.G., Tchij O.P. The study of the reliability of the hardware part of the office cluster. Informatics. 2021;18(2):48-57. (In Russ.)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 1816-0301 (Print)
ISSN 2617-6963 (Online)