Preview

Informatics

Advanced search

The study of the reliability of the hardware part of the office cluster

https://doi.org/10.37661/1816-0301-2021-18-2-48-57

Abstract

The study of measures of reliability of the hardware part of the office cluster was carried out on the example of the cluster SKIF-GEO-Office RB (further as “cluster”) developed within the framework of scientific and technical program "SKIF-NEDRA" (2015-2018, Program of the Union State of Russia and Belarus). The cluster components are located in a small rack on the basis of full Tower "Aerocool Expredator Black" type case.

The basic architectural principles implemented in the cluster, the composition, structural and functional scheme of the cluster are given. The methodological support for calculating the reliability of the cluster, based on previous studies of the authors, and its structural scheme of reliability is justified. The choice of the main measures of reliability of the cluster core and the set of computing facilities is justified and formulas of calculation of these measures are given. The analysis of the consequences of failures of component parts of the cluster is carried out.

A mathematical model of reliability (state graph) of the set of computing facilities of cluster is proposed, which allows to derive formulas for calculating the average value of the time-to-failure and time-to-interruption of cluster. The estimation of the reliability of the cluster as a whole, based on the calculation of measures of reliability on the reference data on the reliability of components as well as on the operation of supercomputers of the family SKIF. The measures of reliability of the cluster are calculated.

About the Authors

T. S. Martinovich
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
Belarus

Tatyana S. Martinovich - Researcher, The United Institute of Informatics Problems of the National Academy of Sciences of Belarus.

st. Surganova, 6, Minsk, 220012.



N. N. Paramonov
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
Belarus

Nikolaj N. Paramonov - Cand. Sci. (Eng.), Associate Professor, Leading Researcher, The United Institute of Informatics Problems of the National Academy of Sciences of Belarus.

st. Surganova, 6, Minsk, 220012.



A. G. Rymarchuk
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
Belarus

Aleksandr G. Rymarchuk - Chief Designer of the Project, The United Institute of Informatics Problems of the National Academy of Sciences of Belarus.

st. Surganova, 6, Minsk, 220012.



O. P. Tchij
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
Belarus

Oleg P. Tchij - Cand. Sci. (Phys.-Math.), Head of the Laboratory of High-Performance Systems, The United Institute of Informatics Problems of the National Academy of Sciences of Belarus.

st. Surganova, 6, Minsk, 220012.



References

1. Paramonov N. N., Tchij O. P., Rymarchuk A. G., Ablamejko S. V., Anishchenko V. V., Kruglikov S. V., Tuzikov A. V. Belorusskie superkomp'yutery semejstva SKIF. Belarusian Supercomputers of the SKIF Family, Gomel, Vechernij Gomel'-Media, 2020, 268 р. (In Russ.).

2. Kuleshova M. E., Paramonov N. N., Rymarchuk A. G., Tchij O. P. Belarusian clusters of the SKIF-GEO family. Sed'moj Nacional'nyj superkomp'juternyj forum: sbornik dokladov, Pereslavl'-Zalesskij, 27-30 nojabrja 2018 g. Institut programmnyh sistem Rossijskoj akademii nauk [7th National Supercomputer Forum: Collection of Reports, Pereslavl-Zalessky, 27 November - 30 November 2018. Program Systems Institute of the Russian Academy of Sciences] (In Russ.). Available at: http://2018.nskf.ru/TesisAll/00_Plenar/051_RymarchukAG.pdf/ (accessed 20.06.2020).

3. Kuleshova M. E., Murashko N. N., Paramonov N. N., Rymarchuk A. G., Tchij O. P. Small office cluster of the Belarusian SKIF family-GEO-Office. Shestoj Nacional'nyj superkomp'juternyj forum: sbornik dokladov, Pereslavl'-Zalesskij, 28 nojabrja - 01 dekabrja 2017 g. Institut programmnyh sistem Rossijskoj akademii nauk [6th National Supercomputer Forum: Collection of Reports, Pereslavl-Zalessky, 28 November - 01 December 2017. Program Systems Institute of the Russian Academy of Sciences] (In Russ.). Available at: http://2017.nscf.ru/nauchno-prakticheskaya-konferenciya/tezisy-dokladov/ (accessed 20.06.2020).

4. Anishchenko V. V., Kulbak L. I., Martinovich T. S. Reliability models of cluster computing systems. Vestsi Natsyianal'nai akademii navuk Belarusi. Seryia fizika-technichnykh navuk [Proceedings of the National Academy of Sciences of Belarus. Physical-technical series], 2008, no. 1, pp. 89-99 (In Russ.).

5. Viktorova V. S., Stepenyanc A. S. Modeli i metody rascheta nadezhnosti tekhnicheskih sistem. Models and Methods for Calculating the Reliability of Technical Systems. Moscow, Lenand, 2016, 256 р. (In Russ.).

6. Rymarchuk A. G., Evdokimchikov A. N., Mazjuk V. V, Kruglikov S. V., Paramonov N. N., Pechkovskij E. I. Kompaktnyj vychislitel'nyj klaster: patent Respubliki Belarus' na poleznuju model' № 12417, MPK 606F. Compact Computing Cluster: patent of the Republic of Belarus for Utility Model no. 12417, MPK 606F. Publ. date 30.10.2020 (In Russ.).


Review

For citations:


Martinovich T.S., Paramonov N.N., Rymarchuk A.G., Tchij O.P. The study of the reliability of the hardware part of the office cluster. Informatics. 2021;18(2):48-57. (In Russ.) https://doi.org/10.37661/1816-0301-2021-18-2-48-57

Views: 465


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1816-0301 (Print)
ISSN 2617-6963 (Online)