Preview

Informatics

Advanced search

Selection of geometrical features of nuclei оn fluorescent images of cancer cells

Abstract

The methods of geometric informative features selection of nuclei on fluorescent images of cancer cells are considered. During the survey, a review of existing geometric features was carried out, including both the signs of rotation resisted shape and displacement of the image, as well as signs of location in space. For the selection of characteristics, the methods were used: median, correlation with calculation of the Pearson correlation coefficient, correlation with calculation of the Spearman correlation coefficient, logistic regression model, random forest with CART trees and Gini criterion, random forest with CART trees and error minimization criterion. As a result of the investigation 11 characteristics were selected from 59 features, the quality of classification and time costs were calculated depending on the number of features for describing the objects. The use of 11 features is sufficient for the accuracy of classification as it allows to reduce time costs in 2,3 times.

About the Authors

Ya. U. Lisitsa
Belarusian State University
Belarus
Yauheniya U. Lisitsa, Researcher, the Faculty of Radiophysics and Computer Technologies


M. M. Yatskou
Belarusian State University
Belarus
Mikalai M. Yatskou, Cand. Sci. (Phys.-Math.), Assoc. Prof., the Faculty of Radiophysics and Computer Technologies


V. V. Skakun
Belarusian State University
Belarus
Victor V. Skakun, Cand. Sci. (Phys.-Math.), Assoc. Prof., the Faculty of Radiophysics and Computer Technologies


P. D. Pavel D. Kryvasheyeu
Belarusian State University
Belarus
Pavel D. Kryvasheyeu, Student, the Faculty of Radiophysics and Computer Technologies


V. V. Apanasovich
Institute of IT & Business Administration
Belarus
Vladimir V. Apanasovich, Dr. Sci. (Phys.-Math.), Professor, First Vice-Rector


References

1. Stewart B., Wild C. P. World Cancer Report 2014. Geneva, WHO Press, 2015, 512 p.

2. Lisitsa Y. U., Yatskou M. М., Apanasovich V. V., Apanasovich T. V. Programmnyj paket CellDataMiner dlja analiza ljuminescentnyh izobrazhenij rakovyh kletok [The software package CellDataMiner for data mining of fluorescent images of cancer cells]. Informatics, 2015, no. 4(48), pp. 73–84 (in Russian).

3. Ronneberger O., Baddeley D., Scheipl F., Verveer P. J., Burkhardt H., …, Joffe B. Spatial quantitative analysis of fluorescently labeled nuclear structures: problems, methods, pitfalls. Chromosome Research, 2008, no. 3, pp. 523–562.

4. Ang J. C., Mirzal A., Haron H., Hamed H. N. Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2016, no. 5, pp. 971–989.

5. Zhang P. W., Chen L., Huang T., Zhang N., Kong X.Y., Cai Y. D. Classifying ten types of major cancers based on reverse phase protein array profiles. PLoS One, 2015, no. 5, pp. 3–7.

6. Sonntag J., Bender C., Soons Z., Heyde S. von der, König R., …, Korf U. Reverse phase protein array based tumor profiling identifies a biomarker signature for risk classification of hormone receptor-positive breast cancer. Translational Proteomics, 2014, no. 2, pp. 52–59.

7. Kaddi C., Wang M. D. Models for predicting stage in head and neck squamous cell carcinoma using proteomic and transcriptomic data. IEEE Journal of Biomedical and Health Informatics, 2017, no. 1, pp. 246–253.

8. Stafford P., Cichacz Z., Woodbury N. W., Johnston S. A. Immunosignature system for diagnosis of cancer. Proceedings of the National Academy of Sciences of the United States of America, 2014, no. 30, pp. 3072–3080.

9. Nguyen T., Nahavandi S. Modified AHP for gene selection and cancer classification using type-2 fuzzy logic. IEEE Transactions on Fuzzy Systems, 2016, no. 2, pp. 273–287.

10. Nguyen T., Khosravi A., Creighton D., Nahavandi S. Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification. PloS One, 2015, no. 3.

11. Boom J. Van den, Heider D., Martin S. R., Pastore A., Mueller J. W. 3-phosphoadenosine 5-phosphosulfate (paps) synthases, naturally fragile enzymes specifically stabilized by nucleotide binding. Journal of Biological Chemistry, 2012, no. 21, pp. 17645–17655.

12. Heider D., Hauke S., Pyka M., Kessler D. Insights into the classification of small GTPases. Advances and Applications in Bioinformatics and Chemistry, 2010, no. 3, pp. 15–24.

13. Touw W. G., Bayjanov J. R., Overmars L., Backus L., Boekhorst J., Wels M., Hijum van S. A. Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle? Briefings in Bioinformatics, 2013, no. 3, pp. 315–326.

14. Dybowski J. N., Riemenschneider M., Hauke S., Pyka M., Verheyen J., Hoffmann D., Heider D. Improved Bevirimat resistance prediction by combination of structural and sequence-based classifiers. BioData Mining, 2011, no. 4, рр. 26–39.

15. Riemenschneider M., Senge R., Neumann U., Hüllermeier E., Heider D. Exploiting HIV-1 protease and reverse transcriptase cross-resistance information for improved drug resistance prediction by means of multi-label classification. BioData Mining, 2016, no. 9, рр. 10–16.

16. Hoek H., Rimm D. L., Williams K. R., Zhao H., Ariyan S., …, Halaban R. Expression profiling reveals novel pathways in the transformation of melanocytes to melanomas. Cancer Research, 2004, no. 15, pp. 5270–5282.

17. Chung G. G., Zerkowski M. P., Ghosh S., Camp R. L., Rimm D. L. Quantitative analysis of estrogen receptor heterogeneity in breast cancer. Laboratory Investigation, 2007, no. 7, pp. 662–669.

18. Camp R. L., Chung G. G., Rimm D. L. Automated subcellular localization and quantification of protein expression in tissue microarrays. Nature Medicine, 2002, no. 11, pp. 1323–1327.

19. Szesze M. K., Crisman C. L., Crow L., McMullen S., Major J. M., …, Wasserman L. M. Quantifying estrogen and progesterone receptor expression in breast cancer by digital imaging. Journal of Histochemistry and Cytochemistry, 2005, no. 6, pp. 753–762.

20. Lisitsa Y. U., Yatskou M. М., Apanasovich V. V., Apanasovich T. V., Shitik M. M. Imitacionnaja model' trehkanal'nyh ljuminescentnyh izobrazhenij populjacij rakovyh kletok [Simulation model for three-channel luminescent images of cancer cell populations]. Zhurnal prikladnoj spektroskopii [Journal of Applied Spectroscopy], 2014, no. 6, pp. 907–913 (in Russian).

21. Burger W., Burge M. Principles of Digital Image Processing: Core Algorithms. London, SpringerVerlag, 2009, 332 p.

22. Jähne B. Digital Image Processing. Iss. 5. Berlin, Heidelberg, Springer, 2002, 585 p.

23. Reiss Th. H. Recognizing Planar Objects using Invariant Image Features. Berlin, Springer, 1993, 186 p.

24. Hu M. K. Visual pattern recognition by moment invariants. IEEE Transactions on Information Theory, 1962, no. 8, pp. 179–187.

25. Neumann U. EFS: an ensemble feature selection tool implemented as R-package and web-application. BioData Mining, 2017, no. 10, рр. 21–30.

26. Bauer D. F. Constructing confidence sets using rank statistics. Journal of the American Statistical Association, 1972, no. 67, pp. 687–690.

27. Yu L. Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 2004, no. 5, pp. 1205–1224.

28. Suzuki N., Olson D. H., Reilly E. C. Developing landscape habitat models for rare amphibians with small geographic ranges: a case study of Siskiyou Mountains salamanders in the western USA. Journal of Machine Learning Research, 2008, no. 17, pp. 2197–2218.

29. Elith J., Graham C. H., Anderson R. P., Dudík M., Ferrier S., …, Zimmermann N. E. Novel methods improve prediction of species distributions from occurrence data. Journal of Space and Time in Ecology, 2006, no. 29, pp. 129–151.

30. Yu L., Liu H. Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 2004, no. 5, pp. 1205–1224.

31. Neumann U., Riemenschneider M., Sowa J.-P., Baars T., Kälsch J., Canbay A., Heider D. Compensation of feature selection biases accompanied with improved predictive performance for binary classification by using a novel ensemble feature selection approach. BioData Mining, 2016, no. 9, pp. 36–50.

32. Breiman L. Random forests. Machine Learning, 2001, no. 5, pp. 5–32. 33. Liu J., Lin Y., Lin M., Wu S., Zhang J. Feature selection based on quality of information. Neurocomputing, 2017, no. 225, pp. 11–22.

33. Montañés E., Fernández J., Díaz I., Combarro E. F., Ranilla J. Measures of rule quality for feature selection in text categorization. Advances in Intelligent Data Analysis V, 2003, no. 225, pp. 589–598.

34. Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., …, Duchesnay É. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 2011, no. 12, pp. 2825–2830.


Review

For citations:


Lisitsa Ya.U., Yatskou M.M., Skakun V.V., Pavel D. Kryvasheyeu P.D., Apanasovich V.V. Selection of geometrical features of nuclei оn fluorescent images of cancer cells. Informatics. 2019;16(2):7-17. (In Russ.)

Views: 799


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1816-0301 (Print)
ISSN 2617-6963 (Online)