<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">inform</journal-id><journal-title-group><journal-title xml:lang="ru">Информатика</journal-title><trans-title-group xml:lang="en"><trans-title>Informatics</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1816-0301</issn><issn pub-type="epub">2617-6963</issn><publisher><publisher-name>UIIP NASB</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.37661/1816-0301-2023-20-1-55-74</article-id><article-id custom-type="elpub" pub-id-type="custom">inform-1228</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ИНТЕЛЛЕКТУАЛЬНЫЕ СИСТЕМЫ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>INTELLIGENT SYSTEMS</subject></subj-group></article-categories><title-group><article-title>Классификация займов c использованием логистической регрессии</article-title><trans-title-group xml:lang="en"><trans-title>Loan classification using logistic regression</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-1142-3992</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Бегунков</surname><given-names>В. И.</given-names></name><name name-style="western" xml:lang="en"><surname>Behunkou</surname><given-names>U. I.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Бегунков Владимир Иванович, магистр технических наук</p><p>ул. Сурганова, 6, Минск, 220012</p></bio><bio xml:lang="en"><p>Uladzimir I. Behunkou, M. Sc. (Eng.)</p><p> </p></bio><email xlink:type="simple">vbegunkov@gmail.com</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-0832-0829</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Ковалев</surname><given-names>М. Я.</given-names></name><name name-style="western" xml:lang="en"><surname>Kovalyov</surname><given-names>M. Y.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Ковалев Михаил Яковлевич, член-корреспондент НАН Беларуси, доктор физико-математических наук, профессор</p><p>ул. Сурганова, 6, Минск, 220012</p></bio><bio xml:lang="en"><p>Mikhail Y. Kovalyov, Corresponding Member of the National Academy of Sciences of Belarus, D. Sc. (Eng.)</p><p> </p></bio><email xlink:type="simple">kovalyov_my@newman.bas-net.by</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Объединенный институт проблем информатики  Национальной академии наук Беларуси</institution></aff><aff xml:lang="en"><institution>The United Institute of Informatics Problems of the National Academy of Sciences of Belarus</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2023</year></pub-date><pub-date pub-type="epub"><day>29</day><month>03</month><year>2023</year></pub-date><volume>20</volume><issue>1</issue><fpage>55</fpage><lpage>74</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Бегунков В.И., Ковалев М.Я., 2023</copyright-statement><copyright-year>2023</copyright-year><copyright-holder xml:lang="ru">Бегунков В.И., Ковалев М.Я.</copyright-holder><copyright-holder xml:lang="en">Behunkou U.I., Kovalyov M.Y.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://inf.grid.by/jour/article/view/1228">https://inf.grid.by/jour/article/view/1228</self-uri><abstract><sec><title>Цели</title><p>Цели. Решение задачи классификации займов имеет большое значение для финансовых институтов, которые должны эффективно распределять денежные активы между субъектами в рамках предоставления финансовых услуг. Поэтому финансовым организациям необходим инструмент наиболее точного определения надежных заемщиков. Одним из инструментов принятия таких решений служит машинное обучение. Целью работы является анализ возможности эффективного применения логистической регрессии для решения задачи классификации займов.</p></sec><sec><title>Метод</title><p>Метод. На основе алгоритма логистической регрессии с использованием исторических данных по выданным займам рассчитываются следующие метрики: стоимостная функция, Accuracy, Precision, Recall и мера . Полиномиальная регрессия и метод главных компонент применяются для определения оптимального набора входных данных для исследуемого алгоритма логистической регрессии.</p></sec><sec><title>Результаты</title><p>Результаты. Оценено влияние нормализации данных на конечный результат, дана оценка влияния сбалансированности целевых значений, рассчитано оптимальное граничное значение для алгоритма логистической регрессии, рассмотрено влияние увеличения входных показателей посредством заполнения отсутствующих значений и использования полиномов разной степени. Имеющийся набор входных показателей проанализирован на избыточность.</p></sec><sec><title>Заключение</title><p>Заключение. Результаты исследований подтверждают, что применение алгоритма логистической регрессии для решения задач классификации займов является целесообразным. Данный алгоритм позволяет быстро получить работающий инструмент классификации займов. </p></sec></abstract><trans-abstract xml:lang="en"><sec><title>Objectives</title><p>Objectives. The studied problem of loan classification is particularly important for financial institutions, which must efficiently allocate monetary assets between entities as part of the provision of financial services. Therefore, it is more important than ever for financial institutions to be able to identify reliable borrowers as accurately as possible. At the same time, machine learning is one of the tools for making such decisions. The aim of this work is to analyze the possibility of efficient use of logistic regression for solving the task of loan  classification.</p></sec><sec><title>Methods</title><p>Methods. Based on the logistic regression algorithm using historical data on loans issued, the following  metrics are calculated: cost function, Accuracy, Precision, Recall и  score. Polynomial regression and  principal component analysis are used to determine the optimal set of input data for the being studied logistic regression algorithm.</p></sec><sec><title>Results</title><p>Results. The impact of data normalization on the final result is estimated, the optimal regularization parameter for solving this problem is determined, the impact of the balance of target values is assessed, the optimal  boundary value for the logistic regression algorithm is calculated, the influence of increasing input indicators by means of filling in missing values and using polynomials of different degrees is considered and the existing set of input indicators is analyzed for redundancy.</p></sec><sec><title>Conclusion</title><p>Conclusion. The research results confirm that the application of the logistic regression algorithm for solving loan classification problems is appropriate. The use of this algorithm allows to get quickly a working loan  classification tool. </p></sec></trans-abstract><kwd-group xml:lang="ru"><kwd>классификация займа</kwd><kwd>скоринг</kwd><kwd>логистическая регрессия</kwd><kwd>машинное обучение</kwd><kwd>нормализация данных</kwd></kwd-group><kwd-group xml:lang="en"><kwd>loan classification</kwd><kwd>scoring</kwd><kwd>logistic regression</kwd><kwd>machine learning</kwd><kwd>data normalization</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Gerhard F., Harlalka A., Suvanam R. The coming opportunity in consumer lending. McKinsey Quarterly, 2021. Available at: https://www.mckinsey.com/business-functions/risk-and-resilience/our-insights/the-comingopportunity-in-consumer-lending (accessed 01.05.2021).</mixed-citation><mixed-citation xml:lang="en">Gerhard F., Harlalka A., Suvanam R. The coming opportunity in consumer lending. McKinsey Quarterly, 2021. Available at: https://www.mckinsey.com/business-functions/risk-and-resilience/our-insights/the-comingopportunity-in-consumer-lending (accessed 01.05.2021).</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Hand D. J., Henley W. E. Statistical classification methods in consumer credit scoring: a review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 1997, vol. 160, no. 3, pp. 523–541.</mixed-citation><mixed-citation xml:lang="en">Hand D. J., Henley W. E. Statistical classification methods in consumer credit scoring: a review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 1997, vol. 160, no. 3, pp. 523–541.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Baesens B., Van Gestel T., Viaene S., Stepanova S., Suykens J., Vanthienen J. Benchmarking state-of-theart classification algorithms for credit scoring. Journal of the Operational Research Society, 2003, vol. 54, no. 6, pp. 627–635.</mixed-citation><mixed-citation xml:lang="en">Baesens B., Van Gestel T., Viaene S., Stepanova S., Suykens J., Vanthienen J. Benchmarking state-of-theart classification algorithms for credit scoring. Journal of the Operational Research Society, 2003, vol. 54, no. 6, pp. 627–635.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Lessmann S., Baesens B., Seow H.-V., Thomas L. C. Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. European Journal of Operational Research, 2015, vol. 247, no. 1, pp. 124–136.</mixed-citation><mixed-citation xml:lang="en">Lessmann S., Baesens B., Seow H.-V., Thomas L. C. Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. European Journal of Operational Research, 2015, vol. 247, no. 1, pp. 124–136.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Shalev-Shwartz S., Ben-David S. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014, pp. 125, 126–127.</mixed-citation><mixed-citation xml:lang="en">Shalev-Shwartz S., Ben-David S. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014, pp. 125, 126–127.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Geron A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd edition. O’Reilly Media, 2019, pp. 144–149.</mixed-citation><mixed-citation xml:lang="en">Geron A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd edition. O’Reilly Media, 2019, pp. 144–149.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Murphy K. P. Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning Series). The MIT Press, 2012, pp. 225–227, 387–407.</mixed-citation><mixed-citation xml:lang="en">Murphy K. P. Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning Series). The MIT Press, 2012, pp. 225–227, 387–407.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Harrington P. Machine Learning in Action, 1st edition. Manning Publication Co, 2012, pp. 86–91, 148, 269–279.</mixed-citation><mixed-citation xml:lang="en">Harrington P. Machine Learning in Action, 1st edition. Manning Publication Co, 2012, pp. 86–91, 148, 269–279.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006, vol. 27, no. 8, pp. 861–874.</mixed-citation><mixed-citation xml:lang="en">Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006, vol. 27, no. 8, pp. 861–874.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Metz C. E. Basic principles of ROC analysis. Seminars in Nuclear Medicine, 1978, vol. 8, no. 4, pp. 283–298.</mixed-citation><mixed-citation xml:lang="en">Metz C. E. Basic principles of ROC analysis. Seminars in Nuclear Medicine, 1978, vol. 8, no. 4, pp. 283–298.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Kelleher J. D., Namee B. M., D’Arcy A. Fundamentals of Machine Learning for Predictive Data Analytics, 1st edition. The MIT Press, 2015, pp. 142–143, 539.</mixed-citation><mixed-citation xml:lang="en">Kelleher J. D., Namee B. M., D’Arcy A. Fundamentals of Machine Learning for Predictive Data Analytics, 1st edition. The MIT Press, 2015, pp. 142–143, 539.</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Gerhard F., Harlalka A., Suvanam R. The coming opportunity in consumer lending. McKinsey Quarterly, 2021. Available at: https://www.mckinsey.com/business-functions/risk-and-resilience/our-insights/the-comingopportunity-in-consumer-lending (accessed 01.05.2021).</mixed-citation><mixed-citation xml:lang="en">Gerhard F., Harlalka A., Suvanam R. The coming opportunity in consumer lending. McKinsey Quarterly, 2021. Available at: https://www.mckinsey.com/business-functions/risk-and-resilience/our-insights/the-comingopportunity-in-consumer-lending (accessed 01.05.2021).</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Hand D. J., Henley W. E. Statistical classification methods in consumer credit scoring: a review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 1997, vol. 160, no. 3, pp. 523–541.</mixed-citation><mixed-citation xml:lang="en">Hand D. J., Henley W. E. Statistical classification methods in consumer credit scoring: a review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 1997, vol. 160, no. 3, pp. 523–541.</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Baesens B., Van Gestel T., Viaene S., Stepanova S., Suykens J., Vanthienen J. Benchmarking state-of-theart classification algorithms for credit scoring. Journal of the Operational Research Society, 2003, vol. 54, no. 6, pp. 627–635.</mixed-citation><mixed-citation xml:lang="en">Baesens B., Van Gestel T., Viaene S., Stepanova S., Suykens J., Vanthienen J. Benchmarking state-of-theart classification algorithms for credit scoring. Journal of the Operational Research Society, 2003, vol. 54, no. 6, pp. 627–635.</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Lessmann S., Baesens B., Seow H.-V., Thomas L. C. Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. European Journal of Operational Research, 2015, vol. 247, no. 1, pp. 124–136.</mixed-citation><mixed-citation xml:lang="en">Lessmann S., Baesens B., Seow H.-V., Thomas L. C. Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. European Journal of Operational Research, 2015, vol. 247, no. 1, pp. 124–136.</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Shalev-Shwartz S., Ben-David S. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014, pp. 125, 126–127.</mixed-citation><mixed-citation xml:lang="en">Shalev-Shwartz S., Ben-David S. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014, pp. 125, 126–127.</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Geron A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd edition. O’Reilly Media, 2019, pp. 144–149.</mixed-citation><mixed-citation xml:lang="en">Geron A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd edition. O’Reilly Media, 2019, pp. 144–149.</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Murphy K. P. Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning Series). The MIT Press, 2012, pp. 225–227, 387–407.</mixed-citation><mixed-citation xml:lang="en">Murphy K. P. Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning Series). The MIT Press, 2012, pp. 225–227, 387–407.</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">Harrington P. Machine Learning in Action, 1st edition. Manning Publication Co, 2012, pp. 86–91, 148, 269–279.</mixed-citation><mixed-citation xml:lang="en">Harrington P. Machine Learning in Action, 1st edition. Manning Publication Co, 2012, pp. 86–91, 148, 269–279.</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006, vol. 27, no. 8, pp. 861–874.</mixed-citation><mixed-citation xml:lang="en">Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006, vol. 27, no. 8, pp. 861–874.</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">Metz C. E. Basic principles of ROC analysis. Seminars in Nuclear Medicine, 1978, vol. 8, no. 4, pp. 283–298.</mixed-citation><mixed-citation xml:lang="en">Metz C. E. Basic principles of ROC analysis. Seminars in Nuclear Medicine, 1978, vol. 8, no. 4, pp. 283–298.</mixed-citation></citation-alternatives></ref><ref id="cit22"><label>22</label><citation-alternatives><mixed-citation xml:lang="ru">Kelleher J. D., Namee B. M., D’Arcy A. Fundamentals of Machine Learning for Predictive Data Analytics, 1st edition. The MIT Press, 2015, pp. 142–143, 539.</mixed-citation><mixed-citation xml:lang="en">Kelleher J. D., Namee B. M., D’Arcy A. Fundamentals of Machine Learning for Predictive Data Analytics, 1st edition. The MIT Press, 2015, pp. 142–143, 539.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
