A Semi-deterministic Ensemble Strategy For Imbalanced Datasets (SDEID) Applied To Bankruptcy Prediction
Free (open access)
205 - 213
R. A. Mathiasi Horta, B. S. L. Pires de Lima & C. C. H. Borges
In the last decade, there was a rapid growth in the availability and use of credit for Brazilian companies. Until recently, the decision to grant credit was based on human trial to evaluate the risk of insolvency. Increased demand from companies for credit has led to the use of more accurate models for bankruptcy prediction. In recent years much progress has occurred in the process of drawing up a model fostered by increased competition among financial institutions, changes in the economic environment for businesses and advances in computational techniques. This article discusses and presents alternatives for some of the main problems in the preparation of models for bankruptcy prediction with the application of data mining techniques. The first problem approached is the class imbalance that may cause a poor classification performance and it is treated jointly with an ensemble strategy. The other one rely on the selection of the most significant combination of attributes, the financial variables, which have been widely studied in insolvency prediction. Finally, it is presented a case study in a real world data base of Brazilian companies. Keywords: data mining, bankruptcy prediction, ensemble, attribute selection, data pre-processing. 1 Introduction The problem of efficient bankruptcy prognosis is of great interest both to scientists and practitioners. Owners, managers, investors, creditors and business
data mining, bankruptcy prediction, ensemble, attribute selection, data pre-processing.