Credit Card Approving

Goal

Construct a credit scoring model that accurately predicts the creditworthiness of clients for a credit card company.

Result

The development of an efficient credit card issuance strategy, driven by a predictive model, aimed at optimizing profitability for the company and reducing financial risks.

Case

The financial success of a credit card company heavily relies on accurate customer selection. Thus, it is imperative to possess reliable and powerful tools capable of effectively utilizing available data to identify and acquire the most profitable clients.

The following are findings provided by such a model.

Which Variables Shows the Strongest Correlations?

There are three pair of variables that shows meaningful correlations:

a. Annual Income and Type of Education.

b. Age and Days Employed.

c. Family members and Number of Children.

Which Variables Displays Outliers?

To have a robust model, it’s essential to handle properly the outliers presented in the data. It’s been identified tree variables that needed to be cleaned in this way: annual income, number of family members and number of children.

Which Algorithm is Best Suited?

Having tested different classifier types and considering the problem's binary classification nature, it has been determined that a decision tree classifier or a random forest classifier would be the most suitable solution.

Considering that both have similar accuracy (0.80 vs 0.82), it’s necessary to analyze the feature importance for each model to ensure theoretical validation.

What are the Most Important Features?

The most important features for the decision tree classifier are age (days_birth), days employed and annual income. In exchange, the most important features for the random forest classifier are age, number of family members and days employed.

Reviewing this, we can conclude that the decision tree classifier has the most solid theoretical background and should be used as a baseline model.