Bank Churn Analysis using Machine Learning

Author: Anmol Gaba


Objective:

To predict the Customer churn on a quarterly basis using Machine Learning models and take necessary measures for the customer retention. Implementation of this analysis is done using DATAIKU which provides us the features like data preparation, visualization, MLOPs, Analytics

etc.

Integration of DATAIKU is easy with SNOWFLAKE with the help of a partner connect.


Prerequisite:

DATAIKU account SNOWFLAKE account

Steps:

1) Open SNOWFLAKES and click on partner connect and search the DATAIKU and click on the icon.



Once opened, click on the launch and this will open a DATAIKU DSS launchpad from where we can launch the services, manage the users, roles, groups, plugin and also manage the subscriptions etc.



Once turning on the services it will take some time to load the DSS and once done you can create, import and check the old projects and tutorials.


2) We have created a new project in our use case and once project creation is done we have to create the dataset and recipes.



Importing the dataset and applying recipes will create the dataflow and then later perform operations like visualizations, AUTOML etc.


3) In customer Churn Analysis


We have created the below data flow which consists of dataset, recipes, SQL scripts, ML model etc. to find the predictions using multiple features present in our dataset.



In the above diagram we have taken the dataset from Customer table which consist of customer specific data, from services table which consist of service related data like Credit card, UPI, Demat account etc and from the transactions table which consist of customer transaction related data like amount, mode, transaction ID, accountID etc.



Apart from customer recipes we also have some inbuilt recipes which just need to drag and drop in order to use in a data flow diagram.Recipes like group, finding distinct, join, merge, split along with SQL, python language code recipes are present in DATAIKU.




4) After importing the dataset operations like data cleansing, data preprocessing is done on the dataset.


Once all the 3 data sources are ready we joined them using inner join and created the new dataset which is now ready to apply Machine Learning algorithms.



The above data set which consists of around 17 features consist of services which customers are using, age, balance, gender, qualification, inactivity etc. which works as independent features while CHURN acts as a dependent feature.


5) After feature engineering, the next step is the model training and hyperparameter tuning and optimizations.


In our case we have trained using classification algorithms like KNN, Random Forests, Decision Tree, Logistic Regression and then check the accuracy metrics like accuracy and performance for model selections.



Since each algorithm has many parameters which need to be set before model training. So in case of hyperparameter tuning we can give the multiple values or a range of values. In below figure we can give the multiple values of (k) i.e nearest neighbor, while model training and evaluation the best parameters with higher accuracy will be selected automatically and once this is done we will publish the model in the data flow for further analysis.



6) After publishing the model we are going to predict the customer churn for those customers who are still using our services.


In the below image you can see here the customer churn is NO.This signifies that our customer is happy with the services the bank is offering.



Conclusion:

Analysis of customer churn data has been completed. The Bank representative team will analyze the results and try for the customer retention by providing them more offers, services, benefits and discounted vouchers so that they can retain their customers. As creating new customers is more expensive than retaining existing customers.


References:

  1. https://www.javatpoint.com/k-nearest-neighbor-algorithm-for-machine-learning

  2. https://blog.ineuron.ai/Random-forest-r7gFle7V8L

  3. https://blog.ineuron.ai/All-About-Decision-Tree-from-Scratch-with-Python-Implementation-JDh9qypLPl

  4. https://www.analyticsvidhya.com/blog/2021/10/building-an-end-to-end-logistic-regression-model/

  5. https://medium.datadriveninvestor.com/bank-churn-prediction-using-popular-classification-algorithms-143d72dfc70b

  6. https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data

15 views0 comments

Recent Posts

See All