Predicting Customer Churn in the USA: A Performance Assessment of Machine Learning Techniques

Authors

  • Jordan Smith , Amelia Lucas Department of Engineering , Arizona State University Author

Keywords:

Customer Churn Prediction, Machine Learning, Gradient Boosting, Random Forest, Neural Networks, Support Vector Machine, Logistic Regression, Predictive Modeling, Customer Retention, AUC-ROC, Model Performance

Abstract

Customer churn prediction is a critical challenge for businesses, particularly in competitive industries such as telecommunications, subscription services, and e-commerce. This study evaluates and compares the performance of several machine learning algorithms—Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Gradient Boosting (GB), and Neural Networks (NN)—in predicting customer churn in a U.S.-based subscription service. The aim is to identify the most effective machine learning technique for churn prediction and assess its practical applicability for customer retention strategies. Using a dataset of customer behaviors and demographics, the models were evaluated based on accuracy, precision, recall, F1-score, and AUC-ROC. The results revealed that Gradient Boosting outperformed the other models in terms of overall accuracy and AUC-ROC, followed closely by Random Forest. Neural Networks demonstrated solid performance but faced challenges related to model interpretability and consistency. Logistic Regression and Support Vector Machine showed moderate performance, especially in scenarios requiring computational efficiency over raw accuracy. The study also highlighted the key features influencing churn, including customer tenure, payment history, and service usage, providing actionable insights for businesses. Based on the findings, the study concludes that while Gradient Boosting offers the highest predictive performance, Random Forest and Logistic Regression provide valuable alternatives depending on the specific needs of businesses.

Downloads

Published

2024-03-20