Customer Category Classification
ML pipeline predicting customer market segments from demographics
A machine learning project that predicts which market segment (A, B, C, or D) a customer belongs to based on demographic and behavioral attributes such as age, profession, and spending habits — helping a retail company automate customer classification for targeted marketing.
Key Features:
- Complete ML pipeline: exploration, imputation, encoding, clustering, and supervised classification on 8,068 customer records
- Multiple algorithms compared: Naive Bayes, Logistic Regression, K-Nearest Neighbors, Decision Trees, and Neural Networks
- Robust evaluation with classification reports, confusion matrices, and multi-class ROC curves
- 70/15/15 train/validation/test split; Logistic Regression and Naive Bayes performed best (~0.745+ weighted AUC)
Tech Stack: Python · scikit-learn · pandas · NumPy · Matplotlib · Seaborn