Customer Category Classification

ML pipeline predicting customer market segments from demographics

A machine learning project that predicts which market segment (A, B, C, or D) a customer belongs to based on demographic and behavioral attributes such as age, profession, and spending habits — helping a retail company automate customer classification for targeted marketing.

Key Features:

  • Complete ML pipeline: exploration, imputation, encoding, clustering, and supervised classification on 8,068 customer records
  • Multiple algorithms compared: Naive Bayes, Logistic Regression, K-Nearest Neighbors, Decision Trees, and Neural Networks
  • Robust evaluation with classification reports, confusion matrices, and multi-class ROC curves
  • 70/15/15 train/validation/test split; Logistic Regression and Naive Bayes performed best (~0.745+ weighted AUC)

Tech Stack: Python · scikit-learn · pandas · NumPy · Matplotlib · Seaborn