Comparative Analysis of Machine Learning Models for Vintage-Based Credit Scoring

Authors

  • Tan Yong Seng Heriot-Watt University Malaysia
  • Soo Huei Ching Heriot-Watt University Malaysia

DOI:

https://doi.org/10.22452/josma.vol7no2.3

Keywords:

Binary Classification, Credit Risk Scoring, Ensemble Learning, Machine Learning, Vintage Analysis

Abstract

Accurate credit risk assessment is crucial for financial institutions to minimise loan defaults. This study proposes a vintage-based credit scoring framework that integrates individual repayment behaviour with vintage analysis and evaluates five machine learning models, including logistic regression, random forest, XGBoost, stacking ensemble, and multilayer perceptron (MLP), for binary credit risk classification. Results show that ensemble methods, particularly random forest, achieve superior predictive performance with the highest F1-score (0.81), precision (0.87) and accuracy (0.96), while logistic regression exhibits high recall but low precision. The MLP shows good recall (0.79) and a competitive F1-score (0.77), making it suitable for prioritising high-risk borrower detection, although it lacks interpretability. Overall, the study highlights the trade-offs between predictive performance and interpretability, emphasising the potential of vintage-based approaches and ensemble learning for practical credit scoring applications.

Downloads

Download data is not yet available.

Author Biography

Soo Huei Ching, Heriot-Watt University Malaysia

Malaysia Mathematical and Computer Sciences • Associate Professor

Downloads

Published

2025-12-26

How to Cite

Tan, Y. S., & Soo, H. C. (2025). Comparative Analysis of Machine Learning Models for Vintage-Based Credit Scoring. Journal of Statistical Modeling and Analytics (JOSMA), 7(2). https://doi.org/10.22452/josma.vol7no2.3