Credit card fraud has increased with the fast expansion of online financial transactions, requiring the implementation of advanced detection systems. According to the IEEE-CIS dataset, this paper presents an extensive empirical assessment of ensemble learning methods for class-imbalanced fraud detection. By evaluating ensemble techniques such as Random Forest, XGBoost, LightGBM, and stacking approaches systematically, we address the critical issues of extreme class imbalance, concept drift, and real-time detection requirements. Our solution involves comprehensive feature engineering strategies tuned to the IEEE-CIS dataset, which consists of 590,540 transactions with a fraud rate of 3.5%, as well as advanced data balancing techniques (SMOTE, ADASYN, and Borderline-SMOTE). From experimental results, our ensemble stacking approach maintains low false positive rates while fraud is detected at high rates (0.918 AUC-ROC, 0.891 AUC-PR) and outperforms. The study offers useful implications for real-world practical implementation and empirical proof of the proficiency of ensemble approaches in dealing with highly imbalanced financial fraud datasets.

Ensemble-Based Fraud Detection: A Robust Approach Evaluated on IEEE-CIS

Tarif, Mehran;
2025-01-01

Abstract

Credit card fraud has increased with the fast expansion of online financial transactions, requiring the implementation of advanced detection systems. According to the IEEE-CIS dataset, this paper presents an extensive empirical assessment of ensemble learning methods for class-imbalanced fraud detection. By evaluating ensemble techniques such as Random Forest, XGBoost, LightGBM, and stacking approaches systematically, we address the critical issues of extreme class imbalance, concept drift, and real-time detection requirements. Our solution involves comprehensive feature engineering strategies tuned to the IEEE-CIS dataset, which consists of 590,540 transactions with a fraud rate of 3.5%, as well as advanced data balancing techniques (SMOTE, ADASYN, and Borderline-SMOTE). From experimental results, our ensemble stacking approach maintains low false positive rates while fraud is detected at high rates (0.918 AUC-ROC, 0.891 AUC-PR) and outperforms. The study offers useful implications for real-world practical implementation and empirical proof of the proficiency of ensemble approaches in dealing with highly imbalanced financial fraud datasets.
2025
Credit Card Fraud Detection, Ensemble learning, Imbalanced Classification, IEEE-CIS Dataset, Machine Learning, Financial Security
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1178447
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact