Credit card fraud has increased with the fast expansion of online financial transactions, requiring the implementation of advanced detection systems. According to the IEEE-CIS dataset, this paper presents an extensive empirical assessment of ensemble learning methods for class-imbalanced fraud detection. By evaluating ensemble techniques such as Random Forest, XGBoost, LightGBM, and stacking approaches systematically, we address the critical issues of extreme class imbalance, concept drift, and real-time detection requirements. Our solution involves comprehensive feature engineering strategies tuned to the IEEE-CIS dataset, which consists of 590,540 transactions with a fraud rate of 3.5%, as well as advanced data balancing techniques (SMOTE, ADASYN, and Borderline-SMOTE). From experimental results, our ensemble stacking approach maintains low false positive rates while fraud is detected at high rates (0.918 AUC-ROC, 0.891 AUC-PR) and outperforms. The study offers useful implications for real-world practical implementation and empirical proof of the proficiency of ensemble approaches in dealing with highly imbalanced financial fraud datasets.
Ensemble-Based Fraud Detection: A Robust Approach Evaluated on IEEE-CIS
Tarif, Mehran;
2025-01-01
Abstract
Credit card fraud has increased with the fast expansion of online financial transactions, requiring the implementation of advanced detection systems. According to the IEEE-CIS dataset, this paper presents an extensive empirical assessment of ensemble learning methods for class-imbalanced fraud detection. By evaluating ensemble techniques such as Random Forest, XGBoost, LightGBM, and stacking approaches systematically, we address the critical issues of extreme class imbalance, concept drift, and real-time detection requirements. Our solution involves comprehensive feature engineering strategies tuned to the IEEE-CIS dataset, which consists of 590,540 transactions with a fraud rate of 3.5%, as well as advanced data balancing techniques (SMOTE, ADASYN, and Borderline-SMOTE). From experimental results, our ensemble stacking approach maintains low false positive rates while fraud is detected at high rates (0.918 AUC-ROC, 0.891 AUC-PR) and outperforms. The study offers useful implications for real-world practical implementation and empirical proof of the proficiency of ensemble approaches in dealing with highly imbalanced financial fraud datasets.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



