Detecting fraud in modern supply chains is difficult due to global complexity and limited labeled data. Traditional methods often fail with class imbalance and weak supervision. This paper proposes a two-phase framework to address these issues. First, Isolation Forest performs unsupervised anomaly detection to flag possible fraud and cut data volume. Second, a self-training SVM refines predictions with labeled and high-confidence pseudo-labeled samples for semi-supervised learning. We test the method on the DataCo Smart Supply Chain Dataset with fraud indicators. It achieves an F1-score of 0.817 and a false positive rate below 3.0%. These results show the value of combining unsupervised pre-filtering with semi-supervised refinement for fraud detection, though concept drift and lack of deep learning comparison remain as limits.
Semi-Supervised Supply Chain Fraud Detection with Unsupervised Pre-Filtering
Tarif, Mehran;
2025-01-01
Abstract
Detecting fraud in modern supply chains is difficult due to global complexity and limited labeled data. Traditional methods often fail with class imbalance and weak supervision. This paper proposes a two-phase framework to address these issues. First, Isolation Forest performs unsupervised anomaly detection to flag possible fraud and cut data volume. Second, a self-training SVM refines predictions with labeled and high-confidence pseudo-labeled samples for semi-supervised learning. We test the method on the DataCo Smart Supply Chain Dataset with fraud indicators. It achieves an F1-score of 0.817 and a false positive rate below 3.0%. These results show the value of combining unsupervised pre-filtering with semi-supervised refinement for fraud detection, though concept drift and lack of deep learning comparison remain as limits.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



