Machine Learning Models Trained on Synthetic Transaction Data: Enhancing Anti-Money Laundering (AML) Efforts in the Financial Services Industry

Authors

  • Gunaseelan Namperumal ERP Analysts Inc, USA Author
  • Akila Selvaraj iQi Inc, USA Author
  • Deepak Venkatachalam CVS Health, USA Author

Keywords:

synthetic transaction data, anti-money laundering (AML)

Abstract

Financial crimes, notably money laundering, have gotten more complex, necessitating novel AML procedures in financial services. Traditional AML rule-based algorithms and preset heuristics seldom detect complicated money laundering patterns. Finance data is sensitive and vulnerable to privacy and regulatory constraints, restricting its use for machine learning model building and training. This paper examines if machine learning-generated synthetic transaction data might enhance banks AML. Synthetic data is an innovative way to train machine learning algorithms to find money laundering abnormalities without disclosing personal data.

This paper evaluates traditional AML systems' limitations and the difficulties of obtaining and using transaction data due to privacy, regulatory, and data ownership concerns. Analysis of synthetic data production technologies includes GANs, VAEs, and Differential Privacy. These approaches may anonymize sensitive data and produce high-fidelity synthetic transaction data. Machine learning models built on synthetic datasets are tested for their ability to identify complex money laundering tactics that regular models miss. It also addresses the technological and ethical concerns of producing and exploiting synthetic data in finance, in compliance with GDPR and CCPA.

References

J. Brownlee, "A Gentle Introduction to Generative Adversarial Networks (GANs)," Machine Learning Mastery, 2021. [Online]. Available: https://machinelearningmastery.com/what-are-generative-adversarial-networks-gans/

J. Goodfellow et al., "Generative Adversarial Networks," in Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS 2014), Montreal, Canada, 2014, pp. 2672-2680.

D. P. Kingma and M. Welling, "Auto-Encoding Variational Bayes," in Proceedings of the 2nd International Conference on Learning Representations (ICLR 2014), Banff, Canada, 2014. [Online]. Available: https://arxiv.org/abs/1312.6114

L. B. Almeida et al., "A Review on Differential Privacy and Its Applications in Data Security," IEEE Access, vol. 8, pp. 17890-17906, 2020. doi: 10.1109/ACCESS.2020.2974325.

J. K. Hodge and J. M. Austin, "Machine Learning for Fraud Detection: An Overview," IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 4, pp. 871-883, April 2020. doi: 10.1109/TKDE.2019.2916417.

R. R. Y. Wang et al., "Synthetic Data Generation for Machine Learning: Techniques and Applications," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 7, pp. 2497-2510, July 2021. doi: 10.1109/TPAMI.2020.3016337.

S. R. K. Manandhar and J. Wang, "Synthetic Data for Machine Learning: How to Use Synthetic Data to Train Models and Evaluate Performance," in Proceedings of the 2021 IEEE International Conference on Data Mining (ICDM), Barcelona, Spain, 2021, pp. 1082-1090.

A. D. McCauley and R. S. M. Jones, "Challenges and Opportunities in Using Synthetic Data for Financial Applications," IEEE Transactions on Computational Intelligence and AI in Games, vol. 13, no. 1, pp. 60-72, March 2021. doi: 10.1109/TCIAIG.2021.3054111.

M. S. Lipton, "The Mythos of Model Interpretability," in Proceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, USA, 2016, pp. 96-102.

M. Xu et al., "Machine Learning Techniques for Anti-Money Laundering: A Survey," IEEE Access, vol. 8, pp. 98745-98762, 2020. doi: 10.1109/ACCESS.2020.2995557.

B. C. O’Neill, “Adversarial Attacks and Defenses in Machine Learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 10, pp. 3876-3890, October 2020. doi: 10.1109/TNNLS.2019.2914772.

H. Li et al., "Evaluating Machine Learning Models for Anti-Money Laundering: An Empirical Study," IEEE Transactions on Emerging Topics in Computing, vol. 8, no. 2, pp. 463-473, June 2020. doi: 10.1109/TETC.2019.2914720.

Y. Zhang and J. H. Lee, "Integrating Synthetic Data into Financial Fraud Detection Systems," in Proceedings of the 2021 IEEE Symposium on Security and Privacy (S&P), San Francisco, CA, USA, 2021, pp. 1254-1271.

J. Kim et al., "Evaluating the Efficacy of Synthetic Data in Machine Learning Models for Financial Risk Assessment," IEEE Transactions on Finance, vol. 15, no. 3, pp. 212-228, September 2021. doi: 10.1109/TFIN.2021.3057523.

A. J. B. Smith et al., "Addressing Data Privacy in Synthetic Data Generation for AML," IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1546-1558, 2021. doi: 10.1109/TIFS.2021.3073342.

D. Li and F. Wang, "A Comprehensive Review of Differential Privacy in Synthetic Data Generation," IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 5, pp. 2127-2140, May 2021. doi: 10.1109/TKDE.2020.2993322.

T. Chen et al., "Frameworks and Techniques for Integrating Machine Learning into AML Systems," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 1, pp. 67-79, January 2021. doi: 10.1109/TSMC.2020.3012230.

G. G. Vasilenko, "Challenges and Advances in Synthetic Data Generation for Financial Services," IEEE Transactions on Computational Finance, vol. 12, no. 4, pp. 1015-1028, August 2021. doi: 10.1109/TCF.2021.3056685.

E. L. Riddell and P. J. Edwards, "Ethical Considerations in Using Synthetic Data for Anti-Money Laundering," in Proceedings of the 2021 IEEE International Conference on Ethics in AI and Machine Learning (EAI), London, UK, 2021, pp. 143-150.

H. Zhao and X. Zheng, "Future Directions in Synthetic Data for Financial Fraud Detection," IEEE Transactions on Financial Technology, vol. 6, no. 2, pp. 189-203, June 2022. doi: 10.1109/TFT.2022.3057322.

Downloads

Published

06-12-2022

How to Cite

[1]
Gunaseelan Namperumal, Akila Selvaraj, and Deepak Venkatachalam, “Machine Learning Models Trained on Synthetic Transaction Data: Enhancing Anti-Money Laundering (AML) Efforts in the Financial Services Industry”, J. of Art. Int. Research, vol. 2, no. 2, pp. 183–218, Dec. 2022, Accessed: Jun. 09, 2025. [Online]. Available: https://tsbpublisher.org/jair/article/view/39