•  
  •  
 

Abstract

Today companies keep a large number of important data, including the work they plan to do in the future, electronically. In many cases, financial information is stolen that can harm the entire company or individual. One of these types of fraud occurs in bank payments. The use of graph data science augments existing analytics and machine learning pipelines, increasing the accuracy and applicability of existing fraud detection methods. In this study, BankSim dataset created from a bank payment information simulation in Spain was used. It is aimed to detect fraud by classifying normal payments and injected fraud signatures on BankSim. RandomForest (RF), SVM, XGBoost (XGB), K Nearest Neighbors (k-NN) classification algorithms in python language were used for classification. K-fold cross validation was used for performance evaluations. Neo4j database was used for graph analysis and CypherQL was used as Neo4j query language. The implementation of this fraud detection has resulted in fewer fraudulent transactions and a more reliable revenue stream. The performances of SVM, RF, XGB, k-NN algorithms were evaluated for fraud detection in bank payments, and the performance of the algorithms was compared according to the K-Fold cross-validation results in terms of performance. In the graph mining phase, the results obtained with the standard machine learning method were optimized together with graph algorithms such as PageRank, Community, and degree. In this respect, it has been proven that the use of graph mining and machine learning algorithms together has higher accuracy rates compared to other methods and is a method that calculates in a faster time.

Reviewers Suggestions

None

DOI

10.24012/dumf.1002110

Share

COinS