Validated Patterns

MLOps Fraud Detection

Validation status:
Sandbox Sandbox
CI status:
Links:

About the MLOps Fraud Detection

MLOps Credit Card Fraud Detection use case
  • Build and train models in RHODS to detect credit card fraud

  • Track and store those models with MLFlow

  • Serve a model stored in MLFlow using RHODS Model Serving (or MLFlow serving)

  • Deploy a model application in OpenShift that runs sends data to the served model and displays the prediction

Background

AI technology is already transforming the financial services industry. AI models can be used to make rapid inferences that benefit the FS institute and its customers. This pattern deploys a AI model to detect fraud on crdit card transactions

About the solution

The model is built on a Credit Card Fraud Detection model, which predicts if a credit card usage is fraudulent or not depending on a few parameters such as: distance from home and last transaction, purchase price compared to median, if it’s from a retailer that already has been purchased from before, if the PIN number is used and if it’s an online order or not.

Technology Highlights:

  • Event-Driven Architecture

  • Data Science on OpenShift

  • Model registry using MLFlow

Solution Discussion

This architecture pattern demonstrates four strengths:

  • Real-Time Processing: Analyze transactions in real-time, quickly identifying and flagging potentially fraudulent activities. This speed is crucial in preventing unauthorized transactions before they are completed.

  • Pattern Recognition: Detect patterns and anomalies in data and learn from historical transaction data to identify typical spending patterns of a cardholder and flag transactions that deviate from these patterns.

  • Cost Efficiency: By automating the detection process, AI reduces the need for extensive manual review of transactions, which can be time-consuming and costly.

  • Flexibility and Agility: An cloud native architecture that supports the use of microservices, containers, and serverless computing, allowing for more flexible and agile development and deployment of AI models. This means faster iteration and deployment of new fraud detection algorithms.

Demo Video

Overview of the solution for credit card fraud detection

Overview of the Architecture

Description of each component:

  • Data Set: The data set contains the data used for training and evaluating the model we will build in this demo.

  • RHODS Notebook: We will build and train the model using a Jupyter Notebook running in RHODS.

  • MLFlow Experiment tracking: We use MLFlow to track the parameters and metrics (such as accuracy, loss, etc) of a model training run. These runs can be grouped under different "experiments", making it easy to keep track of the runs.

  • MLFlow Model registry: As we track the experiment we also store the trained model through MLFlow so we can easily version it and assign a stage to it (for example Staging, Production, Archive).

  • S3 (ODF): This is where the models are stored and what the MLFlow model registry interfaces with. We use ODF (OpenShift Data Foundation) according to the MLFlow guide, but it can be replaced with another solution.

  • RHODS Model Serving: We recommend using RHODS Model Serving for serving the model. It’s based on ModelMesh and allows us to easily send requests to an endpoint for getting predictions.

  • Application interface: This is the interface used to run predictions with the model. In our case, we will build a visual interface (interactive app) using Gradio and let it load the model from the MLFlow model registry.

mfd reference architecture
Figure 1. Overview of the solution reference architecture

About the technology

The following technologies are used in this solution:

Red Hat OpenShift Container Platform

An enterprise-ready Kubernetes container platform built for an open hybrid cloud strategy. It provides a consistent application platform to manage hybrid cloud, public cloud, and edge deployments. It delivers a complete application platform for both traditional and cloud-native applications, allowing them to run anywhere. OpenShift has a pre-configured, pre-installed, and self-updating monitoring stack that provides monitoring for core platform components. It also enables the use of external secret management systems, for example, HashiCorp Vault in this case, to securely add secrets into the OpenShift platform.

Red Hat OpenShift AI

Red Hat® OpenShift® AI is an AI-focused portfolio that provides tools to train, tune, serve, monitor, and manage AI/ML experiments and models on Red Hat OpenShift. Bring data scientists, developers, and IT together on a unified platform to deliver AI-enabled applications faster.

Red Hat OpenShift GitOps

A declarative application continuous delivery tool for Kubernetes based on the ArgoCD project. Application definitions, configurations, and environments are declarative and version controlled in Git. It can automatically push the desired application state into a cluster, quickly find out if the application state is in sync with the desired state, and manage applications in multi-cluster environments.

Red Hat AMQ Streams

Red Hat AMQ streams is a massively scalable, distributed, and high-performance data streaming platform based on the Apache Kafka project. It offers a distributed backbone that allows microservices and other applications to share data with high throughput and low latency. Red Hat AMQ Streams is available in the Red Hat AMQ product.

Hashicorp Vault (community)

Provides a secure centralized store for dynamic infrastructure and applications across clusters, including over low-trust networks between clouds and data centers.

MLFlow Model Registry (community)

A centralized model store, set of APIs, and UI, to collaboratively manage the full lifecycle of an MLflow Model. It provides model lineage (which MLflow experiment and run produced the model), model versioning, model aliasing, model tagging, and annotations.

Other

This solution also uses a variety of observability tools including the Prometheus monitoring and Grafana dashboard that are integrated with OpenShift as well as components of the Observatorium meta-project which includes Thanos and the Loki API.

Next steps