Validated Patterns

Deploying the OPEA QnA chat accelerated with Intel Gaudi pattern

Prerequisites
  • An OpenShift Container Platform cluster

    • To create an OpenShift Container Platform cluster, go to the Red Hat Hybrid Cloud console and select Services -> Containers -> Create cluster.

    • The cluster must have a dynamic StorageClass to provision PersistentVolumes. It was tested with ODF (OpenShift Data Foundation) or LVM Storage solutions. CephFS should be set as a default Storage Class - Setup Guide

    • Required hardware.

  • OpenShift Container Platform Cluster must have a configured Image Registry - Setup Guide

  • A GitHub account and a token for it with repositories permissions, to read from and write to your forks.

  • A HuggingFace account and User Access token, which allows to download AI models. More about User Access token can be found on official HuggingFace website

  • Install the tooling dependencies.

  • Install AWS CLI tool to check status of S3 bucket (RGW storage).

If you do not have a running Red Hat OpenShift cluster, you can start one on a public or private cloud by using Red Hat Hybrid Cloud Console.

Procedure
  1. Fork the qna-chat-gaudi repository on GitHub.

  2. Clone the forked copy of this repository.

    git clone git@github.com:your-username/qna-chat-gaudi.git
  3. Create a local copy of the secret values file that can safely include credentials. Run the following commands:

    cp values-secret.yaml.template ~/values-gaudi-rag-chat-qna.yaml
    vi ~/values-gaudi-rag-chat-qna.yaml

    Do not commit this file. You do not want to push personal credentials to GitHub. If you do not want to customize the secrets by copying secret, these steps are not needed. User can just type in required HuggingFace User Access Token while installing pattern. In the beginning of the installation process there should appear prompt asking for the HuggingFace User Access Token.

  4. (Optional) If cluster is behind proxy values-global.yaml should be similar to the following:

    If the cluster is behind proxy remember to change proxy values of fields gaudillm.build_envs and gaudillm.runtime_envs in values-global.yaml file to appropriate ones.

      gaudillm:
        namespace: gaudi-llm
        build_envs:
        - name: http_proxy
          value: http://proxy-internal.cluster1.gaudi.internal:912
        - name: https_proxy
          value: http://proxy-internal.cluster1.gaudi.internal:912
        - name: HTTP_PROXY
          value: http://proxy-internal.cluster1.gaudi.internal:912
        - name: HTTPS_PROXY
          value: http://proxy-internal.cluster1.gaudi.internal:912
        - name: no_proxy
          value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost
        - name: NO_PROXY
          value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost
        runtime_envs:
        - name: http_proxy
          value: http://proxy-internal.cluster1.gaudi.internal:912
        - name: https_proxy
          value: http://proxy-internal.cluster1.gaudi.internal:912
        - name: HTTP_PROXY
          value: http://proxy-internal.cluster1.gaudi.internal:912
        - name: HTTPS_PROXY
          value: http://proxy-internal.cluster1.gaudi.internal:912
        - name: no_proxy
          value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost
        - name: NO_PROXY
          value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost
  5. Customize the deployment for your cluster. Run the following command:

    git checkout -b my-branch
    vi values-global.yaml
    git add values-global.yaml
    git commit values-global.yaml
    git push origin my-branch
  6. Deploy the pattern by running ./pattern.sh make install or by using the Validated Patterns Operator.

    If you have not set HuggingFace token in secrets file there will be prompt to set the token.

Deploying the cluster by using the pattern.sh file

To deploy the cluster by using the pattern.sh file, complete the following steps:

  1. Login to your cluster by running the following command:

     oc login

    Optional: Set the KUBECONFIG variable for the kubeconfig file path:

     export KUBECONFIG=~/<path_to_kubeconfig>
  2. Deploy the pattern to your cluster. Run the following command:

     ./pattern.sh make install
  3. In the beginning of installation there should be prompt to enter HuggingFace token, looking like this:

    Insert HuggingFace Token:

    Validated Pattern can take a while to be fully installed as it requires couple of reboots to apply MachineConfigs for worker nodes.

    As part of this pattern, HashiCorp Vault has been installed. Refer to the section on Vault.

Verification

  1. Verify that the Operators have been installed.

    1. To verify, in the OpenShift Container Platform web console, navigate to Operators → Installed Operators page.

    2. Check that the Operators are installed in the openshift-operators namespace and its status is Succeeded.

  2. Verify that S3 bucket was created successfully. Run following commands:

    export AWS_ACCESS_KEY_ID=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d)
    export AWS_SECRET_ACCESS_KEY=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d)
    export CEPH_RGW_ENDPOINT=http://$(oc -n openshift-storage get route s3-rgw -ojsonpath='{.spec.host}')
    aws --endpoint ${CEPH_RGW_ENDPOINT} s3api list-buckets

    Expected response should show that bucket named model-bucket exists:

    {
        "Buckets": [
            {
                "Name": "model-bucket",
                "CreationDate": "2024-06-27T08:44:09.451000+00:00"
            }
        ],
        "Owner": {
            "DisplayName": "ocs-storagecluster",
            "ID": "ocs-storagecluster-cephobjectstoreuser"
        }
    }
  3. Verify that MachineConfigPool is ready. To check the status run following command:

    oc get mcp
  4. Verify that all applications are healthy and synchronized. Under the project gaudi-rag-chat-qna click the URL for the hub gitops server.

    Gaudi RAG Chat QnA GitOps Hub

Setup Red Hat OpenShift AI

After all components are properly deployed user can proceed to setup Red Hat OpenShift AI. The procedures consists of two manual steps:

  1. Uploading AI model to S3 bucket (RGW storage) on the cluster

  2. Deploying TGI

Upload AI model

  1. First step is to go to RHOAI dashboard/Data Science Projects tab and select gaudi-llm project (If the gaudi-llm project is missing user should check if app is ready in ArgoCD dashboard):

    Select RHOAI project
  2. Go to Workbenches tab and click Create workbench:

    Create RHOAI workbench
  3. Fill the input form following images below:

    Deploy Jupyter notebook 1
    Deploy Jupyter notebook 2
  4. After workbench is created go to this Jupyter notebook dashboard and upload Jupyter notebook file /download-model.ipynb to the file explorer so it looks like this:

    Jupyter notebook view
  5. When download-model is uploaded, run notebook’s all commands. After notebook is executed, model should be uploaded to S3 bucket. To check if model is present please run following commands:

    export AWS_ACCESS_KEY_ID=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d)
    export AWS_SECRET_ACCESS_KEY=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d)
    export CEPH_RGW_ENDPOINT=http://$(oc -n openshift-storage get route s3-rgw -ojsonpath='{.spec.host}')
    aws --endpoint ${CEPH_RGW_ENDPOINT} s3 ls model-bucket/models/

    Response should show that model directory is present.

Deploy TGI

  1. First step is to go to RHOAI dashboard/Data Science Projects tab and select gaudi-llm project:

    Select RHOAI project
  2. Now select Data connections tab and click Add data connection:

    Add Data connection
  3. There should appear form Add data connection with couple of inputs. Form should be looking similar to this:

    Complete Data connection details
    1. To get value for Access key run command:

      oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d
    2. To get value for Secret key run command:

      oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d
    3. To get value for Endpoint run command:

      echo "http://$(oc -n openshift-storage get route s3-rgw -ojsonpath='{.spec.host}')"
  4. Next step is to go to tab Models and click Deploy model in Single-model serving platform section:

    Create Single Model Serving
  5. There should appear form Deploy model. Fill all the inputs like in following images and then click Deploy button:

    Deploy model 1
    Deploy model 2
  6. If everything is setup correctly go to tab Models again to check status of TGI. It should be looking like this:

    Status of TGI

RAG Chat demo

After whole setup is complete demo application is ready to use. Chat QnA UI address can be obtained by running following command:

echo "http://$(oc -n gaudi-llm get route chatqna-gaudi-ui-server -ojsonpath='{.spec.host}')"