git clone git@github.com:your-username/qna-chat-gaudi.git
Deploying the OPEA QnA chat accelerated with Intel Gaudi pattern
An OpenShift Container Platform cluster
To create an OpenShift Container Platform cluster, go to the Red Hat Hybrid Cloud console and select Services -> Containers -> Create cluster.
The cluster must have a dynamic
StorageClass
to provisionPersistentVolumes
. It was tested with ODF (OpenShift Data Foundation) or LVM Storage solutions. CephFS should be set as a default Storage Class - Setup Guide
OpenShift Container Platform Cluster must have a configured Image Registry - Setup Guide
A GitHub account and a token for it with repositories permissions, to read from and write to your forks.
A HuggingFace account and User Access token, which allows to download AI models. More about User Access token can be found on official HuggingFace website
Install AWS CLI tool to check status of S3 bucket (RGW storage).
If you do not have a running Red Hat OpenShift cluster, you can start one on a public or private cloud by using Red Hat Hybrid Cloud Console.
Fork the qna-chat-gaudi repository on GitHub.
Clone the forked copy of this repository.
Create a local copy of the secret values file that can safely include credentials. Run the following commands:
cp values-secret.yaml.template ~/values-gaudi-rag-chat-qna.yaml
vi ~/values-gaudi-rag-chat-qna.yaml
Do not commit this file. You do not want to push personal credentials to GitHub. If you do not want to customize the secrets by copying secret, these steps are not needed. User can just type in required HuggingFace User Access Token while installing pattern. In the beginning of the installation process there should appear prompt asking for the HuggingFace User Access Token.
(Optional) If cluster is behind proxy
values-global.yaml
should be similar to the following:If the cluster is behind proxy remember to change proxy values of fields
gaudillm.build_envs
andgaudillm.runtime_envs
invalues-global.yaml
file to appropriate ones.gaudillm: namespace: gaudi-llm build_envs: - name: http_proxy value: http://proxy-internal.cluster1.gaudi.internal:912 - name: https_proxy value: http://proxy-internal.cluster1.gaudi.internal:912 - name: HTTP_PROXY value: http://proxy-internal.cluster1.gaudi.internal:912 - name: HTTPS_PROXY value: http://proxy-internal.cluster1.gaudi.internal:912 - name: no_proxy value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost - name: NO_PROXY value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost runtime_envs: - name: http_proxy value: http://proxy-internal.cluster1.gaudi.internal:912 - name: https_proxy value: http://proxy-internal.cluster1.gaudi.internal:912 - name: HTTP_PROXY value: http://proxy-internal.cluster1.gaudi.internal:912 - name: HTTPS_PROXY value: http://proxy-internal.cluster1.gaudi.internal:912 - name: no_proxy value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost - name: NO_PROXY value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost
Customize the deployment for your cluster. Run the following command:
git checkout -b my-branch
vi values-global.yaml
git add values-global.yaml
git commit values-global.yaml
git push origin my-branch
Deploy the pattern by running
./pattern.sh make install
or by using the Validated Patterns Operator.If you have not set HuggingFace token in secrets file there will be prompt to set the token.
Deploying the cluster by using the pattern.sh file
To deploy the cluster by using the pattern.sh
file, complete the following steps:
Login to your cluster by running the following command:
oc login
Optional: Set the
KUBECONFIG
variable for thekubeconfig
file path:export KUBECONFIG=~/<path_to_kubeconfig>
Deploy the pattern to your cluster. Run the following command:
./pattern.sh make install
In the beginning of installation there should be prompt to enter HuggingFace token, looking like this:
Insert HuggingFace Token:
Validated Pattern can take a while to be fully installed as it requires couple of reboots to apply MachineConfigs for worker nodes.
As part of this pattern, HashiCorp Vault has been installed. Refer to the section on Vault.
Verification
Verify that the Operators have been installed.
To verify, in the OpenShift Container Platform web console, navigate to Operators → Installed Operators page.
Check that the Operators are installed in the
openshift-operators
namespace and its status isSucceeded
.
Verify that S3 bucket was created successfully. Run following commands:
export AWS_ACCESS_KEY_ID=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d) export AWS_SECRET_ACCESS_KEY=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d) export CEPH_RGW_ENDPOINT=http://$(oc -n openshift-storage get route s3-rgw -ojsonpath='{.spec.host}') aws --endpoint ${CEPH_RGW_ENDPOINT} s3api list-buckets
Expected response should show that bucket named
model-bucket
exists:{ "Buckets": [ { "Name": "model-bucket", "CreationDate": "2024-06-27T08:44:09.451000+00:00" } ], "Owner": { "DisplayName": "ocs-storagecluster", "ID": "ocs-storagecluster-cephobjectstoreuser" } }
Verify that MachineConfigPool is ready. To check the status run following command:
oc get mcp
Verify that all applications are healthy and synchronized. Under the project
gaudi-rag-chat-qna
click the URL for thehub
gitopsserver
.
Setup Red Hat OpenShift AI
After all components are properly deployed user can proceed to setup Red Hat OpenShift AI. The procedures consists of two manual steps:
Uploading AI model to S3 bucket (RGW storage) on the cluster
Deploying TGI
Upload AI model
First step is to go to
RHOAI dashboard/Data Science Projects
tab and selectgaudi-llm
project (If thegaudi-llm
project is missing user should check if app is ready in ArgoCD dashboard):Go to
Workbenches
tab and clickCreate workbench
:Fill the input form following images below:
After workbench is created go to this Jupyter notebook dashboard and upload Jupyter notebook file
/download-model.ipynb
to the file explorer so it looks like this:When
download-model
is uploaded, run notebook’s all commands. After notebook is executed, model should be uploaded to S3 bucket. To check if model is present please run following commands:export AWS_ACCESS_KEY_ID=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d) export AWS_SECRET_ACCESS_KEY=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d) export CEPH_RGW_ENDPOINT=http://$(oc -n openshift-storage get route s3-rgw -ojsonpath='{.spec.host}') aws --endpoint ${CEPH_RGW_ENDPOINT} s3 ls model-bucket/models/
Response should show that model directory is present.
Deploy TGI
First step is to go to
RHOAI dashboard/Data Science Projects
tab and selectgaudi-llm
project:Now select
Data connections
tab and clickAdd data connection
:There should appear form
Add data connection
with couple of inputs. Form should be looking similar to this:To get value for
Access key
run command:oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d
To get value for
Secret key
run command:oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d
To get value for
Endpoint
run command:echo "http://$(oc -n openshift-storage get route s3-rgw -ojsonpath='{.spec.host}')"
Next step is to go to tab
Models
and clickDeploy model
inSingle-model serving platform
section:There should appear form
Deploy model
. Fill all the inputs like in following images and then clickDeploy
button:If everything is setup correctly go to tab
Models
again to check status of TGI. It should be looking like this: