$ podman --versionTroubleshooting the Lemonade Stand AI Quickstart pattern
Prerequisite and tooling issues
Podman version not supported
The pattern.sh script requires Podman 4.3.0 or later. Earlier versions do not support the --userns=keep-id flag required for correct UID/GID mapping inside the container.
The script exits with an error referencing the Podman version or keep-id.
Check your Podman version:
If the version is earlier than 4.3.0, upgrade Podman. For instructions, see the Podman installation documentation.
KUBECONFIG path is outside the HOME directory
The pattern.sh script runs inside a container and mounts your $HOME directory. If your KUBECONFIG file is located outside $HOME, the container cannot access it.
The script fails to connect to the cluster or reports that the kubeconfig file cannot be found.
Move your kubeconfig file to a path inside your home directory and export the updated path:
$ cp <current-kubeconfig-path> ~/kubeconfig
$ export KUBECONFIG=~/kubeconfigDeployment issues
ArgoCD applications are not syncing or are unhealthy
After running ./pattern.sh make install, ArgoCD applications can take 15–30 minutes to reach a healthy state. Model downloads and GPU operator initialization take additional time.
Running ./pattern.sh make argo-healthcheck reports applications in Progressing or Degraded state.
Check which applications are not healthy:
$ oc get applications -n openshift-gitopsInspect the failing application for error details:
$ oc describe application <application-name> -n openshift-gitopsCheck the logs of the ArgoCD application controller:
$ oc logs -n openshift-gitops deployment/openshift-gitops-application-controllerIf applications are stuck in
Progressing, wait an additional 10 minutes and re-run the health check. Detector model downloads from Hugging Face through MinIO and GPU operator initialization can take significant time.
GPU and inference issues
GPU nodes are not ready
The NVIDIA GPU Operator must successfully initialize on the GPU node before model serving can start.
The vLLM inference service pod remains in Pending state, or oc get inferenceservice -A shows the service not ready.
Check the status of GPU nodes:
$ oc get nodes -l nvidia.com/gpu.present=trueCheck the NVIDIA GPU Operator pods:
$ oc get pods -n nvidia-gpu-operatorCheck for driver initialization errors:
$ oc logs -n nvidia-gpu-operator -l app=nvidia-driver-daemonsetIf you are using a provider other than AWS, confirm that a GPU node was present in the cluster before you deployed the pattern. The pattern does not provision GPU nodes on providers other than AWS.
Inference endpoint is not serving
oc get inferenceservice -A shows the inference service in a non-ready state, or the chatbot returns connection errors.
Check the status of the inference service:
$ oc get inferenceservice -ACheck the vLLM model server pod logs:
$ oc logs -n lemonade-stand -l serving.kserve.io/inferenceservice=llm-serviceConfirm that the GPU node has sufficient available VRAM. The Llama 3.2 3B Instruct model requires a GPU with at least 24 GB of VRAM.
Guardrails orchestrator issues
Guardrails Orchestrator pod is not ready
All detector models must be available and healthy before the Guardrails Orchestrator can serve requests.
The orchestrator pod is in CrashLoopBackOff or Error state, or the chatbot returns 503 errors.
Check the status of all pods in the lemonade-stand namespace:
$ oc get pods -n lemonade-standCheck the orchestrator pod logs for detector connection errors:
$ oc logs -n lemonade-stand -l app=guardrails-orchestratorVerify that all detector services are running:
$ oc get inferenceservice -n lemonade-standIf detector models are not ready, check that MinIO has successfully downloaded the model artifacts from Hugging Face:
$ oc logs -n lemonade-stand -l app=minio
Guardrails are blocking all requests
Every user query is blocked by the guardrails, even when the content appears safe and in English.
Check the R Shiny dashboard to identify which detector is triggering. Navigate to Networking → Routes in the
lemonade-standnamespace and open the dashboard route.If the Lingua detector is blocking English text, the language confidence threshold may be too high. Review the Lingua threshold in the
fms-orchestr8-config-nlpConfigMap.If the HAP or prompt injection detector is triggering on safe content, their detection thresholds may be too aggressive. See Configuring detector thresholds.
Application issues
Lemonade Stand chatbot UI is not accessible
The chatbot UI route returns a 503 or connection error.
Check that the lemonade-stand pod is running:
$ oc get pods -n lemonade-stand -l app=lemonade-standCheck the application logs for startup errors:
$ oc logs -n lemonade-stand -l app=lemonade-standVerify the route is correctly configured:
$ oc get routes -n lemonade-stand
R Shiny dashboard shows no data
The dashboard loads but shows zero values for all metrics, or displays errors.
Confirm that the lemonade-stand application is running and the
/metricsendpoint is accessible:$ oc exec -n lemonade-stand deployment/shiny-dashboard -- curl -s http://lemonade-stand:8080/metricsCheck the Shiny dashboard pod logs:
$ oc logs -n lemonade-stand -l app=shiny-dashboardVerify that the
shinyDashboard.metrics.urlin the Helm chart values points to the correct metrics endpoint.
Getting help
If you cannot resolve an issue using this guide:
Check the GitHub issues for known problems and workarounds.
Open a new issue with the output of the following command to help diagnose the problem:
$ oc get pods -A | grep -v Running | grep -v Completed
