Rancher Prime in AWS: Usage Instructions

Prerequisites

  1. A Rancher-compatible EKS cluster. Please see Rancher support matrix for more details. Please see Creating an EKS cluster for bringing up an EKS cluster to install Rancher Prime PAYG which is covered in the later section of this document.
  2. An ingress controller installed on the EKS cluster so that Rancher is accessible from outside the cluster. Please refer to Rancher documentation for instructions on how to deploy the Ingress-INGINX controller on EKS cluster.
  3. Get the Load Balancer IP. Please refer to Rancher documentation and Save the EXTERNAL-IP.
  4. The Rancher hostname must be a fully qualified domain name (FQDN) and its corresponding IP address must be resolvable from a public DNS. Please refer to Rancher documentation for instructions on how to setup DNS. This DNS is setup to point to the EXTERNAL-IP address.
  5. Installation requires you have the following tools available and properly configured to access your AWS account, and your EKS cluster:
  • aws
  • curl
  • eksctl
  • helm (v3 or greater)

Preparing your cluster

OIDC provider

Your EKS cluster is required to have an OIDC provider installed. To check for an OIDC provider first find the OIDC issuer with the following command. Substitute $CLUSTER_NAME with the Name of your EKS cluster and $REGION with region where it is running:

aws eks describe-cluster --name $CLUSTER_NAME --region $REGION --query cluster.identity.oidc.issuer --output text

A URL is returned, like https://oidc.eks.region.amazonaws.com/id/1234567890ABCDEF. The part after https:// will be referred to in later instructions as the OIDC Provider Identity (e.g. oidc.eks.region.amazonaws.com/id/1234567890ABCDEF). The final section of the URL, 1234567890ABCDEF, is the $OIDC_ID.

Using the id of the issuer found above you can check if a provider is installed with the following command:

aws iam list-open-id-connect-providers | grep $OIDC_ID

If there is no output, you will need to create an OIDC provider:

eksctl utils associate-iam-oidc-provider --cluster $CLUSTER_NAME --region $REGION --approve

IAM Role

To provide the necessary permissions, an IAM role and an attached policy are required. The role name is passed as an argument during the helm deployment.

Create the role with a role name of your choosing (for example, rancher-csp-iam-role), and the required policy attached to it:

eksctl create iamserviceaccount \
  --name rancher-csp-billing-adapter \
  --namespace cattle-csp-billing-adapter-system \
  --cluster $CLUSTER_NAME \
  --region $REGION \
  --role-name $ROLE_NAME --role-only \
  --attach-policy-arn 'arn:aws:iam::aws:policy/AWSMarketplaceMeteringFullAccess' \
  --approve

Installing Rancher

Log helm into the AWS Marketplace ECR, to fetch the application. Note that the AWS Marketplace ECR is always in the us-east-1 region:

export HELM_EXPERIMENTAL_OCI=1

aws --region us-east-1 ecr get-login-password \
  | helm registry login --username AWS \
  --password-stdin 709825985650.dkr.ecr.us-east-1.amazonaws.com

Install Rancher into your cluster using helm. Customize your helm installation values if needed:

NOTE

Rancher Prime utilizes cert-manager to issue and maintain its certificates. Rancher will generate a CA certificate of its own, and sign a cert using that CA.

The Rancher hostname must be resolvable by public DNS. Please refer to the Prerequisites section for more details. For example, if the DNS name is rancher.my.org, HOST_NAME=rancher.my.org

helm install -n cattle-rancher-csp-deployer-system rancher-cloud --create-namespace \
oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/suse/{{repository}}/rancher-cloud-helm/rancher-cloud \
  --version {{chart_version}} \
  --set rancherHostname=$HOST_NAME \
  --set rancherServerURL=https://$HOST_NAME \
  --set rancherReplicas=$REPLICAS \
  --set rancherBootstrapPassword=$BOOTSTRAP_PASSWORD \
  --set rancherIngressClassName=nginx \
  --set global.aws.accountNumber=$AWS_ACCOUNT_ID \
  --set global.aws.roleName=$ROLE_NAME

NOTE: Monitor the rancher-cloud pod logs as the rancher-cloud pod is deleted in 1 min after a successful/failed installation

kubectl logs -f <pod> -n cattle-rancher-csp-deployer-system

After a successful deployment, running the following command should produce a similar output.

kubectl get deployments --all-namespaces=true
NAMESPACE                           NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
cattle-csp-billing-adapter-system   csp-rancher-usage-operator    1/1     1            1           30m
cattle-csp-billing-adapter-system   rancher-csp-billing-adapter   1/1     1            1           30m
cattle-fleet-local-system           fleet-agent                   1/1     1            1           29m
cattle-fleet-system                 fleet-controller              1/1     1            1           29m
cattle-fleet-system                 gitjob                        1/1     1            1           29m
cattle-provisioning-capi-system     capi-controller-manager       1/1     1            1           28m
cattle-system                       rancher                       1/1     1            1           32m
cattle-system                       rancher-webhook               1/1     1            1           29m
cert-manager                        cert-manager                  1/1     1            1           32m
cert-manager                        cert-manager-cainjector       1/1     1            1           32m
cert-manager                        cert-manager-webhook          1/1     1            1           32m
ingress-nginx                       ingress-nginx-controller      1/1     1            1           33m
kube-system                         coredns                       2/2     2            2           38m

Please look at the Troubleshooting section of this doc for a failed installation.

To check if helm chart installation is completed, run following command:

helm ls -n cattle-rancher-csp-deployer-system

After the helm chart installation is complete, you can verify the installation. Verify the status of the helm charts installation by running the following command:

helm status rancher-cloud -n cattle-rancher-csp-deployer-system

Helm Chart Installed Successfully

After it is completed, the Rancher Prime is successfully installed.

Log into the Rancher dashboard

You may now login to Rancher dashboard by point your browser to Rancher server URL https://<Rancher hostname>, where Rancher hostname is the hostname you have chosen previously.


NOTE

The Rancher hostname must be resolvable by public DNS. Please refer to the Prerequisites section for more details.


How To Use Rancher

Please refer to the Rancher documentation on how to use Rancher.

 

Upgrading a Rancher Prime PAYG Cluster

The marketplace PAYG offer is tied to a billing adapter AND Rancher Prime version. These are updated periodically as new version of the billing adapter or Rancher are released. In that case the helm chart will be updated with new tags and digests and new version of helm chart will be uploaded. In order to upgrade the deployed helm chart with new version, execute the following helm command

helm upgrade -n cattle-rancher-csp-deployer-system rancher-cloud --create-namespace \
oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/suse/{{repository}}/rancher-cloud-helm/rancher-cloud \
  --version <upgraded_chart_version> \
  --set rancherHostname=$HOST_NAME \
  --set rancherServerURL=https://$HOST_NAME \
  --set rancherReplicas=$REPLICAS \
  --set rancherIngressClassName=nginx \
  --set global.aws.accountNumber=$AWS_ACCOUNT_ID \
  --set global.aws.roleName=$ROLE_NAME

To check if upgraded helm chart installation is deployed, run following command. It should display the upgraded chart version and REVISION incremented by 1 from previous install.

helm ls -n cattle-rancher-csp-deployer-system

Troubleshooting

This section contains information to help you troubleshoot issues when installing Rancher Prime PAYG.

Jobs and Pods:

Check that pods or jobs have status Running/Completed

kubectl get pods --all-namespaces

If a pod is not in Running state, you can debug into the root cause by running:

Describe pod

kubectl describe pod <pod name> -n <namespaces>

Pod container logs

kubectl logs <pod name> -n <namespaces>

Describe job

kubectl describe job <job name> -n <namespaces>

Logs from the containers of pods of the job

kubectl logs -l job-name=<job name> -n <namespaces>

Recovery from Failed Pods

If any of the pods are not Running :

Check the rancher-cloud Pod

kubectl get pods --all-namespaces | grep rancher-cloud

If the rancher-cloud pod is in Error state, wait for the pod to be deleted which takes approx 1 min after it goes in Error State.

Fix the problem and execute

helm upgrade -n cattle-rancher-csp-deployer-system rancher-cloud --create-namespace \
oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/suse/{{repository}}/rancher-cloud-helm/rancher-cloud --install \
  --version {{chart_version}} \
  --set rancherHostname=$HOST_NAME \
  --set rancherServerURL=https://$HOST_NAME \
  --set rancherReplicas=$REPLICAS \
  --set global.aws.accountNumber=$AWS_ACCOUNT_ID \
  --set global.aws.roleName=$ROLE_NAME

Rancher Usage Record Not found:

Error:

Error from server (NotFound): cspadapterusagerecords.susecloud.net "rancher-usage-record" not found"
 Check Configuration, Retrieve generated configuration csp-config

Solution:

kubectl get cm -n cattle-csp-billing-adapter-system csp-config -o yaml

if a configuration is not listed, you can debug into the root cause by checking the pod status and log (Refer Jobs and Pods section).

Support configuration does not display the rancher product information or the upgraded product version

Once the rancher-cloud components are installed, the support configuration takes approximately an hour to show the product information. In an upgrade scenario, the support config might still have base_product field reference to the pre-upgrade version but after the next usage update (approximately in an hour), the support config should reflect the upgraded version.

Known Issues

  1. Uninstalling Rancher Prime:

     helm uninstall -n cattle-rancher-csp-deployer-system rancher-cloud
    

    Uninstalling Rancher Prime may not cleanly remove all the resources that were created by Rancher. Users are encouraged to use Rancher cleanup script to perform a more comprehensive cleanup if necessary. However, it is recommended to migrate any other workloads off the cluster and prepare to destroy the cluster to complete the uninstallation since cleanup is not recoverable.

  2. When migrating Rancher to a different EKS by following the steps specified in Rancher Backups and Disaster Recovery, Rancher Prime must be reinstalled on the target EKS cluster after restoring from the backup. Furthermore, the restored Rancher version must not be newer than the version available in the AWS marketplace.