Troubleshooting |

Can Container Storage Module Operator manage existing drivers installed using Helm charts or the CSI Operator?
Why do some of the Custom Resource fields show up as invalid or unsupported in the OperatorHub GUI?
How can I view detailed logs for the Container Storage Module SM Operator?
My Dell CSI Driver install failed. How do I fix it?
My CSContainer Storage ModuleM Replication install fails to validate replication prechecks with ’no such host’.
How to update resource limits for Container Storage Module Operator when it is deployed using Operator hub

Can Container Storage Module Operator manage existing drivers installed using Helm charts or the CSI Operator?

The Container Storage Module Operator is unable to manage any existing driver installed using Helm charts or the CSI Operator. If you already have installed one of the Dell CSI driver in your cluster and want to use the CSM operator based deployment, uninstall the driver and then redeploy the driver via Container Storage ModuleM Operator

Why do some of the Custom Resource fields show up as invalid or unsupported in the OperatorHub GUI?

The Container Storage Module Operator is not fully compliant with the OperatorHub React UI elements. Due to this, some of the Custom Resource fields may show up as invalid or unsupported in the OperatorHub GUI. To get around this problem, use kubectl/oc commands to get details about the Custom Resource(CR). This issue will be fixed in the upcoming releases of the Container Storage Module Operator.

How can I view detailed logs for the Container Storage Module Operator?

Detailed logs of the Container Storage Module Operator can be displayed using the following command:

kubectl logs <csm-operator-controller-podname> -n <namespace>

My Dell CSI Driver install failed. How do I fix it?

Describe the current state by issuing: kubectl describe csm <custom-resource-name> -n <namespace>

In the output refer to the status and events section. If status shows pods that are in the failed state, refer to the CSI Driver Troubleshooting guide.

Example:

Status: Controller Status: Available: 0 Desired: 2 Failed: 2 Node Status: Available: 0 Desired: 2 Failed: 2 State: Failed

Events Warning Updated 67s (x15 over 2m4s) csm (combined from similar events): at 1646848059520359167 Pod error details ControllerError: ErrImagePull= pull access denied for dellem/csi-isilon, repository does not exist or may require 'docker login': denied: requested access to the resource is denied, Daemonseterror: ErrImagePull= pull access denied for dellem/csi-isilon, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

The above event shows dellem/csi-isilon does not exist, to resolve this user can kubectl edit the csm and update to correct image.

To get details of driver installation: kubectl logs <dell-csm-operator-controller-manager-pod> -n dell-csm-operator.

Typical reasons for errors:

Incorrect driver version
Incorrect driver type
Incorrect driver Spec env, args for containers
Incorrect RBAC permissions

My CSM Replication install fails to validate replication prechecks with ’no such host'.

In replication environments that utilize more than one cluster, and utilize FQDNs to reference API endpoints, it is highly recommended that the DNS be configured to resolve requests involving the FQDN to the appropriate cluster.

If for some reason it is not possible to configure the DNS, the /etc/hosts file should be updated to map the FQDN to the appropriate IP. This change will need to be made to the /etc/hosts file on:

The bastion node(s) (or wherever repctl is used).
Either the CSM Operator Deployment or ClusterServiceVersion custom resource if using an Operator Lifecycle Manager (such as with an OperatorHub install).
Both dell-replication-controller-manager deployments.

To update the ClusterServiceVersion, execute the command below, replacing the fields for the remote cluster’s FQDN and IP.

kubectl patch clusterserviceversions.operators.coreos.com -n <operator-namespace> dell-csm-operator-certified.v1.3.0 \
--type=json -p='[{"op": "add", "path": "/spec/install/spec/deployments/0/spec/template/spec/hostAliases", "value": [{"ip":"<remote-IP>","hostnames":["<remote-FQDN>"]}]}]'

To update the dell-replication-controller-manager deployment, execute the command below, replacing the fields for the remote cluster’s FQDN and IP. Make sure to update the deployment on both the primary and disaster recovery clusters.

kubectl patch deployment -n dell-replication-controller dell-replication-controller-manager \
-p '{"spec":{"template":{"spec":{"hostAliases":[{"hostnames":["<remote-FQDN>"],"ip":"<remote-IP>"}]}}}}'

How to update resource limits for CSM Operator when it is deployed using Operator Hub

In certain environments where users have deployed CSM Operator using Operator hub, they have encountered issues related to Container Storage Module Operator pods reporting ‘OOM Killed’. This issue is attributed to the default resource requests and limits configured in the CSM Operator, which fail to meet the resource requirements of the user environments. In this case users can update the resource limits from Openshift web console by following the steps below:

Login into OpenShift web console
Navigate to Operators section in the left pane and expand it and click on ‘Installed Operators’
Select the Dell Container Storage Modules operator
Click on the YAML tab under the operator and you will see ClusterServiceVersion(CSV) file opened in an YAML editor
Update the resource limits in the opened YAML under the section spec.install.spec.deployments.spec.template.spec.containers.resources
Save the CSV and your changes should be applied

Symptoms	Prevention, Resolution or Workaround
When you run the command `kubectl describe pods powerstore-controller-<suffix> –n csi-powerstore`, the system indicates that the driver image could not be loaded.	- If on Kubernetes, edit the daemon.json file found in the registry location and add `{ "insecure-registries" :[ "hostname.cloudapp.net:5000" ] }` - If on OpenShift, run the command `oc edit image.config.openshift.io/cluster` and add registries to yaml file that is displayed when you run the command.
The `kubectl logs -n csi-powerstore powerstore-node-<suffix>` driver logs show that the driver can’t connect to PowerStore API.	Check if you’ve created a secret with correct credentials
Installation of the driver on Kubernetes supported versions fails with the following error: `Error: unable to build kubernetes objects from release manifest: unable to recognize "": no matches for kind "VolumeSnapshotClass" in version "snapshot.storage.k8s.io/v1"`	Kubernetes v1.21/v1.22/v1.23 requires v1 version of snapshot CRDs to be created in cluster, see the Volume Snapshot Requirements
If PVC is not getting created and getting the following error in PVC description: `failed to provision volume with StorageClass "powerstore-iscsi": rpc error: code = Internal desc = : Unknown error:`	Check if you’ve created a secret with correct credentials
If the NVMeFC pod is not getting created and the host looses the ssh connection, causing the driver pods to go to error state	remove the nvme_tcp module from the host in case of NVMeFC connection
When a node goes down, the block volumes attached to the node cannot be attached to another node	1. Force delete the pod running on the node that went down 2. Delete the volumeattachment to the node that went down. Now the volume can be attached to the new node.
If the pod creation for NVMe takes time when the connections between the host and the array are more than 2 and considerable volumes are mounted on the host	Reduce the number of connections between the host and the array to 2.
Driver install or upgrade fails because of an incompatible Kubernetes version, even though the version seems to be within the range of compatibility. For example: Error: UPGRADE FAILED: chart requires kubeVersion: >= 1.22.0 < 1.25.0 which is incompatible with Kubernetes V1.22.11-mirantis-1	If you are using an extended Kubernetes version, please see the helm Chart and use the alternate kubeVersion check that is provided in the comments. Please note that this is not meant to be used to enable the use of pre-release alpha and beta versions, which is not supported.
If two separate networks are configured for ISCSI and NVMeTCP, the driver may encounter difficulty identifying the second network (e.g., NVMeTCP).	This is a known issue, and the workaround involves creating a single network on the array to serve both ISCSI and NVMeTCP purposes.
Unable to provision PVC’s via driver	Ensure that the NAS name matches the one provided on the array side.
Unable to install or upgrade the driver	Ensure that the firewall is configured to grant adequate permissions for downloading images from the registry.
Faulty paths in the multipath	Ensure that the configuration of the multipath is correct and connectivity to the underlying hardware is intact.
Unable to install or upgrade the driver due to minimum Kubernetes version or Openshift version	Currently CSM only supports n, n-1, n-2 version of Kubernetes and Openshift, if you still wanted to continue with existing version update the `verify.sh` to continue.
Volumes are not getting deleted on the array when PV’s are deleted	Ensure `persistentVolumeReclaimPolicy` is set to Delete.
fsGroupPolicy may not work as expected without root privileges for NFS only https://github.com/kubernetes/examples/issues/260	To get the desired behavior set “RootClientEnabled” = “true” in the storage class parameter

Troubleshooting

Operator

Can Container Storage Module Operator manage existing drivers installed using Helm charts or the CSI Operator?

Why do some of the Custom Resource fields show up as invalid or unsupported in the OperatorHub GUI?

How can I view detailed logs for the Container Storage Module Operator?

My Dell CSI Driver install failed. How do I fix it?

My CSM Replication install fails to validate replication prechecks with ’no such host'.

How to update resource limits for CSM Operator when it is deployed using Operator Hub

CSI Driver

CSM Module

Observability

Replication

Resiliency

Feedback