Troubleshooting
Troubleshooting guide
	| Symptoms | Prevention, Resolution or Workaround | 
|---|---|
| Persistent volumes don’t get created on the target cluster. | Run kubectl describe on one of the pods of replication controller and see if event says Config update won't be applied because of invalid configmap/secrets. Please fix the invalid configuration. If it does, then ensure you correctly populated replication ConfigMap. You can check the current status by running kubectl describe cm -n dell-replication-controller dell-replication-controller-config. If ConfigMap is empty, please edit it yourself or use repctl cluster inject command. | 
      
| Persistent volumes don’t get created on the target cluster. You don’t see any events on the replication-controller pod. | Check logs of replication controller by running kubectl logs -n dell-replication-controller dell-replication-controller-manager-<generated-symbols>. If you see clusterId - <clusterID> not found errors then be sure to check if you specified the same clusterIDs in both your ConfigMap and replication enabled StorageClass. | 
      
You apply replication action by manually editing ReplicationGroup resource field spec.action and don’t see any change of ReplicationGroup state after a while. | 
          Check events of the replication-controller pod, if it says Cannot proceed with action <your-action>. [unsupported action] then check spelling of your action and consult the Replication Actions page. Alternatively, you can use repctl instead of manually editing ReplicationGroup resources. | 
      
You execute failover action using repctl failover command and see failover: error executing failover to source site. | 
          This means you tried to failover to a cluster that is already marked source. If you still want to execute failover for RG, just choose another cluster. | 
| You’ve created PersistentVolumeClaim using replication enabled StorageClass but don’t see any RGs created in the source cluster. | Check annotations of created PersistentVolumeClaim. If it doesn’t have annotations that start with replication.storage.dell.com then please wait for a couple of minutes for them to be added and RG to be created. | 
      
When installing common replication controller using helm you see an error that states invalid ownership metadata and missing key "app.kubernetes.io/managed-by": must be set to "Helm" | 
          This means that you haven’t fully deleted the previous release, you can fix it by either deleting entire manifest by using kubectl delete -f deploy/controller.yaml or manually deleting conflicting resources (ClusterRoles, ClusterRoleBinding, etc.) | 
      
PV and/or PVCs are not being created at the source/target cluster. If you check the controller’s logs you can see no such host errors | 
          Make sure cluster-1’s API is pingable from cluster-2 and vice versa. If one of your clusters is OpenShift located in a private network and needs records in /etc/hosts, exec into controller pod and modify /etc/hosts manually. | 
      
After upgrading to Replication v1.4.0, if kubectl get rg returns an error Unable to list "replication.storage.dell.com/v1alpha1, Resource=dellcsireplicationgroups" | 
          This means kubectl still doesn’t recognize the new version of CRD dellcsireplicationgroups.replication.storage.dell.com after upgrade. Running the command kubectl get DellCSIReplicationGroup.v1.replication.storage.dell.com/<rg-id> -o yaml will resolve the issue. | 
      
To add or delete PV s in the existing SYNC Replication Group in PowerStore, you may encounter the error The operation is restricted as sync replication session for resource <Replication Group Name> is not paused | 
          To resolve this, you need to pause the replication group, add the PV, and then resume the replication group (RG). The commands for the pause and resume operations are: repctl --rg <rg-id> exec -a suspend  repctl --rg <rg-id> exec -a resume | 
      
| To delete the last volume from the existing SYNC Replication Group in Powerstore, you may encounter the error ‘failed to remove volume from volume group: The operation cannot be completed on metro or replicated volume group because volume group will become empty after last members are removed’ | To resolve this, unassign the protection policy from the corresponding volume group on the PowerStore Manager UI. After that, you can successfully delete the last volume in that SYNC Replication Group. | 
When running CSI-PowerMax with Replication in a multi-cluster configuration, the driver on the target cluster fails and the following error is seen in logs: error="CSI reverseproxy service host or port not found, CSI reverseproxy not installed properly" | 
          The reverseproxy service needs to be created manually on the target cluster. Follow the instructions here to create it. | 
When getting the following error for CSI-Powerscale with Replication with encryption enabled: SyncIQ policy failed to establish an encrypted connection, the Replication groups and PVC’s won’t be created at target cluster. | 
          The encryption required flag in the SyncIQ settings was set to “yes” by default in OneFS 9.0+. To rectify this error, please follow the following article: https://www.dell.com/support/kbdoc/en-us/000215174/isilon-synciq-9-0-all-policies-fail-when-source-or-target-cluster-is-on-onefs-9-0-with-no-node-on-source-cluster-was-able-to-connect-to-target-cluster |