Troubleshooting
Logs and Events
The first and in most cases sufficient step in troubleshooting issues with a CSI driver that has Encryption enabled is exploring logs of the Encryption driver and related Kubernetes components. These are some useful log sources:
CSI Driver Containers Logs
The driver creates several controller and node pods. They can be listed with kubectl -n <driver namespace> get pods
.
The output will look similar to:
NAME READY STATUS RESTARTS AGE
isi-controller-84f697c874-2j6d4 10/10 Running 0 16h
isi-node-4gtwf 4/4 Running 0 16h
isi-node-lnzws 4/4 Running 0 16h
List containers in pod isi-node-4gtwf
with kubectl -n <driver namespace> logs isi-node-4gtwf
.
Each pod has containers called driver
which is the storage driver container and driver-sec
which is the Encryption driver container.
These container’s logs tend to provide the most important information, but other containers may give a hint too.
View the logs of driver-sec
in isi-node-4gtwf
with kubectl -n <driver namespace> logs isi-node-4gtwf driver-sec
.
The log level of this container can be changed by setting value encryption.logLevel and restarting the driver.
Often it is necessary to see the logs produced on a specific Kubernetes worker host.
To find which node pod is running on which worker host, use kubectl -n <driver namespace> get pods -o wide
.
PersistentVolume, PersistentVolumeClaim and Application Pod Events
Some errors may be logged to the related resource events that can be viewed with kubectl describe
command for that resource.
Vault Server Logs
Some errors related to communication with the Vault server and key requests may be logged on the Vault server side.
If you run a test instance of the server in a Docker container you can view the logs with docker logs vault-server
.
Typical Failure Reasons
Incorrect Vault related configuration
- check logs
- check vault-auth secret
- check vault-cert secret
- check vault-client-conf config map
Incorrect Vault server-side configuration
- check logs
- check Vault server configuration
Expired AppRole secret ID
Incorrect CSI driver configuration
- check the related CSI driver troubleshooting steps
SSH server is stopped/restarted on the worker host
This may manifest in:
- failure to start the CSI driver
- failure to create a new encrypted volume
- failure to access an encrypted volume (IO errors)
Resolution:
- check SSH server is running on all worker host
- stop all workloads that use encrypted volumes on the node, then restart them
No license provided, or license expired
This may manifest in:
- failure to start the CSI driver
- failure to create a new encrypted volume
Resolution:
- obtain a new valid license
- check the license is for the cluster on which the encrypted volumes are created
- check encryption-license secret
Typical Rekey Failure reasons
If all rekeys in the cluster are failing
- check the Rekey controller helm chart values.yaml
provisioner
name against the Dell CSI driver chartencryption.pluginName
, and ensure they match. - check the Rekey controller helm chart values.yaml
port
number against the Dell CSI driver chartencryption.apiPort
, and ensure they match.
If Rekeys fail for a particular PV
- check that the volume is provisioned by the Encryption provisioner
- check that volume attachments exist for the said PV
- check that at least one node on which the PV is mounted is available and reachable
- check the Encryption provisioner logs for details that may indicate the failure reason
- check the Rekey controller log for the reason for failure
If a Rekey results in a Status.Phase
of unknown
- this implies the connection failed during the rekey process which may mean the volume was rekeyed
- an additional rekey attempt should work assuming a reliable connection to the Encryption provisioner. This may result in the volume being rekeyed twice.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.