Post Installation Dependencies

The following third-party components are required in the same Kubernetes cluster where Container Storage Modules Observability has been deployed:

There are various ways to deploy these components. We recommend following the Helm deployments according to the specifications defined below.

Tip: Container Storage Modules Observability must be deployed first. Once the module has been deployed, you can proceed to deploying/configuring Prometheus and Grafana.

Prometheus

Prometheus and Container Storage Modules Observability services run on the same Kubernetes cluster, with Container Storage Modules sending metrics to the OpenTelemetry Collector, which Prometheus then scrapes for data.

Supported Version	Image	Helm Chart
2.34.0	prom/prometheus:v2.34.0	Prometheus Helm chart

Note: It is the user’s responsibility to provide persistent storage for Prometheus if they want to preserve historical data.

Prometheus Helm Deployment

Here’s a minimal Prometheus configuration using insecure skip verify; for proper TLS, add a ca_file signed by the same CA as the Container Storage Modules Observability certificate. More details about Prometheus configuration, see Prometheus configuration.

Note: Replace OTEL-COLLECTOR-NAMESPACE with the namespace where otel-collector service is running in prometheus-values.yaml.

Create a values file named prometheus-values.yaml.

# prometheus-values.yaml
alertmanager:
  enabled: false
nodeExporter:
  enabled: false
pushgateway:
  enabled: false
kube-state-metrics:
  enabled: false
configmapReload:
  prometheus:
    enabled: false
server:
  enabled: true
  image:
    repository: quay.io/prometheus/prometheus
    tag: v2.34.0
    pullPolicy: IfNotPresent
  persistentVolume:
    enabled: false
  service:
    type: NodePort
    servicePort: 9090
extraScrapeConfigs: |
  - job_name: 'karavi-metrics-[CSI-DRIVER]'
    scrape_interval: 5s
    scheme: https
    tls_config:
      insecure_skip_verify: true
    metrics_path: /metrics
    kubernetes_sd_configs:
      - role: endpoints
        namespaces:
          names:
            - [OTEL-COLLECTOR-NAMESPACE]
    relabel_configs:
      - source_labels:
          - __meta_kubernetes_service_label_app_kubernetes_io_instance
          - __meta_kubernetes_service_label_app_kubernetes_io_name
        action: keep
        regex: karavi-observability;otel-collector
      - source_labels: [__meta_kubernetes_endpoint_port_name]
        action: keep
        regex: exporter-https
      - source_labels: [__address__]
        target_label: __address__
        regex: (.+):\d+
        replacement: ${1}:8443

To enable scraping of Kubernetes object state metrics, set kube-state-metrics.enabled to true in the prometheus-values.yaml configuration file.

To scrape the KubeVirt metrics, include the following scrape config to the prometheus-values.yaml configuration file.

   - job_name: 'kubevirt-metrics'
     scrape_interval: 10s
     # Kubernetes service discovery to find the kubevirt-prometheus-metrics service
     kubernetes_sd_configs:
       - role: endpoints
         namespaces:
           # Ensure this is the correct namespace where KubeVirt is installed
           names: [KUBEVIRT-NAMESPACE]

     relabel_configs:
       - source_labels: [__meta_kubernetes_service_name]
         regex: 'kubevirt-prometheus-metrics'
         action: keep

       - source_labels: [__meta_kubernetes_pod_ip, __meta_kubernetes_service_port_name]
         regex: '(.+);https-metrics'
         target_label: '__address__'
         replacement: '$1:8443'

     scheme: https
     tls_config:
       insecure_skip_verify: true

If using Rancher, create a ServiceMonitor.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: otel-collector
  namespace: powerflex
spec:
  endpoints:
  - path: /metrics
    port: exporter-https
    scheme: https
    tlsConfig:
      insecureSkipVerify: true
  selector:
    matchLabels:
      app.kubernetes.io/instance: karavi-observability
      app.kubernetes.io/name: otel-collector

Add the Prometheus Helm chart repository.

On your terminal, run each of the commands below:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add stable https://charts.helm.sh/stable
helm repo update

Install the Helm chart.

On your terminal, run the command below:

helm install prometheus prometheus-community/prometheus -n [CSM_NAMESPACE] -f prometheus-values.yaml

Grafana

The Grafana dashboards require Grafana to be deployed in the same Kubernetes cluster as Container Storage Modules Observability. Below are the configuration details required to properly set up Grafana to work with Container Storage Modules Observability.

Supported Version	Helm Chart
11.x	Grafana Helm chart

Grafana must be configured with the following data sources/plugins:

Name	Additional Information
Prometheus data source	Prometheus data source

Settings for the Grafana Prometheus data source:

Setting	Value	Additional Information
Name	Prometheus
Type	prometheus
URL	http://PROMETHEUS_IP:PORT	The IP/PORT of your running Prometheus instance
Access	Proxy

Grafana Helm Deployment

Below are the steps to deploy a new Grafana instance into your Kubernetes cluster:

Create a ConfigMap.

When using a network that requires a decryption certificate, the Grafana server MUST be configured with the necessary certificate. If no certificate is required, skip to step 2.
- Create a Config file named grafana-configmap.yaml The file should look like this:
```
# grafana-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: certs-configmap
  namespace: [CSM_NAMESPACE]
  labels:
    certs-configmap: "1"
data:
  ca-certificates.crt: |-
    -----BEGIN CERTIFICATE-----
  ReplaceMeWithActualCaCERT=
    -----END CERTIFICATE-----
```
NOTE: you need an actual CA Cert for it to work

On your terminal, run the commands below:
```
kubectl create -f grafana-configmap.yaml
```

Create a values file.

Create a Config file named grafana-values.yaml The file should look like this:

# grafana-values.yaml 
image:
  repository: grafana/grafana
  tag: 11.5.2
  sha: ""
  pullPolicy: IfNotPresent
service:
  type: NodePort

## Administrator credentials when not using an existing Secret
adminUser: admin
adminPassword: admin

## Configure grafana datasources
## ref: http://docs.grafana.org/administration/provisioning/#datasources
##
datasources:
  datasources.yaml:
    apiVersion: 1
    datasources:
    - name: Prometheus
      type: prometheus
      access: proxy
      url: 'http://prometheus-server:9090'
      isDefault: null
      version: 1
      editable: true
testFramework:
  enabled: false
sidecar:
  datasources:
    enabled: true
  dashboards:
    enabled: true

## Additional grafana server ConfigMap mounts
## Defines additional mounts with ConfigMap. ConfigMap must be manually created in the namespace.
extraConfigmapMounts: [] # If you created a ConfigMap on the previous step, delete [] and uncomment the lines below 
#   - name: certs-configmap
#     mountPath: /etc/ssl/certs/ca-certificates.crt
#     subPath: ca-certificates.crt
#     configMap: certs-configmap
#     readOnly: true

Add the Grafana Helm chart repository.

On your terminal, run each of the commands below:
```
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
```

Install the Helm chart.

On your terminal, run the commands below:

helm install grafana grafana/grafana -n [CSM_NAMESPACE] -f grafana-values.yaml

Other Deployment Methods

Importing Container Storage Modules for Observability Dashboards

Once Grafana is properly configured, you can import the pre-built observability dashboards. Log into Grafana and click the + icon in the side menu. Then click Import. From here you can upload the JSON files or paste the JSON text directly into the text area. Below are the locations of the dashboards that can be imported:

Dashboard	Description
PowerFlex: I/O Performance by Kubernetes Node	Provides visibility into the I/O performance metrics (IOPS, bandwidth, latency) by Kubernetes node
PowerFlex: I/O Performance by Provisioned Volume	Provides visibility into the I/O performance metrics (IOPS, bandwidth, latency) by volume
PowerFlex: Storage Pool Consumption By CSI Driver	Provides visibility into the total, used and available capacity for a storage class and associated underlying storage construct
CSI Driver Provisioned Volume Topology	Provides visibility into Dell CSI (Container Storage Interface) driver provisioned volume characteristics in Kubernetes correlated with volumes on the storage system.

Scalable Kubernetes Metrics for Observability and Visualization

To visualize Dell CSI driver provisioned metrics from multiple clusters in a single Grafana dashboard, users can create multiple data sources by navigating to Data Sources in the left-hand menu and selecting Add new Data Source. Provide the exposed Prometheus URL from each cluster along with the required authentication details. Karavi Observability provides pre-built sample dashboards with drill-down capabilities, enabling users to quickly establish a unified observability experience across multiple Kubernetes clusters.

Sample dashboards for KubeVirt volumes are available in Karavi Observability for each supported platform. To use these dashboards, first install KubeVirt by following the official KubeVirt documentation. Once KubeVirt pods are running, install CDI using the CDI installation guide. Ensure that kube-state-metrics is enabled and scraped by Prometheus, as these metrics are required for writing join PromQL queries that power the drill-downs in the sample Grafana dashboards. Below are the locations of the dashboards that can be imported to visualize KubeVirt metrics:

Dashboard	Description
KubeVirt VM Metrics	Provides the information about kubevirt volumes in the selected cluster
PowerFlex Volume IO Metrics with kubevirt details	Provides visibility into Virtual Machine status and associated disk details, while also capturing I/O performance metrics for volumes provisioned by Dell CSI PowerFlex that are used by the VMs

In order to clone the dashboard, user can import the dashboard in Grafana, then click on Edit button on top right of dashboard, click on Save as Copy and provide appropriate title before saving it.

To add custom panels to the sample dashboard, import the dashboard into Grafana and click Edit in the top-right corner. Then select Add and choose the visualization panel. User can configure the panel by writing PromQL join queries to retrieve customized metrics. For example, to visualize PVC-to-VM mapping, use the following query:

kubevirt_vmi_info * on(vmi_pod, namespace) group_left(persistentvolumeclaim) label_replace(kube_pod_spec_volumes_persistentvolumeclaims_info, "vmi_pod", "$1", "pod", "(.*)")

This query performs a join between two metrics: kubevirt_vmi_info (providing VirtualMachineInstance details) and kube_pod_spec_volumes_persistentvolumeclaims_info (providing pod-to-PVC mapping). Using label_replace and PromQL’s on(…) group_left(…) matching, it enriches VMI data with PVC information from the pod specification.

More details on metrics exported by kubevirt can be found here and metrics for kube-state-metrics can be found here.

The following sample dashboards enable users to visualize joined metrics from kube-state-metrics and natively exposed Dell storage array metrics:

Dashboard	Description
Topology Dashboard with Kube State Metrics Join Queries	Provides visibility into volumes and its topology details which are provisioned by Dell CSI (Container Storage Interface) driver using join queries on kube-state-metrics
Kube State Metrics	Provides a comprehensive overview of the state and performance of Kubernetes resources within a cluster

Dynamic Configuration

Some parameters can be configured/updated during runtime without restarting the Container Storage Modules for Observability services. These parameters will be stored in ConfigMaps that can be updated on the Kubernetes cluster. This will automatically change the settings on the services.

ConfigMap	Observability Service	Parameters
karavi-metrics-powerflex-configmap	karavi-metrics-powerflex	COLLECTOR_ADDR PROVISIONER_NAMES POWERFLEX_SDC_METRICS_ENABLED POWERFLEX_SDC_IO_POLL_FREQUENCY POWERFLEX_VOLUME_IO_POLL_FREQUENCY POWERFLEX_VOLUME_METRICS_ENABLED POWERFLEX_STORAGE_POOL_METRICS_ENABLED POWERFLEX_STORAGE_POOL_POLL_FREQUENCY POWERFLEX_TOPOLOGY_METRICS_ENABLED POWERFLEX_TOPOLOGY_METRICS_POLL_FREQUENCY POWERFLEX_MAX_CONCURRENT_QUERIES LOG_LEVEL LOG_FORMAT

To update any of these settings, run the following command on the Kubernetes cluster then save the updated ConfigMap data.

kubectl edit configmap [CONFIG_MAP_NAME] -n [CSM_NAMESPACE]

Tracing

Container Storage Modules Observability is instrumented to report trace data to Zipkin. This helps gather timing data needed to troubleshoot latency problems with Container Storage Modules Observability. Follow the instructions below to enable the reporting of trace data:

Deploy a Zipkin instance in the CSM namespace and expose the service as NodePort for external access.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: zipkin
  labels:
    app.kubernetes.io/name: zipkin
    app.kubernetes.io/instance: zipkin-instance
    app.kubernetes.io/managed-by: zipkin-service
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: zipkin
      app.kubernetes.io/instance: zipkin-instance
  template:
    metadata:
      labels:
        app.kubernetes.io/name: zipkin
        app.kubernetes.io/instance: zipkin-instance
    spec:
      containers:
        - name: zipkin
          image: "openzipkin/zipkin"
          imagePullPolicy: IfNotPresent
          env:
          - name: "STORAGE_TYPE"
            value: "mem"
          - name: "TRANSPORT_TYPE"
            value: "http"

---

apiVersion: v1
kind: Service
metadata:
  name: zipkin
  labels:
    app.kubernetes.io/name: zipkin
    app.kubernetes.io/instance: zipkin-instance
    app.kubernetes.io/managed-by: zipkin-service
spec:
  ports:
    - port: 9411
      targetPort: 9411
      protocol: TCP
  type: "NodePort"
  selector:
    app.kubernetes.io/name: zipkin
    app.kubernetes.io/instance: zipkin-instance

Add the Zipkin URI to the Container Storage Modules Observability ConfigMaps. Based on the manifest above, Zipkin will be running on port 9411.
Note: Zipkin tracing is currently not supported for the collection of PowerFlex metrics.
Update the ConfigMaps from the Dynamic Configuration. Here is an example updating the karavi-metrics-powerstore-configmap based on the deployment manifest above.
```
kubectl edit configmap/karavi-metrics-powerstore-configmap -n [CSM_NAMESPACE]
```
Update the ZIPKIN_URI and ZIPKIN_PROBABILITY values and save the ConfigMap.
```
ZIPKIN_URI: "http://zipkin:9411/api/v2/spans"
ZIPKIN_SERVICE_NAME: "metrics-powerstore"
ZIPKIN_PROBABILITY: "1.0"
```
Once the ConfigMaps are updated, the changes will automatically be applied and tracing can be seen by accessing Zipkin on the exposed port.

Updating Storage System Credentials

If storage system credentials are updated in the CSI Driver, update Container Storage Modules Observability with the new credentials

When Container Storage Modules for Observability uses the Authorization module

All storage system requests by Container Storage Modules Observability will go through the Authorization module. Perform the following steps:

Update the Authorization Module Token

CSI Driver for PowerFlex

Delete the current proxy-authz-tokens Secret from the CSM namespace.
```
kubectl delete secret proxy-authz-tokens -n [CSM_NAMESPACE]
```

Copy the proxy-authz-tokens Secret from the CSI Driver for Dell PowerFlex to the CSM namespace.

kubectl get secret proxy-authz-tokens -n [CSI_DRIVER_NAMESPACE] -o yaml | sed 's/namespace: [CSI_DRIVER_NAMESPACE]/namespace: [CSM_NAMESPACE]/' | kubectl create -f -

Update Storage Systems

If the list of storage systems managed by a Dell CSI Driver have changed, the following steps can be performed to update Container Storage Modules Observability to reference the updated systems:

CSI Driver for PowerFlex

Delete the current proxy-server-root-certificate and proxy-authz-tokens secrets from the CSM namespace.
```
kubectl delete secret proxy-server-root-certificate proxy-authz-tokens -n [CSM_NAMESPACE]
```

Copy the proxy-server-root-certificate and proxy-authz-tokens secrets from the CSI Driver for PowerFlex namespace to Container Storage Modules Observability namespace.

kubectl get secret proxy-server-root-certificate proxy-authz-tokens -n [CSI_DRIVER_NAMESPACE] -o yaml | sed 's/namespace: [CSI_DRIVER_NAMESPACE]/namespace: [CSM_NAMESPACE]/' | kubectl create -f -

When Container Storage Modules for Observability does not use the Authorization module

In this case all storage system requests made by Container Storage Modules Observability will not be routed through the Authorization module. The following must be performed:

CSI Driver for PowerFlex

Delete the current vxflexos-config Secret from the CSM namespace.
```
kubectl delete secret vxflexos-config -n [CSM_NAMESPACE]
```

Copy the vxflexos-config Secret from the CSI Driver for PowerFlex namespace to the CSM namespace.

kubectl get secret vxflexos-config -n [CSI_DRIVER_NAMESPACE] -o yaml | sed 's/namespace: [CSI_DRIVER_NAMESPACE]/namespace: [CSM_NAMESPACE]/' | kubectl create -f -

If the CSI driver secret name is not the default vxflexos-config, please use the following command to copy secret:

kubectl get secret [VXFLEXOS-CONFIG] -n [CSI_DRIVER_NAMESPACE] -o yaml | sed 's/name: [VXFLEXOS-CONFIG]/name: vxflexos-config/' | sed 's/namespace: [CSI_DRIVER_NAMESPACE]/namespace: [CSM_NAMESPACE]/' | kubectl create -f -