acme-fitness PVC Healthcheck

Troubleshooting Commands

Fetch Events for Unhealthy Kubernetes PersistentVolumeClaims in Namespace acme-fitness

What does it do?

This command uses kubectl to check for any unbound PersistentVolumeClaims (PVC) in a specific namespace and context, and then retrieves events for each unbound PVC, including the last timestamp, name, and message associated with each event.

for pvc in $(kubectl get pvc -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o json | jq -r '.items[] | select(.status.phase != "Bound") |'); do kubectl get events -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster --field-selector$pvc -o json | jq '.items[]| "Last Timestamp: " + .lastTimestamp + ", Name: " + + ", Message: " + .message'; done
# Iterate over each PVC in the specified namespace and context
for pvc in $(kubectl get pvc -n ${NAMESPACE} --context ${CONTEXT} -o json | jq -r '.items[] | select(.status.phase != "Bound") |'); do 
  # Get events related to the current PVC
  kubectl get events -n ${NAMESPACE} --context ${CONTEXT} --field-selector$pvc -o json | 
  # Extract relevant information and format it
  jq '.items[]| "Last Timestamp: " + .lastTimestamp + ", Name: " + + ", Message: " + .message'; 
List PersistentVolumeClaims in Terminating State in Namespace acme-fitness

What does it do?

This command uses the Kubernetes command line tool (kubectl) to get information about persistent volume claims (pvc) in a specific namespace and context, displaying the ones that are in a "Terminating" state along with their deletion timestamp and finalizers. It also uses jq to format the output as human-readable text.

namespace=acme-fitness; context=gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster; kubectl get pvc -n $namespace --context=$context -o json | jq -r '.items[] | select(.metadata.deletionTimestamp != null) | as $name | .metadata.deletionTimestamp as $deletion_time | .metadata.finalizers as $finalizers | "\($name) is in Terminating state (Deletion started at: \($deletion_time)). Finalizers: \($finalizers)"'
# First, assign the values of NAMESPACE and CONTEXT to variables

# Then use kubectl command to get the list of PersistentVolumeClaims in the specified namespace and context in JSON format
kubectl get pvc -n $namespace --context=$context -o json |

# Next, use jq to filter the output and select only the items which have a deletionTimestamp
jq -r '.items[] | 
  select(.metadata.deletionTimestamp != null) |

# For each selected item, set the name, deletion time, and finalizers as variables and create a custom message as $name | 
  .metadata.deletionTimestamp as $deletion_time | 
  .metadata.finalizers as $finalizers | 
  "\($name) is in Terminating state (Deletion started at: \($deletion_time)). Finalizers: \($finalizers)"'
List PersistentVolumes in Terminating State in Namespace acme-fitness

What does it do?

This command retrieves information about events associated with terminating persistent volumes in a Kubernetes cluster by querying the cluster for relevant data and outputting it in a readable format using jq.

for pv in $(kubectl get pv --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o json | jq -r '.items[] | select(.status.phase == "Terminating") |'); do kubectl get events --all-namespaces --field-selector$pv --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o json | jq '.items[]| "Last Timestamp: " + .lastTimestamp + " Name: " + + " Message: " + .message'; done
# Iterate through each Persistent Volume (pv) that is in the "Terminating" phase
for pv in $(kubectl get pv --context ${CONTEXT} -o json | jq -r '.items[] | select(.status.phase == "Terminating") |'); do 
    # Use kubectl to get events from all namespaces associated with the current pv
    kubectl get events --all-namespaces --field-selector$pv --context ${CONTEXT} -o json | 
        # Use jq to format the output into a more readable and informative format
        jq '.items[]| "Last Timestamp: " + .lastTimestamp + " Name: " + + " Message: " + .message'; 
    # End of the loop for each pv
What does it do?

This command uses kubectl to retrieve information about running pods in a specific namespace, including details about the persistent volume claims (PVC) and their associated storage volumes. It then prints out various attributes of each PVC and its related resources, such as status, node location, storage class, access mode, reclaim policy, and CSI driver.

for pod in $(kubectl get pods -n acme-fitness --field-selector=status.phase=Running --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{range .items[*]}{}{"\n"}{end}'); do for pvc in $(kubectl get pods $pod -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{range .spec.volumes[*]}{.persistentVolumeClaim.claimName}{"\n"}{end}'); do pv=$(kubectl get pvc $pvc -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{.spec.volumeName}') && status=$(kubectl get pv $pv --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{.status.phase}') && node=$(kubectl get pod $pod -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{.spec.nodeName}') && zone=$(kubectl get nodes $node --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{.metadata.labels.topology\.kubernetes\.io/zone}') && ingressclass=$(kubectl get pvc $pvc -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{.spec.storageClassName}') && accessmode=$(kubectl get pvc $pvc -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{.status.accessModes[0]}') && reclaimpolicy=$(kubectl get pv $pv --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{.spec.persistentVolumeReclaimPolicy}') && csidriver=$(kubectl get pv $pv --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{.spec.csi.driver}')&& echo -e "\n------------\nPod: $pod\nPVC: $pvc\nPV: $pv\nStatus: $status\nNode: $node\nZone: $zone\nIngressClass: $ingressclass\nAccessModes: $accessmode\nReclaimPolicy: $reclaimpolicy\nCSIDriver: $csidriver\n"; done; done
# Iterate through each running pod in the specified namespace and context
for pod in $(kubectl get pods -n ${NAMESPACE} --field-selector=status.phase=Running --context ${CONTEXT} -o jsonpath='{range .items[*]}{}{"\n"}{end}');
  # Iterate through each persistent volume claim associated with the current pod
  for pvc in $(kubectl get pods $pod -n ${NAMESPACE} --context ${CONTEXT} -o jsonpath='{range .spec.volumes[*]}{.persistentVolumeClaim.claimName}{"\n"}{end}');
    # Retrieve information about the persistent volume associated with the current PVC
    pv=$(kubectl get pvc $pvc -n ${NAMESPACE} --context ${CONTEXT} -o jsonpath='{.spec.volumeName}') && 
    status=$(kubectl get pv $pv --context ${CONTEXT} -o jsonpath='{.status.phase}') &&
    node=$(kubectl get pod $pod -n ${NAMESPACE} --context ${CONTEXT} -o jsonpath='{.spec.nodeName}') &&
    zone=$(kubectl get nodes $node --context ${CONTEXT} -o jsonpath='{.metadata.labels.topology\.kubernetes\.io/zone}') &&
    ingressclass=$(kubectl get pvc $pvc -n ${NAMESPACE} --context ${CONTEXT} -o jsonpath='{.spec.storageClassName}') &&
    accessmode=$(kubectl get pvc $pvc -n ${NAMESPACE} --context ${CONTEXT} -o jsonpath='{.status.accessModes[0]}') &&
    reclaimpolicy=$(kubectl get pv $pv --context ${CONTEXT} -o jsonpath='{.spec.persistentVolumeReclaimPolicy}') &&
    csidriver=$(kubectl get pv $pv --context ${CONTEXT} -o jsonpath='{.spec.csi.driver}')&&
    # Print out the gathered information in a formatted manner
    echo -e "\n------------\nPod: $pod\nPVC: $pvc\nPV: $pv\nStatus: $status\nNode: $node\nZone: $zone\nIngressClass: $ingressclass\nAccessModes: $accessmode\nReclaimPolicy: $reclaimpolicy\nCSIDriver: $csidriver\n"; 

This script retrieves detailed information about all running pods in a specific Kubernetes namespace, including details about their associated Persistent Volume Claims (PVCs) and Persistent Volumes (PVs). This information can be useful for troubleshooting storage-related issues and understanding the storage configuration of running pods.
Fetch the Storage Utilization for PVC Mounts in Namespace acme-fitness

What does it do?

This command is a complex series of nested loops that uses the Kubernetes command line tool (kubectl) to gather information about running pods in a specific namespace and context. It then retrieves information about the persistent volume claims, volumes, container names, and mount paths associated with these pods and prints the information along with disk usage statistics for each mount path.

for pod in $(kubectl get pods -n acme-fitness --field-selector=status.phase=Running --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{range .items[*]}{}{"\n"}{end}'); do for pvc in $(kubectl get pods $pod -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o jsonpath='{range .spec.volumes[*]}{.persistentVolumeClaim.claimName}{"\n"}{end}'); do for volumeName in $(kubectl get pod $pod -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o json | jq -r '.spec.volumes[] | select(has("persistentVolumeClaim")) | .name'); do mountPath=$(kubectl get pod $pod -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o json | jq -r --arg vol "$volumeName" '.spec.containers[].volumeMounts[] | select(.name == $vol) | .mountPath'); containerName=$(kubectl get pod $pod -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -o json | jq -r --arg vol "$volumeName" '.spec.containers[] | select(.volumeMounts[].name == $vol) | .name'); echo -e "\n------------\nPod: $pod, PVC: $pvc, volumeName: $volumeName, containerName: $containerName, mountPath: $mountPath"; kubectl exec $pod -n acme-fitness --context gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster -c $containerName -- df -h $mountPath; done; done; done;
# Start by iterating through each running pod in the specified namespace and context
for pod in $(kubectl get pods -n ${NAMESPACE} --field-selector=status.phase=Running --context ${CONTEXT} -o jsonpath='{range .items[*]}{}{"\n"}{end}');
  # For each pod, iterate through the persistent volume claims (PVCs) associated with it
  for pvc in $(kubectl get pods $pod -n ${NAMESPACE} --context ${CONTEXT} -o jsonpath='{range .spec.volumes[*]}{.persistentVolumeClaim.claimName}{"\n"}{end}');
    # Then, for each PVC, iterate through the volumes to find the mount path
    for volumeName in $(kubectl get pod $pod -n ${NAMESPACE} --context ${CONTEXT} -o json | jq -r '.spec.volumes[] | select(has("persistentVolumeClaim")) | .name');
      # Use JSONPath and jq to retrieve the mount path and container name for the specific volume
      mountPath=$(kubectl get pod $pod -n ${NAMESPACE} --context ${CONTEXT} -o json | jq -r --arg vol "$volumeName" '.spec.containers[].volumeMounts[] | select(.name == $vol) | .mountPath');
      containerName=$(kubectl get pod $pod -n ${NAMESPACE} --context ${CONTEXT} -o json | jq -r --arg vol "$volumeName" '.spec.containers[] | select(.volumeMounts[].name == $vol) | .name');

      # Print out the details of the pod, PVC, volume name, container name, and mount path
     echo -e "\n------------\nPod: $pod, PVC: $pvc, volumeName: $volumeName, containerName: $containerName, mountPath: $mountPath";

      # Finally, execute a disk usage command within the pod to check the storage consumption at that mount path
      kubectl exec $pod -n ${NAMESPACE} --context ${CONTEXT} -c $containerName -- df -h $mountPath; 
Check for RWO Persistent Volume Node Attachment Issues in Namespace acme-fitness

What does it do?

This command is a bash script that uses kubectl to gather information about pods, their associated persistent volumes, and the nodes they are running on, checking for any mismatched data between the pod and storage nodes. If a mismatch is found, it outputs an error message.

NAMESPACE="acme-fitness"; CONTEXT="gke_runwhen-nonprod-sandbox_us-central1_sandbox-cluster-1-cluster"; PODS=$(kubectl get pods -n $NAMESPACE --context=$CONTEXT -o json); for pod in $(jq -r '.items[] | @base64' <<< "$PODS"); do _jq() { jq -r ${1} <<< "$(base64 --decode <<< ${pod})"; }; POD_NAME=$(_jq ''); [[ "$(_jq '.metadata.ownerReferences[0].kind')" == "Job" ]] && continue; POD_NODE_NAME=$(kubectl get pod $POD_NAME -n $NAMESPACE --context=$CONTEXT -o custom-columns=:.spec.nodeName --no-headers); PVC_NAMES=$(kubectl get pod $POD_NAME -n $NAMESPACE --context=$CONTEXT -o jsonpath='{.spec.volumes[*].persistentVolumeClaim.claimName}'); for pvc_name in $PVC_NAMES; do PVC=$(kubectl get pvc $pvc_name -n $NAMESPACE --context=$CONTEXT -o json); ACCESS_MODE=$(jq -r '.spec.accessModes[0]' <<< "$PVC"); if [[ "$ACCESS_MODE" == "ReadWriteOnce" ]]; then PV_NAME=$(jq -r '.spec.volumeName' <<< "$PVC"); STORAGE_NODE_NAME=$(jq -r --arg pv "$PV_NAME" '.items[] | select(.status.volumesAttached != null) | select(.status.volumesInUse[] | contains($pv)) |' <<< "$(kubectl get nodes --context=$CONTEXT -o json)"); echo "------------"; if [[ "$POD_NODE_NAME" == "$STORAGE_NODE_NAME" ]]; then echo "OK: Pod and Storage Node Matched"; else echo "Error: Pod and Storage Node Mismatched - If the issue persists, the node requires attention."; fi; echo "Pod: $POD_NAME"; echo "PVC: $pvc_name"; echo "PV: $PV_NAME"; echo "Node with Pod: $POD_NODE_NAME"; echo "Node with Storage: $STORAGE_NODE_NAME"; echo; fi; done; done
# Assign the value of the NAMESPACE environment variable to a new variable

# Assign the value of the CONTEXT environment variable to a new variable

# Retrieve the list of pods in the specified namespace and context as JSON
PODS=$(kubectl get pods -n $NAMESPACE --context=$CONTEXT -o json)

# Loop through each pod entry in the JSON list and perform operations
for pod in $(jq -r '.items[] | @base64' <<< "$PODS"); do 
    # Define a custom function using jq
    _jq() { 
        jq -r "${1}" <<< "$(base64 --decode <<< "${pod}")"; 

    # Obtain the name of the current pod
    POD_NAME=$(_jq '')

    # Check if the pod is associated with a "Job", and if so, skip to the next iteration
    [[ "$(_jq '.metadata.ownerReferences[0].kind')" == "Job" ]] && continue

    # Retrieve the node name where the pod is running
    POD_NODE_NAME=$(kubectl get pod $POD_NAME -n $NAMESPACE --context=$CONTEXT -o custom-columns=:.spec.nodeName --no-headers)

    # Retrieve the names of persistent volume claims (PVC) associated with the pod
    PVC_NAMES=$(kubectl get pod $POD_NAME -n $NAMESPACE --context=$CONTEXT -o jsonpath='{.spec.volumes[*].persistentVolumeClaim.claimName}')

    # Loop through each PVC name to obtain more details
    for pvc_name in $PVC_NAMES; do 
        PVC=$(kubectl get pvc $pvc_name -n $NAMESPACE --context=$CONTEXT -o json)
        ACCESS_MODE=$(jq -r '.spec.accessModes[0]' <<< "$PVC")

        # If the access mode of the PVC is ReadWriteOnce, proceed with further checks
        if [[ "$ACCESS_MODE" == "ReadWriteOnce" ]]; then 
            PV_NAME=$(jq -r '.spec.volumeName' <<< "$PVC")

            # Retrieve the node where the persistent volume (PV) is attached
            STORAGE_NODE_NAME=$(jq -r --arg pv "$PV_NAME" '.items[] | select(.status.volumesAttached != null) | select(.status.volumesInUse[] | contains($pv)) |' <<< "$(kubectl get nodes --context=$CONTEXT -o json)")

            # Display relevant information comparing pod and storage node associations
            echo "------------"
            if [[ "$POD_NODE_NAME" == "$STORAGE_NODE_NAME" ]]; then 
                echo "OK: Pod and Storage Node Matched"
                echo "Error: Pod and Storage Node Mismatched - If the issue persists, the node requires attention."
            echo "Pod: $POD_NAME"
            echo "PVC: $pvc_name"
            echo "PV: $PV_NAME"
            echo "Node with Pod: $POD_NODE_NAME"
            echo "Node with Storage: $STORAGE_NODE_NAME"
