Skip to content

Kubernetes

ClusterControl allows DevOps and SRE teams to declaratively deploy, scale, tune, and backup open-source databases with Kubernetes database operators and ClusterControl’s integration with Argo CD and GitHub-based GitOps workflows.

There are two primary methods for managing the database cluster on Kubernetes with ClusterControl:

  1. Direct database cluster management using supported Kubernetes operators via the ClusterControl GUI.
  2. The GitOps approach. See GitOps for details.

ClusterControl Kubernetes proxy

Integration with the Kubernetes operator is managed through a new process called kuber-proxy. This proxy is installed alongside ClusterControl Ops-C (the new GUI replacing ClusterControl GUI v2, starting from version 2.3.3) using the package name clustercontrol-kuber-proxy. The service is managed by systemd under the unit file kuber-proxy.

GitOps

ClusterControl serves as the operational layer that sits above Kubernetes and Argo CD. It facilitates traceable database operations by transforming declarative GitOps workflows. This approach establishes a modern operational model where Git is the single source of truth for declaratively managing applications and infrastructure. Configurations are version-controlled in Git and continuously reconciled with the live infrastructure.

The core principles of GitOps include:

  • Declarative infrastructure using YAML.
  • Versioned source of truth (Git).
  • Automated reconciliation (e.g., via Argo CD).
  • Continuous delivery achieved through Pull Requests.
  • Security and governance inherited from the Git workflow.

The following flowchart illustrates the workflow and how the components communicate if GitOps integration is enabled:

flowchart TB
    a[/**DBA/DevOps/SRE**/] --> |configure<br>integration|b{{ClusterControl GUI}}
    b --> |HTTP|c[ClusterControl<br>Proxy]
    c --> |REST<br>#40;JWT#41;|d[ClusterControl<br>Kubernetes Proxy]
    d <--> e([kuber-agent])
    e --> f([Argo CD]) & h
    e --> |pull request<br>only|i
    f --> |watches|g(**GitHub repository**)
    f --> |watches/<br>applies/<br>reconciles|h([Database operator])
    h --> |manages| n@{ shape: procs, label: "**Database pods**"}
    a --> i{pull request/merge/<br>sync manifests}
    i --> g
        subgraph **ClusterControl host**
            b
            c
            d
        end
        subgraph **Kubernetes**
            e
            f
            h
            n
        end

Argo CD integration

Argo CD is a GitOps-based, declarative continuous delivery tool specifically for Kubernetes. It automates the deployment of desired application states into specified target environments. Deployment updates can be tracked via branches or tags, or pinned to a specific Git commit version of the manifests. Argo CD mandates the use of Git repositories as the source of truth for defining the desired application state.

ClusterControl offers integrated support for Argo CD by providing:

  • Automatic Installation: Simple deployment and configuration of Argo CD within the target Kubernetes environment.
  • External Integration: The ability to connect to pre-existing Argo CD instances, ensuring continuity of established CI/CD and governance policies.

GitHub integration

ClusterControl integrates with GitHub using Fine-Grained Personal Access Tokens (PATs) for secure, least-privilege authentication.

These tokens restrict access to specific repositories and operations, minimizing exposure. To create a fine-grained PAT:

  1. Go to GitHub → Settings → Developer Settings → Personal Access Tokens → Fine-grained tokens.

  2. Click Generate new token.

  3. Under Resource owner, select your user or organization.

  4. Under Repository access, choose Only select repositories and pick your GitOps repository.

  5. Under Permissions, enable:

    • Contents: Read and write

    • Metadata: Read

    • Pull requests: Read and Write

  6. Set an expiration (recommended: 90 days or less).

  7. Click Generate token and copy it.

  8. Paste the token in ClusterControl GUI → Kubernetes → GitOps Integration → Repository Settings to authenticate securely.

Tip

Store tokens as Kubernetes Secrets or integrate with HashiCorp Vault. Rotate regularly and assign repository-specific permissions only.

Deploy your first Kubernetes cluster

To set up your first database in Kubernetes using ClusterControl, follow the following steps.

1. Install ClusterControl Ops-C

ClusterControl Ops-C is now installed by default (it is the default ClusterControl GUI, replacing ClusterControl GUI v2) so you just need to follow the instructions in the Installation section to install it. See Online Installation.

Check that the kuber-proxy process is reachable on port 50051 (default) either using a public or private address depending on your setup (public managed Kubernetes cluster versus on-premise, for example).

2. Install the required tools

The kubectl tool is required to use this feature. Install it on the ClusterControl host and make sure kubectl is installed in the PATH.

3. Create a Kubernetes cluster

You must create a Kubernetes Cluster if you do not already have one.

The cluster can be a self-hosted on-premise installation, a public managed service, or a development environment like Docker Desktop or OrbStack. The only requirement is that the Kubernetes cluster can successfully connect to the kuber-proxy process on port 50051 of the ClusterControl host.

4. Connect to a Kubernetes cluster

After creating the Kubernetes cluster (or if it was already created), you have to connect to the environment by installing the agent-operator. The Kubernetes agent-operator establishes the connection to your Kubernetes resources. It acts as a bridge between ClusterControl and the Kubernetes environment, allowing you to monitor and perform operations directly through ClusterControl.

The Kubernetes integration is accessible from a new dedicated page, Kubernetes, where you can manage all things related to Kubernetes like your database clusters, backups, database operators and connections to different Kubernetes clusters.

ClusterControl & Kubernetes Environment

You can add any Kubernetes cluster whether it is an on-premise self-hosted, a public managed service, Docker Desktop or OrbStack development environment as long as the agent-operator is able to connect to the kuber-proxy process using the provided GRPC address to the ClusterControl host where the proxy process is running. The default port that kuber-proxy is listening on is 50051.

You can check which Kubernetes cluster that you are currently established with:

kubectl config current-context

After you have verified that this is the Kubernetes cluster that you want to use, then connect this Kubernetes cluster with ClusterControl by clicking on Connect environment.

5. Configure the agent

Begin by specifying the environment details.

ClusterControl & Kubernetes Connect

In this section, you have to complete the following information:

  • Environment name: Enter an environment name.

  • Host and port: ClusterControl Kubernetes Proxy host and GRPC port. The Kubernetes cluster must be able to reach this endpoint.

  • Agent namespace: Specify the agent's namespace (e.g., severalnines-system).

  • Agent chart version: Select the desired agent's Helm chart version (e.g., 1.0.0).

  • Allow cluster-wide write permissions: Grants this agent permission to create, update, patch, and delete Kubernetes resources defined by the operator. Leave unchecked to confine write permissions to the namespaces you list below. Those namespaces can be used to deploy database operators and clusters.

  • Enable GitOps: Use git repository as the single source of truth for desired cluster state.

  • Install Argo: Install Argo CD to manage agent and database operators lifecycle.

6. Set up the repository

Skip this step if Enable GitOps is unchecked in step 5. This step is for GitOps integration.

Provide the GitHub repository URL that will store the YAML manifests. Specify the Base path and Base branch, and authenticate using your GitHub Personal Access Token (PAT) to allow ClusterControl to push configuration changes automatically.

ClusterControl & Kubernetes GitHub Repository

7. Install Argo CD

Skip this step if Enable GitOps and Install Argo are unchecked in step 5. This step is for GitOps integration.

The wizard generates a kubectl command to install Argo CD. Copy and paste this command into your Kubernetes management terminal. This installs the GitOps controller components needed for synchronization with your GitHub repository.

ClusterControl & Kubernetes Argo CD

8. Install the agent

Next, you can download and execute the YAML file or you can copy and execute the command provided in the inline tab. This exports the necessary environment variables and installs the ClusterControl Kubernetes agent, which connects your cluster back to ClusterControl for management and monitoring.

ClusterControl & Kubernetes Agent Installation

9. Wait for environment detection

Once installation is complete, return to ClusterControl and wait for the new environment to appear in the dashboard.

ClusterControl & Kubernetes Waiting Page

10. Pull requests

Skip this step if Enable GitOps and Install Argo are unchecked in step 5. This step is for GitOps integration.

ClusterControl automatically creates a GitHub Pull Request for configuration deployment. Click View PR, review the YAML and Helm manifests, and merge it into your main branch.

GitHub Pull Request for the initial setup

11. GitOps synchronization

Skip this step if Enable GitOps and Install Argo are unchecked in step 5. This step is for GitOps integration.

Wait for Argo CD to reconcile the merged configuration with the cluster and deploy the agent components.

12. Confirm activation

Skip this step if Enable GitOps and Install Argo are unchecked in step 5. This step is for GitOps integration.

Finally, verify that the Kubernetes environment’s status changes to "Active". This indicates successful connection and readiness for ClusterControl operations such as monitoring, app deployment, and GitOps-based management.

ClusterControl & Kubernetes Active Environment

Create a Kubernetes namespace

Go to ClusterControl GUI → Kubernetes → Namespaces and click on Create Namespace. Create the namespace and wait for the View PR to be shown.

ClusterControl & Kubernetes Namespace

Click on View PR to open a page to GitHub to review and merge the Pull Request. A new namespace has been created that can be used to install a database operator.

Deploy a database operator

Before deploying a database cluster, a corresponding database operator for a specific database vendor or technology must be installed in the Kubernetes cluster that you want to use.

Go to ClusterControl GUI → Kubernetes → Operators and click on Deploy operator and select the namespace to use and the database operator to install.

ClusterControl & Kubernetes Operators

Supported operators

The following database operators are currently supported:

  • PostgreSQL standalone and streaming replication: Cloudnative-pg
  • MySQL standalone and replication: Moco

Create a new database cluster

Once the desired database operators have been deployed, the next step is to deploy a new database cluster using one of those operators.

Go to ClusterControl GUI → Kubernetes → Clusters and click on Deploy database cluster.

ClusterControl & Deploy Database Cluster

In this section, complete the required information:

  • Select an environment where to deploy it.
  • Select which database operator to use.
  • Set a cluster name and which namespace to use.
  • Number of nodes to deploy.
  • The database configuration template to use.
  • The resource template to use for CPU, Memory and Storage.
  • Define how to access the database cluster, i.e., via a port on the node or launching a load balancer (which can incur cost depending on the Kubernetes cluster provider).

ClusterControl & Kubernetes Cluster Ready

Database connection details

To enable your applications to connect to the database, you need the connection details for the database cluster. These details can be retrieved by selecting the cluster and using the Details menu action.

ClusterControl & Kubernetes Details

Resource and database configuration templates

ClusterControl supports creating templates that specifies what resources limits to set and different database configurations that can be quickly applied at deployment.

ClusterControl & Kubernetes Details

Retrieve and view logs

To access the logs, go to each section (Clusters, Operators, Environments) and click on the Logs option under the Actions menu.

ClusterControl & Kubernetes Logs

Purge resources

  • Clusters: Go to ClusterControl GUI → Kubernetes → Clusters → Cluster Actions → Delete.
  • Operators: Go to ClusterControl GUI → Kubernetes → Operators → Operator Actions → Delete.
  • Environments: Go to ClusterControl GUI → Kubernetes → Environments → Environment Actions → Delete.

Troubleshooting issues

Use k9s to interact with your Kubernetes cluster. It makes it easier to navigate and visualize the resources deployed.

Restart agent

Sometimes you may need to restart agent when issues occurring. Known issues which are fixed by restart are:

  • Agent is in a disconnected state and doesn't attempt to connect to proxy.
  • Moco operator backups are successful but can’t see backups in UI.

To restart, run the following command:

kubectl rollout restart deployment/kuber-agent-controller-manager -n severalnines-system

Agent with debug log

To see debug logs when installing or upgrading agent add --set debug.logLevel=debug to helm install or upgrade command.

  • Watch agent logs:

    kubectl logs -l app.kubernetes.io/instance=kuber-agent -n severalnines-system -f
    
  • Search in logs:

    kubectl logs -l app.kubernetes.io/instance=kuber-agent -n severalnines-system --tail=-1 | grep "error"
    

Basic troubleshooting commands

  • View pods in a specific namespace:

    kubectl get pods -n <namespace>
    
  • Get detailed information about a resource:

    kubectl describe <resource> <resource-name> -n <namespace>
    
  • View pod logs:

    kubectl logs <pod-name> -n <namespace>
    
  • List all pods across namespaces:

    kubectl get pods --all-namespaces
    

Example issues and solutions

Let's see some examples about different issues and its possible solutions.

Failed database operators

  • Moco operator failure: Check the operator status:

    kubectl get databaseoperator
    NAME        TYPE        STATUS   VERSION   AGE
    cnpg        cnpg        Ready              38m
    moco        moco        Error              37m
    stackgres   stackgres   Ready              35m
    
    kubectl get pods --all-namespaces
    NAMESPACE             NAME                                             READY   STATUS              RESTARTS      AGE
    cert-manager          cert-manager-86d7c7b689-znbz5                    1/1     Running             0             13m
    cert-manager          cert-manager-cainjector-77894f5f57-djr94         1/1     Running             0             13m
    cert-manager          cert-manager-webhook-6cf469dbd6-tkvnd            1/1     Running             0             13m
    cnpg-system           cnpg-controller-manager-8db87d769-jcvp2          1/1     Running             1 (13m ago)   13m
    default               cnpg-cluster-1                                   1/1     Running             0             11m
    default               cnpg-cluster-2                                   1/1     Running             0             10m
    default               cnpg-cluster-3                                   1/1     Running             0             10m
    severalnines-system   kuber-agent-controller-manager-c6b7d9489-259gp   1/1     Running             0             19h
    stackgres             stackgres-operator-5bfff484d8-55dmj              0/1     ContainerCreating   0             10m
    stackgres             stackgres-operator-set-crd-version-n2rm7         0/1     Completed           0             10m
    stackgres             stackgres-restapi-6554545f7b-mmpmc               0/2     ContainerCreating   0             10m
    
  • Diagnosis: Moco operator requires cert-manager and operator pods. While cert-manager is running, the operator pods have not been deployed.

  • Check agent logs for details:

    kubectl logs kuber-agent-controller-manager-c6b7d0000-000gp -n severalnines-system -f
    
    Example
    $ kubectl logs kuber-agent-controller-manager-c6b7d0000-000gp -n severalnines-system -f
    2025-03-19T04:27:24Z    ERROR   Failed to install operator  {"controller": "databaseoperator", "controllerGroup": "agent.severalnines.com", "controllerKind": "DatabaseOperator", "DatabaseOperator": {"name":"moco","namespace":"default"}, "namespace": "default", "name": "moco", "reconcileID": "af00009c-0000-4b9a-0000-faadfbf70000", "error": "failed to install cert-manager: failed to apply manifest: [failed to apply resource moco-system/cert-manager-edit: clusterroles.rbac.authorization.k8s.io \"cert-manager-edit\" is forbidden: user \"system:serviceaccount:severalnines-system:agent-operator-controller-manager\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:severalnines-system\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"acme.cert-manager.io\"], Resources:[\"challenges\"], Verbs:[\"deletecollection\"]}\n{APIGroups:[\"acme.cert-manager.io\"], Resources:[\"orders\"], Verbs:[\"deletecollection\"]}\n{APIGroups:[\"cert-manager.io\"], Resources:[\"certificaterequests\"], Verbs:[\"deletecollection\"]}\n{APIGroups:[\"cert-manager.io\"], Resources:[\"certificates\"], Verbs:[\"deletecollection\"]}\n{APIGroups:[\"cert-manager.io\"], Resources:[\"issuers\"], Verbs:[\"deletecollection\"]}, failed to apply resource moco-system/cert-manager-controller-approve:cert-manager-io: clusterroles.rbac.authorization.k8s.io \"cert-manager-controller-approve:cert-manager-io\" is forbidden: user \"system:serviceaccount:severalnines-system:agent-operator-controller-manager\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:severalnines-system\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"cert-manager.io\"], Resources:[\"signers\"], ResourceNames:[\"clusterissuers.cert-manager.io/*\"], Verbs:[\"approve\"]}\n{APIGroups:[\"cert-manager.io\"], Resources:[\"signers\"], ResourceNames:[\"issuers.cert-manager.io/*\"], Verbs:[\"approve\"]}, failed to apply resource moco-system/cert-manager-controller-certificatesigningrequests: clusterroles.rbac.authorization.k8s.io \"cert-manager-controller-certificatesigningrequests\" is forbidden: user \"system:serviceaccount:severalnines-system:agent-operator-controller-manager\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:severalnines-system\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"certificates.k8s.io\"], Resources:[\"signers\"], ResourceNames:[\"clusterissuers.cert-manager.io/*\"], Verbs:[\"sign\"]}\n{APIGroups:[\"certificates.k8s.io\"], Resources:[\"signers\"], ResourceNames:[\"issuers.cert-manager.io/*\"], Verbs:[\"sign\"]}, failed to apply resource moco-system/cert-manager-controller-approve:cert-manager-io: clusterroles.rbac.authorization.k8s.io \"cert-manager-controller-approve:cert-manager-io\" not found, failed to apply resource moco-system/cert-manager-controller-certificatesigningrequests: clusterroles.rbac.authorization.k8s.io \"cert-manager-controller-certificatesigningrequests\" not found]"}
    github.com/severalnines/clustercontrol-k8s/agent-operator/internal/controller.(*DatabaseOperatorReconciler).Reconcile
        /app/agent-operator/internal/controller/databaseoperator_controller.go:275
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:303
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224
    2025-03-19T04:27:24Z    ERROR   Reconciler error    {"controller": "databaseoperator", "controllerGroup": "agent.severalnines.com", "controllerKind": "DatabaseOperator", "DatabaseOperator": {"name":"moco","namespace":"default"}, "namespace": "default", "name": "moco", "reconcileID": "af00009c-0000-4b9a-0000-faadfbf70000", "error": "failed to install cert-manager: failed to apply manifest: [failed to apply resource moco-system/cert-manager-edit: clusterroles.rbac.authorization.k8s.io \"cert-manager-edit\" is forbidden: user \"system:serviceaccount:severalnines-system:agent-operator-controller-manager\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:severalnines-system\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"acme.cert-manager.io\"], Resources:[\"challenges\"], Verbs:[\"deletecollection\"]}\n{APIGroups:[\"acme.cert-manager.io\"], Resources:[\"orders\"], Verbs:[\"deletecollection\"]}\n{APIGroups:[\"cert-manager.io\"], Resources:[\"certificaterequests\"], Verbs:[\"deletecollection\"]}\n{APIGroups:[\"cert-manager.io\"], Resources:[\"certificates\"], Verbs:[\"deletecollection\"]}\n{APIGroups:[\"cert-manager.io\"], Resources:[\"issuers\"], Verbs:[\"deletecollection\"]}, failed to apply resource moco-system/cert-manager-controller-approve:cert-manager-io: clusterroles.rbac.authorization.k8s.io \"cert-manager-controller-approve:cert-manager-io\" is forbidden: user \"system:serviceaccount:severalnines-system:agent-operator-controller-manager\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:severalnines-system\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"cert-manager.io\"], Resources:[\"signers\"], ResourceNames:[\"clusterissuers.cert-manager.io/*\"], Verbs:[\"approve\"]}\n{APIGroups:[\"cert-manager.io\"], Resources:[\"signers\"], ResourceNames:[\"issuers.cert-manager.io/*\"], Verbs:[\"approve\"]}, failed to apply resource moco-system/cert-manager-controller-certificatesigningrequests: clusterroles.rbac.authorization.k8s.io \"cert-manager-controller-certificatesigningrequests\" is forbidden: user \"system:serviceaccount:severalnines-system:agent-operator-controller-manager\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:severalnines-system\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"certificates.k8s.io\"], Resources:[\"signers\"], ResourceNames:[\"clusterissuers.cert-manager.io/*\"], Verbs:[\"sign\"]}\n{APIGroups:[\"certificates.k8s.io\"], Resources:[\"signers\"], ResourceNames:[\"issuers.cert-manager.io/*\"], Verbs:[\"sign\"]}, failed to apply resource moco-system/cert-manager-controller-approve:cert-manager-io: clusterroles.rbac.authorization.k8s.io \"cert-manager-controller-approve:cert-manager-io\" not found, failed to apply resource moco-system/cert-manager-controller-certificatesigningrequests: clusterroles.rbac.authorization.k8s.io \"cert-manager-controller-certificatesigningrequests\" not found]"}
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224
    2025-03-19T04:27:24Z    INFO    status-predicate    DIAGNOSTIC: Detected status-only change, filtering out reconciliation   {"name": "moco", "kind": "", "oldGeneration": 1, "newGeneration": 1, "resourceVersion": "24302778", "oldResourceVersion": "24302764"}
    2025/03/19 04:27:25 Received message from proxy: {"group":"agent.severalnines.com","version":"v1alpha1","kind":"DatabaseOperator","namespace":"","limit":50,"continue":"","labelSelector":"","fieldSelector":"","annotationSelector":""} (list_resources)
    

Failed backups

  • List database backups:

    kubectl get databasebackup
    NAME                                                            STATUS   AGE   STARTED   COMPLETED
    cnpg-cluster-backup-cnpg-cluster-backup-backup-20250319055800   Failed   26s  
    
  • Check CNPG backup status:

    kubectl get backup
    NAME                                        AGE   CLUSTER        METHOD              PHASE    ERROR
    cnpg-cluster-backup-backup-20250319055800   94s   cnpg-cluster   barmanObjectStore   failed   can't execute backup: cmd: [/controller/manager backup cnpg-cluster-backup-backup-20250319055800]...
    
  • Get backup details:

    kubectl describe backup cnpg-cluster-backup-backup-20250319055800
    
  • Error:

    Status:
    Error:  can't execute backup: cmd: [/controller/manager backup cnpg-cluster-backup-backup-20250319055800]
    error: command terminated with exit code 1
    stdErr: {"level":"info","ts":"2025-03-19T05:58:22.131051022Z","msg":"Error while requesting backup","logging_pod":"cnpg-cluster-2","backupURL":"http://localhost:8010/pg/backup","statusCode":500,"body":"error while requesting backup: while starting backup: cannot recover backup credentials: while getting secret s3-credentials: secrets \"s3-credentials\" not found\n"}
    

Node maintenance and troubleshooting

  • Check node status

    kubectl get nodes
    NAME                   STATUS   ROLES    AGE   VERSION
    pool-n97uu1q8u-edf02   Ready    <none>   57d   v1.31.1
    
  • Debug node issues

    Start a debug container on a node:

    kubectl debug node/pool-n97uu1q8u-edf02 -it --image=ubuntu:latest -- /bin/bash
    
  • Manage container runtime

    Access the host filesystem:

    chroot /host
    

    List container images:

    crictl images ls
    

    List all Containers (including stopped):

    crictl ps -a
    
  • Disk usage management

    Check log usage:

    du -sh /var/log/*
    

    Check container runtime storage:

    du -sh /var/lib/containerd/*
    
  • Cleanup tasks

    Remove stopped containers:

    crictl ps -a -o json | jq -r '.containers[] | select(.state=="CONTAINER_EXITED") | .id' | xargs -r crictl rm
    

    Remove dangling images:

    crictl images -o json | jq -r '.images[] | select((.repoTags == null) or ((.repoTags | length)==0)) | .id' | xargs -r crictl rmi