# Access and manage embedded clusters (Beta)

This topic describes managing nodes in clusters created with Replicated Embedded Cluster.

## Access the cluster

You can use the CLI to access the cluster. This is useful for development or troubleshooting.

To access the cluster and use other included binaries:

1. SSH into a controller node.

     :::note
     You cannot run the `shell` command on worker nodes.
     :::

1. Use the Embedded Cluster shell command to start a shell with access to the cluster:

     ```
     sudo ./APP_SLUG shell
     ```
     Where `APP_SLUG` is the unique slug for the application.

     The output looks similar to the following:
     ```
        __4___
     _  \ \ \ \   Welcome to APP_SLUG debug shell.
    <'\ /_/_/_/   This terminal is now configured to access your cluster.
     ((____!___/) Type 'exit' (or CTRL+d) to exit.
      \0\0\0\0\/  Happy hacking.
     ~~~~~~~~~~~
    root@alex-ec-1:/home/alex# export KUBECONFIG="/var/lib/embedded-cluster/k0s/pki/admin.conf"
    root@alex-ec-1:/home/alex# export PATH="$PATH:/var/lib/embedded-cluster/bin"
    root@alex-ec-1:/home/alex# source <(k0s completion bash)
    root@alex-ec-1:/home/alex# source <(cat /var/lib/embedded-cluster/bin/kubectl_completion_bash.sh)
    root@alex-ec-1:/home/alex# source /etc/bash_completion
    ```

     The appropriate kubeconfig is exported, and the location of useful binaries like kubectl and Replicated’s preflight and support-bundle plugins is added to PATH.

1. Use the available binaries as needed.

     **Example**:

     ```bash
     kubectl version
     ```
     ```
     Client Version: v1.29.1
     Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
     Server Version: v1.29.1+k0s
     ```

1. Type `exit` or **Ctrl + D** to exit the shell.

## Configure multi-node clusters

This section describes how to join nodes to a cluster with Embedded Cluster.

### Limitations

Multi-node clusters with Embedded Cluster have the following limitations:

* All nodes joined to the cluster use the same Embedded Cluster data directory as the installation node. You cannot choose a different data directory for Embedded Cluster when joining nodes.

* You should not join more than one controller node at the same time. When joining a controller node, Embedded Cluster prints a warning explaining that you should not attempt to join another node until the controller node joins successfully.

* You cannot change a node's role (`controller` or `worker`) after you join the node. If you need to change a node’s role, reset the node and add it again with the new role.

### Requirement

To deploy multi-node clusters with Embedded Cluster, you must enable the **Multi-node Cluster (Embedded Cluster only)** license field for the customer. For more information about managing customer licenses, see [Create and Manage Customers](/vendor/releases-creating-customer).

### Join nodes {#add-nodes}

To join a node:

1. SSH into a controller node.

1. Run the following command to generate the `.tar.gz` bundle for joining a node:

   ```bash
   sudo ./APP_SLUG create-join-bundle --role [controller | worker]
   ```
   Where:
   * `APP_SLUG` is the unique slug for the application.
   * `--role` is the role to assign the node (`controller` or `worker`).
     
     :::note
     You cannot change the role after you add a node. If you need to change a node’s role, reset the node and add it again with the new role.
     :::

1. Use `scp` to copy the `.tar.gz` bundle to the node that you want to join.

1. Extract the `.tar.gz`.

1. Run the join command to add the node to the cluster:

   ```bash
   sudo ./APP_SLUG node join
   ```

1. Repeat these steps for each node you want to add.

## High availability for multi-node clusters {#ha}

Embedded Cluster automatically enables high availability (HA) when at least three `controller` nodes are present in the cluster.

In HA installations, Embedded Cluster deploys multiple replicas of the OpenEBS and image registry built-in extensions. Also, any Helm [extensions](embedded-config#extensions) that you include in the Embedded Cluster Config are installed in the cluster depending on the given chart and whether or not it is configured to be deployed with high availability.

### Best practices for high availability

Consider the following best practices and recommendations for HA clusters:

* HA requires at least three _controller_ nodes that run the Kubernetes control plane. This is because clusters use a quorum system, in which more than half the nodes must be up and reachable. In clusters with three controller nodes, the Kubernetes control plane can continue to operate if one node fails because the remaining two nodes can still form a quorum.

* Always use an odd number of controller nodes in HA clusters. Using an odd number of controller nodes ensures that the cluster can make decisions efficiently with quorum calculations. Clusters with an odd number of controller nodes also avoid split-brain scenarios where the cluster runs as two, independent groups of nodes, resulting in inconsistencies and conflicts.

* You can have any number of _worker_ nodes in HA clusters. Worker nodes do not run the Kubernetes control plane, but can run workloads such as application workloads.

### Create a multi-node cluster with HA

To create a multi-node cluster with HA:

* During installation with Embedded Cluster, follow the steps in the Embedded Cluster UI to join a total of three `controller` nodes to the cluster. For more information about joining nodes, see [Join nodes](#add-nodes) on this page.

   Embedded Cluster automatically converts the installation to HA when three or more controller nodes are present. 

### Enable HA for an existing cluster {#enable-ha-existing}

To enable HA for an existing Embedded Cluster installation with three or more controller nodes:

* On one of the controller nodes, run this command:

   ```bash
   sudo ./APP_SLUG enable-ha
   ```

   Where `APP_SLUG` is the unique slug for the application.

## Reset nodes and remove clusters

This section describes how to reset individual nodes and how to delete an entire multi-node cluster using the Embedded Cluster [reset](embedded-cluster-reset) command.

### About the `reset` command

Resetting a node with Embedded Cluster removes the cluster and your application from that node. This is useful for iteration, development, and when you make mistakes because you can reuse the machine instead of having to procure a new one.

The `reset` command performs the following steps:

1. Run safety checks. For example, `reset` does not remove a controller node when there are workers nodes available. And, it does not remove a node when the etcd cluster is unhealthy.
1. Drain the node and evict all the Pods gracefully
1. Delete the node from the cluster
1. Stop and reset k0s
1. Remove all Embedded Cluster files
1. Reboot the node

For more information about the command, see [reset](embedded-cluster-reset).

### Limitations and best practices

Before you reset a node or remove a cluster, consider the following limitations and best practices:

* When you reset a node, Embedded Cluster deletes OpenEBS PVCs on that node. Kubernetes automatically recreates only PVCs created as part of a StatefulSet on another node in the cluster. To recreate other PVCs, redeploy the application in the cluster.

* If you need to reset one controller node in a three-node cluster, first join a fourth controller node to the cluster before removing the target node. This ensures that you maintain a minimum of three nodes for the Kubernetes control plane. You can add and remove worker nodes as needed because they do not have any control plane components.

* When resetting a single node or deleting a test environment, you can include the `--force` flag with the `reset` command to ignore any errors.

* When removing a multi-node cluster, run `reset` on each of the worker nodes first. Then, run `reset` on controller nodes. Controller nodes also remove themselves from etcd membership.

### Reset a node {#reset-a-node}

To reset a node:

1. SSH onto the node. Ensure that the Embedded Cluster binary is still available on the machine.

1. Run the following command to remove the node and reboot the machine:

    ```bash
    sudo ./APP_SLUG reset
    ```
    Where `APP_SLUG` is the unique slug for the application.

### Remove a multi-node cluster

To remove a multi-node cluster:

1. SSH onto a worker node.

   :::note
   The safety checks for the `reset` command prevent you from removing a controller node when there are still worker nodes available in the cluster.
   :::

1. Remove the node and reboot the machine:

   ```bash
   sudo ./APP_SLUG reset
   ```
   Where `APP_SLUG` is the unique slug for the application.

1. After removing all the worker nodes in the cluster, SSH onto a controller node and run the `reset` command to remove the node.

1. Repeat the previous step on the remaining controller nodes in the cluster.