Documentation
Troubleshoot Cluster Bootstrapping ¶
When you create a management cluster, a bootstrap cluster is created on your local client machine. This is a Kind based cluster - a cluster in a container. This bootstrap cluster then creates a cluster on your specified provider.
The bootstrap cluster is the key to being able to introspect and understand what is happening during bootstrapping. Any issues or errors that occur in the bootstrap cluster during bootstrapping will provide information about potential problems in the final cluster on the target provider.
Troubleshoot with Tanzu Diagnostics ¶
One way to debug your cluster bootstrap issues is to use the tanzu diagnostics
CLI plugin which comes with Tanzu Community Edition.
Prerequisites ¶
Prior to collecting diagnostics data, your local machine must have the following programs in its $PATH
:
- kind
- kubectl
Before you Begin ¶
The kind bootstrap cluster name and the management cluster name are not related. If there are multiple kind bootstrap clusters present, before you begin debugging, you should determine the name of the kind cluster in one or both of the following files:
~/.kube-tkg/tmp/
~/.config/tanzu/tkg/config.yaml
- find the cluster name in thename
parameter and the corresponding kind bootstrap cluster name in thecontext
parameter prefixed bykind-tkg-kind
Collecting bootstrap diagnostics ¶
If the bootstrap process is stuck or did not finish successfully, use the diagnostics plugin to collect logs and cluster information:
tanzu diagnostics collect
This command searches for Tanzu bootstrap kind clusters (with tkg-kind-* prefix) and collects logs and API objects (excluding Secrets) that can help diagnose the bootstrap issues. You should see output similar to the following:
tanzu diagnostics collect
2021/09/20 11:05:17 Collecting bootstrap cluster diagnostics
2021/09/20 11:05:17 Info: Found bootstrap cluster(s): ["tkg-kind-b4o9sn5948199qbgca8d"]
2021/09/20 11:05:17 Info: Bootstrap cluster: tkg-kind-b4o9sn5948199qbgca8d: capturing node logs
2021/09/20 11:05:22 Info: Capturing pod logs: cluster=tkg-kind-b4o9sn5948199qbgca8d
2021/09/20 11:05:22 Info: Capturing API objects: cluster=tkg-kind-b4o9sn5948199qbgca8d
2021/09/20 11:05:25 Info: Archiving: bootstrap.tkg-kind-b4o9sn5948199qbgca8d.diagnostics.tar.gz
The command automatically generates a tar file which you can unpack to analyze and troubleshoot your cluster.
What is collected? ¶
The diagnostics plugin collects a set of known resources to help troubleshoot bootstrap cluster problems, including:
- Kind cluster logs (obtained with command
kind export logs
) - Pod logs from
capi*, capv*, capa*, tkg-system
namespaces (if they exist) - Pods, services, deployments, apps (from
capi*, capv*, capa*, tkg-system
namespaces) - Any other server resources in the
cluster-api
resource category
No other cluster objects are collected.
Collecting diagnostics for specific bootstrap cluster ¶
The previous command will collect diagnostics for all bootstrap kind clusters that are found. You can use the plugin to specify a boostrap cluster name.
First, determine the name of the bootstrap cluster. This can be done using kind
itself to list its clusters:
kind get clusters
Next, run the tanzu diagnostics
command to collect diagnostics data to help troubleshoot the cluster identified in the step above:
tanzu diagnostics collect --boostrap-cluster-name=<BOOTSTRAP-CLUSTER-NAME>
Diagnostics for bootstrap clusters only ¶
If you do not have a management cluster yet (or do not need to collect management cluster diagnostics), you can modify the tanzu diagnostics
command to only collect diagnostics for the bootstrap cluster as follows:
tanzu diagnostics collect --management-cluster-skip
Troubleshooting manually ¶
If the steps above are not enough, or you want complete control over your troubleshooting steps, complete the following steps to troubleshoot a bootstrap cluster:
Run docker ps on your local Docker system to get the name of the bootstrap cluster container:
docker ps
The bootstrap cluster container name will begin with
tkg-cluster
followed by a unique ID, for example,tkg-cluster-example1234567abcdef
. Copy the CONTAINER ID of the bootstrap cluster container.Open a bash shell in the bootstrap cluster container:
docker exec -it <BOOTSTRAP-CLUSTER-ID> bash
Where
<BOOTSTRAP-CLUSTER-ID>
is the value copied in the previous step.Before you can proceed to run
kubectl
commands against the pods inside the bootstrap cluster container, copy theadmin.conf
file to the default kubeconfig location:cp -v /etc/kubernetes/admin.conf ~/.kube/config
Now you are inside the bootstrap cluster container that is going to bootstrap your cluster to the target provider, you can run
kubectl
commands against this container. By watching the status of the pods, you can understand what might go wrong in the bootstrap process. Run the following command to see the pods being created inside the container:kubectl get po -A
Copy the name of the controller manager, it is usually first in the list. It will be named similarly to the following depending on your target provider:
- capa-controller-manager-12a3456789-b1cde (AWS)
- capd-controller-manager-12a3456789-b1cde (Docker)
- capv-controller-manager-12a3456789-b1cde (vSphere)
- capz-controller-manager-12a3456789-b1cde (Azure)
Next, you can examine the logs of the controller manager that communicates with the target provider. This step is important, if you are having problems bootstrapping, the errors in the controller logs will provide the detail. Examine the logs for the controller manager:
kubectl logs -n <NAMESPACE> <CONTROLLER-MANAGER> -c manager –f
Where
<CONTROLLER-MANAGER>
is the value copied in the previous step.<NAMESPACE>
will vary based on your provider, use:capa-system
(AWS)capd-system
(Docker)capv-system
(vSphere)capz-system
(Azure)
[Optional] Events are also reported based on actions taken in the target provider. You can view the known events by running:
kubectl get events -n tkg-system