9.1 KiB
9.1 KiB
Kubernetes Bootstrap Guide
🎯 Overview
This guide explains how to bootstrap a complete Kubernetes cluster from scratch using Azure VMs and the freeleaps-ops repository. Kubernetes does NOT create automatically - you need to manually bootstrap the entire infrastructure.
📋 Prerequisites
1. Azure Infrastructure
- ✅ Azure VMs (already provisioned)
- ✅ Network connectivity between VMs
- ✅ Azure AD tenant configured
- ✅ Resource group:
k8s
2. Local Environment
- ✅
freeleaps-opsrepository cloned - ✅ Ansible installed (
pip install ansible) - ✅ Azure CLI installed and configured
- ✅ SSH access to VMs
3. VM Requirements
- Master Nodes: 2+ VMs for control plane
- Worker Nodes: 2+ VMs for workloads
- Network: All VMs in same subnet
- OS: Ubuntu 20.04+ recommended
🚀 Step-by-Step Bootstrap Process
Step 1: Verify Azure VMs
# Check VM status
az vm list --resource-group k8s --query "[].{name:name,powerState:powerState,privateIP:privateIps}" -o table
# Ensure all VMs are running
az vm start --resource-group k8s --name <vm-name>
Step 2: Configure Inventory
Edit the Ansible inventory file:
cd freeleaps-ops
vim cluster/ansible/manifests/inventory.ini
Example inventory structure:
[all:vars]
ansible_user=wwwadmin@mathmast.com
ansible_ssh_common_args='-o StrictHostKeyChecking=no'
[kube_control_plane]
prod-usw2-k8s-freeleaps-master-01 ansible_host=10.10.0.4 etcd_member_name=freeleaps-etcd-01 host_name=prod-usw2-k8s-freeleaps-master-01
prod-usw2-k8s-freeleaps-master-02 ansible_host=10.10.0.5 etcd_member_name=freeleaps-etcd-02 host_name=prod-usw2-k8s-freeleaps-master-02
[kube_node]
prod-usw2-k8s-freeleaps-worker-nodes-01 ansible_host=10.10.0.6 host_name=prod-usw2-k8s-freeleaps-worker-nodes-01
prod-usw2-k8s-freeleaps-worker-nodes-02 ansible_host=10.10.0.7 host_name=prod-usw2-k8s-freeleaps-worker-nodes-02
[etcd]
prod-usw2-k8s-freeleaps-master-01
prod-usw2-k8s-freeleaps-master-02
[k8s_cluster:children]
kube_control_plane
kube_node
Step 3: Test Connectivity
cd cluster/ansible/manifests
ansible -i inventory.ini all -m ping -kK
Step 4: Bootstrap Kubernetes Cluster
cd ../../3rd/kubespray
ansible-playbook -i ../../cluster/ansible/manifests/inventory.ini ./cluster.yml -kK -b
What this does:
- Installs Docker/containerd on all nodes
- Downloads Kubernetes binaries (v1.31.4)
- Generates certificates and keys
- Bootstraps etcd cluster
- Starts Kubernetes control plane
- Joins worker nodes
- Configures Calico networking
- Sets up OIDC authentication
Step 5: Get Kubeconfig
# Get kubeconfig from master node
ssh wwwadmin@mathmast.com@10.10.0.4 "sudo cat /etc/kubernetes/admin.conf" > ~/.kube/config
# Test cluster access
kubectl get nodes
kubectl get pods -n kube-system
Step 6: Deploy Infrastructure
cd ../../cluster/manifests
# Deploy in order
kubectl apply -f freeleaps-controls-system/
kubectl apply -f freeleaps-devops-system/
kubectl apply -f freeleaps-monitoring-system/
kubectl apply -f freeleaps-logging-system/
kubectl apply -f freeleaps-data-platform/
Step 7: Setup Authentication
cd ../../cluster/bin
./freeleaps-cluster-authenticator auth
🤖 Automated Bootstrap Script
Use the provided bootstrap script for automated deployment:
cd freeleaps-ops/docs
./bootstrap-k8s-cluster.sh
Script Features:
- ✅ Prerequisites verification
- ✅ Azure VM status check
- ✅ Connectivity testing
- ✅ Automated cluster bootstrap
- ✅ Infrastructure deployment
- ✅ Authentication setup
- ✅ Status verification
Usage Options:
# Full bootstrap
./bootstrap-k8s-cluster.sh
# Only verify prerequisites
./bootstrap-k8s-cluster.sh --verify
# Only bootstrap cluster (skip infrastructure)
./bootstrap-k8s-cluster.sh --bootstrap
🔧 Manual Bootstrap Commands
If you prefer manual control, here are the detailed commands:
1. Install Prerequisites
# Install Ansible
pip install ansible
# Install Azure CLI
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
# Install kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
2. Configure Azure
# Login to Azure
az login
# Set subscription
az account set --subscription <subscription-id>
3. Bootstrap Cluster
# Navigate to kubespray
cd freeleaps-ops/3rd/kubespray
# Run cluster installation
ansible-playbook -i ../../cluster/ansible/manifests/inventory.ini ./cluster.yml -kK -b
4. Verify Installation
# Get kubeconfig
ssh wwwadmin@mathmast.com@<master-ip> "sudo cat /etc/kubernetes/admin.conf" > ~/.kube/config
# Test cluster
kubectl get nodes
kubectl get pods -n kube-system
🔍 Verification Steps
1. Cluster Health
# Check nodes
kubectl get nodes -o wide
# Check system pods
kubectl get pods -n kube-system
# Check cluster info
kubectl cluster-info
2. Network Verification
# Check Calico pods
kubectl get pods -n kube-system | grep calico
# Check network policies
kubectl get networkpolicies --all-namespaces
3. Authentication Test
# Test OIDC authentication
kubectl auth whoami
# Check permissions
kubectl auth can-i --list
🚨 Troubleshooting
Common Issues
1. Ansible Connection Failed
# Check VM status
az vm show --resource-group k8s --name <vm-name> --query "powerState"
# Test SSH manually
ssh wwwadmin@mathmast.com@<vm-ip>
# Check network security groups
az network nsg rule list --resource-group k8s --nsg-name <nsg-name>
2. Cluster Bootstrap Failed
# Check Ansible logs
ansible-playbook -i inventory.ini cluster.yml -kK -b -vvv
# Check VM resources
kubectl describe node <node-name>
# Check system pods
kubectl get pods -n kube-system
kubectl describe pod <pod-name> -n kube-system
3. Infrastructure Deployment Failed
# Check CRDs
kubectl get crd
# Check operator pods
kubectl get pods --all-namespaces | grep operator
# Check events
kubectl get events --all-namespaces --sort-by='.lastTimestamp'
Recovery Procedures
If Bootstrap Fails
- Clean up failed installation
# Reset VMs to clean state
az vm restart --resource-group k8s --name <vm-name>
- Retry bootstrap
cd freeleaps-ops/3rd/kubespray
ansible-playbook -i ../../cluster/ansible/manifests/inventory.ini ./cluster.yml -kK -b
If Infrastructure Deployment Fails
- Check prerequisites
kubectl get nodes
kubectl get pods -n kube-system
- Redeploy components
kubectl delete -f <component-directory>/
kubectl apply -f <component-directory>/
📊 Post-Bootstrap Verification
1. Core Components
# ArgoCD
kubectl get pods -n freeleaps-devops-system | grep argocd
# Cert-manager
kubectl get pods -n freeleaps-controls-system | grep cert-manager
# Prometheus/Grafana
kubectl get pods -n freeleaps-monitoring-system | grep prometheus
kubectl get pods -n freeleaps-monitoring-system | grep grafana
# Logging
kubectl get pods -n freeleaps-logging-system | grep loki
2. Access Points
# ArgoCD UI
kubectl port-forward svc/argocd-server -n freeleaps-devops-system 8080:80
# Grafana UI
kubectl port-forward svc/kube-prometheus-stack-grafana -n freeleaps-monitoring-system 3000:80
# Kubernetes Dashboard
kubectl port-forward svc/kubernetes-dashboard-kong-proxy -n freeleaps-infra-system 8443:443
3. Authentication Setup
# Setup user authentication
cd freeleaps-ops/cluster/bin
./freeleaps-cluster-authenticator auth
# Test authentication
kubectl auth whoami
kubectl get nodes
🔒 Security Considerations
1. Network Security
- Ensure VMs are in private subnets
- Configure network security groups properly
- Use VPN or bastion host for access
2. Access Control
- Use Azure AD OIDC for authentication
- Implement RBAC for authorization
- Regular access reviews
3. Monitoring
- Enable audit logging
- Monitor cluster health
- Set up alerts
📚 Next Steps
1. Application Deployment
- Deploy applications via ArgoCD
- Configure CI/CD pipelines
- Set up monitoring and alerting
2. Maintenance
- Regular security updates
- Backup etcd data
- Monitor resource usage
3. Scaling
- Add more worker nodes
- Configure auto-scaling
- Optimize resource allocation
🆘 Support
Emergency Contacts
- Infrastructure Team: [Contact Information]
- Azure Support: [Contact Information]
- Kubernetes Community: [Contact Information]
Useful Commands
# Cluster status
kubectl get nodes
kubectl get pods --all-namespaces
# Logs
kubectl logs -n kube-system <pod-name>
# Events
kubectl get events --all-namespaces --sort-by='.lastTimestamp'
# Resource usage
kubectl top nodes
kubectl top pods --all-namespaces
Last Updated: September 3, 2025 Version: 1.0 Maintainer: Infrastructure Team