At work, we use kube-aws to deploy our Kubernetes clusters running on top of Container Linux inside of AWS. I wanted to be able try new things with Kubernetes in my personal lab without having to rack up huge AWS bills, so that means figuring out a good way to deploy Kubernetes to baremetal in the least painful way. I wanted to try Tectonic because it offers a simplified graphical installer, and that means I also need to install Matchbox to support the baremetal provisioning aspects.
After reading through the entire Tectonic Installation Guide, I realized that it doesn’t cover some of the underlying OS provisioning components as they vary per environment. Since I don’t have that infrastructure as I’m starting from scratch, I had to get that going before continuing the installation. Here’s my summary of steps taken (including a custom dnsmasq container) to round out a full working guide.
My personal lab is a mixture of PCs, 1U servers, and Mac hardware. So, for this lab, I’m going to pick one of each to ensure the entire process would work on all my hardware.
As per the installation guide, Tectonic needs three systems at a minimum. Here are my systems per role:
- Deployment/Provisioning system
- Kubernetes Controller
- Kubernetes Worker
The simplistic network environment is as follows:
Detailed PXE/Netboot to CoreOS Installation Flow
The following describes the complete baremetal provisioning process from DHCP request to CoreOS installation. Configuring the provisioning system to support this workflow is described in the subsequent section.
- Using IPMI or a keyboard during booting, tell the system to “boot from network”. This will cause the NIC to perform a DHCP request and look for TFTP related settings to boot from. In my case, it was
F12for my server and holding
nduring the boot chime for the Mac.
- The DHCP server responds with an IP and subnet along with information pointing to the TFTP server and a filename of what to download/run from that TFTP server.
- The system attempts to connect to the TFTP server and download/run that initial file. In the case of
grub2, that initial boot file runs and then also tries to contact the same TFTP server looking for a grub boot configuration.
- If a grub boot configuration file is found, it follows that configuration. In the case of
matchbox, it should be a pointing to its web port and passing the NIC’s MAC address:
- Since the initial grub configuration does nothing but load an HTTP module and defer to a web address for the rest of the
grubconfiguration, it provides a convenient way to grab a system-specific boot configuration without having to change your TFTP provided configuration file. In this case,
Matchboxanswers to web requests at the
/grub?mac=XX:XX:XX:XX:XX:XXURL with a tailored boot configuration, like so:
- Now that grub knows what to boot, where to get it, and extra kernel parameters for the ignition configuration for what to install inside the OS and how, the system can begin and complete the installation. Here is the ignition configuration that the CoreOS kernel pulls from the
coreos.config.urlURL which basically says to install CoreOS Container Linux and reboot:
- Here are the URL-decoded contents of
/opt/installerfrom inside the ignition configuration above. Notice how it uses the
os=installedparameter to pull a “normal” boot configuration specific to this machine for future booting after being installed to disk
- Also, if you notice the name of the user available via SSH during that first boot/installation is
debugand uses the same SSH key as what will be available after the installation completes and is rebooted for the permanent user
core. This is really handy if you are troubleshooting why the installation is failing or want to watch that process as it goes. Note that it’s really only available for a few minutes on quick systems since the installation completes so quickly.
- At this point, CoreOS/Container Linux has been installed to
/dev/sda, a user named
corewith an SSH key set, and has an ignition configuration that configured its systemd units. This is where Matchbox/Ignition stop and normal SSH-based administration can take over.
Provisioning Infrastructure Configuration
There are several components that run on the
deploy.lab system that all need to work in concert for the above process to be successful:
- Matchbox - The CoreOS provided container that handles the web serving of grub, CoreOS, and Igntition templates provided by the Tectonic installer.
- DHCP, DNS, TFTP, Grub Network Boot Images and Configuration - This is handled by a custom container built using
I hand installed Centos 7.3 with the minimal install and ensured that it had a recent version of Docker with SSH key authentication as the
admin user in the
Installing and Configuring the Matchbox Container on
The matchbox documentation for running via
docker is a bit misleading as it actually requires several things to be completed before actually running the container.
First, create the
matchbox user and create/own the
/var/lib/matchbox directory where it will keep all the assets and profiles.
matchbox package, verify its signature, and untar it:
Run a script to grab the version(s) of CoreOS/Container Linux to the current directory and then copy them to the assets directory to be available via
matchbox on port
Drop out of the
matchbox installation directory and run it via
Finally, verify that
matchbox is running and able to serve up your downloaded CoreOS image(s). If you see this, you should be good to go:
Installing and Configuring the DNSMasq (DNS, DHCP, TFTP, Grub) Container on
It’s easiest to grab a copy of the repo and build your own docker image locally.
Edit the files in
files directory as needed. Most changes are IP addresses, MAC addresses, and hostnames for your environment:
Finally, build and run the image:
Running the Tectonic Installer
With the above in place and a free license from CoreOS for Tectonic, you can now follow the Tectonic Baremetal with Graphical Installer guide having satisfied the pre-requisites–with one exception.
The Tectonic installer will run you through several steps of supplying configuration and will arrive at a point where it instructs you to “power on your systems” that are to be baremetal booted from the network, but it won’t work out-of-the-box. The details are in this github issue for what is happening to prevent
grubfrom working by default. The good news is that there is a simple workaround. On the
deploy.labsystem, this is the default profile that the Tectonic GUI installer places into your
args section. When the system is booting from the network and pulls the
/grub?mac=XX:XX:XX:XX:XX:XX configuration, this
args list is directly dropped into the kernel line. However, the
mac:hexhyp variables are
ipxe boot environment specific. For
grub2, it actually varies slightly. To fix this, we need to make some customizations to the
groups configuration files. I chose to make one for each system based on the
mac address selector. Notice that I now reference the
We also need to make those renamed profiles. Notice the variable
$net_efinet0_dhcp_mac for UEFI/Mac hardware and the
$net_default_mac variable for BIOS booting hardware. Also notice that I made another unique ignition template to install to
/dev/sdb instead of
/dev/sda for the
Once the above changes have been made, you should be able to successfully PXE/Netboot the systems and continue on with the final portion of the Tectonic installer. If you run into issues, double-check your formatting of the profiles and groups as well as hitting the
/ignition endpoints with the proper parameters to see what configurations are being provided to your systems.
Congratulations! You should now be able to hit the web UI of the Tectonic Console, use
ssh into the systems using the
core user and the SSH key you supplied. I hope this helps you understand what’s going on behind a fairly sophisticated and easily customizable baremetal Container Linux and Kubernetes installation system.