Building a Kubernetes Cluster with Pis
What happens when you buy too many Raspberry Pis? You start to run out of things to run on them – this is the problem with being a massive technology fanatic. In my home, I had 3 each running a separate service and had 2 more arriving in the mail to play around with.
Why Do This?
So, the first logical question to ask is why do this? Well, for me, I had three Raspberry Pi’s scattered around my house doing various things. One was acting as a Pi-Hole, one was self-hosting a cloud storage solution and another was hosting a simple web server.
I thought to myself that there must be a way I can combine the power of multiple Raspberry Pi’s together - that’s when I stumbled upon a project called the OctaPi. That’s when I knew I could do this.
But, instead of doing it the way outlined in the article, I decided I wanted to do something different. For me, I am a HUGE networking nerd. I love playing around with switches, routers, firewalls and other networking devices.
Recently, I read about many companies using containerization technology such as Docker and Kubernetes and thought it would be a good idea to try and run something like this across multiple Pi’s.
Preparing the Pi’s
The first step was to install an operating system on each individual Pi. Now, because I had so many Raspberry Pi’s, it was going to be a pain to install Raspbian on each, wait for boot, login, configure network info, enable SSH so on and so forth… you get the idea.
However, I stumbled across something called a headless install which allowed me to configure some of this without any human input – amazing right? To do this, I use the Raspberry Pi Imager and configure additional advanced options like giving it a hostname and adding SSH functionality to later SSH in from my main Windows 10 PC via the command line.
This was done for every Pi in the cluster – 4 times for me. The only thing that changed was the hostname (with one being masterpi.local and the others being child1 through to child3). The reason I did not configure the WiFi settings was because each pi had a direct Ethernet connection to an unmanaged switch that had a connection to my ISP router.
Once the microSD card was flashed, the card was plugged into the Pi, and it magically installed itself with SSH capabilities and a configured hostname. Repeat 4 times and the Pi’s are ready for take-off.
There is a slight caveat however. In order to run Kubernetes on a Pi, there is an additional configuration that needs to be made. For this, the microSD card needed to be taken back out and plugged into another PC to edit some files. One of these files has the name “cmdline.txt” – inside this file, the following lines needed to be appended to the end:
cgroup_memory=1
cgroup_enable=memory
In addition, if you know anything about networking, you will know that in a typical home network, any device connecting will get a random IP everytime it boots up meaning it could change once a restart is required – not good!
In order to fix this, there is the option to add additional configuration options. For this, I added one line that simply configured the network and added a hostname (just to make sure it was set). The line was:
ip=192.168.0.150::192.168.0.1:255.255.255.0:masterpi:eth0:off
This configured the current Pi – masterpi in this case – with a static IP address of 192.168.0.150 with a /24 subnet mask which makes sure it joins my current home network to communicate with my main PC. It also sets the default gateway as the ISP router so it has internet connectivity.
The last three options in the configuration set the hostname, sets the interface to configure the network on – in this case eth0 since it was a wired connection. Finally, the off option tells the Pi to turn off any autoconfiguration that it may do initially.
Once again, I repeated this for all the Pis changing the IP addresses and hostnames for each configuration.
Sadly, there was one other file that needed additional configuration called “config.txt”. Inside this file, the following line needed to be added to configure the Pi as a 64-bit machine instead of its default 32-bit configuration (only available on Pis that are 64-bit capable).
With this set up, every Pi was now successfully running with their specified IPs and hostnames. For me, the following information was set:
masterpi (192.168.0.150)
child1 (192.168.0.151)
child2 (192.168.0.152)
child3 (192.168.0.153)
However, before installing and running Kubernetes on the Pis, one more configuration was necessary. Kubernetes has trouble running using the new networking configuration in modern Linux distros. To fix this, I needed to enable the old method in Linux called “iptables” via the following command:
sudo iptables -F
sudo update-alternatives –set iptables /usr/sbin/iptables-legacy
sudo update-alternatives –set ip6tables /usr/sbin/ip6tables-legacy
Before diving head first into the mind-blowing field of containerization, it’s vital to make sure everything is set up correctly and working initially. First, I added the hostnames to the /etc/hosts file on Windows which simply enabled me to refer to the devices via their hostnames. Then, just for fun, I played around with PowerShell and whipped up a simple ping loop script that looks like:
This simple script iterates through a text file that contains 4 lines – one for each hostname. Once it grabs a line, it uses that hostname and pings it once. If the host I up, it reports it as online. If it is down, it reported it as not reachable.
As you can see above, each Pi was up and running and contactable via their hostname meaning our configuration was working!
How about diving into Kubernetes? Let’s go!
Installing Kubernetes
The next step was installing Kubernetes. However, the normal install for Kubernetes (known as K8S) is very large and resource intensive. This means it can be very difficult to run it on a low power system such as a Raspberry Pi.
To solve this potential problem, there is a lightweight version of Kubernetes called K3S maintained by a company called Rancher which is essentially a trimmed down version but maintains much of the same functionality.
To install K3S, the cURL command is used with the following parameters:
-s for silent mode
-f for failing silently on HTTP errors
-L to redo the request if a redirect occurs
This command is then piped to add another line which configures itself as the master node before being executed by the shell itself. The full one-liner looks like the following:
curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE=”644” sh -s -
Once finished, Kubernetes is now installed on the masterpi! The system is now configured as the master node. But how do we check this? Very easily by using the “kubectl get nodes” command which shows any nodes we have in the cluster.
Before adding any other nodes to this cluster under the master’s control, we need to grab a token that will allow the child nodes to authenticate to the master node successfully – think of it as a similar mechanism to SSH.
When we installed K3S, it created a token identified in the /var/lib/rancher/k3s/server directory which we need to copy:
Once we have the token, we can SSH into a child node and run the same cURL command as before but this time adding some additional parameters such as the authentication token, the URL/IP of the master node with the 6443 port specified and the name we want to give this specific node. This is done via the following command:
curl -sfL https://get.k3s.io | K3S_TOKEN=”[TOKEN]” K3S_URL=”https://192.168.0.150:6443” K3S_NODE_NAME=”child1” sh –
After running this, the child node will be added to the cluster under the control of the master node we created earlier. Simply do this for every Raspberry Pi you want as part of the cluster.
Upon completion of this, running the “kubectl get nodes” command once again reveals that we have 4 total nodes – 1 master and 3 children.
The cluster is complete! The next step is to start installing services. For this project, I also decided to add a GUI interface to manage Kubernetes as an added bonus (who doesn’t like a pretty GUI?). The GUI interface is called Rancher – provided by the same company that provides the K3S installation.
Installing Rancher
To run Rancher, we first need a Linux machine on our network to act as the server. For this, I created an Ubuntu VM running version 18.04 LTS as it was the most stable release. Before installing Rancher on the Ubuntu machine, a directory needs to exist - /etc/rancher/rke2.
Inside this directory, a YAML file must be created which specifies two things – a token which is essentially a password and the IP address of the server that will be running Rancher (i.e. the Ubuntu VM IP).
Once the file exists, cURL can once again be used to download the Rancher file and pipe it into the shell to be ran immediately:
curl -sfL https://get.rancher.io | sh –
Once finished, the systemctl utility can be used to make sure Rancher service is enabled on boot every time and to also start the service right now in case it is not running.
In order to use Rancher, we need some login credentials. The “rancherd reset-admin” command is used to generate some credentials for the first login:
Once generated, navigating to the IP of the Ubuntu machine on port 8443 in a browser reveals a simple login page – it works.
Once logged in, a new password must be set. Additionally, we can tell Rancher that we want to manage multiple clusters – this is the Pi cluster we have created.
Next, we get taken to a page that shows our clusters. Currently, there is only one cluster which is the local cluster that Rancher creates when installed – our Pis are not there. We need to manually add the Pi cluster.
To do this, we click the “Add Cluster” button. Then, Rancher prompts us for a cluster type – in this scenario, we add “Other Cluster”.
Once hit, give it any name you like – in this case, it is called Pies.
Then, Rancher will tell us to run a command. Depending on how we have set it up, there is two options. If an SSL certificate is installed on the system running Rancher, you can use the first command. However, in my case, the Ubuntu system has no certificates installed which means that cURL needs to be used with the –insecure option to force it to download a YAML file and run it via kubectl (e.g. Kubernetes).
Once we run the command, it creates everything for us – we like that!
Our Pis should be added to Rancher. Going back to the dashboard reveals the clusters once again, but this time the Pies cluster is succesfully imported and showing up! But don’t celebrate just yet, it has 0 nodes and no CPU or RAM information available.
This is due to the default options that Rancher uses. When Rancher pulls down any files/configurations, it is pulling down files for a specific architecture – in this case, it is pulling down Intel 64-bit architecture. If you know about Raspberry Pis, you know that they don’t run Intel architecture but instead, they use ARM architecture which means there is a misconfiguration.
To fix this, we need to tell Rancher to pull down ARM64 files instead to work with our cluster. Inside the option to edit the cluster, there is a certain file called “agentImageOverride” where we can specify what image to use which specifies the architecture:
Inside this box, we can specify the following to use the ARM64 files for Rancher:
rancher/rancher-agent:v2.5.8-linux-arm64
Once saved, going back to the dashboard will reveal that the Pies cluster is now active and is reported as having 4 nodes – our Pis are alive!
Finally, with everything sorted, we can navigate to the dashboard and get a GUI that reports everything we ever need – from the cores and memory usage to the number of services and deployments currently active.
Creating a Deployment
In order to ensure that Kubernetes and Rancher are installed correctly, I ran a quick test before deploying a Pi-Hole which was my end goal. For the test, I decided to deploy multiple Nginx web servers across the cluster.
Some quick theory – Kubernetes uses YAML files to create what are called deployments. For the Nginx deployments, the YAML file looks like the following:
Think of a YAML file as many key:value pairs like in programming. For example, there is a metadata section which specifies the name is nginx. Under the spec section, it defines what are known as replicas – this is simply how many of this deployment do you want. In this case, I said 6 nginx web servers.
Finally, there is a section called containers meaning we are deploying some containers – think Docker. Under the image key, there is a value of nginx:stable which is just the Docker image name that will be pulled from the Docker Hub.
In order to deploy this, the “kubectl apply -f” command is used followed by the name of the YAML file:
kubectl apply -f nginx.yaml
Once ran, the web servers will be running across the cluster on different Raspberry Pis – how cool is that? To see this in action, run the “kubectl get pods -o wide” command to get a ton of information. In this case, there is 1 server running on child1, 2 servers running on child2 and 3 servers running on child3 – I SWEAR I DID NOT PLAN THAT!
As always in cyber security, don’t celebrate too soon as there is a problem. Looking at the IP addresses that have been assigned, they are not in the IP address range of my home network (192.168.0.0/24) which means they are not reachable. The 10.42.0.0/16 network is Kubernetes specific meaning only devices inside the Kubernetes cluster have access to the web servers – kind of useless right?
Luckily, there is a solution – a Node Port. A Node Port is a service to expose applications. Think of it like a port forward. In this example, I will add a rule that says on our local computer we will access the server using port 31111 and when it hits a server running on the cluster to translate that to port 80 – the port running the Nginx web server.
In action, we create another YAML file that specifies roughly the same stuff. However, in this file, the selector states what this will affect – in this case, only affect the nginx apps. Then, it simply states the port the service is running on – port 80 – and then the nodePort which is the port we type into our browser which gets translated – port 31111.
Finally, we deploy this configuration using the “kubectl apply -f” command once again. Once deployed, the “kubectl get services” command is used to show any services running. In this case, there is an nginx-nodeport service running which translates port 80 to port 31111 seen under the PORTS section.
To test this configuration, we can simply navigate to any IP in the cluster that has an Nginx web server running and specify port 31111. If everything works, we should see a default Nginx webpage presented to us.
Holy ****! It works. We now have 6 nginx web servers running across 4 Raspberry Pis using Kubernetes to deploy it all with port forwarding so they are accessible on our local home LAN.
Now, what if we got some load balancing thrown in there aswell? Hmm….
Load Balancing Example
By default, the previous Nginx deployment along with its Node Port was load balancing. However, as a fun side project, we can actually visualize it to prove it actually is load balancing – what if it’s lying to us?
To achieve this, we first create another YAML file that specifies what we want. In this case, we are deploying the rancher-demo docker image running on port 8080 via the TCP protocol along with some other information:
Once created, the “kubectl apply -f helloworld.yaml” command is executed. This deploys however many replicas were defined – above I used 4 but later changed it to 12 to better show off load balancing.
Viewing the deployments again, we see the hello-world deployments:
As a side-note, these deployments are also viewable in the Rancher dashboard if you prefer looking at something pretty (I know I do!):
Just to reiterate, these deployments have a Kubernetes private IP. If we wanted to access the sites running this visualization demo, what do we need to do? Yep, add another nodePort/port forwarding rule.
This time, I will use Rancher to do it. First, I create a new service and some options show up:
Remember, we want a Node Port. Clicking that option reveals a simple page where we enter some information. First, a name which I called “hello-node”. Next, the listening port which this service is running on and the target port. For this visualization, it runs off port 8080. Finally, the node port which is the number we want to add to the end of the URL to access this page – i.e. 31115.
Finally, navigating to the Selectors section, we must specify which key:value pair will receive this node port configuration. Here, the key is app, and the value is hello-world – the same value as specified in the YAML file before deployment.
Once all the configuration is finished, navigating to any IP on the Pi cluster that is running the hello-world container with port 31115 reveals a visualization of load balancing across multiple instances of a service/deployment!
And that was just a demo to get familiar. How about installing a useful application on this cluster? Maybe something that could help block annoying advertisements and trackers? Pi-Hole anyone?
Installing a Pi-Hole on Kubernetes
A Pi-Hole acts like a DNS sinkhole – it attempts to block any ads and trackers in a blacklist from entering your home network. If the name does not give it away, it is made for a Raspberry Pi meaning we can run it on a Kubernetes Pi cluster (at least in theory).
To do this, on the beautiful looking Rancher dashboard, there will be an option to deploy what is known as a “workload” which is basically another name for an application.
Once on the page, we can deploy a workload. There are a multitude of options that need to be set. First of all, by giving it a name. Next, specifying the Docker Image to use and also assigning a name space which I left as default:
In the above image, the most important aspect is the Port Mapping section. In this section, there are 3 port mappings being made. First, a port mapping is created for both TCP and UDP DNS traffic on port 53 (i.e. DNS port). Finally, there is an entry for HTTP – this is so we can log in to the Pi-Hole itself using a web browser.
Before going further, there is a massive problem that I encountered. Because we are using Pi-Hole as the DNS, we temporarily don’t have DNS to pull down the Docker Image – a huge issue. The workload we are deploying is becoming the DNS while in the background, the Ubuntu machine does not know how to resolve any hostnames.
The first solution that came to mind was to edit the resolv.conf file and simply add a static entry pointing to our own machine. However, the file gets rewritten every single time dynamically – meaning it regenerates a new file.
To fix this, there is a package to install called resolvconf.
Then, once installed, we start it and enable it for future use using systemctl:
Next, there is a configuration file located in /etc/resolvconf/resolv.conf.d/head which we can edit and list the nameserver to use temporarily – in this case, 9.9.9.9
Finally, the service is restarted. Looking inside the /etc/resolv.conf file reveals the nameserver entry is in there permanently and will stay there upon reboot or restart meaning we have permanent DNS to download the Docker Image while Pi-Hole installs itself.
The next hurdle was to configure the Pi-Hole by adding some environment variables. There is really limitless variables you can add but some of the most important I added were the following:
TZ=Europe/London (sets the timezone)
DNS1=9.9.9.9 (primary DNS server to use, Quad9)
DNS2=149.112.112.112 (backup DNS server, Quad9 backup)
ServerIP=192.168.0.39 (sets IP of the Pi-Hole)
Next step is setting up Volumes. These are important so we can persist the configuration between reboots. In order for this to work, we need to create a directory on the Ubuntu machine called /pihole for the volumes to be stored.
Then, we create two volumes – one called /etc-pihole and another called /etc-dnsmasq.d. Once created, we simply point these mount points to the paths we just created (/home/james/pihole).
Almost there! Next, we chose the scaling/upgrade policy. For a service like PiHole, it won’t normally be scaled. A good option to choose is to kill all pods and start new ones – this saves deprecated/broken Pi-Holes from being run if an upgrade occurs.
Finally, we add some DNS name servers for Rancher to use. In this case, the first DNS is the Pi-Hole itself (127.0.0.1) and as a backup, the Quad9 DNS server in case the Pi-Hole fails (which it probably will at some point, this is IT after all).
After all of that, going back to the Dashboard should reveal the Pi-Hole as active and we should be able to click on the 8001/tcp highlighted link underneath to access the web portal:
You know where this is going – another problem! We need to login to gain more information. By default, Pi-Hole will generate a random password to use. However, because we deployed it through Kubernetes, we don’t know the password.
To solve this we can open a shell on the Pi-Hole itself through Rancher and change the password that way using the “pihole -a -p” command and entering a new password:
Once reset, we login and boom! We have an (almost) working Pi-Hole. Yes, there is another problem. – I assure you this is normal in IT and it never gets any better.
Looking at the top left, there is a status of unknown with an amber dot – not good. This means there is some sort of problem that we need to solve. I bashed my head against the wall for hours trying to work out what it was. It’s such a simple solution, trust me.
To solve it, I looked at the logs that got generated when the Pi-Hole first started which looked like:
At first glance, it looks like nothing. However, actually reading it (yes, I’m guilty of not reading logs) it seems as if the account trying to run the Pi-Hole does not have root privileges. I did say the problem was simple!
You know the drill. To solve this, I went down to the Security tab in Rancher and gave the user running the Pi-Hole the ability to escalate privileges to root when needed but did NOT run it as root all the time – that would be bad, bad security practice.
Never run a service as root unless needed!
Looking back at the Pi-Hole dashboard, the dot turned green and is now active meaning the Pi-Hole is running perfectly like no problems ever happened.
All that is left to do now is modify the network configuration on a host machine on my network and use the IP address of the Pi-Hole as my DNS server and the Pi-Hole should begin to intercept traffic.
And it works! You can see how I was frantically on Reddit trying to solve my issues at 6:48pm on a Sunday night. The work never ends but damn it’s fun.
Conclusion
Part of the reason I wanted to undertake this project was to learn more about containerization and automation/orchestration as that is where the industry is going to be heading soon (at least I believe so) and so being prepared for it is a must in my book.
Anyways, the project was completed! Although this project was infuriating at times and I wanted to chuck the Pis out the window, sticking with it taught me so much about Kubernetes and it was truly a rewarding experience that will help me be prepared for the future.
Thanks for reading and have a good day/night wherever you are!