Background: Since there are more than 5,000 developers working on the same project simultaneously, the main challenge was to cope with the high frequency of builds every minute. We also had to tackle the resource consumption on the Jenkins machine and the nodes: there are more than 4,000 nodes running across five Jenkins controllers hosted on Kube.
Goals: Making the rollout of our cellular platform software quick and easy, while automating test runs.
Solution & Results: We moved the entire node mechanism to Kube. Since Jenkins was already running on Kube, we came up with a solution to create nodes on-demand according to the build queue and load. In this way, there are no build queues standing in a 'pending' state.
We are using scalable resources for the Kube clusters and for hosting Jenkins, so resources like Print Via Computer (PVC), never run short in case of heavy load. Also, monitoring was super easy since it got coupled nicely with Prometheus and Grafana, which have awesome dashboards and alert tools, so we are always ahead of any unforeseen disasters.
Jenkins speeds up the process of development which makes the software run better with the hardware. It's the secret of successful rollouts by giving us the ability to run extensive automated tests.
Jenkins has awesome plugins. One of the most important is the Kubernetes plugin, which allowed the builds to run in a container and in a scalable way. We solved the problem of the increasing number of build nodes by creating a mechanism of spinning up nodes on-demand on Kube using custom-built images.
Also the "throttle concurrent builds plug-in" came to the rescue before implementing our recent nodes on Kube setup by limiting the incoming builds for more efficient throughput.
The results are better than expected: