Published: 01/06/2023
Many clients, with the introduction of cloud services, still opt to deploy and maintain their own solutions. This choice is driven by various factors, including security requirements and cost-saving goals.
At first glance, self-maintenance of these solutions may appear relatively straightforward. For instance, one of our clients decided to implement their own NGINX node cluster to handle load balancing on their backend cluster.
The NGINX server cluster seemed like an uncomplicated solution to configure. However, as the workload increased, NGINX began struggling to process the incoming traffic effectively. Upon reviewing the log journal, an evident cause emerged: there was an insufficient number of available worker_connections, resulting in packet drops.
What surprised everyone was that the combined total of worker_connections multiplied by the number of worker processes was four times greater than the number of client requests, which should have been more than sufficient.
At this stage, the client turned to our company to diagnose this issue. Through our diagnostic process, we discovered that NGINX possesses architectural peculiarities that lead to an uneven distribution of user requests among worker processes. As a result, one particular worker process experienced a higher load, depleting the reserve of worker_connections.
The client was both astonished and appreciative of these findings, and they valued our technical investigations.