Load balancers distribute incoming client requests to computing resources such as application servers and databases. In each case, the load balancer returns the response from the computing resource to the appropriate client. Load balancers are effective at:

  1. Preventing requests from going to unhealthy servers
  2. Preventing overloading resources
  3. Helping to eliminate a single point of failure

Load balancers can be implemented with hardware (expensive) or with software such as NGINX or HAProxy.

Additional benefits include:

  • SSL termination - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations
  • Removes the need to install X.509 certificates on each server
  • Session persistence - Issue cookies and route a specific client’s requests to same instance if the web apps do not keep track of sessions

To protect against failures, it’s common to set up multiple load balancers, either in active-passive or active-active mode.

Load Balancing Algorithms

Dynamic

Dynamic load balancing uses algorithms that take into account the current state of each server and distribute traffic accordingly.

  1. Least connection Checks which servers have the fewest connections open at the time and sends traffic to those servers. This assumes all connections require roughly equal processing power.
  2. Weighted least connection Gives administrators the ability to assign different weights to each server, assuming that some servers can handle more connections than others.
  3. Weighted response time Averages the response time of each server, and combines that with the number of connections each server has open to determine where to send traffic. By sending traffic to the servers with the quickest response time, the algorithm ensures faster service for users.
  4. Resource-Based Distributes load based on what resources each server has available at the time. Specialized software (called an “agent”) running on each server measures that server’s available CPU and memory, and the load balancer queries the agent before distributing traffic to that server.

Static

  1. Round-robin  Round robin load balancing distributes traffic to a list of servers in rotation using the Domain Name System (DNS). An authoritative nameserver will have a list of different A records for a domain and provides a different one in response to each DNS query.
  2. Weighted round-robin Allows an administrator to assign different weights to each server. Servers deemed able to handle more traffic will receive slightly more. Weighting can be configured within DNS records.
  3. IP Hash Combines incoming traffic’s source and destination IP addresses and uses a mathematical function to convert it into a hash. Based on the hash, the connection is assigned to a specific server.

Layer 4 vs. Layer 7 Load Balancers

Layer 4 load balancing

Layer 4 load balancers look at info at the transport layer to decide how to distribute requests. Generally, this involves the source, destination IP addresses, and ports in the header, but not the contents of the packet. Layer 4 load balancers forward network packets to and from the upstream server, performing Network Address Translation (NAT).

Layer 7 load balancing

Layer 7 load balancers look at the application layer to decide how to distribute requests. This can involve contents of the header, message, and cookies. Layer 7 load balancers terminate network traffic, reads the message, makes a load-balancing decision, then opens a connection to the selected server. For example, a layer 7 load balancer can direct video traffic to servers that host videos while directing more sensitive user billing traffic to security-hardened servers.

At the cost of flexibility, layer 4 load balancing requires less time and computing resources than Layer 7, although the performance impact can be minimal on modern commodity hardware.

Load Balancer Internals

Nginx Architecture

Read more here

Disadvantage(s): load balancer

  • The load balancer can become a performance bottleneck if it does not have enough resources or if it is not configured properly.
  • Introducing a load balancer to help eliminate a single point of failure results in increased complexity.
  • A single load balancer is a single point of failure, configuring multiple load balancers further increases complexity.