Load balancing is a
technique used to distribute workloads uniformly across servers or
other compute resources to optimize network efficiency, reliability and
capacity. Load balancing is performed by an appliance --
either physical or virtual -- that identifies in real time which server in a
pool can best meet a given client request, while ensuring
heavy network traffic doesn't unduly overwhelm a single server.
In addition to
maximizing network capacity and performance, load balancing provides failover.
If one server fails, a load balancer immediately redirects its workloads to a
backup server, thus mitigating the impact on end users.
Load balancing is
usually categorized as supporting either Layer 4 or Layer 7.
Layer 4 load balancers distribute traffic based on transport data, such as IP
addresses and Transmission Control Protocol (TCP) port numbers. Layer 7
load-balancing devices make routing decisions based on application-level
characteristics that include HTTP header information and the actual contents of
the message, such as URLs and cookies. Layer 7 load balancers
are more common, but Layer 4 load balancers remain popular, particularly in
edge deployments.
How load balancing works
Load balancers handle
incoming requests from users for information and other services. They sit
between the servers that handle those requests and the internet. Once a request
is received, the load balancer first determines which server in a pool is
available and online and then routes the request to that server. During times
of heavy loads, a load balancer can dynamically add servers in response to
spikes in traffic. Conversely, they can drop servers if demand is low.
A load balancer can be a
physical appliance, a software instance or a combination of both.
Traditionally, vendors have loaded proprietary software onto dedicated hardware
and sold them to users as stand-alone appliances -- usually in pairs, to
provide failover if one goes down. Growing networks require purchasing
additional and/or bigger appliances.
In contrast, software
load balancing runs on virtual machines (VMs) or white box servers, most
likely as a function of an application delivery controller (ADC). ADCs
typically offer additional features, like caching, compression, traffic
shaping, etc. Popular in cloud environments, virtual load balancing can offer a
high degree of flexibility -- for example, enabling users to automatically scale
up or down to mirror traffic spikes or decreased network activity.
Load-balancing methods
Load-balancing algorithms determine
which servers receive specific incoming client requests. Standard methods are
as follows:
- The hash-based approach calculates a given client's preferred server based on designated keys, such as HTTP headers or IP address information. This method supports session persistence, or stickiness, which benefits applications that rely on user-specific stored state information, such as checkout carts on e-commerce sites.
- The least-connections method favors servers with the fewest ongoing transactions, i.e., the "least busy."
- The least-time algorithm considers both server response times and active connections -- sending new requests to the fastest servers with the fewest open requests.
- The round robin method -- historically, the load-balancing default -- simply cycles through a list of available servers in sequential order.
Formulas can vary
significantly in sophistication and complexity. Weighted load-balancing
algorithms, for example, also take into account server hierarchies -- with
preferred, high-capacity servers receiving more traffic than those assigned
lower weights.
Comments
Post a Comment