11.3. Server Load Balancing

Home Prev	cOS Core 14.00.15 Administration Guide	Next

11.3.1. Overview

The Server Load Balancing (SLB) feature allows the administrator to spread client application requests over a number of servers using an SLB Policy object. Alternatively an IP Rule with an Action of SLB_SAT could be used, although using an SLB Policy is recommended. Examples for both methods can be found in Section 11.3.7, Setting Up SLB.

SLB is a powerful tool that can improve the following aspects of network applications:

Performance.
Scalability.
Reliability.
Ease of administration.

The principle SLB benefit of sharing the load across multiple servers can improve not just the performance of applications but also scalability by facilitating the implementation of a cluster of servers (sometimes referred to as a server farm) that can handle many more requests than a single server.

The illustration below shows a typical SLB scenario, with Internet access to internal server applications by external clients being managed by a Clavister firewall.

Figure 11.9. A Server Load Balancing Configuration

Additional Benefits of SLB

Besides improving performance and scalability, SLB provides other benefits:

SLB increases the reliability of network applications by actively monitoring the servers sharing the load. cOS Core SLB can detect when a server fails or becomes congested and will not direct any further requests to that server until it recovers or has less load.

In addition, cOS Core can optionally send the appropriate error message back to the initiator of a server connection when failure of the server is detected via server monitoring.
SLB can allow network administrators to perform maintenance tasks on servers or applications without disrupting services. Individual servers can be restarted, upgraded, removed, or replaced, and new servers and applications can be added or moved without affecting the rest of a server farm, or taking down applications.
The combination of network monitoring and distributed load sharing also provides an extra level of protection against Denial Of Service (DoS) attacks.

SLB Deployment Considerations

The following questions should be considered when deploying SLB:

Across which servers is the load being balanced?
Which SLB algorithm should be used?
Will "stickiness" be used?
Which monitoring method will be used?

Each of these topics is discussed further in the sections that follow.

Identifying the Servers

An important first step in SLB deployment is to identify the servers across which the load is to be balanced. This might be a server farm which is a cluster of servers set up to work as a single "virtual server". The servers that are to be treated as a single virtual server by SLB must be specified.

11.3.2. SLB Distribution Algorithms

There are several ways to determine how a load is shared across a set of servers. cOS Core SLB supports the following algorithms for load distribution:

Round-robin

The algorithm distributes new incoming connections to a list of servers on a rotating basis. For the first connection, the algorithm picks a server randomly, and assigns the connection to it. For subsequent connections, the algorithm cycles through the server list and redirects the load to servers in order. Regardless of each server's capability and other aspects, for instance, the number of existing connections on a server or its response time, all the available servers take turns in being assigned the next connection.

This algorithm ensures that all servers receive an equal number of requests, therefore it is most suited to server farms where all servers have an equal capacity and the processing loads of all requests are likely to be similar.
Connection-rate

This algorithm considers the number of requests that each server has been receiving over a certain time period. This time period is known as the Window Time. SLB sends the next request to the server that has received the least number of connections during the last Window Time number of seconds.

The Window Time is a setting that the administrator can change. The default value is 10 seconds.
Resource-usage

This method depends on the servers sending back their loading information to cOS Core so the connection allocation can always go to the server with the least load.

The servers send back their loading information using the cOS Core REST API. Custom software that runs on servers must be written using this API. The API is fully described in the separate cOS Core REST API Guide.

Using the REST API requires that an appropriate Remote Management object is configured in cOS Core. This object allows access by external software that uses the API and setting this is up is described in the first chapter of the cOS Core REST API Guide.
Strict

This algorithm always sends new traffic connections to the first server in the server list. Should the first server become unavailable then new connections will be allocated to the second server. If both the first and second server become unavailable, the third server on the list is used and so on.

Note that this algorithm always sends new connections to the server on the list that is available and closest to the beginning of the list. This means that if the first server comes back online, it will once again get all new connections.

An example use case for this algorithm might be where all DNS traffic is SATed to a single internal DNS server. If the server becomes unavailable then it will be important to be able to direct this traffic to an alternate DNS server. Often, the server list will consist of just two servers, a principal and a backup, but it could contain more.

The Fallback Server Option

The SLB feature also provides the option to specify a Server fallback address. This is the IP address of a single server that is only used if all the servers in the SLB server list become unavailable. During normal operation, this server will not be included in any connection allocations controlled by any of the distribution algorithms listed above.

A typical use case for the fallback option is where the fallback server is set up to deliver a a simple webpage which indicates that there is a problem with the normal processing of HTTP requests. For example, the delivered page might carry the message:

     Web services are temporarily unavailable. Please try again later.

SLB Log Messages

The SLB subsystem generates two key log messages to indicate changes in the active server list:

server_online - Generated when a server is added to the active list.
server_offline - Generated when a server is removed from the active list.

A helpful parameter that appears in both of the above log messages is servers_reachable. This indicates the current number of active servers following the log event.

All the log messages generated by the SLB system can be found in the SLB section of the separate cOS Core Log Reference Guide.

11.3.3. Selecting Stickiness

In some scenarios, such as with SSL or TLS connections, it is important that the same server is used for a series of connections from the same client. This is achieved by selecting the appropriate stickiness option and this can be used with either the round-robin or connection-rate algorithms. The options for stickiness are as follows:

Per-state Distribution

This mode is the default and means that no stickiness is applied. Every new connection is considered to be independent from other connections even if they come from the same IP address or network. Consecutive connections from the same client may therefore be passed to different servers.

This may not be acceptable if the same server must be used for a series of connections coming from the same client. If this is the case then stickiness is required.
IP Address Stickiness

In this mode, a series of connections from a specific client will be handled by the same server. This is particularly important for TLS or SSL based services such as HTTPS, which require a repeated connection to the same host.
Network Stickiness

This mode is similar to IP stickiness except that the stickiness can be associated with a network instead of a single IP address. The network is specified by stating its size as a parameter.

For example, if the network size is specified as 24 (the default) then an IP address 10.01.01.02 will be assumed to belong to the network 10.01.01.00/24 and this will be the network for which stickiness is applied.

Stickiness Properties

If either IP stickiness or network stickiness is enabled then the following stickiness properties can be adjusted:

Idle Timeout

When a connection is made, the source IP address for the connection is remembered in a table. Each table entry is referred to as a slot. After it is created, the entry is only considered valid for the number of seconds specified by the Idle Timeout. When a new connection is made, the table is searched for the same source IP, providing that the table entry has not exceeded its timeout. When a match is found, then stickiness ensures that the new connection goes to the same server as previous connections from the same source IP.

The default value for this setting is 10 seconds.
Max Slots

This parameter specifies how many slots exist in the stickiness table. When the table fills up then the oldest entry is discarded to make way for a new entry even though it may be still valid (the Idle Timeout has not been exceeded).

The consequence of a full table can be that stickiness will be lost for any discarded source IP addresses. The administrator should therefore try to ensure that the Max Slots parameter is set to a value that can accommodate the expected number of connections that require stickiness.

The default value for this setting is 2048 slots in the table.
Net Size

The processing and memory resources required to match individual IP addresses when implementing stickiness can be significant. By selecting the Network Stickiness option these resource demands can be reduced.

When the Network Stickiness option is selected, the Net Size parameter specifies the size of the network which should be associated with the source IP of new connections. A stickiness table lookup does not then compare individual IP addresses but instead compares if the source IP address belongs to the same network as a previous connection already in the table. If they belong to the same network then stickiness to the same server will result.

The default value for this setting is a network size of 24.

11.3.4. SLB Algorithms and Stickiness

This section discusses further how stickiness functions with the different SLB algorithms.

An example scenario is illustrated in the figure below. In this example, the Clavister firewall is responsible for balancing connections from 3 clients with different addresses to 2 servers. Stickiness is enabled.

Figure 11.10. Connections from Three Clients

When the round-robin algorithm is used, the first arriving requests R1 and R2 from Client 1 are both assigned to one server, say Server 1, according to stickiness. The next request R3 from Client 2 is then routed to Server 2. When R4 from Client 3 arrives, Server 1 gets back its turn again and will be assigned with R4.

Figure 11.11. Stickiness and Round-Robin

If the connection-rate algorithm is applied instead, R1 and R2 will be sent to the same server because of stickiness, but the subsequent requests R3 and R4 will be routed to another server since the number of new connections on each server within the Window Time span is counted in for the distribution.

Figure 11.12. Stickiness and Connection-rate

Regardless which algorithm is chosen, if a server goes down, traffic will be sent to other servers. And when the server comes back online, it can automatically be placed back into the server farm and start getting requests again.

The Server Status Display and Pausing a Server

While SLB is active, its status can be viewed in the Web Interface by going to: Status > Server Load Balancing. There the server and number of allocated connections are displayed.

In the Maintenance column of the display for each server there is a Pause button and this can pressed to temporarily remove the server from having new connections distributed to it. Existing connections to the server will not be closed by cOS Core when a server is paused, but no new connections will be allocated.

However, if stickiness is enabled, new connections can continue to be allocated to a paused server to comply with the stickiness requirement.

11.3.5. SLB Server Monitoring

SLB Server Monitoring can be used to continuously check the status of the servers in an SLB configuration. If monitoring is enabled and a server goes offline, cOS Core will not open any new connections to that server until monitoring indicates that the server is online again.

The SLB Monitoring feature is similar in concept to the host monitoring feature used for the cOS Core Route Failover feature and which is described in Section 4.2.4, Host Monitoring for Route Failover. However, there are important differences.

Enabling Server Monitoring

Server monitoring is enabled on each SLB rule with the list of servers to be monitored and their IP addresses defined in each individual SLB rule.

Monitoring is done by polling hosts through any one or any combination of the three methods described below. A routing table is also specified for monitoring, with main as the default, and this is the table used by polling to look up the server IP addresses. This means that the routing table chosen must contain routes for all the server IP addresses specified in the SLB rule.

Monitoring Methods

The method by which hosts are polled can be any combination of:

ICMP

An ICMP "Ping" message is sent to the server.

TCP

A TCP connection is established to and then disconnected from the server.

HTTP

An HTTP request is sent using a specified URL. Two extra pieces of data must be specified for HTTP polling:
1. Request URL
  The URL which is to be requested from all servers.
  
  This must be specified to be either a FQDN or a Relative path. An example FQDN is http://www.example.com/path/file.txt and an example relative path is /path/file.txt.
  
  If a relative path is specified then the path is concatenated to the IP address of the server. For example, if the server IP is 10.12.14.01 then the relative path /path/file.txt would become http://10.12.14.01/path/file.txt for the polling.
  
  An FQDN must be used, however, when polling a server that is host to many virtual servers.
2. Expected Response
  A text string which is the beginning (or complete) text of a valid response. If no text is specified, any response from the server will be considered valid.
  
  Testing for a specific response text provides the possibility of testing if an application is offline. For example, if a web page response from a server can indicate if a specific database is operational with text such as "Database OK", then the absence of that response can indicate that the server is operational but the application is offline.
Monitoring with HTTP assumes that the URL entered is valid for all the servers in the SLB rule so no DNS lookup needs to be done. An HTTP GET request is therefore sent straight to the IP address of the server. (This differs from route failover host monitoring, where HTTP URLs are resolved with a DNS lookup.)

All Server Polling Must Succeed

The polling methods configured for an SLB rule are all used on all the servers in the rule. If one configured polling method fails but another succeeds on the same server, that server is considered to be offline. In other words, all configured polling methods need to succeed on a server for that server to be considered operational.

Polling Options

For each polling method specified, there are a number of property parameters that should be set:

Ports

The port number for polling when using the TCP or HTTP option.

More than one port number can be specified in which case all ports will be polled and all ports must respond for the server to be considered online. Up to 16 port numbers may be specified as a comma separated list for each polling method.
Interval

The interval in milliseconds between polling attempts. The default setting is 10,000 and the minimum value allowed is 100 ms.
Samples

The number of polling attempts used as a sample size for calculating the Percentage Loss and the Average Latency. The default setting is 10 and the minimum value allowed is 1.
Maximum Poll Failed

The maximum permissible number of failed polling attempts. If this number is exceeded then the server is considered offline.
Max Average Latency

The maximum number of milliseconds allowable between a poll request and the response. If this threshold is exceeded then the server is considered offline.

Average Latency is calculated by averaging the response times from the server for Samples number of polls. If a polling attempt receives no response then it is not included in the averaging calculation.

11.3.6. Behavior After Server Failure

If monitoring determines that a server is unavailable then new connections are automatically sent to servers that are still available. This section describes how cOS Core behaves when it detects that a server has failed but a connected client is not yet aware of the failure.

The Active Connection Reset Setting

If monitoring is used with SLB, there is an additional property available on both the IP Rule and SLB Policy objects which is called Active Connection Reset. This determines how cOS Core will handle existing connections after server failure.

A Summary of cOS Core Behavior

The following describes the actions that cOS Core will take depending on the protocol and whether the Active Connection Reset setting is enabled:

Active Connection Reset Setting Disabled (the default)

Depending on the protocol, the following will happen when a client sends a packet to a failed server:
1. TCP
  
  cOS Core closes the connection and sends back TCP RST.
2. UDP
  
  cOS Core closes the connection and sends back ICMP Port Unreachable (3).
3. ICMP or other protocol
  
  cOS Core closes the connection and sends back ICMP Protocol Unreachable (2).

Active Connection Reset Setting Enabled

Depending on the protocol, the following will happen immediately after the server is detected as failed:
1. TCP
  
  cOS Core closes the connection and sends back TCP RST immediately after failure detection.
2. UDP
  
  cOS Core closes the connection immediately after failure detection and sends no message
3. ICMP or other protocol
  
  cOS Core closes the connection immediately after failure detection and sends no message.

Note that if a server client sends traffic after cOS Core closes the connection then it will be treated as a new connection and routed to a functioning server.

	Caution: Active Connection Reset consumes resources
	If the Active Connection Reset setting is enabled, additional processing resources are required by cOS Core to track the state of every connection. With large numbers of connections, this has the potential to impact traffic flow through the firewall.

11.3.7. Setting Up SLB

This section contains examples that illustrate how SLB is set up using first an SLB Policy object and then an alternative method using an IP Rule. Using an SLB Policy is the recommended method.

The following steps should be used for setting up SLB.

Define an IP address object for each server for which SLB is to be enabled.
Define an IP address group object which includes all these individual objects.
Define an SLB Policy object in the IP rule set which refers to this IP address group. The destination interface is always specified as core in the policy meaning cOS Core itself deals with the connection.

Example 11.3. Setting up SLB with an SLB Policy

In this example, server load balancing is performed between two HTTP web servers situated behind the Clavister firewall. These web servers have the private IPv4 addresses 192.168.1.10 and 192.168.1.11. Access by external clients is via the wan interface which has the IPv4 address wan_ip.

The default SLB values for monitoring, distribution method and stickiness are used.

Command-Line Interface

A. Create an address object for each of the web servers:

Device:/> add Address IP4Address server1 Address=192.168.1.10

Device:/> add Address IP4Address server2 Address=192.168.1.11

B. Specify the SLB Policy object:

Device:/> add SLBPolicy Name=my_web_slb_policy
			SourceInterface=wan
			SourceNetwork=all-nets
			DestinationInterface=core
			DestinationNetwork=wan_ip
			Service=http-all
			SLBAddresses=server1,server2

The above will only allow access by external clients on the Internet. To also allow internal clients on lan_net access, the IP Policy must be rewritten using an Interface Group object which combines both the wan and lan interfaces.

A-2: First, create the InterfaceGroup:

Device:/> add Interface InterfaceGroup my_if_group Members=wan,lan

B-2: Now, create an SLBPolicy object:

Device:/> add SLBPolicy Name=my_web_slb_policy
			SourceInterface=my_if_group
			SourceNetwork=all-nets
			DestinationInterface=core
			DestinationNetwork=wan_ip
			Service=http-all
			SLBAddresses=my_server_group

InControl

Follow similar steps to those used for the Web Interface below.

Web Interface

A. Create an Object for each of the web servers:

Go to: Objects > Address Book > Add > IP4 Address
Enter a suitable name, in this example server1
Enter the IP Address as 192.168.1.10
Click OK
Repeat the above to create an object called server2 for the 192.168.1.11 IP address

B. Specify the SLB_SAT IP rule:

Go to: Policies > Firewalling > Main IP Rules > Add > SLB Policy
Now enter:
- Name: my_web_slb_policy
- Source Interface: wan
- Source Network: all-nets
- Destination Interface: core
- Destination Network: wan_ip
- Service: http-all
Add server1 and server2 to Selected
Click OK

The above will only allow access by external clients on the Internet. To also allow internal clients on lan_net access, the IP Policy must be rewritten using an Interface Group object which combines both the wan and lan interfaces.

A-2: First, create an InterfaceGroup:

Go to: Network > Interfaces and VPN > Interface Groups > Add > Interface Group
Now enter:
- Name: my_if_group
- Selected: wan and lan
Add server1 and server2 to Selected
Click OK

B-2. Now, create the SLBPolicy object:

Go to: Policies > Firewalling > Main IP Rules > Add > SLB Policy
Now enter:
- Name: my_web_slb_policy
- Source Interface: my_if_group
- Source Network: all-nets
- Destination Interface: core
- Destination Network: wan_ip
- Service: http-all
- Selected: server1 and server2
Click OK

Example 11.4. Setting up SLB with IP Rules

In this example, server load balancing is performed between two HTTP web servers situated behind the Clavister firewall. These web servers have the private IPv4 addresses 192.168.1.10 and 192.168.1.11. Access by external clients is via the wan interface which has the IPv4 address wan_ip.

The default SLB values for monitoring, distribution method and stickiness are used.

A NAT rule is used in conjunction with the SLB_SAT rule so that clients behind the firewall can access the web servers. An Allow rule is used to allow access by external clients.

Command-Line Interface

A. Create an address object for each of the web servers:

Device:/> add Address IP4Address server1 Address=192.168.1.10

Device:/> add Address IP4Address server2 Address=192.168.1.11

B. Create an IP4Group which contains the 2 web server addresses:

Device:/> add Address IP4Group server_group Members=server1,server2

C. Specify the SLB_SAT IP rule:

Device:/> add IPRule Action=SLB_SAT
			SourceInterface=any
			SourceNetwork=all-nets
			DestinationInterface=core
			DestinationNetwork=wan_ip
			Service=http-all
			SLBAddresses=server_group
			Name=web_slb

D. Specify a NAT rule for internal clients access to the servers:

Device:/> add IPRule Action=NAT
			SourceInterface=lan
			SourceNetwork=lan-net
			DestinationInterface=core
			DestinationNetwork=wan_ip
			Service=http-all
			NATAction=UseInterfaceAddress
			Name=web_slb_nat

E. Specify an Allow IP rule for the external clients:

Device:/> add IPRule Action=Allow
			SourceInterface=wan
			SourceNetwork=all-nets
			DestinationInterface=core
			DestinationNetwork=wan_ip
			Service=http-all
			NATAction=UseInterfaceAddress
			Name=web_slb_allow