HA Clusters
cOS Core High Availability (HA) provides a fault tolerant capability for Clavister firewall installations. HA works by adding a back-up slave Clavister firewall to an existing master firewall. The master and slave are connected together by a synchronization link and make up a logical HA Cluster. One of the units in a cluster will be active while the other unit will be inactive and on standby.Initially, the cluster slave will be inactive and will only monitor the activity of the master. If the slave detects that the master has become inoperative, an HA failover takes place and the slave becomes active, assuming processing responsibility for all traffic. If the master later becomes operative again, the slave will continue to be active but the master will now monitor the slave with failover only taking place if the slave fails. This is sometimes known as an active-passive implementation of fault tolerance.
HA Requires Similar Systems
The master and slave in an HA cluster will normally be on identical hardware platforms. Clavister does not support HA clusters that use dissimilar Clavister hardware. In addition, the cOS Core licenses for the master and slave should have identical capabilities and each must also allow high availability. However, when cOS Core runs in a virtual environment, the requirement for identical hardware platforms may not be applicable but the licenses should still have identical capabilities.The Master and Active Units
When reading this section on HA, it should be kept in mind that the master unit in a cluster is not always the same as the active unit in a cluster.The active unit is the firewall that is actually processing all traffic at a given point in time. This could be the slave unit if a failover has occurred because the master is no longer operational.
Interconnection of the Sync Interfaces
In a cluster, the master and slave units must be directly connected to each other by a synchronization connection which is known to cOS Core as the sync interface. One of the normal interfaces on the master and the slave are dedicated for this purpose and are best connected together using a crossover cable. This is discussed further in Section 12.3, Setting Up HA.The connection between sync interfaces could be made over a longer distance for physical separation of the HA units using an appropriate cable. However, the data latency over this connection should never be greater than 20 milliseconds. This latency restriction is also discussed in Section 12.4, HA Issues and Troubleshooting.
Special packets, known as heartbeats, are continually sent by cOS Core from one cluster unit to the other across Ethernet interfaces which have been configured as sync interfaces. These are also sent on all other Ethernet interfaces unless an interface is explicitly configured not to send them. These special packets allow the health of both units to be monitored. Heartbeat packets are sent in both directions so that the passive unit knows about the health of the active unit and the active unit knows about the health of the passive.
The heartbeat mechanism is discussed further with more detail in Section 12.2, HA Mechanisms.
Cluster Management and Configuration Synchronization
When managing the cluster through the Web Interface or CLI, the configuration on one cluster unit can be changed and this will then be automatically copied to the other unit, provided that automatic synchronization is enabled for both cluster units (by default, it is).Automatic synchronization involves a process of one peer failing over to the other when configuration changes are saved. For example, if a change is made to the inactive peer and saved, the inactive peer will become the active unit so the other cluster unit can be updated.
Configuration changes can be made to the active or inactive peer and the other peer will then be synchronized. However, it is usually recommended to change the inactive peer since this will result in a single failover. When the active unit is changed, two failovers occur. The active peer first goes inactive so it can update, then becomes active again as the other peer updates. This method leaves the active peer as still the active one and this might be desirable. For example, where a feature does not support HA, such as with L2TP, connections will not be lost if the active peer remains the active one.
Turning off automatic synchronization and changing the cluster units separately is not recommended but can be done if required.
Example 12.1. Enabling Automatic Cluster Synchronization
This example enables automatic cluster synchronization on a Clavister firewall which is already part of an HA cluster. This setting should always be set to the same value on both cluster peers. Note that synchronization is enabled by default so this command is only needed if synchronization has previously been manually disabled using a similar procedure.
Command-Line Interface
Device:/>
set HighAvailability Sync=Yes
InControl
Follow similar steps to those used for the Web Interface below.
Web Interface
Cluster Management with InControl
An HA Cluster can also be managed as a single unit using InControl. A cluster will appear in the InControl client interface as a single logical Clavister firewall with a unique name. Any configuration changes are then deployed automatically to the two units by InControl. However, once under InControl control, configuration changes should not be made outside of InControl because this will make the InControl copy of the configuration inconsistent. This is discussed further in the separate InControl Administration Guide.Load Sharing
Clavister HA clusters do not provide a load sharing capability since only one unit will be active while the other is inactive and only two firewalls, the master and the slave, can exist in a single cluster. The only processing role that the inactive unit plays is to replicate the state of the active unit and to take over all traffic processing if it detects the active unit is not responding.Extending Redundancy
Implementing an HA Cluster will eliminate one of the points of failure in a network. Routers, switches and Internet connections can remain as potential points of failure and redundancy for these should also be considered.Protecting Against Network Failures Using HA and Link Monitor
The cOS Core Link Monitor feature can be used to check the connection to a host so that when it is no longer reachable an HA failover is initiated to a peer which has a different connection to the host. This technique is a useful extension to normal HA usage which provides protection against network failures between a single Clavister firewall and hosts. This technique is described further in Section 2.4.3, The Link Monitor. HA requires that the cOS Core licenses in both the master and slave units have their HA parameter set to enabled. HA will not function if either or both units in a cluster are operating in the 2 hour demo mode. cOS Core enters demo mode automatically if no valid license is present.