6.2. Web Content Filtering

Home Prev	cOS Core 14.00.18 Administration Guide	Next

6.2.1. Overview

As part of the HTTP ALG, cOS Core supports Web Content Filtering (WCF) of web traffic, which enables an administrator to permit or block access to web pages based on the content type of those web pages. Web content filtering requires a minimum of administration effort and has very high accuracy. WCF can be configured to work with either HTTP or HTTPS traffic or both.

Methods of Enabling WCF

WCF can be enabled using either of the following methods:

Using IP Policies

Configuring web content filtering using an IP Policy object is the simplest method and is the recommended way of doing it. WCF is first configured on a new Web Profile object and this object is then associated with an IP policy that triggers on the target traffic. The Protocol property of the Service object assigned to the IP policy must be set to HTTP.

WCF setup using IP policies is discussed further in Section 6.2.2, WCF Setup Using IP Policies.
Using IP Rules

With an IP Rule object, WCF is first enabled on an HTTP ALG object. Then, that ALG is associated with a Service object which is in turn associated with an IP rule.

WCF setup using IP rules is discussed further in Section 6.2.3, WCF Setup Using IP Rules.

WCF Databases

cOS Core WCF allows web page blocking to be automated so it is not necessary to manually specify beforehand which URLs to block or to allow. Instead, Clavister maintains a global infrastructure of databases containing huge numbers of current website URL addresses which are already classified and grouped into a variety of categories such as shopping, news, sport, adult-oriented and so on.

The scope of the URLs in the databases is global, covering websites in many different languages and hosted on servers located in many different countries.

	Note: WCF database access uses TCP port 9998
When cOS Core sends a query to the external WCF databases, it sends it as a TCP request to the destination port 9998. Therefore, any network equipment through which the request passes, including other firewalls, must not block TCP traffic with destination port 9998.

Note: WCF database access uses TCP port 9998

When cOS Core sends a query to the external WCF databases, it sends it as a TCP request to the destination port 9998.

Therefore, any network equipment through which the request passes, including other firewalls, must not block TCP traffic with destination port 9998.

WCF Processing Flow

When a user of a web browser requests access to a website, cOS Core queries the external WCF databases in order to retrieve the category of the requested site. Access to the URL can then be allowed or denied based on the filtering policy that the administrator has put in place for that particular category.

If access is denied, a web page will be presented to the user explaining that the requested site has been blocked. To make the lookup process as fast as possible cOS Core maintains a local cache in memory of recently accessed URLs. Caching can be highly efficient since a given user community, such as a group of university students, often connects to a limited range of websites.

Figure 6.8. Web Content Filtering Flow

If the requested web page URL is not present in the databases, then the webpage content at the URL will automatically be downloaded to Clavister's central data warehouse and automatically analyzed using a combination of software techniques. Once categorized, the URL is distributed to the global databases and cOS Core receives the category for the URL. WCF therefore requires a minimum of administration effort.

	Note: New URL submissions are anonymous
	New, uncategorized URLs sent to the Clavister network are treated as anonymous submissions and no record of the source of new submissions is kept.

Categorizing Pages and Not Sites

cOS Core WCF categorizes web pages and not sites. In other words, a web site may contain particular pages that should be blocked without blocking the entire site. cOS Core provides blocking down to the page level so that users may still access those pages of a website that are not blocked by the filtering policy.

WCF and Whitelisting

If a particular URL is whitelisted then it will bypass the WCF subsystem. No classification will be done on the URL and it will always be allowed. This applies if the URL has an exact match with an entry on the whitelist or if it matches an entry that makes use of wildcarding.

WCF is a Subscription Based Feature

Web content filtering is a feature that is enabled by purchasing a subscription to the service. This is an addition to the normal cOS Core license. This subscription is described further in Appendix A, Subscription Based Features along with details of WCF behavior after subscription expiry.

Introducing Blocking Gradually

Blocking websites can disturb users if it is introduced suddenly. It is therefore recommended that the administrator gradually introduces the blocking of particular categories one at a time. This allows individual users time to get used to the notion that blocking exists and could avoid any adverse reaction that might occur if too much is blocked at once. Gradual introduction also makes it easier to evaluate if the goals of site blocking are being met.

6.2.2. WCF Setup Using IP Policies

WCF can be enabled on an IP Policy object instead of using the combination of an HTTP ALG object with an IP Rule object. Using an IP Policy object provides a more direct method of WCF activation which can be combined with the other options available in an IP policy, such as traffic shaping or anti-virus scanning.

To set up WCF using an IP Policy object, the following steps are required:

Create a custom Service object for the protocol targeted. Make sure the Protocol property of this object is set to HTTP.
Create a Web Profile object that has the appropriate settings for the type of web content filtering required.
Associate these Service and Web Profile objects with an IP Policy object that targets the traffic to be filtered.

Predefined HTTP Services

With cOS Core version 11.03 or later, the default configuration will contain predefined Service objects where the Protocol property will already be correctly set. For WCF, this is the http-outbound service. When cOS Core is upgraded to 11.03 or later, the Protocol property will need to be explicitly set on services. For clarity, the example in this section will create a custom Service object and explicitly set the Protocol property.

Fail Mode Action

The fail mode setting determines what happens when web content filtering cannot function. This is usually because cOS Core is unable to reach the external databases to perform URL lookup.

Fail mode can have one of the following settings:

Allow

This is the default value for the property. If the external WCF database is not accessible, URLs are allowed even though they might be disallowed if the WCF databases were accessible.
Deny

If WCF is unable to function then URLs are denied if external database access to verify them is not possible. The user will see an "Access denied" web page.

Example 6.41. WCF Setup Using an IP Policy

This example shows how to set up web content filtering for HTTP traffic coming from HTTP clients on a protected network which is destined for the Internet. It will be configured to block all shopping sites. It is assumed that an IP Policy object called http_nat_policy already exists and this implements NAT for the client connections to the Internet.

Command-Line Interface

Create a Service object :

Device:/> add Service ServiceTCPUDP http_wcf_service
			Type=TCP
			DestinationPorts=80 
			Protocol=HTTP

Create a Web Profile object:

Device:/> add Policy WebProfile my_wcf_profile
			WCF=Yes
			WCFCategories=SHOPPING

Modify the IP Policy to use the new service, as well as the profile:

Device:/> set IPPolicy http_nat_policy
			Service=http_wcf_service
			WebControl=Yes
			Web_Policy=my_wcf_profile

InControl

Follow similar steps to those used for the Web Interface below.

Web Interface

Create a service object for the traffic:

Go to: Local Objects > Services > Add > TCP/UDP service
Now enter:
- Name: http_wcf_service
- Type: TCP
- Destination port: 80
- Protocol: HTTP
Click OK

Create a Web Profile object:

Go to: Policies > Firewalling > Web > Add > Web Profile
Specify the Name as my_wcf_profile
Enable Web Content Filtering
Add Shopping tn the Restricted list
Click OK

Modify the IP Policy to use the new service and the profile:

Go to: Policies
Select http_nat_policy
Select http_wcf_service from the Service list
Select the Web Control options
Enable Web Control
Select my_wcf_profile from the Web Profile list
Click OK

6.2.3. WCF Setup Using IP Rules

Setting up WCF with an IP rule requires the following steps:

Define an HTTP ALG object with Web Content Filtering enabled.

Alternatively, use the Light Weight HTTP ALG (LW-HTTP ALG). This is preferred as it has less system overhead and will provide higher traffic throughput. The disadvantage is that certain features, such as Anti-Virus scanning and stripping static web content, are not supported. The LW-HTTP ALG is discussed further in Section 6.1.2.5, Light Weight HTTP ALG.
The ALG object is then associated with a Service object. It is recommended to create a custom Service object for this purpose so the predefined Service objects are left unchanged.
This Service object is then associated with an IP Rule object to determine which traffic should be subject to filtering. This allows a detailed filtering policy to be defined.

Example 6.42. WCF Setup Using IP Rules

This example shows how to set up web content filtering for HTTP traffic from a protected network to all-nets. It will be configured to block all search sites, and it is assumed that there is using a single NAT IP rule controlling HTTP traffic.

Note that this example configures filtering using an IP Rule object. It could also be done with an IP Policy object and a second example is given later which does this.

Command-Line Interface

First, create an HTTP Application Layer Gateway (ALG) Object:

Device:/> add ALG ALG_HTTP content_filtering
			WebContentFilteringMode=Enabled
			FilteringCategories=SEARCH_SITES

Then, create a service object using the new HTTP ALG:

Device:/> add Service ServiceTCPUDP http_content_filtering Type=TCP
			DestinationPorts=80 
			ALG=content_filtering

Finally, modify the NAT rule to use the new service. Assume rule is called NATHttp:

Device:/> set IPRule NATHttp Service=http_content_filtering

InControl

Follow similar steps to those used for the Web Interface below.

Web Interface

First, create an HTTP Application Layer Gateway (ALG) Object:

Go to: Objects > ALG > Add > HTTP ALG
Specify a suitable name for the ALG, for example content_filtering
Click the Web Content Filtering tab
Select Enabled in the Mode list
In the Blocked Categories list, select Search Sites and click the >> button.
Click OK

Then, create a service object using the new HTTP ALG:

Go to: Local Objects > Services > Add > TCP/UDP service
Specify a suitable name for the Service, for example http_content_filtering
Select TCP in the Type list
Enter 80 as the Destination Port
Select the HTTP ALG just created in the ALG list
Click OK

Finally, modify the NAT IP rule to use the new service:

Go to: Policies
Select the NAT rule handling the HTTP traffic
Select http_content_filtering from the Service list
Click OK

Web content filtering is now activated for all web traffic from lan_net to all-nets.

We can validate the functionality with the following steps:

On a workstation on the lan_net network, launch a standard web browser.
Try to browse to a search site. For example, www.google.com.
If everything is configured correctly, the web browser will present a web page that informs the user that the requested site has been blocked.

Example 6.43. Enabling Audit Mode

This example is based on the same scenario as the previous example, but now with audit mode enabled.

Command-Line Interface

First, create an HTTP Application Layer Gateway (ALG) Object:

Device:/> add ALG ALG_HTTP content_filtering
			WebContentFilteringMode=Audit
			FilteringCategories=SEARCH_SITES

InControl

Follow similar steps to those used for the Web Interface below.

Web Interface

First, create an HTTP Application Layer Gateway (ALG) Object:

Go to: Objects > ALG > Add > HTTP ALG
Specify a suitable name for the ALG, for example content_filtering
Click the Web Content Filtering tab
Select Audit in the Mode list
In the Blocked Categories list, select Search Sites and click the >> button
Click OK

The steps to then create a service object using the new HTTP ALG and modifying the NAT IP rule to use the new service, are described in the previous example.

Web Content Filtering with HTTPS

It is possible in the HTTP ALG to have either the ALG apply to either HTTP or HTTPS traffic or both. If filtering of HTTPS traffic is to work then the Service object associated with the ALG should be one that allows the appropriate port numbers.

For example, the predefined service http-all could be used when both HTTP (port 80) and HTTPS (port 443) traffic are allowed. A custom service may need to be defined and used if an existing predefined service does not meet the requirements of the traffic.

A further point to note with WCF over an HTTPS connection is that if access to a particular site is denied, the HTTPS connection is automatically dropped. This means that the browser will not be able to display the usual cOS Core generated messages to indicate that the WCF feature has intervened and why. Instead, the browser will only display its own message to indicate the connection is broken.

The Fail Mode setting can also affect HTTP connections. If no hostname is found in either the ClientHello from the client or the ServerHello from the server in the initial HTTPS handshake session before encrypted packets are sent then the connection is dropped if the Fail Mode action is Deny and not dropped if the action is Allow.

Audit Mode

In Audit Mode, the system will classify and log all surfing according to the content filtering policy, but restricted websites will still be accessible to the users. This means the content filtering feature of cOS Core can then be used as an analysis tool to analysis what categories of websites are being accessed by a user community and how often.

After running in Audit Mode for some period of time, it is easier to then have a better understanding of the surfing behavior of different user groups and also to better understand the potential impact of turning on the WCF feature.

Allowing Override

On some occasions, Active Content Filtering may prevent users carrying out legitimate tasks. Consider a stock analyst who deals with online gaming companies. In his daily work, he might need to browse gambling websites to conduct company assessments. If the corporate policy blocks gambling websites, he will not be able to do his job.

For this reason, cOS Core supports a feature called Allow Override. With this feature enabled, the content filtering component will present a warning to the user that he is about to enter a website that is restricted according to the corporate policy, and that his visit to the web site will be logged. This page is known as the restricted site notice. The user is then free to continue to the URL, or abort the request to prevent being logged.

By enabling this functionality, only users that have a valid reason to visit inappropriate sites will normally do so. Other will avoid those sites due to the obvious risk of exposing their surfing habits.

	Caution: Overriding the restriction of a site
	If a user overrides the restricted site notice page, they are allowed to surf to all pages without any new restricted site message appearing again. However, the user is still being logged. When the user has been inactive for 5 minutes, the restricted site page will reappear if they then try to access a restricted site.

Reclassification of Blocked Sites

As the process of classifying unknown websites is automated, there is always a small risk that some sites are given an incorrect classification. cOS Core provides a mechanism for allowing users to manually submit a blocked URL for reclassification.

This mechanism can be enabled on a per-HTTP ALG level, which means that the administrator can choose to enable this functionality for regular users or for a selected user group only.

If reclassification is enabled and a user requests a website which is disallowed, the block page will include a Reclassify link. The link will take the user to a special reclassification web page where the blocked URL can be manually entered and a request submitted for it to be reclassified. The processing of these submissions is not immediate and may take some time.

Example 6.44. Reclassifying URLs Blocked by WCF

This example shows how a user may propose a reclassification of a website if he believes it is wrongly classified. This mechanism is enabled on a per-HTTP ALG level basis.

Command-Line Interface

First, create an HTTP Application Layer Gateway (ALG) Object:

Device:/> add ALG ALG_HTTP content_filtering
			WebContentFilteringMode=Enable 
			FilteringCategories=SEARCH_SITES
			AllowReclassification=Yes

Then, continue setting up the service object and modifying the NAT rule as we have done in the previous examples.

InControl

Follow similar steps to those used for the Web Interface below.

Web Interface

First, create an HTTP Application Layer Gateway (ALG) Object:

Go to: Objects > ALG > Add > HTTP ALG
Specify a suitable name for the ALG, for example content_filtering
Click the Web Content Filtering tab
Select Enabled in the Mode list
In the Blocked Categories list, select Search Sites and click the >> button
Check the Allow Reclassification control
Click OK

Then, continue setting up the service object and modifying the NAT rule as we have done in the previous examples.

Web content filtering is now activated for all web traffic from lan_net to all-nets and the user is able to propose reclassification of blocked sites. Validate the functionality by following these steps:

On a workstation on the lan_net network, launch a standard web browser.
Try to browse to a search site, for example www.google.com.
If everything is configured correctly, the web browser will present a block page with a browser link to the reclassification web page.
Click the reclassification link. The user is now able to submit the URL for reclassification.

Event Messages

WCF utilizes the request_url log event message to log its activities. The parameters of this event message will contain information on whether the request was allowed or blocked, and what categories the website was classified as. Also, the message includes parameters specifying if audit mode or a restricted site notice were in effect.

	Note: Enabling request_url message generation
	The request_url event message will only be generated if event message generation has been enabled in the "parent" IP rule set entry.

6.2.4. WCF Categories

WCF Advisories

An online listing of all the current web content filtering categories available with the Web Content Filtering subsystem can be found online at the following link:

https://www.clavister.com/advisories/wcf

Determining the Category of a Given Webpage

Clavister also provides a tool on its website to determine what the category classification of a given webpage would be in the WCF subsystem. The tool can be found at the following link:

https://www.clavister.com/web-content-filtering/

This tool allows the administrator to more accurately configure what categories to allow or block. It is also discussed in a Clavister Knowledge Base article at the following link:

https://kb.clavister.com/354846751

Common WCF Categories

Below is a summary of the most common web content filtering categories:

Academic Fraud

A website may be categorized under the Academic Fraud category if the site offers or appears to offer services related to Academic Fraud. This includes 3rd party assignment and/or essay writing or any other services aimed at assisting students or researchers to obtain academic qualifications fraudulently.
Adult Content

A website may be categorized under this category if the site primarily contains adult content.
Advertising

A website may be categorized under this category if the site primarily contains information relating to advertising.
Animals:Pets

A website may be categorized under the Animals/Pets category if its content includes information pertaining to Animals, or images relating to animals and/or pets.
Arts:Culture

A website may be categorized under the Arts/Culture category if its content includes information pertaining to the Arts/Culture, or images relating to the Arts/Culture.
Auctions

A website may be categorized under the Auctions category if its content includes information pertaining to, or images relating to auctions and/or, the site involves the auctioning of goods and/or services.
Audio Streaming Services

A website may be categorized under the Audio Streaming Services category if the site hosts or provides, an audio streaming service and/or software that facilitates audio streaming and/or information pertaining to online audio streaming. It does not include streamed radio or TV.
Backups:Storage

A website may be categorized under the Backups/Storage category if its content includes information pertaining to, or images relating to backup/storage and/or if the site is providing an online backup/storage service.
Botnets

A website may be categorized under the Botnets category if the site is currently participating in a botnet and/or contains botnet malware. Computer security sites that have information relating to botnets will be classified under IT Security.
Business Oriented

A website may be categorized under this category if the site primarily contains business related content.
Charities

A website may be categorized under the Charities if its content includes information pertaining to, or images relating to charities.
Chat Rooms

A website may be categorized under the Chatrooms category, if the site primarily hosts or provides chat room services and/or its content includes information pertaining to, or images relating to chat Rooms.
Child Abuse Material

This content is illegal in most jurisdictions in most countries.
Child Entertainment

A website may be categorized under the Child Entertainment category if its content includes information pertaining to, or images relating to child entertainment.
Clubs and Societies

A website may be categorized under this category if the site primarily contains club or society related content.
Computing/IT

A website may be categorized under this category if the site primarily contains computer or IT related content.
Crime/Terrorism

A website may be categorized under this category if the site primarily contains crime or terrorism related content.
Dating Sites

A website may be categorized under this category if the site primarily contains personal dating related content.
Dictionary

A website may be categorized under the Dictionary category if the site is a dictionary site or a site offering similar services, has information and/or images about dictionaries and or dictionary sites.
Drugs/Alcohol

A website may be categorized under this category if the site primarily contains information relating to alcohol and /or drugs.
Drugs:illicit

A website may be categorized under the Drugs:illicit category if its content includes information/material/products and/or images that promote or facilitate the use of drugs that are illegal in most jurisdictions. This does not include health information relating to medical use or general use of otherwise illegal drugs.
Drugs:Pharmaceuticals

A website may be categorized under the Drugs/Pharmaceuticals category if its content includes information pertaining to, or images relating to legal drugs/pharmaceuticals.
Dynamic DNS

A website may be categorized under the Dynamic DNS category if its content includes information pertaining to, or images relating to and/or is providing a dynamic DNS service.
E-Banking

A website may be categorized under this category if the site primarily relates to e-banking.
Educational

A website may be categorized under this category if the site primarily contains information related to education.
Educational Games

A website may be categorized under the Educational Games category if its content includes software and/or information pertaining to, or images relating to, or is an actual gaming sites that is intended for an educational audience.
Embedded Threats

A website may be categorized under the Embedded Threats category if its content includes embedded malware. This includes sites that may otherwise be legitimate sites that are currently infected with some form of malware.
Entertainment

A website may be categorized under this category if the site primarily contains information relating to entertainment.
Fashion

A website may be categorized under the Fashion category if its content includes information pertaining to, or images relating to fashion.
Gambling

A website may be categorized under this category if the site relates to gambling.
Games sites

A website may be categorized under this category if the site primarily contains information relating to online games.
Government

A website may be categorized under the Government category if the site is run by any government authority (federal,state, local or national). It includes most .gov sites.
Government Blocking List

A website may be categorized under this category if the site relates to government blocking lists.
Guns:Weapons

A website may be categorized under the Guns/Weapons category if its content includes information pertaining to, or images relating to Weapons.
Hacking

A website may be categorized under the Hacking category if its content includes information pertaining to, or images relating to and/or tools to assist with illegal Hacking. Information relating to malware protection and/or defence against hacking and/or defence against malware is contained within the IT Security category.
Health Sites

A website may be categorized under the Hobbies category if its content includes information pertaining to health.
Hobbies

A website may be categorized under the Hobbies category if its content includes information pertaining to, or images relating to hobbies.
Hosted Services

A website may be categorized under the Hosted Services category if the site provides hosted services or hosting services and/or its content includes information pertaining to, or images relating to, hosted services.
Humor

A website may be categorized under the Humor category if its content includes information pertaining to, or images relating to Humor. E.g. joke sites.
Investment Sites

A website may be categorized under the Hobbies category if its content includes information related to investment.
ISPs

A website may be categorized under the ISPs category if its content includes information pertaining to, or images relating to an Internet Service Provider (ISP).
IT Forums:Blogs

A website may be categorized under the IT Forums/Blogs category if the site hosts or provides an Internet Forum or Internet Log (BLOG) containing articles, images and/or information pertaining to Information Technology.
IT Security

A website may be categorized under the IT Security category if its content includes information/software pertaining to, or images relating to information technology security and/or Internet security.
Job Search

A website may be categorized under this category if the site primarily contains information employment.
Keyloggers

A website may be categorized under the Keyloggers category if the site contains or may contain keylogging software and or information to facilitate key logging activities, designed for malicious purposes.
Malicious

A website may be categorized under this category if the site contains malicious code.
Media:File Sharing

A website may be categorized under the Media Sharing category if its content includes information pertaining to, or images relating to or the provision of file and/or other media sharing services.
Military

A website may be categorized under the Military category if its content includes information pertaining to, or images relating to military services and/or equipment designed for military use.
Music Download

A website may be categorized under this category if the site relates to downloading music.
News

A website may be categorized under this category if the site relates to news.
Nudity

A website may be categorized under the Nudity category if its content includes non-pornographic information pertaining to, or images relating to or depicting nudity.
Online Meeting & Collaboration

A website may be categorized under the Online Meeting & Collaboration category if its provides Online Meeting & Collaboration services or information pertaining to online meeting & collaboration. e.g. GoToMeeting, WebEx.
Pay to Surf

A website may be categorized under the Pay to Surf category if its content includes information pertaining to, or images relating to, or the facilitation of any pay to surf activity.
Peer To Peer

A website may be categorized under the Peer To Peer category if its content includes information pertaining to, images relating and/or software designed to facilitate peer to peer activity. e.g. Limewire.
Personal

A website may be categorized under the Personal category if its content includes information and/or images relating to a particular individual. This does not include social networking sites.
Personal Beliefs/Cults

A website may be categorized under this category if the site primarily contains information relating to beliefs and cults.
Phishing:Frauds

A website may be categorized under the Phishing/Frauds category if its contains or may contain phishing and/or malware code designed to trick the victim for the purpose of fraud.
Politics

A website may be categorized under this category if the site primarily contains information relating to politics.
Proxies:Filtering Bypass

A website may be categorized under the Proxies and Filtering Bypass category if the site is a bypass proxy server or provides software designed to hide the users identity or the source of any browsing traffic.
Radio

A website may be categorized under the Radio category if the site provides access to online streaming radio and/or provides software designed to facilitate the streaming of online radio.
Real Estate

A website may be categorized under the Real Estate category if its content includes information pertaining to, or images relating to real estate.
Remote Control/Desktop

A website may be categorized under this category if the site relates to software for remote control of computers.
Restaurants:Dining Food

A website may be categorized under the Restaurants/Dining Food category if its content includes information pertaining to, or images relating to eating out.
Search Sites

A website may be categorized under this category if the site relates to Internet search engines.
ShareWare:FreeWare

A website may be categorized under the ShareWare/FreeWare category if the site provides access to Shareware or Freeware. Sites that do provide free/share software commonly used by professional IT staff are excluded and come under Computing/IT.
Shopping

A website may be categorized under this category if the site relates to online shopping.
Social Networking

A website may be categorized under the Social Networking category if it provides an online social networking service.
Software Downloading:Sharing

A website may be categorized under the Software Downloading/Sharing category if the site provides a hosting and downloading service for multiple software authors.
Spam URLs

A website may be categorized under the Spam URLs category if its content includes information pertaining to, or images relating to spamming.
Special Events

A website may be categorized under the Special Events category if its content includes information pertaining to, or images relating to specific special events.
Sports

A website may be categorized under this category if the site relates to sports.
Spyware

A website may be categorized under the spyware category if its content includes spyware.
Stock Trading

A website may be categorized under the Stock Trading category if its provides the ability to trade stocks online.
Surveillance Monitoring Site Cams

A website may be categorized under the Surveillance Monitoring Site Cams category if the site provides access to video cams.
Suspicious

A website may be categorized under the Suspicious category if it hosts or appears to host suspicious content or suspicious software.
Swimsuit/Lingerie Models

A website may be categorized under this category if the site relates to swimsuit and/or lingerie models.
Text Messaging

A website may be categorized under the Text Messaging category if its purpose is to send and/or receive text or SMS messages to mobile and/or other devices.
Tobacco

A website may be categorized under the Tobacco category if its content includes information pertaining to, or images relating to tobacco.
Travel/Tourism

A website may be categorized under this category if the site primarily contains information relating to travel and tourism.
TV

A website may be categorized under the TV category if the sites provides online streaming generic TV services and /or software applications designed to facilitate online streaming of Television programs.
Unions:Professional Organizations

A website may be categorized under the Unions & Professional Organizations category if its content includes information pertaining to, or images relating to unions and/or professional organizations.
Vehicles

A website may be categorized under the Vehicles category if its content includes information pertaining to, or images relating to motor vehicles, motor bikes and other motorised vehicles. This includes buying & selling motor vehicles, motoring magazines, motor manufacturing sites and similar.
Violence/Undesirable

A website may be categorized under this category if the site primarily contains information relating to violence and other undesirable behavior.
Viral Video

A website may be categorized under the Viral Video category if its content includes information pertaining to, or images relating to online videos that have gone viral.
Weather

A website may be categorized under the Weather category if its content includes information pertaining to, or images relating to the weather.
WWW email sites

A website may be categorized under this category if the site relates to webmail.

6.2.5. Customizing WCF HTML Pages

The Web Content Filtering (WCF) feature of the HTTP ALG make use of a set of HTML files to present information to the user when certain conditions occur such as trying to access a blocked site.

These HTML web pages are stored as files in cOS Core and these files are known as HTTP Banner Files. The administrator can customize the appearance of the HTML in these files to suit a particular installation's needs. The cOS Core management interface provides a simple way to download, edit and re-upload the edited files.

	Note
	The banner files related to authentication rules and web authentication are a separate subject and are discussed in Section 9.3, Customizing Authentication HTML.

Available Banner Files

The predefined HTML ALG banner files for WCF are:

CompressionForbidden
ContentForbidden
URLForbidden
RestrictedSiteNotice
ReclassifyURL

HTML Page Parameters

The HTML pages contain a number of parameters that can be used as needed. The parameters available are:

%URL% - The URL which was requested.
%IPADDR% - The IP address of the client.
%REASON% - The reason that access was denied.
%RESTRICTED_SITE_NOTICE_BEGIN_SECTION% - This begins the restricted site section.
%RESTRICTED_SITE_NOTICE_FORM% - Allows the restricted notice to be ignored.
%RESTRICTED_SITE_NOTICE_END_SECTION% - This ends the restricted site section.
%RECLASSIFICATION_FORM% - Allows site reclassification to be flagged.

By not including the section between %RESTRICTED_SITE_NOTICE_BEGIN_SECTION% and %RESTRICTED_SITE_NOTICE_END_SECTION%, the ability to ignore the restricted site warning can be removed.

Customizing Banner Files

To perform customization it is necessary to first create a new, named ALG Banner Files object. This new object automatically contains a copy of all the files in the Default ALG Banner Files object. These new files can then be edited and uploaded back to cOS Core. The original Default object cannot be edited. The following example goes through the necessary steps.

Example 6.45. Editing Content Filtering HTTP Banner Files

This example shows how to modify the contents of the URL forbidden HTML page.

InControl

Follow similar steps to those used for the Web Interface below.

Web Interface

Go to: System > Advanced Settings > HTTP Banner files > Add > ALG Banner Files
Enter a name such as new_forbidden and press OK
The dialog for the new set of ALG banner files will appear
Click the Edit & Preview tab
Select URLForbidden from the Page list
Now edit the HTML source that appears in the text box for the Forbidden URL page
Use Preview to check the layout if required
Press Save to save the changes
Click OK to exit editing
Go to: Policies > User Authentication User Authentication Rules
Select the relevant HTML ALG and click the Agent Options tab
Set the HTTP Banners option to be new_forbidden
Click OK
Go to: Configuration > Save & Activate to activate the new file
Press Save and then click OK

The new file will be uploaded to cOS Core

	Tip: Saving changes
	In the above example, more than one HTML file can be edited in a session but the Save button should be pressed to save all edits before beginning to edit another file.

Uploading with SCP

It is possible to upload new HTTP Banner files using SCP. The steps to do this are:

Since SCP cannot be used to download the original default HTML, the source code must be first copied from the Web Interface and pasted into a local text file which is then edited using an appropriate editor.
A new ALG Banner Files object must exist which the edited file(s) is uploaded to. If the object is called mytxt, the CLI command to create this object is:
```
Device:/> add HTTPALGBanners mytxt
```
This creates an object which contains a copy of all the Default content filtering banner files.
The modified file is then uploaded using SCP. It is uploaded to the object type HTTPALGBanner and the object mytxt with the property name URLForbidden.

If the edited URLForbidden local file is called my.html then using the Open SSH SCP client, the upload command would be:
```
scp myhtml admin@10.5.62.11:HTTPAuthBanners/mytxt/URLForbidden
```
The usage of SCP clients is explained further in Section 2.1.8, Using SCP.
Using the CLI, the relevant HTTP ALG should now be set to use the mytxt banner files. If the ALG us called my_http_alg, the command would be:
```
set ALG_HTTP my_http_alg HTTPBanners=mytxt
```
As usual, the activate followed by the commit CLI commands must be used to activate the changes on the firewall.

6.2.6. HTTPS Setup with WCF

When a client is trying to connect to a server using HTTPS, the default behavior when WCF blocks a URL is that cOS Core drops the connection without presenting an explanation to the client. To allow WCF block pages to be sent back to the client, the following configuration steps are required:

Enable the HTTPS option in the Web Profile object associated with the IP Policy for the traffic.
Set the Root Certificate field for the HTTPS option to a self signed root certificate that has been generated by the administrator. This certificate object must have both the public and private key files.
Install the public key of the self-signed root certificate on all connecting clients as a trusted CA certificate. This is not mandatory but will avoid the client user having to create a security exception when they receive a reply page from cOS Core.

After the above has been configured, cOS Core will be able to send back WCF pages to HTTPS clients. The processing sequence for doing this is as follows:

The client attempts to open an HTTPS connection via the firewall to a remote web server.
The cOS Core WCF subsystem looks up the URL to see if the connection is permitted. If it is, the traffic can flow over HTTPS between client and server and no further WCF action is required.
If the connection is not allowed by WCF, cOS Core generates a host certificate signed by the configured root certificate. This host certificate is a wildcard certificate for the domain that the client tried to reach.
cOS Core sends back the WCF blocking response over HTTPS using the generated host certificate.
The client sees that the host certificate of the response is signed by a trusted root certificate and displays it. If the root certificate was not installed on the client then the browser will ask if the user wants to continue before displaying the response.

Host Certificates are Cached

Processing overhead is required to generate host certificates in the steps described above. To improve performance, cOS Core maintains a cache of generated certificates so that a new certificate does not have to be generated for repeatedly accessed domains.

Only TLS 1.0 and 1.2 are Supported

WCF Block Pages will only be sent over TLS 1.0 or TLS 1.2 connections. In normal circumstances, an 1.3-capable browser will still allow 1.2 to be negotiated and everything will function as expected.

However, if a browser is configured to only allow TLS 1.3 (or 1.1), and WCF blocks a page, it will give the user an error message, possibly mentioning mismatching protocol versions, and no further HTML content.

Non-blocked connections are not affected by this limitation. The firewall will not, for example, downgrade TLS 1.3 connections to 1.2.

The Generation Limit

The HTTPS option in the Web Profile object also has a numeric field called Generation Limit. This is the maximum number of new host certificates that cOS Core can generate per second and is designed to prevent overloading of the hardware resources.

While the limit is exceeded, new HTTPS client connections will simply be dropped, as though the feature was not enabled. Note that if the limit is set to a value of zero then no limit is applied.

6.2.7. Examining WCF Performance

cOS Core provides an option for looking more closely at what the web content filtering subsystem is doing and this is called the WCF Performance Log. It is intended to be used by qualified support technicians but it is useful to know that it exists and how to enable it.

When enabled, this feature takes a snapshot of the status of the WCF subsystem and outputs a wcf_performance_notice log event message to all configured log receivers, including Memlog. An example of this log message is shown below:

2014-10-04 08:47:25 Info ALG 200142 wcf_performance_notice 
algmod=http cache_size=88 cache_repl_per_sec=3 trans_per_sec=8 queue_len=7
in_transit=2 rtt=1 queue_delta_per_sec=1 server=10.1.0.10 srv_prec=primary

When enabling the performance log, one of the following frequencies can be chosen for how often log generation occurs:

Every 5 seconds.
Every 10 seconds.
Every 30 seconds.
Every 60 seconds.
Every 300 seconds.
Every 600 seconds.

Each wcf_performance_notice log message contains the following fields:

cache_size

The size of the WCF cache which contains the most recent URLs looked up against the external WCF database server.
trans_per_sec

The number of database queries being sent per second. URLs are not always sent singly. cOS Core will send batches when there is more than one waiting for processing against the database.
queue_len

The length of the queue of URLs awaiting processing by the external WCF database server.
in_transit

The number of URLs in the queue where a request has been sent to the WCF database server but a reply has not yet been received.
rtt

The round-trip time for the last WCF database server lookup.
queue_delta_per_sec

This is the amount the queue of waiting requests increases or decreases per second.
server

The IPv4 address of the server performing the database lookup.
srv_prec

The precedence of the responding server. A server with a higher precedence may not have responded to the request and the next lower precedence has been selected.

The last snapshot sent as a log message can also be viewed on a management console using the cOS Core CLI command httpalg -wcf. Below is an example of the output from the command and as shown there is additional information compared with the wcf_performance_notice log event message.

Device:/> httpalg -wcf
		
Dynamic Web Content Filter Statistics

Counter                Value
---------------------- ---------------
Cache Size:            62   URLs
Cache Hit Rate:        0    per second.
Cache Miss Rate:       0    per second.
Request Lookups:       0    per second.
Request Queue Length:  0    URLs.
Requests In Transit:   0    URLs.
RTT per transaction:   40   milliseconds.
Request Queue Delta:   0    URLs per second.
Cache Replacements:    0    URLs per second.
Last Cache Repl. - Hit Rate Idle:     N/A.
Last Cache Repl. - Idle TTL Left:     N/A.
Last Cache Repl. - Session TTL Left:  N/A.

Server:                192.168.1.18
Connection Lifetime:   574  seconds.

	Note: This is in techsupport command output
	The output from the CLI command httpalg -wcf is included in the output from the techsupport command. See Section 2.6.10, The techsupport Command.

A further discussion of the output from the httppalg -wcf command can be found in an article in the Clavister Knowledge Base at the following link:

https://kb.clavister.com/324735708

Enabling the WCF Performance Log

The example below shows how the WCF performance log feature is enabled.

Example 6.46. Enabling the WCF Performance Log

This example enables the WCF performance log feature so that a wcf_performance_notice log event message is generated every 5 seconds. This log message provides a snapshot of the WCF subsystem.

Command-Line Interface

Device:/> set Settings MiscSettings WCFPerfLog=5

InControl

Follow similar steps to those used for the Web Interface below.

Web Interface

Go to: System > Advanced Settings > Misc. Settings
Set WCF Performance Log to Every 5 Seconds
Click OK