Get a quoteGet a quote
Live ChatLive chat
Client ZoneClient Zone
ContactsContacts
Hide
Live Support
Sales department
Technical Support
Service high availability using open-source tools

People want their servers, and, of course, the projects that are running on them, to be available all the time, and to bring maximum revenue at the same time.
In a real-life situation, it is more than clear that pure 100% availability is impossible to achieve, especially if a single hardware device is involved. There can be unpredicted factors that affect availability, either at software or hardware level.
Most common examples:

  • power failures
  • network device failures
  • sysadmin’s mistakes

Most of these factors can be excluded, or their impact can be reduced to a minimum, by a careful initial design and planning, and eliminating any SPOF (*single point of failure), whenever possible, of course.

We also have to define the notion of downtime: it is the amount of time when the service is unavailable, or the system fails to provide the services it should be providing.

Downtime can be: Planned or Unplanned
In a HA (*high-availability) environment, we need to exclude unplanned downtimes.
Planned downtime – is the result of maintenance procedures executed on the system. This may include:

  • hardware components replacement
  • applying security patches or OS updates that require a system reboot
  • performing hardware upgrades
  • system redesign

In this article we are designing a high-availability solution for a web service, using open-source software.

There are multiple ways of achieving a HA web service. Most common include:
1. DNS load-balancing – this implies using two or more A records in the dns zone for a web site, with a low TTL, thus, the requests will be round-robin distributed to all the servers (or “nodes”) that the A dns records point to.
Example: google.com (though they aren’t using DNS load balancing only. They also seem to use geodns, anycast and other related technologies)

;; QUESTION SECTION:
;google.com.            IN    A

;; ANSWER SECTION:
google.com.        300    IN    A    209.85.135.99
google.com.        300    IN    A    209.85.135.103
google.com.        300    IN    A    209.85.135.104
google.com.        300    IN    A    209.85.135.105
google.com.        300    IN    A    209.85.135.106
google.com.        300    IN    A    209.85.135.147

This method has some drawbacks in a simple 2-3 node cluster.
If one node goes down, manual (or automatic) modification of the DNS records for this domain is required, and at least a few minutes (interval equal to the DNS TTL), a part of the requests will still go to the affected node.

We will adopt the following schema (that can be extended of course, but we are using a minimum of resources):

  • two servers, that will have both the role of web servers, file servers and load balancers. Servers are running latest CentOS 5.4
  • two networks, one public, that links to the ISP/Datacenter network, and an OOB network (Local to these servers)

Following open-source software will be used:

All software is installed from standard repositories, except nginx and mon, which will be installed from the EPEL repositories (http://fedoraproject.org/wiki/EPEL)

Both servers should have similar configuration

Typical network wiring:
OSS cluster

We will use the following IP adresses.
1. Public network:
2.2.2.1 – gateway
2.2.2.2 – srv1
2.2.2.3 – srv2
2.2.2.4, 2.2.2.5 – are used for the sites, and will be migrated from one server to another automatically by heartbeat.

2. Private network:
10.0.0.2 – srv1
10.0.0.3 – srv2
10.0.0.4, 10.0.0.5 – used for the OOB VIPs.

First of all, we will add the EPEL repository:
rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rpm

Install all required software:
yum install nginx heartbeat mon httpd php php-mysql unison httpd-devel

mod_rpaf will be installed manually:
wget http://stderr.net/apache/rpaf/download/mod_rpaf-0.6.tar.gz
tar zxf mod_rpaf-0.6.tar.gz
cd mod_rpaf-0.6
apxs -cia mod_rpaf-2.0.c

Adjust the apache config by adding:
RPAFenable On
RPAFsethostname On
RPAFproxy_ips 127.0.0.1 10.0.0.2 10.0.0.3 10.0.0.4 10.0.0.5
RPAFheader X-Forwarded-For

next to the
LoadModule rpaf_module libexec/apache2/mod_rpaf-2.0.so directive.

Also make sure to install any other php modules needed.

Heartbeat configuration (config files are similar on both servers):
Please refer to the documetation in case some parameter is not completely clear.
File /etc/ha.d/ha.cf:
—— cut ——-
node srv1.example.com
node srv2.example.com
bcast eth0
bcast eth1
udpport 694
ucast eth0 2.2.2.2
ucast eth0 2.2.2.3
ucast eth1 10.0.0.2
ucast eth1 10.0.0.3
ping 2.2.2.1
baud 19200
crm off
use_logd on
keepalive 1
deadtime 10
initdead 30

—— cut ——-

File /etc/ha.d/haresources:
—— cut ——-
srv1.example.com 2.2.2.4/24 10.0.0.4/24
srv2.example.com 2.2.2.5/24 10.0.0.5/24

—— cut ——-

In case you are using some iptables-based firewall, make sure to allow all traffic between the two servers, on both public and private networks.

Start the heartbeat service on both servers:
service heartbeat start

You should notice that 2.2.2.4, 10.0.0.4 IPs will be brought up automatically on srv1, and 2.2.2.5, 10.0.0.5 on srv2.

Next, to the file synchronization setup.
Assuming that all virtual hosts are stored in /home/websites, we will setup the following script on srv1 and add it to the root user’s crontab, to run once per minute (or more, if the content isn’t changing frequently):
/root/bin/usync.sh:

#!/usr/local/bin/bash
LOCK="/root/.unison";
if [ -f $LOCK ];
then
echo "unison already running";
exit 1;
fi
touch $LOCK
/usr/bin/unison -batch
rm -f $LOCK

Also, we must create the unison config in /root/.unison/default.prf

# Unison preferences file
root = /home/websites/
root = ssh://10.0.0.3//home/websites/
ignore = Path {logs/*}
log = true

Please note above that you can exclude some specific paths that don’t need to be synchronized, like logs or temporary files.

Configure a sample vhost in apache (we will setup apache to listen on 8000 port only, adjust the Listen directive if needed):

NameVirtualHost *:8000
<VirtualHost *:8000>
DocumentRoot "/home/websites/example.com"
ServerName example.com
ServerAlias www.example.com
<Directory "/home/websites/example.com">
Allow from all
Options -Indexes FollowSymLinks
</Directory>
ErrorLog /home/websites/logs/example.com-error.log
CustomLog /home/websites/logs/example.com-access.log combined
ErrorDocument 404 /error/index.php
</VirtualHost>

Configure the vhost in nginx:
upstream backend {
# add the internal VIPs here, the requests will be forwarded to all ip:port combinations listed here.
# in case hardware configuration is different on these servers - you can adjust the 'weight' parameter
server 10.0.0.4:8000 weight=50;
server 10.0.0.5:8000 weight=50;
}
server
{
listen  80;
server_name example.com www.example.com;
location /
{
proxy_pass     http://backends;
proxy_redirect  off;
log_not_found   off;

proxy_set_header   Host             $host;
proxy_set_header   X-Real-IP        $remote_addr;
proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;

client_max_body_size       30m;
client_body_buffer_size    128k;

proxy_connect_timeout      30;
proxy_send_timeout         30;
proxy_read_timeout         60;

proxy_buffer_size          4k;
proxy_buffers              4 32k;
proxy_busy_buffers_size    64k;
proxy_temp_file_write_size 64k;
}
location ~* ^.+\.(jpg|jpeg|gif|png|css|js|ico|zip|rar|swf)$ {
root  /home/websites/example.com;
access_log /home/logs/example.com-access.log main;
error_page 404 = @fallback;
}
location @fallback {
proxy_pass http://backends;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP $remote_addr;
}
}

Ok, start apache, nginx.

You should use the public VIPs (2.2.2.4, 2.2.2.5) to access the web sites.
In case one of the servers will fail, heartbeat will ensure the ips are brought up automatically on the 2nd server.
Now we have to configure automatic failover in case the servers do not fail, but the web service is inaccessible on one of them. We are going to use ‘mon’ for this.

File: /etc/mon/mon.cf
### global options
cfbasedir   = /etc/mon
pidfile     = /var/run/mon.pid
statedir    = /var/lib/mon/state.d
logdir      = /var/lib/mon/log.d
dtlogfile   = /var/lib/mon/log.d/downtime.log
alertdir    = /usr/lib64/mon/alert.d
mondir      = /usr/lib64/mon/mon.d
maxprocs    = 20
histlength  = 100
randstart   = 60s
authtype    = pam
userfile    = /etc/mon/userfile

### group definitions (hostnames or IP addresses)
hostgroup www 2.2.2.4
watch www
service httpd
interval 30s
monitor http.monitor -p 80 -u /index.php 2.2.2.4
period wd {Mon-Sun}
alert stop-heartbeat.alert
alert mail.alert -S "Apache on Node 1 Down" monitoring@example2.com
upalert start-heartbeat.alert
alertevery 1h

service nginx
interval 30s
monitor http.monitor -p 80 -u /nginx-status 192.168.1.232
period wd {Mon-Sun}
alert stop-heartbeat.alert
alert mail.alert -S "Nginx on Node 1 Down" monitoring@example2.com
upalert start-heartbeat.alert
alertevery 1h

We will need to make two simple scripts that will stop/start heartbeat on demand:
/usr/lib64/mon/alert.d/stop-heartbeat.alert:
#!/bin/bash
/etc/init.d/heartbeat stop

and /usr/lib64/mon/alert.d/start-heartbeat.alert:
#!/bin/bash
/etc/init.d/heartbeat start

As you may have noticed, we did not mention MySQL in our setup,
This is left as an exercise to the reader, as well as the initial server and network setup, and eventual ftp/sftp, dns or firewall configuration.
A HA MySQL cluster will be discussed in a later article.

http://en.wikipedia.org/wiki/Downtime
http://en.wikipedia.org/wiki/High_availability
http://linux-ha.org/
http://www.cis.upenn.edu/~bcpierce/unison/
http://sysoev.ru/nginx/
http://httpd.apache.org/
http://mon.wiki.kernel.org/index.php/Main_Page
http://stderr.net/apache/rpaf/

Request a Quote
Request a quote
Fill out a small form and let us contact you shortly
Client Zone
Access Client Zone
Ticket system, knowledge base and other services for our clients
Contact Us
Contact us
Send us a request or ask any questions via this contact form