Blog About Contact

5 Minute Guide to Clustering - Java Web Apps in Tomcat

Published Sat, 21 Aug 2010 • 55 comments

I've been taking a break from posting for the last couple of weeks. I was starting to get a bit run down, and feel like burn out was about to set in. The kind of blog posts I do take quite a bit of time, both in terms of the technical background work and the time to write and proof read the posts. Balancing that with work, plus personal projects, family life something had to take a break, and it's not going to be work or family life :)

Anyhow, I'm back with another 5 minute guide. This time, how to set up clustering with Apache Web Server and Apache Tomcat.

For the purposes of the rest of this article, when I say "Apache" I mean the web server, and when I say "Tomcat" I mean Tomcat.

There are pretty much two ways to set up basic clustering, which use two different Apache modules. The architecture for both, is the same. Apache sits in front of the Tomcat nodes and acts as a load balancer.

Architecture of Apache and Tomcat cluster, protocols and connectivity

Traffic is passed between Apache and Tomcat(s) using the binary AJP 1.3 protocol. The two modules are modjk and modproxy.

mod_jk stands for "jakarta" the original project under which Tomcat was developed. It is the older way of setting this up, but still has some advantages.

modproxy is a newer and more generic way of setting this up. The rest of this guide will focus on modproxy, since it ships "out of the box" with newer versions of Apache.

You should be able to follow this guide by downloading Apache and Tomcat default distributions and following the steps. No funny business required.

Clustering Background

You can cluster at the request or session level. Request level means that each request may go to a different node - this is the ideal since the traffic would be balanced across all nodes, and if a node goes down, the user has no idea. Unfortunately this requires session replication between all nodes, not just of HttpSession, but ANY session state. For the purposes of this article I'm going to describe Session level clustering, since it is simpler to set up, and works regardless of the dynamics of your application. ....... After all we only have 5 minutes! :)

Session level clustering means if your application is one that requires a login or other forms of session-state, and one or more your Tomcat nodes goes down, on their next request, the user will be asked to log in again, since they will hit a different node which does not have any stored session data for the user.

This is still an improvement on a non-clustered environment where, if your node goes down, you have no application at all!

And we still get the benefits of load balancing across nodes, which allows us to scale our application out horizontally across many machines.

Anyhow without further ado, let's get into the how-to.

Setting Up The Nodes

In most situations you would be deploying the nodes on physically separate machines, but in this example we will set them up on a single machine, but on different ports. This allows us to easily test this configuration.

Nothing much changes for the physically separate set up - just the Hostnames of the nodes as you would expect.

Oh and I'm working on Windows - but aside from the installation of Apache and Tomcat nothing is different between platforms since the configuration files are standard on all platforms.

  1. Download Tomcat .ZIP distribution, e.g. Image showing download package
  2. We'll use a folder to install all this stuff in. Let's say it's "C:\cluster" for the purposes of the article.
  3. Unzip the Tomcat distro twice, into two folders - C:\cluster\tomcat-node-1 C:\cluster\tomcat-node-2
  4. Start up each of the nodes, using the bin/startup.bat / bin/startup.sh scripts. Ensure they start. If they don't you may need to point Tomcat to the JDK installation on your machine.
  5. Open up the server.xml configuration on c:\cluster\tomcat-node-1\conf\server.xml
  6. There are two places we need to (potentially) configure -screenshot of where these lines are in server.xml The first line is the connector for the AJP protocol. The "port" attribute is the important part here. We will leave this one as is, but for our second (or subsequent) Tomcat nodes, we will need to change it to a different value. The second part is the "engine" element. The "jvmRoute" attribute has to be added - this configures the name of this node in the cluster. The "jvmRoute" must be unique across all your nodes. For our purposes we will use "node1" and "node2" for our two node cluster.
  7. This step is optional, but for production configs, you may want to remove the HTTP connector for Tomcat - that's one less port to secure, and you don't need it for the cluster to operate. Comment out the following lines of the server.xml -
  8. Now repeat this for C:\cluster\tomcat-node-2\conf\server.xml Change the jvmRoute to "node2" and the AJP connector port to "8019".

We're done with Tomcat. Start each node up, and ensure it still works.

Setting Up The Apache Cluster

Okay, this is the important part.

  1. Download and install Apache HTTP Server. Use the custom option to install it into C:\cluster\apache2.2
  2. Now open up c:\cluster\apache2.2\conf\httpd.conf in your favourite text editor.
  3. Firstly, we need to uncomment the following lines (delete the '#') - mod_proxy lines in httpd.conf to be uncommented These enable the necessary mod_proxy modules in Apache.
  4. Finally, go to the end of the file, and add the following:
    <Proxy balancer://testcluster stickysession=JSESSIONID>
    BalancerMember ajp://127.0.0.1:8009 min=10 max=100 route=node1 loadfactor=1
    BalancerMember ajp://127.0.0.1:8019 min=20 max=200 route=node2 loadfactor=1
    </Proxy>
    
    ProxyPass /examples balancer://testcluster/examples
    The above is the actual clustering configuration. The first section configures a load balancer across our two nodes. The loadfactor can be modified to send more traffic to one or the other node. i.e. how much load can this member handle compared to the others? This allows you to balance effectively if you have multiple servers which have different hardware profiles. Note also the "route" setting which must match the names of the "jvmRoutes" in the Tomcat server.xml for each node. This in conjunction with the "stickysession" setting is key for a Tomcat cluster, as this configures the session management. It tells mod_proxy to look for the node's route in the given session cookie to determine which node that session is using. This allows all requests from a given client to go to the node which is holding the session state for the client. The ProxyPass line configures the actual URL from Apache to the load balanced cluster. You may want this to be "/" e.g. "ProxyPass /balancer://testcluster/" In our case we're just configuring the Tomcat /examples application for our test.
  5. Save it, and restart your Apache server.

Test It Out

With your Apache server running you should be able to go to http://localhost/examples

You should get a 503 error page as per below -

This is because both Tomcat nodes are down.

Start up node1 (c:\cluster\tomcat-node-1\bin\startup) and reload http://localhost/examples

You should see the examples application from the default Tomcat installation -

Shut down node1, and then start up node2. Repeat the test. You should see the same page as above. We have transparently moved from node1 to node2 since node1 went down.

Start both nodes up and your cluster is now working.

You're done!

Optional: Set Up Apache Balancer Manager

mod_proxy has an additional "balancer manager" component which provides a nice web interface to the load balanced cluster. It's worthwhile setting this up if you want to remotely administer / monitor the cluster.

To do so is easy -

  1. Add the following to the bottom of your C:\cluster\apache2.2\conf\httpd.conf
    <Location /balancer-manager>
    SetHandler balancer-manager
    AuthType Basic
    AuthName "Balancer Manager"
    AuthUserFile "C:/cluster/apache2.2/conf/.htpasswd"
    Require valid-user
    </Location>
    This configures the balancer manager at http://localhost/balancer-manager
  2. We need to create a password file to secure it. At the command prompt you can use -
    c:\cluster\apache2.2\bin\htpasswd -c c:\cluster\apache2.2\conf\.htpasswd admin
    Then set a password when prompted. This password would be used by the balancer-manager URL to authenticate.

Restart your Apache web server, and go to http://localhost/balancer-manager

You should be prompted for a username/password as you set before, and see the balancer manager tool as below:


About the Author

Richard Nichols is an Australian software engineer with a passion for making things.

Follow him on twitter or subscribe by RSS or email.

You might also enjoy reading -


Discuss / Comment

There are 55 comments.

Add a comment

  • {{e.error}}

Thanks for your comment!/

Required.
Valid email address required.
Required.
Posting message, please wait...