Tuesday 17 September 2013

Apache Web Server, Tomcat AJP: ajp_read_header: ajp_ilink_receive failed

Problem:

Apache Web Server configured to proxy requests to web application running on Tomcat (7.0.39) over AJP. The applications were installed on virtual machines in a cloud environment. On the completion of load tests (more than 24 hours), the system became unresponsive. Whenever a request was made to Apache an HTTP 503 status code was returned. Looking at the Tomcat logs showed no errors and requests could still be sent over the HTTP channel directly to Tomcat. CPU and memory consumption was also very low. Looking at the Apache log files showed errors of the following nature:


[error] ajp_read_header: ajp_ilink_receive failed
[error] (70007)The timeout specified has expired: proxy: read response failed from 10.1.3.3:8009 (10.1.3.3)



Tomcat could no longer handle any more requests from Apache over AJP and required a restart.

Analysis:

Doing a 'netstat' for port 8009 on the app server VM showed that there were 200 connections still in an ESTABLISHED state, with 100 connections in a CLOSE_WAIT state.
 From this initial analysis, a number of questions arose:

  • How did the number of AJP connections grow so large?
  • Why did the number of connections not close after a period of inactivity?
  • What was causing Tomcat from accepting any more requests?
Reading the AJP documentation confirmed that by default the AJP connection pool is configured with a size of 200 and an accept count (request queue when all connections are busy) of 100. To confirm the findings, Tomcat was configured with a smaller AJP connection pool (20) and as expected the errors in Apache occurred sooner and more frequently. 
To address the issue, Apache (MaxClients) and Tomcat (maxConnections) were both configured to support 25 concurrent requests. This worked perfectly (Apache no longer returned 503 responses and the log files no longer showed the ajp_link errors). The test was then repeated after increasing the connection pool to 50. Running a load test for an hour showed the servers working well, response times improved and no 503 responses. However, after the test completed, the Tomcat VM still showed 50 connections in an ESTABLISHED state. A further read of the Tomcat AJP documentation revealed that the connections remain open indefinitely until the client closes them. The next thing to try was the 'keepAliveTimeout' on the AJP connection pool. This had the effect of closing the connections after a period of inactivity and therefore seems to have resolved the issue. Ideally, the AJP connections should grow as load increases and then reduce back to an optimal number when load decreases. The 'keepAliveTimeout' has the effect of closing all connections that are inactive.

Solution:

Configure Apache 'MaxClients' to be equal to the Tomcat AJP 'maxConnections' configuration.
Configure Tomcat AJP 'keepAliveTimeout' to close connections after a period of inactivity.

References:
Tomcat AJP: http://tomcat.apache.org/tomcat-7.0-doc/config/ajp.html
Apache MPM Worker: http://httpd.apache.org/docs/2.2/mod/worker.html

Tuesday 3 September 2013

Runtime Configuration of log4j properties

log4j is used widely to provide logging in applications. There are numerous articles about how to configure log4j.properties or log4j.xml to enable/disable logging at various levels and categories. Often it is necessary to enable the debug log level for a short duration to identify the cause of a production issue. Without runtime configuration, this would require stopping the application, changing the log configuration (after extracting the properties file from the war) and then redeploying.

This post describes how log4j can be configured outside of the application and also changed at runtime.

File Watchdog

Log4j provides runtime configuration through the DOMConfigurator.configureAndWatch for XML files, or the PropertyConfigurator.configureAndWatch for properties file. Both these methods take the absolute path for the configuration file and a refresh interval. This allows the configuration file to be located outside of a web application war file and allow administrators to change the logging levels at runtime.

The configureAndWatch API can be invoked from a custom servlet listener. For users of the Spring framework, there already exists the Log4jConfigListener that is a wrapper to the Log4j configureAndWatch API. This listener is configured in the web applications web.xml.

Spring Log4jConfigListener

By specifying the log4jConfigLocation to a file outside the web application, it allows different environments (dev, test, prod) to have different levels of logging enabled. The log4jRefreshInterval specifies how often the log4j.properties file should be checked for changes and reloaded. This allows runtime changes to the log4j configuration to be performed.

JMX

Log4j also has built-in support for JMX and two classes org.apache.log4j.jmx.LoggerDynamicMBean and org.apache.log4j.HierarchyDynamicMBean are used to expose and register loggers as MBeans. However, it still requires wrappers to work to add loggers that are not defined at start-up.


Tomcat JMX Connectivity through firewall

JMX is great for monitoring the JVM to identify potential problems with memory and concurrency. Usually this is quite simple as a local running JVM can be connected to using tools such as JConsole. When connecting remotely however there is some additional configuration required.

During a recent project, the team faced an issue where a web application was deployed using Tomcat 7 on a cloud environment.

To enable JMX monitoriing for Tomcat 7, the following system properties were configured via setenv.sh:
Tomcat JMX configuration
The above configuration properties specify that JMX remote access should be enabled on port 7099 and authenticated using the credentials found in jmxremote.access and jmxremote.password.

For more information about configuring JMX see the Oracle documentation in the references section.

The firewall on the VPN between the cloud environment and local network was configured to allow connections on port 7099. However this still didn't prove successful.

After reading numerous online resources, the problem was identified. Basically, when the JMX server starts up, it opens two ports, one for the JMX registry and another dynamically generated port for the RMI server. The above configuration only specifies the JMX registry port and there isn't a system property to configure the RMI server port. Both these ports need to be opened in the firewall, but as the RMI server port is dynamically chosen, how can the firewall be configured?

One solution is to develop a custom JMX agent and configure the java runtime to use it. This is outlined in the references below.

For Tomcat there is a more elegant solution for out-of-the-box JMX monitoring.
Tomcat provides a JMXRemoteLifecycleListener that allows specifying both the JMX registry port and the RMI server port.

Tomcat JMXRemoteLifecycleListener

The rmiRegistryPortPlatform replaces the use of the com.sun.management.jmxremote.port system property.

The JMXRemoteLifecycleListener requires the deployment of the catalina-jmx-remote.jar in the ${CATALINA_HOME}/lib directory.

Tomcat can now be remotely monitored and managed using the following JMX service URL:
service:jmx:rmi://<host>:<rmiServerPort>/jndi/rmi://<host>:<rmiRegistryPort>/jmxrmi

References: