Problem:
Apache Web Server configured to proxy requests to web application running on Tomcat (7.0.39) over AJP. The applications were installed on virtual machines in a cloud environment. On the completion of load tests (more than 24 hours), the system became unresponsive. Whenever a request was made to Apache an HTTP 503 status code was returned. Looking at the Tomcat logs showed no errors and requests could still be sent over the HTTP channel directly to Tomcat. CPU and memory consumption was also very low. Looking at the Apache log files showed errors of the following nature:[error] ajp_read_header: ajp_ilink_receive failed
[error] (70007)The timeout specified has expired: proxy: read response failed from 10.1.3.3:8009 (10.1.3.3)
Tomcat could no longer handle any more requests from Apache over AJP and required a restart.
Analysis:
Doing a 'netstat' for port 8009 on the app server VM showed that there were 200 connections still in an ESTABLISHED state, with 100 connections in a CLOSE_WAIT state.From this initial analysis, a number of questions arose:
- How did the number of AJP connections grow so large?
- Why did the number of connections not close after a period of inactivity?
- What was causing Tomcat from accepting any more requests?
Reading the AJP documentation confirmed that by default the AJP connection pool is configured with a size of 200 and an accept count (request queue when all connections are busy) of 100. To confirm the findings, Tomcat was configured with a smaller AJP connection pool (20) and as expected the errors in Apache occurred sooner and more frequently.
To address the issue, Apache (MaxClients) and Tomcat (maxConnections) were both configured to support 25 concurrent requests. This worked perfectly (Apache no longer returned 503 responses and the log files no longer showed the ajp_link errors). The test was then repeated after increasing the connection pool to 50. Running a load test for an hour showed the servers working well, response times improved and no 503 responses. However, after the test completed, the Tomcat VM still showed 50 connections in an ESTABLISHED state. A further read of the Tomcat AJP documentation revealed that the connections remain open indefinitely until the client closes them. The next thing to try was the 'keepAliveTimeout' on the AJP connection pool. This had the effect of closing the connections after a period of inactivity and therefore seems to have resolved the issue. Ideally, the AJP connections should grow as load increases and then reduce back to an optimal number when load decreases. The 'keepAliveTimeout' has the effect of closing all connections that are inactive.
Solution:
Configure Apache 'MaxClients' to be equal to the Tomcat AJP 'maxConnections' configuration.
Configure Tomcat AJP 'keepAliveTimeout' to close connections after a period of inactivity.
References:
Tomcat AJP: http://tomcat.apache.org/tomcat-7.0-doc/config/ajp.html
Apache MPM Worker: http://httpd.apache.org/docs/2.2/mod/worker.html
References:
Tomcat AJP: http://tomcat.apache.org/tomcat-7.0-doc/config/ajp.html
Apache MPM Worker: http://httpd.apache.org/docs/2.2/mod/worker.html