Tuesday, October 9, 2007

Broken pipes with Tomcat and DBCP connection pooling

Getting "broken pipe" errors showing up in your tomcat logs? Happening every 8 hours or so?I had this problem with MySQL 4.1.21 and I've managed to fix it.

(NB. if you are running 5.0.45 MySQL then you'll get a "com.mysql.jdbc.CommunicationsException: Communications link failure" error which is a symptom of the same problem)


A quick search of the web revealed that lots of people are getting these and there are LOTS of suggested solutions. The most common explanation is that your MySQL connections are getting terminated by MySQL before they have been released by the connection pool. Then the next time you try to get a connection from the pool, you'll get a dead connection which will give a "broken pipe" error.

The first step is to try and reproduce it, without waiting around for 8 hours!

I checked the MySQL timeout setting via "mysql> select @@global.wait_timeout". Mine was set to 28,800 which is, not suprisingly, 8 hours. I then reduced this to one minute by:
  • stopping MySQL ("/usr/local/bin/mysqladmin shutdown"),
  • modifying the /etc/my.inf file and adding a "wait_timeout=60" under the "[mysqld]" heading and saving,
  • restarting MySQL ("/usr/local/mysql/mysqld_safe")
Don't worry if your session timeout hasn't changed ("mysql> select @@session.wait_timeout") - it doesn't seem to affect your terminal session. Your global session will have changed, and new connections will use the global setting.

Next I started my application and saw my connections appear (via "mysql> show processlist;") and then disappear at the 60 second mark. Trying to use the app again resulted in the broken pipe error HOORAY! Retrying the app resulted in one broken pipe error per connection in the pool, then it all works OK.

Now the fix.

Appending ?autoconnect=true to the JDBC driver connection string didn't work.

Killing the idle session before MySQL did was the solution. With the MySQL timeout still set at 60 seconds, I decided to evict idle session after 30 seconds by adding the following to the data source:

minEvictableIdleTimeMillis=30000 (evict after 30 seconds inactivty)
timeBetweenEvictionRunsMillis=10000 (run the evicter every 10 seconds)
numTestsPerEviction=-1 (check every connection)

Now, running the same test above, the connections disappear from the list after about 40 seconds. And retrying the ap just creates a new connection. HOORAY! Fixed!! If you are not happy with your connection pool getting wiped out then you can set "minIdle=n" against the data source. When the items from the pool are evicted, new ones are crested to keep the pool full.

In the end I went for a 6 hour idle eviction, checking very 30 mins.

9 comments:

Unknown said...

Thanks so much for your post. I had tried several things to solve this "Broken Pipes" problem. Your suggestion fixed the problem.

Unknown said...

Thanks for sharing that! Don't know yet if it actually helps in my case, but at least you provided some reasoning instead of just using trial-and-error.

Cheers,
Chris

Unknown said...

Great article !

I was struggling with this some days, but know it is solved !

One question: So there is no need use the validationQuery attribute ?

Nima said...

Thanks a lot Buddy. your post made my day. I was struggling with it for a couple of days! Seems to be fixed now!

Sully said...

Excellent, thanks a lot!

dubchristian said...

I am new in this..
Where do you change this configuration ?..
In context.xml
Can you put a example

mubes said...

Hi there,

Great article. Unfortunately for me, I still have this problem.

I wondered if someone might be kind enough to publish a working and complete set of parameters for the Tomcat/MySQL connection, minus sensitive details of course.

I've been over so many forum articles now, trying this and trying that and have ended up with about 1000 parameters.

Many thanks.

dimm said...

Thanks a lot! It solved the problem.

Nitish Vashishtha said...

This configuration helped in resolving my database outage problem, wherein my Hive DB server was getting restarted, and all my existing connections in pool were getting a broken pipe for Thrift api, TTransportException: java.net.SocketException: Broken pipe.


I did not with to use a validate query since that is an unnecessary performance hit on the Hive server, moreover unlike typical SQL I couldn't determine a dummy validate query like SELECT 1

This solution helped me!
Thanks a lot!