Recently I encountered this issue. One of the application connects with Apache Cassandra NoSQL Database. Application uses DataStax java driver to connect with Cassandra. DataStax has dependency on the netty library. To be specific following are the Jars that application uses:
- cassandra-driver-core-2.0.1.jar
- netty-3.9.0.Final.jar
This application all of sudden ran in to ‘java.lang.OutOfMemoryError: unable to create new native thread‘. When thread dump was taken on the application, around 1650 threads were in ‘runnable’ state with following stack trace:
"New I/O worker #211" prio=10 tid=0x00007fa06424d000 nid=0x1a58 runnable [0x00007f9f832f6000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) - locked (a sun.nio.ch.Util$2) - locked (a java.util.Collections$UnmodifiableSet) - locked (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) at org.jboss.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:68) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.select(AbstractNioSelector.java:415) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:212) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722)
Oh boy! This is way too many threads. All of them are netty library threads. Apparently issue turned out that Apache Cassandra NoSQL DB ran out of space. This issue was cascading in the application as OutOfMemoryError. When more space was allocated with Cassandra DB, problem went away.
However point here is: Even though Cassandra ran out of space, client applications should be resilient to it. It can’t result in OutOfMemoryError. It’s an unacceptable behavior.
can you explain the reason for this ?