1. 程式人生 > >Using VisualVM to fix live Tomcat and JVM problems

Using VisualVM to fix live Tomcat and JVM problems

You have done all your Java implementation, unittesting and perhaps integration testing. You met all specs and passed the acceptance phase, so you’re going to deploy your .war file to the live environment and install it on Tomcat. All goes well and you continue on other work to be done… A few hours later the system administrator calls you and asks you why the quad core processor reached 400% of CPU level (where normally it’s around 100% spread over 4 cores). You did the best you can with testing but still a problem like this can get through. Now the challenge starts: How to find what causes this problem!

CPU load is far higher then normal. Note the platforms in the graph!

CPU load is far higher then normal. Note the platforms in the graph!


You could think back on what changes you made since the previous release and pinpoint possible problem areas. Then revert this part of the code, redeploy and see what happens. This approach can be tedious and you can end up quite frustrated when this approach didn’t fix the problem after several tries. It would be nice if you could pinpoint exactly where things go wrong wouldn’t it?

VisualVM to the rescue

Since Java 5.0 Java has this great feature called JMX (Java Management Extensions). This technology enables getting all sorts of information from running JVM’s. VisualVM is a tool that uses JMX and gives you detailed info about JVM memory, CPU usage, Garbage Collection but can also profile your objects for CPU and memory usage (local JVM’s only, otherwise all your live classes would be instrumented which would take down the performance even more). The nice thing is that you can monitor your remote running JVM’s too! In our case this is exactly what we need since we can’t reproduce it on our development or acceptance machines. Users could have hit some location on your webapplication in just the kind of way you never tested and therefor can’t reproduce.

Enable JMX via Tomcat

In order to use JMX you should add some variables to the JAVA_OPTS in your Tomcat startup script:

-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.port=9090
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false

You can choose your own jmxremote port and choose to use SSL and authenticate (You should if you need to open up the JMX port for the entire world via your firewall). When we tried it we first opened up only port 9090 on our firewall. This didn’t work. There are some other ports that are used between JMX and VisualVM (Disclaimer: We have not tested this any further and we are not sure which other ports are needed). We fixed this problem by opening the firewall entirely for just the IP address of the computer we have VisualVM running.

Connecting VisualVM to the JMX enabled remote JVM

Now we are ready to connect to the server:

Connecting you JMX enabled remote JVM to your local VisualVM

Connecting your JMX enabled remote JVM to your local VisualVM

Start VisualVM on your machine and create a remote connection by right clicking ‘Remote’ and click  ’Add remote host…’. Then on your created remote host do a right click again and click ‘Add JMX connection…’. Type your remote server’s domainname or IP followed by your remotely configured port number. Now open this connection. You will see a lot of information about your live JVM environment and some tabs. In this article we will need the ‘Threads’ tab!

Pinpointing the problem

When you open the Threads tab you should see something like the following example:

Normal expected view of tomcat HTTP connections threads

Normal expected view of tomcat HTTP connections threads

As you can see some threads are running at some time, handling some HTTP requests. This should not take too much time (the HTTP responses will take to long and users will experience your webapp as slow) but should certainly not take forever! Guess what… when we clicked our Threads tab we saw something like the example I created below:

Problematic view; some threads are continuously running

Problematic view; some threads are running continuously

Spot the differences. As you can see there are two threads which are running forever: http-8005-191 and http-8005-188. If you scroll up and down you could very well spot some others. Just to be sure that this is your CPU problem check at what time these threads started being in running state and compare this with your CPU graph. We saw that the CPU platforms started at exactly the same time as a thread became permanently running.

Fixing the problem

So what causes these threads to be running forever? The answer lies under the ‘Thread Dump’ button you can see at the upper right of this Threads tab. Click it to let VisualVM create a dump of all threads stacks. Now remember the ID’s of our problematic threads and search for them in the thread dump output. The stack trace you see here should lead you directly to the problem you’ll have to fix! This will probably be some never ending while loop somehow. Never ending recursive calls would probably have alarmed you sooner with a stack overflow exception. In our case it was code like below that caused our pain:

// Simplified the code a lot ;)
Node node = new Node();
node.addParent(node);

//... a lot of other code...

while(node.hasParent()){
node = node.getParent();
}

You can imagine we extended our testcases for this particular part of code in the webapplication. We now could reproduce this problem on our development machines and solve it.

Fixing your problem?

It could very well be that your problem has a different origin but hopefully this will help you out to solve your problem more easily in that case. If you have different cases which you solved (partly) with what you read in this article, please let us know with a comment to this article and how you solved your problem. This way you can help out others with problems like these too! Now go enjoy your speedy bugfree webapp ;)

This entry was posted in