<$BlogRSDUrl$>

Monday, October 04, 2010

Top 10 Java Issues in Production

10 Heap/network issues

Tools: dtrace, hprof, introscope, jconsole, visualvm, yourkit, azul zvision
Invasive tools : bci, jvmti, jvmdi

OS Tools : dtrace, osprofile, vtune

Network/DISK tools: ganglia, iostat, lsof, nagios, netstat, snoop

Gotcha:
  • instrumentation is not cheap
  • avoid expensive heap walks
  • thread dumps
  • asynchronous logging
  • finish task and then increment performance counter
  • jconsole is cheap

9 Leaks

symptoms :
  • app consumes all memory
  • heap trend is ramping saw tooth
  • finally Out of Memory
Tools : yourkit, hprof, eclipse mat, jconsole, jhat, jps, visualvm, azul zvision

Theory:
  • Allotted Vs Live Objects
  • Finalizers, Classloaders
8 I/O

I/O Serialization
Symptoms :
  • Multi-node scale-out does not scale linearly
  • Spending time CPU & IO
Tools :
  • cpu profiling
  • io profiling
Solution
  • pick a high performance serialization library
  • avro, krgo
I/O Limits/Tuning

Symptoms:
  • Too many file descriptors, cursors etc
  • inconsistent response times
Tools:
  • Nagios
  • pkg
  • rpm
  • info
  • ulimit
  • yum
Solution
  • check OS patches
  • check user and process limits
I/O Sockets, Files, DB

Symptoms:
  • socket open/close takes a long time
  • JRMP timeouts, long JDBC calls
  • running out of files, cursors
Tools:
  • dbms tools, du, iostat, gmon, lsof, netstat
Workaround
  • ping/telnet tests
7 Locks and Synchronization

Symptoms:
  • Adding users/threads/CPUs causes slow down
  • High lock acquire times and contention
  • race conditions, dead locks
  • I/O under load
Tools:
  • dtrace, lockstat, azulzvision
  • thread dumps
  • IBM visual analyzer ( j.u.c in eclipse )
Solution:
  • Use non-blocking collections

6 Endless Compilation

symptoms:
  • Time in compiling
  • Time in interpretor
Tools:
  • -XX:+PrintCompilation
  • CPU profiler

5 Endless Exceptions

symptoms:
  • Application spending time filling stack trace
tools:
  • CPU profiler, zvision
  • thread dumps
  • Track caller/callee
  • repeated kill -3
Solution:
  • don't throw , return
  • JVM's don't optimize exception paths
4 Fragmentation

Symptoms:
  • performance degrades over time
  • "Full GC" makes problem go away
  • Lot of free memory, but in tiny fragments
Tools;
  • GC logging flags, for CMS -XX:PrintFLSStatistics=2 -XX:+PrintCMSInitializationStatistics
  • Fragger
Solution:
  • Upgrade to latest CMS
  • azul zing and gen pausless GC
  • pooling similar sized and similar aged objects together

3 GC Tuning

Symptoms:
  • Entropy(GC) = number of GC flags
  • Too many free parameters
  • 64bit/ large heap no solution
  • Constant 40%-60% CPU utilization by GC
  • Scheduled reboot before full GC
  • Full-time engineer working on GC flags
Workarounds:
  • Ask JVM vendor for one flag solution
  • G1 GC, Zing GPGC
2 Spikes in Load

Symptoms:
  • Rush hour traffic, tax day, black friday
  • outage under spikes, power law of networks
Solution:
  • Measure
  • Test with realistic load and realistic multi-node setup
  • build redundancy
1 Versionitis

Symptoms:
  • different nodes have different configuration, stack components, versions
  • classpath dist/* , -verbose
  • hard to reproduce
Solutions:
  • Method
  • Version control
  • rigor
0 Collapsing under load

juc profiling


This page is powered by Blogger. Isn't yours?