At work we have a lot of issues with monitoring and logging. We run a lot of different services as EJBs in Weblogic clusters. They get a lot of use, so much so, that logging can actually make a real performance problem. The main issues are:
- The logs are not aggregated in any one place.
- There are a lot of ad-hoc scripts to do remote greps via ssh.
- Any type of central aggregation will cause increased network load and increased CPU load on the servers.
I'd like something like this for Java Applications. Something that could listen for known protocols like RMI, SOAP-RMI, etc. It could do performance montoring (avg., min, max, std. dev.) and also track error/exception rates. In my head these seems very possible, but maybe I'm missing something.
There are several products that use byte-code modification to instrument code, but this adds unnessesary overhead. I was quoted something like a 7% increase, and its still not doing the other half I need -- error analysis.