Julien's tech blog

talking about tech stuff: stuff I'm interested in, intriguing stuff, random stuff.

Category Archives: Java

Exception handling, Checked vs Unchecked exceptions, …

Some thoughts in random order about exceptions in Java, as they often get overlooked. The following points are often related. Mostly I’m talking to myself here, don’t take it too personally when I say “you should do this” :) just imagine I’m the guy from Memento and that I tattoo myself with my blog posts. Feel free to disagree/add your own views in the comments (though I may not get your opinion tattooed on my body). Here I’m assuming we are building a library that others will use.

1. Exceptions are part of the contract

When you declare a method, it defines a contract for the level of abstraction of the class. The exceptions are part of this contract; exceptions thrown by a method should be relevant to the level of abstraction. Just throwing the exceptions from the underlying layer is letting the implementation details slip through the interface. (See the point about chaining exceptions)

Example :
You write a login method implemented by querying an underlying DB. It should throw a LoginFailedException (identifier or password incorrect), it should not throw a SQLException. LoginFailedException extends a base class that you define (extending java.lang.Exception) and is common to all your checked exceptions. (See following point).

2. Have base classes for the checked and unchecked exceptions

Sometimes it is useful to be able to catch exceptions of one specific library at a higher level. For example, catching all the runtime exceptions that have not been handled by the library. That way they can be caught in a single catch block. In Java 7 you will be able to catch multiple exceptions in a single block. Inheritance for exception is just a category tree even if usually that’s not how you want to think of inheritance.

3. Do not throw java.lang.Exception and make sure that declared exceptions are actually thrown

Throwing java.lang.Exception hides the different cases that you may have to handle depending of what type of exception it is. Checked exceptions is a handy way of making sure you don’t forget to handle error cases, as long as you declare them correctly. People using your library will swear (or maybe it’s just me. Either way, if I’m going to use your code, please do it for my co-workers). In general design your API in a way that does not force people to declare catch blocks in cases when exceptions are never thrown. For examplenew String(bytes, “UTF-8″) throws UnsupportedEncodingException even though UTF-8 is always supported. In java 6 new String(byte[], Charset) was added to avoid this.

4. Checked vs Unchecked exceptions

Checked exceptions are for cases where the caller should do something about the error. This is for exceptional cases that should be handled. For example, if the login failed, you should problably display an error and ask to retry. The users will dometimes mistype their password and you should always handle it.
Unchecked exception are for run time errors caused by bugs or unexpected failures when the caller could not possibly do something about it and the default expected behavior is just to fail. Most of the time you want to centrally handle those at the top of the stack to display an error message or send an alert to the monitoring system. The caller can still catch it if it wants to, but it does not have to.
For example, if the data base refuses the connection you may throw a DatabaseUnavailableException that extends a base class (extending java.lang.RuntimeException) common to all your uncaught exceptions. You could have an intermediary layer that will retry the transaction or a top level apologetic error message asking to come back later. The main point is that the exception is not dealt with where it is thrown.

5. Chaining Exceptions

Since Java 1.4 all exceptions can be chained. When you catch an Exception and throw a new one related to your level of abstraction, you should chain the original one to make sure you have all available information to fix a problem. A good stack trace tells you exactly where the bug is (See the “fail early” point). It not fun when the production issue you have to fix urgently reports itself without providing the root cause. You usually end up patching the exception chaining first then reproduce the error then know what happened.

Sometimes people ask how to display the “… 2 more” at the end of a chained stack trace. The display does not truncate any information and those lines are already there in the parent stack trace. Obviously exceptions that have a cause will have the end of the stack trace in common with their cause, printStackTrace() is not printing those duplicate lines.

6. Add more information.

When you catch an Exception and throw a new one related to your level of abstraction, you should add information related to this upper level.
example:

void readConfiguration() throws InvalidConfigurationException {
  File confFile = new File("conf/conf.properties");
  try {
   ...
  } catch (IOException e) {
    throw new InvalidConfigurationException("Error while reading the configuration at "+confFile.getAbsolutePath(),e);
  }
 }

In general put as much information as you can (id of the object for which it failed …), but keep it to one line.

7. do not have empty catch blocks

If the exception can not possibly get thrown (new StringReader(stream,”UTF-8″) throws UnsupportedEncodingException) just throw an exception saying so. That way if you’re wrong you don’t hide the problem.
If it is really what you want to do (the API you call probably needs refactoring) at least put a comment explaining why.

8. Fail early

Prefer throwing an exception to use a default value when it fail. You prefer your code to tell you what’s wrong instead of doing something you didn’t ask for.
If something is not what’s expected throw an exception, better fail early on the real cause than later on the consequences which will be harder to debug.

9. Read the error message

When there’s an exception if you did a good job, you should be able to know quickly what the problem is by reading the messages and the stack traces of the chain of exceptions. I know this one sounds silly, but how many times have you had a bug report mentioning that “it fails” without a stacktrace?

(this is an open list, I may come back later to add some)

Detecting low memory in Java Part 2

This is a follow up on my previous post the rationale is explained there.

I ended up spending a little more time on the low memory detection issue and played with the MemoryPoolMXBean. I had found some posts about it but none were satisfying. This post (thanks @techmilind for pointing it out) lead me in the right direction even though it is partially incorrect. Experimenting with a monotonically increasing memory usage is a very special (and invalid) case.
In particular the setUsageThreshold() is not very useful in my case as it is triggered the very first time we use that much memory, regardless of imminent garbage collection. As discussed in my previous post it is useful only if the available memory is measured right after garbage collection.
However setCollectionUsageThreshold() is exactly what I need. This is setting a threshold for notification when the memory is low right after a GC.

We need to do this in two steps, first set a collectionUsageThreshold on the tenured partition.

// heuristic to find the tenured pool (largest heap) as seen on http://www.javaspecialists.eu/archive/Issue092.html
MemoryPoolMXBean tenuredGenPool = null;
for (MemoryPoolMXBean pool : ManagementFactory.getMemoryPoolMXBeans()) {
  if (pool.getType() == MemoryType.HEAP && pool.isUsageThresholdSupported()) {
    tenuredGenPool = pool;
  }
}
// we do something when we reached 80% of memory usage
tenuredGenPool.setCollectionUsageThreshold((int)Math.floor(tenuredGenPool.getUsage().getMax()*0.8));

Here tenuredGenPool.getCollectionUsage() is the memory usage as measured right after the last garbage collection. This is the value that we can rely on to detect low memory.

Then we setup a listener to get notified. Only MEMORY_COLLECTION_THRESHOLD_EXCEEDED is interesting as explained before.

//set a listener
MemoryMXBean mbean = ManagementFactory.getMemoryMXBean();
NotificationEmitter emitter = (NotificationEmitter) mbean;
emitter.addNotificationListener(new NotificationListener() {
public void handleNotification(Notification n, Object hb) {
if (n.getType().equals(
  MemoryNotificationInfo.MEMORY_COLLECTION_THRESHOLD_EXCEEDED)) {
   // this is the signal => end the application early to avoid OOME
}
}}, null, null);

My experiment slowly increases the memory usage while also freeing up a bunch of existing object. This shows that the MEMORY_THRESHOLD_EXCEEDED is reached pretty quickly when most of the memory gets cleaned afterwards. MEMORY_COLLECTION_THRESHOLD_EXCEEDED is triggered when we actually fill up the memory.

edit: actual code explains it better than words.

Here is some test code to show how that works:

https://github.com/julienledem/blog/blob/master/2011/07/21/detecting-low-memory-in-java-part-2/MemoryTest.java

sample output: (only the last line is the warning we want)

it=   0 - Par Eden Space: u= 49% cu=  0% th=0% - Par Survivor Space: u=  0% cu=  0% th=0% - CMS Old Gen: u=  0% cu=  0% th=79%
it= 100 - Par Eden Space: u= 16% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 35% cu=  0% th=79%
memory threshold exceeded !!! : 
        - Par Eden Space: u= 92% cu=  0% th=0% - Par Survivor Space: u= 96% cu= 96% th=0% - CMS Old Gen: u= 86% cu=  0% th=79%
it= 200 - Par Eden Space: u= 84% cu= 45% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 19% cu= 19% th=79%
memory threshold exceeded !!! : 
        - Par Eden Space: u=  1% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 83% cu= 19% th=79%
it= 300 - Par Eden Space: u= 49% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 79% cu= 74% th=79%
memory threshold exceeded !!! : 
        - Par Eden Space: u=  4% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 84% cu= 74% th=79%
memory threshold exceeded !!! : 
        - Par Eden Space: u=  0% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 83% cu= 33% th=79%
it= 400 - Par Eden Space: u= 15% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 51% cu= 38% th=79%
memory threshold exceeded !!! : 
        - Par Eden Space: u=  6% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 83% cu= 38% th=79%
it= 500 - Par Eden Space: u= 79% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 59% cu= 46% th=79%
memory threshold exceeded !!! : 
        - Par Eden Space: u=  0% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 84% cu= 51% th=79%
it= 600 - Par Eden Space: u= 43% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 64% cu= 56% th=79%
memory threshold exceeded !!! : 
        - Par Eden Space: u=  0% cu=  0% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 84% cu= 56% th=79%
memory collection threshold exceeded !!! : 
        - Par Eden Space: u= 19% cu=  9% th=0% - Par Survivor Space: u= 99% cu= 99% th=0% - CMS Old Gen: u= 85% cu= 85% th=79%

Detecting low memory in Java

This seems to be a common need and difficult thing to do in Java. Here is my solution, let me know what you think.

1. The problem

I am focusing on only one aspect of the problem as I have a very specific use case. I have a piece of code that accumulates information about a (possibly big) data set. Let’s assume I have a very big file that contains a list of records encoded in JSON and that I iterate on the records to accumulate statistics about the data. This is close enough to what I’m doing. The result of the operation is a report containing the schema inferred from the data and statistics about the fields (max/min/avg size, list of unique values and count if bounded, type, …). This is implemented as an Algebraic UDF (so that it can use the Combiner) in a Pig script, so there’s a fair amount of the code that I don’t control. If the data is homogenous everything is fine and it generates a rather small report with a lot of interesting information about the data. Now if the data is random enough (dynamic keys, …) it will eventually eat up all the memory until it fails. The first thing to do is to make it configurable to limit the number of fields/values/… it will accumulate but getting this configuration right is still black magic and unsatisfying. What I really want is keep accumulating until I run out of memory and output what I can. Of course without getting to the dreaded OutOfMemoryError as it can be thrown anywhere and most often outside of my code where I can’t catch it. One important point here: I don’t want false positives.

2. Trying out things

First I looked into java.lang.Runtime.{free|max|total}Memory() and the more precise MemoryPoolMXBean. This tells me how much memory I am using and I can even set a usage threshold to be notified. Solved? Not really. The trouble is that reaching a level of usage does not mean you’re going to run out of memory. The unreachable objects do not get garbage collected until it is needed. To simplify we will first eat up all the memory then the garbage collector will free old objects and again and again. Graphing memory usage will show an upward trend with a drop every time GC kicks in. One way to make the value returned by freeMemory() more accurate is to force a garbage collection right before, but of course this slows down the application drastically; there’s a reason the garbage collector runs only when needed. Also there’s a special type of OutOfMemoryError (“GC overhead limit exceeded”) that gets thrown when you spend over a certain percentage of time in GC and you could artificially trigger it. Another way to look at this is to read the value of freeMemory() right after a (full) GC, but there’s no way to get notified of GC from the java side. It seems you need to write an agent in C which exceeded by far the complexity threshold I had set for the solution. You could also imagine polling GarbageCollectorMXBean which knows how many GCs happened (current>previous ? get freeMemory() ) but I did not try that (there was also a threshold on the time I spent on this :) ) and I’m not sure how reliable it would be.

3. My solution

I settled for something very different which happens to be triggered by the garbage collector when you are close to get an OutOfMemoryError. I initialize a byte array and I set it in a SoftReference. The byte array is big enough so that freeing it will give me enough memory to finish what I’m doing gracefully. This acts like canaries in coal mines: the SoftReference “dies” as a warning that we are out of memory.

canary = new SoftReference<Object>(new byte[bufferSize])

Now in each iteration I can check canary.get() == null as a signal that I’m running low on memory before actually getting an OutOfMemoryError (remember that it can be thrown somewhere I can not catch it). You could also use a ReferenceQueue to get notified of this. A SoftReference is how you tell the JVM that you’d rather keep the object but that it can be freed when there’s a need.

Experimentation monitoring the memory and GCs shows that the SoftReference actually gets freed in last resort and not before so it fits the bill.

One big inconvenient of this is that the buffer is actually using memory that can not be used for something else. In particular the less memory is left, the more time the GC takes. The consequence is that the application will slow down a lot right before freeing the buffer, delaying the detection. My experiments showed that it was acceptable for my use case. The big advantage: this is a very simple solution.

If you have an opinion about this, please comment.

edit: I have posted an update regarding the MemoryPoolMXBean.

Follow

Get every new post delivered to your Inbox.