Julien's tech blog

talking about tech stuff: stuff I'm interested in, intriguing stuff, random stuff.

Java JIT compiler inlining

As you know the Java Virtual Machine (JVM) optimizes the java bytecode at runtime using a just-in-time-compiler (JIT). However the exact behavior of the JIT is hard to predict and documentation is scarce. You probably know that the JIT will try to inline frequently called methods in order to avoid the overhead of method invocation. But you may not realize that the heuristic it uses depends on both how often a method is invoked and also on how big it is. Methods that are too big can not be inlined without bloating the call sites.

Keeping this heuristic in mind and enabling flags on the java command line, we can find places where we can help the JIT better optimize our code by breaking large methods into smaller hot methods that can usefully be inlined.

The JIT aggressively inlines methods, removing the overhead of method calls. Methods that can be inlined include static, private or final methods but also public methods if it can be determined that they are not overridden. Because of this, subsequent class loading can invalidate the previously generated code. Because inlining every method everywhere would take time and would generate an unreasonably big binary, the JIT compiler inlines the hot methods first until it reaches a threshold. To determine which methods are hot, the JVM keeps counters to see how many times a method is called and how many loop iterations it has executed. This means that inlining happens only after a steady state has been reached, so you need to repeat the operations a certain number of times before there is enough profiling information available for the JIT compiler to do its job.

Rather than trying to guess what the JIT is doing, you can take a peek at what’s happening by turning on java command line flags: -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining

Here is what they do:

  • -XX:+PrintCompilation: logs when JIT compilation happens
  • -XX:+UnlockDiagnosticVMOptions: enables other flags like -XX:+PrintInlining
  • -XX:+PrintInlining: prints what methods get inlined and where

Turning those flags on, you will see the JVM print out compilation information to the standard output.

Inlined methods are displayed as a tree with its leaves annotated:

  • inline (hot): the method was determined hot and inlined
  • too big: the method was not inlined as the generated code was getting too big (but the method was not hot)
  • hot method too big: the method was determined hot but not inlined because the resulting code was getting too big.

In the output, you want to make sure hot methods get inlined. Seeing “too big” on its own is nothing to worry about, as the cost of a method call that does not happen often is negligible. It is when you see “hot method too big” that you want to take a closer look and find out how you can make the JIT compiler’s life easier. The compiler works at the granularity of the method and inlining is an all-or-nothing operation: big methods reduce opportunities for inlining.

To illustrate the theory let’s take a look at the decoding implementation for Parquet, a columnar file format for Hadoop. I used the -XX:+PrintInlining flag to look at how methods get inlined and saw an instance of “hot method too big”.

! @ 1 parquet.column.impl.ColumnReaderImpl::checkRead (492 bytes)   hot method too big

For every value read in a column it checks if the current page is fully consumed and reads a page if it needs more data to process.

private void checkRead() {
  if (isPageFullyConsumed()) {
     //read page
     … code for reading a page
     …
     …
  }
  read();
}

The code reading the content of the page is right there in the method making it too big to be inlined. The method is called for every value we read (which is very frequent) but the test is true only when we are done reading the page (which is rare).

We can improve it by modifying the method as follows:

private void checkRead() {
  if (isPageFullyConsumed()) {
    readPage();
  }
  read();
}

private void readPage() {
  //read page
  … code for reading a page
  …
  …
}

Now the checkRead(), isPageFullyConsumed() and read() methods get inlined removing those method calls from the hot loop. The readPage() call does not get inlined but as it does not happen often, the cost is negligible. The change on github.

To generalize the principle, when there is a hot method containing tests that evaluate rarely to true, it could be a good idea to break the content of the if statement (or switch or for) into a separate method, increasing the granularity at which the JIT compiler can optimize the code. This is of course after you used -XX:+PrintInlining to determine there is something to improve in the first place.

The morale of this story is that I made my code faster by adding a method call.

thanks @peterseibel for the feedback.

About these ads

2 responses to “Java JIT compiler inlining

  1. Pingback: Parquet – columnar storage for Hadoop | Hadoopified

  2. joe October 25, 2013 at 4:41 am

    really cool, informative post. and ironic that adding theoretical ‘overhead’ in a method call leads to actual gain

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: