J2ME Game Optimization Secrets (part 3)

Out of the loop?


Code inside a for() loop will be executed as many times as the loop iterates. To improve performance, therefore, we want to leave as much code as possible outside of our loops. We can see from the profiler that our paint() method is being called 101 times, and the loop inside iterates 16 times. What can we leave out of those two loops? Let’s start with all those declarations. We are declaring a Font, a String, an Image and a Graphics object every time paint() is called. We’ll move these outside of the method and to the top of our class.


public static final Font font =
  Font.getFont( Font.FACE_PROPORTIONAL,
                Font.STYLE_BOLD | Font.STYLE_ITALIC,
                Font.SIZE_SMALL);
public static final int graphicAnchor =
                   Graphics.VCENTER | Graphics.HCENTER;
public static final int textAnchor =
      Graphics.TOP | Graphics.LEFT;
private static final String MESSAGE = ” ms per frame”;
private String msMessage = “000” + MESSAGE;
private Image stringImage;
private Graphics imageGraphics;
private long oldFrameTime;


You’ll notice I made the Font object a public constant. This is often a useful thing to do in your apps, as you can gather the font declarations you use most together in one place. I have found the same goes for anchors, so I’ve done the same with the text and graphic anchors. Pre-calculating these things keeps those calculations, however insignificant, out of our loop.


I’ve also made the MESSAGE a constant. That’s because Java loves to create String objects all over the place. Strings can be a huge memory drain if they’re not controlled. Don’t take them for granted, or you will probably bleed memory, which can in turn affect performance, especially if the garbage collector is being called too often. Strings create garbage, and garbage is bad. Using a String constant reduces the problem. Later we’ll see how to use a StringBuffer to completely stop memory loss from String abuse.


Now that we’ve made these things instance variables, we need to add this code to the constructor:


stringImage = Image.createImage( font.stringWidth( msMessage ),
                                 font.getBaselinePosition() );
imageGraphics = stringImage.getGraphics();
imageGraphics.setFont( font );


Another cool thing about getting our Graphics object up-front is that we can set the font once and then forget about it instead of setting it every time we iterate through the loop. We still have to wipe the Image object each time using fillRect(). Gung-ho coders might see an opportunity there to create two Graphics objects from the same Image, and pre-set the color of one to COLOR_BG for the call to fillRect() and COLOR_FG on the other for the calls to drawString(). Unfortunately, the behavior of getGraphics() when called multiple times on the same Image is ill-defined in J2ME and differs across platforms so your optimization tweak might work on Motorola but not NOKIA. If in doubt, assume nothing.


There is another way to improve on our paint() method. Using our brain again we realize that we only need to re-draw the string if the frameTime value has changed since the method was last called. That’s where our new variable oldFrameTime comes in. Here’s the new method:


public void paint(Graphics g) {
  g.setColor( COLOR_BG );
  g.fillRect( 0, 0, getWidth(), getHeight() );
  if ( frameTime != oldFrameTime ) {
    msMessage = frameTime + MESSAGE;
    imageGraphics.setColor( COLOR_BG );
    imageGraphics.fillRect( 0, 0, stringImage.getWidth(),
                            stringImage.getHeight() );
    imageGraphics.setColor( COLOR_FG );
    imageGraphics.drawString( msMessage, 0, 0, textAnchor );
  }
  for ( int i  = 0 ; i < DRAW_COUNT ; i ++ ) {
    g.drawImage( stringImage, getRandom( getWidth() ),
                 getRandom( getHeight() ), graphicAnchor );
  }
  oldFrameTime = frameTime;
}


The Profiler now shows that the time spent in OCanvas paint is down to 42.01% of the total. Comparing the frameTime across calls to paint() resulted in drawString() and fillRect() being called 69 times instead of 101. That’s a decent savings, and there’s not much more to go, but now it’s time to get serious. The more you optimize, the harder it gets. Now we are down to scraping out the last pieces of superfluous, cycle-eating code. We’re dealing with shaving off very small percentages, or even fractions of percentages now, but if we’re lucky, they’ll add up to something significant.


Let’s start with something easy. Instead of calling getHeight() and getWidth(), let’s call those methods once and cache the results outside our loop. Next, we’re going to stop using Strings and do everything manually with a StringBuffer. We’re then going to shave a little off the calls to drawImage() by restricting the drawing area through calls to Graphics.setClip(). Finally, we’re going to avoid making calls to java.util.Random.nextInt() inside our loop.


Here are our new variables…


  private static final String MESSAGE = “ms per frame:”;
  private int iw, ih, dw, dh;
  private StringBuffer stringBuffer;
  private int messageLength;
  private int stringLength;
  private char[] stringChars;
  private static final int RANDOMCOUNT = 256;
  private int[] randomNumbersX = new int[RANDOMCOUNT];
  private int[] randomNumbersY = new int[RANDOMCOUNT];
  private int ri;


…and here is the new code for our constructor:


iw = stringImage.getWidth();
ih = stringImage.getHeight();
dw = getWidth();
dh = getHeight();
for ( int i = 0 ; i < RANDOMCOUNT ; i++ ) {
  randomNumbersX[i] = getRandom( dw );
  randomNumbersY[i] = getRandom( dh );
}
ri = 0;
stringBuffer = new StringBuffer( MESSAGE+”000″ );
messageLength = MESSAGE.length();
stringLength = stringBuffer.length();
stringChars = new char[stringLength];
stringBuffer.getChars( 0, stringLength, stringChars, 0 );


You can see we’re pre-calculating Display and Image dimensions. We’re also cacheing the results of 512 calls to getRandom(), and we’ve dispensed with the msMessage String in favor of a StringBuffer. The meat is still in the paint() method, of course:


  public void paint(Graphics g) {
    g.setColor( COLOR_BG );
    g.fillRect( 0, 0, dw, dh );
    if ( frameTime != oldFrameTime ) {
      stringBuffer.delete( messageLength, stringLength );
      stringBuffer.append( (int)frameTime );
      stringLength = stringBuffer.length();
      stringBuffer.getChars( messageLength,
                             stringLength,
                             stringChars,
                             messageLength );
      iw = font.charsWidth( stringChars, 0, stringLength );
      imageGraphics.setColor( COLOR_BG );
      imageGraphics.fillRect( 0, 0, iw, ih );
      imageGraphics.setColor( COLOR_FG );
      imageGraphics.drawChars( stringChars, 0,
                               stringLength, 0, 0, textAnchor );
    }
    for ( int i  = 0 ; i < DRAW_COUNT ; i ++ ) {
      g.setClip( randomNumbersX[ri], randomNumbersY[ri], iw, ih );
      g.drawImage( stringImage, randomNumbersX[ri],
                   randomNumbersY[ri], textAnchor );
      ri = (ri+1) % RANDOMCOUNT;
    }
    oldFrameTime = frameTime;
  }


We’re using a StringBuffer now to draw the characters of our message. It’s easier to append characters to the end of a StringBuffer than to insert them at the beginning, so I’ve switched our display text around and the frameTime is now at the end of the message, e.g. “ms per frame:120”. We’re just writing over the last few frameTime characters each time, and leaving the message part intact. Using a StringBuffer explicitly like this saves the system from creating and destroying Strings and StringBuffers each time through our paint() method. It’s extra work, but it’s worth it. Note that I’m casting frameTime to an int. I found that using append(long) caused a memory leak. I don’t know why, but it’s a good example of why you should use the utilities to keep an eye on things.


We’re also using font.charsWidth() to calculate the width of the message image so that we have to do the minimum of drawing. We’ve been using a proportional font, so the image for “ms per frame:1” will be smaller than the image for “ms per frame:888”, and we’re using Graphics.setClip() so we don’t have to draw the extra. This also means we only have to fill a rectangle big enough to blank out the area we need. We’re hoping that the drawing time we save makes up for the extra time spent calling font.charsWidth().


It may not make much of a difference here, but this is a great technique to use in a game for drawing a player’s score on the screen. In that case, there’s a big difference between drawing a score of 0 and a score of 150,000,000. This is hampered somewhat by the implementation’s incorrect return values for font.getBaselinePosition(), which seems to return the same value as font.getHeight(). Sigh.


Finally, we’re just looking up the pre-calculated “random” co-ordinates in our two arrays, which saves us making those calls. Note the use of the modulo operator to implement a circular array. Note also that we’re using the textAnchor for drawing both the image and the string now so that setClip() works correctly.


We are now firmly in a gray area with respect to the numbers this version of the code produces. The profiler tells me that the code is spending roughly 7% more time in paint() than without these changes. The call to font.charsWidth() is probably to blame, weighing in at 4.6%. ( That’s not great, but it could be reduced. Notice that we’re retrieving the width of the MESSAGE string every time. We could easily calculate that ahead of the loop body and simply add it to the frameTime width. ) Also, the new call to setClip() is labelled 0.85%, and seems to significantly increase the percentage of time spent in drawImage ( to 33.94% from 27.58% ).


At this point, it looks like all this extra code must certainly be slowing things down, but the values generated by the application contradict with this assumption. Figures on the emulator fluctuate so much as to be inconclusive without running longer tests but my i85s reports that things are a little faster with the extra clipping code than without, coming in at 37130ms without either call to setClip() or charsWidth(), and coming in at 36540 with both. I ran this test as many times as I had patience for, and the results were solid. This highlights the issues of varying execution environments. Once you get to the point where you’re not sure if you’re making headway, you might be forced to continue all your testing on the hardware, which would require a lot of installing and uninstalling of JAR files.


So it looks like we’ve squeezed a lot more performance from our graphical routines. Now it’s time to take the same high-level and low-level approaches with our work() method. Let’s review that method:


  public synchronized int work( int[] n ) {
    r = 0;
    for ( int j = 0 ; j < DIVISOR_COUNT ; j++ ) {
      for ( int i = 0 ; i < n.length ; i++ ) {
        divisor = getDivisor(j);
        r += workMore( n, i, divisor );
      }
    }
    return r;
  }


Every time through the loop in run() we’re passing in our array of numbers. The outer loop in the work() method calculates our divisor, then calls workMore() to actually perform the division. All kinds of things are wrong here, as you can probably tell. For a start, the programmer has put the call to getDivisor() inside the inner loop. Given that the value of j does not change through the inner loop, the divisor is an invariant, and really belongs outside the inner loop.


But let’s think about this some more. The call itself is completely unnecessary. This code does the same thing…


  public synchronized int work( int[] n ) {
    r = 0;
    divisor = 1;
    for ( int j = 0 ; j < DIVISOR_COUNT ; j++ ) {
      for ( int i = 0 ; i < n.length ; i++ ) {
        r += workMore( n, i, divisor );
      }
      divisor *= 2;
    }
    return r;
  }


…without that call to getDivisor(). Now our profiler is telling me that we are spending 23.72% in our run() method, versus 38.78% before we made these improvements. Always optimize with your head first before messing with low-level optimization tricks. With that said, let’s take a look at some of those tricks.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.