J2ME Game Optimization Secrets (part 5)

Other Techniques


One technique I was unable to include in my example code was the optimal use of a switch() statement. Switches are very commonly used to implement Finite State Machines, which are used in game Artificial Intelligence code to control the behavior of non-player actors. When you use a switch, it is good programming practice to write code like this:


  public static final int STATE_RUNNING = 1000;
  public static final int STATE_JUMPING = 2000;
  public static final int STATE_SHOOTING = 3000;
  switch ( n ) {
    case STATE_RUNNING:
      doRun();
    case STATE_JUMPING:
      doJump();
    case STATE_SHOOTING:
      doShoot();
  }


There’s nothing wrong with this, and the int constants are nice and far apart, in case we might want to stick another constant in between RUNNING and JUMPING, like STATE_DUCKING = 2500. But apparently switch statements can be compiled into one of two byte codes, and the faster of the two is used if the ints used are close together, so this would be better:


  public static final int STATE_RUNNING = 1;
  public static final int STATE_JUMPING = 2;
  public static final int STATE_SHOOTING = 3;


There are also some optimizations you can perform when using a Fixed Point math library. First, if you’re doing a lot of division by a single number, you should instead work out the inverse of that number and perform a multiplication. Multiplication is slightly quicker than division. So instead of…


  int fpP = FP.Div( fpX, fpD );
  int fpQ = FP.Div( fpY, fpD );
  int fpR = FP.Div( fpZ, fpD );


…you should rewrite it like this:


  int fpID = FP.Div( 1, fpD );
  int fpP = FP.Mul( fpX, fpID );
  int fpQ = FP.Mul( fpY, fpID );
  int fpR = FP.Mul( fpZ, fpID );


If you’re performing hundreds of divisions every frame, this will help. Secondly, don’t take your FP math library for granted. If you have source for it, open it up and take a look at what’s going on in there. Make sure all the methods are declared final static and look for other opportunities to improve the code. For example, you may find that the multiplication method has to cast both ints to longs and then back to an int:


public static final int Mul (int x, int y) {
  long z = (long) x * (long) y;
  return ((int) (z >> 16));
}


Those casts take time. Collision detection using bounding circles or spheres involves adding the squares of ints together. That can generate some big numbers that might overflow the upper bound of your int Fixed Point data type. To avoid this, you could write your own square function that returns a long:


    public static final long Sqr (int x) {
      long z = (long) x;
      z *= z;
      return (z >> 16);
    }


This optimized method avoids a couple of casts. If you’re doing a great deal of Fixed Point math, you might consider replacing all of the library calls in the main game loop with the long-hand math. That will save a lot of method calls and parameter passing. You may also find that when the math is written out manually you can reduce the number of casts that are required. This is especially true if you are nesting several calls to your library, e.g.


  int fpA = FP.Mul( FP.toInt(5),
                    FP.Mul( FP.Div( 1 / fpB ),
                    FP.Mul( FP.Div( fpC, fpD ),
                    FP.toInt( 13 ) ) ) );


Take the time to unravel nested calls like this and see if you can reduce the amount of casting. Another way to avoid casting to longs is if you know that the numbers involved are small enough that they definitely won’t cause an overflow.


To help with high-level optimization, you should look for articles on game programming. A lot of the problems presented by game programming such as fast 3D geometry and collision detection have already been solved very elegantly and efficiently. If you can’t find Java source, you will almost certainly find C source or pseudo-code to convert. Bounds checking, for example, is a common technique that we could have used inside our paint() method. Instead of clearing the entire screen every time, we really only need to clear the section of the screen that changes from frame to frame. Because graphics routines are relatively slow you will find that the extra housekeeping required to keep track of which parts of the screen need to be cleared is well worth the effort.


Some phone manufacturers offer proprietary APIs that help programmers get around some of the limitations J2ME presents, such as lack of sound, lack of Image transparency, etc. Motorola, for example, offers a floating point math library that uses floating point math instructions on the chip. This library is much faster than the fastest Fixed Point math library, and a lot more accurate. Using these libraries completely destroys the portability of your code, of course, but they may be an option to consider if deployment on many different handsets is not a concern.


Conclusions
Only optimize code if you need to
Only optimize where it counts
Use the profiler to see where to optimize
The profiler won’t help you on the device, so use the System timer on the hardware
Always study your code and try to improve the algorithms before using low-level techniques
Drawing is slow, so use the Graphics calls as sparingly as possible
Use setClip() where possible to minimize the drawing area
Keep as much stuff as possible out of loops
Pre-calculate and cache like crazy
Strings create garbage and garbage is bad so use StringBuffers instead
Assume nothing
Use static final methods where possible and avoid the synchronized modifier
Pass as few parameters as possible into frequently-called methods
Where possible, remove method calls altogether
Unroll loops
Use bit shift operators instead of division or multiplication by a power of two
You can use bit operators to implement circular loops instead of modulo
Try to compare to zero instead of any other number
Array access is slower than C, so cache array elements
Eliminate common sub-expressions
Local variables are faster than instance variables
Don’t wait() if you can callSerially()
Use small, close constants in switch() statements
Look inside your Fixed Point math library and optimize it
Unravel nested FP calls to reduce casting
Division is slower than multiplication, so multiply by the inverse instead of dividing
Use tried and tested algorithms
Use proprietary high-performance APIs with care to preserve portability
Where to next?
Optimization is a black art. At the heart of any computer lies the CPU and at the heart of Java lies a virtual CPU, the JVM. To squeeze the last ounce of performance from the JVM, you need to know a lot about how it functions beneath the hood. Specifically, you need to know what things the JVM can do fast, and what it does slowly. Look for sites with solid information on the inner workings of Java. You don’t necessarily have to learn how to program in byte code, but the more you know, the easier it will be to come up with new ways to optimize your applications for performance.


There’s no substitute for experience. In time you will discover your own secrets about the performance characteristics of J2ME and of the handsets you are developing for. Even if you can’t code around certain idiosynchrasies, you could design your next game around them. While developing my game I found that calling drawImage() five times to draw five images of 25 pixels each is much slower than calling it once to draw an image five times the size. That knowledge will definitely help shape my next game.


Good luck, and have fun.


Resources:



  1. J2ME’s official web site contains the latest on what’s happening on this front.

  2. Like wireless games? Read the Wireless Gaming Review.

  3. Discuss J2ME Game Development at j2me.org

  4. A great site on many aspects of Java Optimization

  5. Another great site on Optimization

  6. Many articles on J2ME performance tuning

  7. The amazing Graphics Programming Black Book by Michael Abrash
  8. The Art of Computer Game Design by Chris Crawford

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.