Elementary Arithmetic Methods


Note written January 14th, 1998: This is quite dated stuff, but back then we really did sweat getting decent performance out of our 8-bit machines.

Time was when the amount of processing you had to do to support a game involved little more than some simple boolean operations to determine whether the bullet hit the target. We’ve come a long way since those days, and nowadays our games require far more complex computations. In this article, I’ll go over some of the elementary considerations surrounding the use of arithmetic in games.

The first and most important consideration is the choice of number representation for arithmetic calculations. There are four common options: 8-bit bytes, 16-bit words, 32-bit longwords, and floating-point representations.

Eight-bit representations went out with eight-bit CPUs. Those designers still stuck with eight-bit CPUs do not have many options. This article can do nothing for them other than request a moment of sympathetic silence from the reader. There is, however, one situation in which 16-bit designers might still wish to use 8-bit bytes; more on this later.

The 16-bit integer is now the primary representation for data in computer games. It is more than adequate for most purposes; under ideal circumstances, a 16-bit word offers average resolution of one part in 16,000, or better than .01%. In practice, there are quite a few gotchas to cope with.

First is the fact that the effective resolution of the 16-bit word decreases with the value used. Suppose, for example, that you want to record the wealth of a character in a game., measured in dollars. What happens if you have a rule penalizing errant players with the loss of half their wealth? If the character has 25,000 dollars, this works fine. But if the player happens to have 3 dollars, halving his wealth will give him 1 dollar; that’s quite an inaccuracy. It’s even worse if the player has one dollar.

A common solution to this problem is to constrain the dynamic range of the values accessible in the game. You come up with a brute-force rule that keeps the player’s wealth between a lower limit and an upper limit.

Underflow and Overflow
Two of the most common (and frustrating) problems are overflow and underflow. Overflow occurs when you push a number over its maximum permissible value. For example, if the player has a wealth of 32,000, and then finds a treasure trove of 769 dollars, the player’s wealth should be increased to 32,769. Unfortunately, a 16-bit signed integer cannot show a value in excess of 32,767. Therefore, when you add 769 to 32,000, you will get a result of -32767. Underflow is the same problem at the bottom end of the scale. If you take -32767 and subtract 769 from it, you will get a result of 32,000.

Artificially constraining a value to stay within a particular range does work, but it has its own problems. First, your code becomes inelegant as you backstop every calculation with "IF x > MaxLimit THEN x:=MaxLimit" statements. More important, it is artificial; one way or the other, the user is going to encounter the constraint and wonder, why can’t I have more than 32,767 dollars? Imposing artificial constraints on the user is bad form.

There’s a sneakier way to get around the problem. Instead of putting a hard ceiling over the head of the value, put a soft ceiling. To use the wealth example, instead of arbitrarily confiscating any wealth that might put the player over the ceiling, impose a kind of progressive income tax. That is, as the player grows wealthier and wealthier, you make it harder for him to acquire more wealth. Note that this is just as arbitrary as outright confiscation, but is not so brutal nor so obvious. The player never encounters a brick wall; instead, he encounters a progressively firmer spongy wall. It’s still a wall, but it’s a nicer wall.

There are other solutions. The simplest is to use a larger word size. If 16 bits aren’t enough, go to 32 bits. Current CPUs can handle 32-bit integer arithmetic directly with little additional time cost. Of course, 32-bit words consume twice as much memory as 16-bit words.

Scaling Problems
Geometric manipulations (in which you adjust a number by multiplying it by a fraction rather than adding or subtracting a value) are an important component of any designer’s basic toolkit. Such techniques are useful for modeling growth rates, combat results, and a host of resource management issues. But they can raise a hornet’s next of problems.

Suppose, for example, that you have a value that you need to increase by 1%. There are two ways to do this: you can divide it by 100 and add the result to the original value; or you can multiply by 101 and then divide by 100. (Remember, this is integer arithmetic so you cannot simply multiply by 1.01)

The first approach, dividing by 100, could result in underflow errors if the original value is small. For example,if the original value is less than 100, dividing by 100 will result in a value of zero. Increasing by 1% will have no effect. This can be catastrophic is you are using an algorithm that slowly increases the value of a number geometrically. Even if your value is greater than 100, the calculation will still suffer from significant round-off errors if the value is less than, say, 1000.

The obvious solution is to first multiply by 101 and then divide by 100. That way, the small number is scaled up and then scaled down. Unfortunately, this can yield overflow problems if you are using 16-bit integers and your original value is greater than 326. So here is our quandary: to increase a value by 1%, that value must fall between 100 and 326. Even though it’s a 16-bit integer with a normal dynamic range from 0 to 32,767, we have to keep it in the range 100-326 if we want to be able to increase it by only 1%. That’s a severe restriction!

Fortunately, there is a solution: doubling up the word size. If you need to change a value by a small percentage, you first double its word size (from 16 bits to 32 bits), then multiply by the numerator, then divide by the denominator. Then you truncate back down to 16 bits.

Another solution is to restrict the original value to 8 bits, multiply first to scale up, then divide down. This is one good reason to use 8-bit values. In many cases, the value itself doesn’t need much dynamic range, but it can be used in calculations with lots of dynamic range. If you start small, your word size ceiling looks a lot higher.

What if you original value is already a 32-bit value? You’re not quite out of luck. There’s one last option: floating-point representations. Most compilers readily support floating point arithmetic. Many game designers, especially those of us from the bad old days of 8-bit CPUs without hardware multiply/divide, blanch at the thought of floating-point computations in a game. Not only are floating-point values RAM-guzzlers (32 bits is common), but they also guzzle CPU cycles. The last thing you need in a game any game, no matter how relaxed in pace is to keep the user tapping his fingers while you’re crunching floating-point numbers.

The good news is, the situation is not that bad. The modern 16-bit CPUs with hardware multiply/divide can crunch through a goodly number of floating-point operations without bringing your game to its knees. The trick is to avoid their use in fast real-time calculations. For example, flight simulators are particularly vulnerable to arithmetic problems. The dynamic range of the many variables makes integer arithmetic inappropriate, yet the time costs of floating-point calculations makes them prohibitive. This is what separates the flight-simulation men from the boys: the cleverness to create adequate flight models that work with integer arithmetic.

Under what circumstance, should you use floating-point values instead of integers? There are two situations that demand floating-point numbers. The first of these is an extension of the fractional adjustment problem. If you need to perform a fractional adjustment, and the value of the fraction is determined at runtime, you cannot be certain that the multiply-up/divide-down technique won’t overflow. Floating-point numbers are just about the only way to perform this calculation without fear of error.

The other situation calling for floating-point numbers arises when you have a large set of coupled equations. It is usually possible, when using just a few equations, to merely scale your integer values to cover the dynamic range of the problem. But hand-scaling requires you to anticipate all possible values that might arise during play. While this is easy with 2 or 3 coupled equations, it gets tricky with 10 and impossible with 30.