Faces

The primary output device for computer games is the monitor screen, with a few hundred thousand pixels of resolution and a handful of colors. We also have some capacity to deliver sounds, but this communication channel has many technical constraints and is therefore limited to a support role rather than a primary role.

Because our primary output device is visual, most computer games rely heavily on spatial reasoning as their primary challenge. The vast majority of computer games include a verb for moving from place to place, and in many games this is the primary verb. There are some games in which motion is the only verb available to the player. Spatial reasoning has become established as the overwhelming component of computer games.

There is an alternative to spatial reasoning: verbal reasoning, the manipulation of symbols to extract meaning. This is most often done with text, but here we run afoul of the gamer’s insistence on sensational graphics in games. Pure text games are rejected by most game players as inferior to graphically intensive games. Such games have been abandoned by the industry.

There are other alternatives to spatial reasoning in games, and in this article I would like to present one such alternative: use of the human face to communicate information.

Faces are important
I suggest the face because human faces occupy an exalted place in the human visual universe. A portion of the human brain appears to be hard-wired with algorithms for recognizing facial expressions, and many of the basic human facial expressions are universally recognized across all cultures. Faces are important to people.

A measure of their importance to us comes from their use in other visual media such as the cinema. The human face is the most important element of the cinema. More movie frames are expended on human faces than on any other visual element.

You don’t believe me? OK, let’s take an example: the all-time graphics and special-effects, action-packed shoot-em-up extravaganza,
Star Wars. And let’s really handicap it: we’ll use the heart-pounding final space battle in which our hero Luke Skywalker destroys the Death Star. Surely one of the great special-effects graphics spectaculars of movie history, right? So here’s the experiment: get a videotape of the movie and start two stopwatches when the battle starts. Leave one running until the Death Star blows up. Use the other to time the shots that focus on human faces. Start the face-stopwatch whenever the camera focuses primarily on a person’s face; stop it whenever the camera returns to an exterior shot. It’s fairly simple, because most shots are either cockpit facial shots or exterior dogfight shots. The only difficulty you will have will come from the rapid cuts; it’s hard to keep up with some of them.

If you do this experiment, you will find that the overall battle sequence consumes just about 13 minutes. The facial shots consume about 6.5 minutes of this sequence. In other words, in this, the most graphics- and action-packed sequence in the most action-packed movie of modern memory, the human face still hogged half the air time. In the quieter sections of the movie, the ratio is even higher!

The comparison with computer games is embarrassing. Most of the games on the market do not have a single human face in them. Some have a few incidental faces showing up at rare intervals. A few have faces showing up often, but even these are fixed faces, one-time bitmaps that do not change or show expression. I can think of only a handful of games that show faces that change and show expression.

Requirements
What do we need to do to show faces? First, we need to allocate screen space to do the job. I have worked with faces for some years now, and I have developed some rules of thumb for how much screen space you need for various tasks (this assumes black and white displays; with color, fewer pixels are necessary, but the advantage of color is not as great as would be indicated by the data consumption, because most facial recognition relies heavily on the shapes of facial features rather than their coloration.)

The absolute lower limit for face display is 32x32 pixels, or a thousand pixels total. This can be used to indicate a single face. Discrimination between different faces is almost impossible, as is recognition of different facial expressions.

64x64 pixels (4K pixels) allows us to display vague likenesses. The user could readily distinguish a handful of faces and recognize the basic emotional expressions, but no more.

128x128 pixels (16K pixels) makes it possible to recognize a wider array of faces, recognize famous faces, and show a small range of emotional expressions. Emotional nuance would still be lost. This is the smallest size I consider useful for game design.

256x256 pixels (64K pixels) is about the largest size we can afford, and fortunately it gives us a lot of flexibility. A wide variety of faces can be recognized at this resolution, and much of the working range of facial expressions can be expressed.

These rules of thumb assume average quality artwork. It is true that a brilliant artist could squeeze more expression into a smaller space, perhaps making it possible to build a game around a 64-pixel face display. But as designers we cannot simply assume brilliant artistry as part of our designs.

Wider Requirements
The use of faces also requires fundamental changes in the goals of our games. Most now focus on the analysis of spatial relationships to win conflicts. The information that a face offers, however, does not support such gameplay. A face tells us about the emotional state of a character. This type of information is vital to any game that focuses on characters. If we want to use faces in our games, we will need to shift our game focus towards character interaction.

How to do it
How do we put faces onto the screen? We certainly can’t use simple bitmaps. Suppose that we are designing a game with a dozen characters. Each character could easily use 40 different facial expressions. This implies over 500 distinct images. If each image is 128 pixels square, we’re talking about 8 megabytes of bitmapped data. Even if we had the data storage for such a display, the artwork costs for such displays could well cost us $25K. This is simply too expensive, both in dollars and in disk space.

Fortunately, the human face follows a number of regular patterns in its expression of emotion. This makes it possible for us to rely on more algorithmic methods to display faces. Such algorithmic approaches dramatically reduce the costs of face display, both in terms of the artist time required to generate a face and the storage space to use it. I have developed a set of techniques for drawing faces; the explication of these techniques must await a later article. Until then, readers might consult my latest game Guns & Butter (misnamed
The Global Dilemma by Mindscape) for an example of the first generation of my face generation technology.