I have often railed on these pages about the obsession that designers in this industry seem to have with spatial reasoning. Now, don’t get me wrong I think that spatial reasoning is just fine. I enjoy spatial reasoning games and I have designed a few myself. What bothers me is the narrow-minded emphasis on spatial reasoning to the exclusion of anything else. There’s a much bigger world of human mental life out there than can ever be addressed by spatial thinking alone. So long as we think exclusively in terms of spatial reasoning, we shall never expand our designs into this larger world.
Related essay: Spatial vs Verbal Reasoning
The data structure is one way that we organize our program’s view of the world. Now, there are all kinds of data structures, and one of the great discoveries of computer science is that the choice of data structure has profound implications for the computational reach of our work. A well-designed data structure makes it easy to do fabulous things; a lousy data structure makes it impossible.
The best example of this is the huge variety of data structures that we have evolved for handling of spatial problems in our programs. Think of how many ways that we can store spatial information in a computer. One way or another, almost every computer game has some sort of map buried deep in its computational bowels. With many games, we have multiple data structures for managing the different aspects of the spatial mapping. I think it safe to say that we have developed a large and sophisticated set of data structures for manipulating spatial problems in our games.
Contrast this with the situation for verbal reasoning. What kind of data structures do we have for verbal reasoning? Not bloody much, is there? In this essay, I propose to discuss possible data structures for games that rely heavily on verbal reasoning.
I begin with a warning: I am NOT going to use any of the academic research on natural language parsing. The primary reason for this is that I am ignorant of that research. The reason I am ignorant is that every time I study the material, I discover that these people are trying to solve problems that we seldom encounter in entertainment software design, so I abandon the effort. Perhaps someday... but in the meantime, I’ve got design problems to solve, and so I’d rather invent my own clumsy wheels hewn from oak trunks than contemplate their plans for high-tension spoked magnesium wheels that we’ll build just as soon as we figure out where to get some magnesium.
The basic data structure for any game built around verbal reasoning will be the sentence. If we’re going to have games with verbal reasoning, we’re going to need sentence structures. So, what should the data structure for a sentence look like?
First, there are three obvious components that any sentence structure will include. They are a subject, a verb, and a direct object.
Now, how exactly do we store a subject as a piece of data inside a computer?My answer to this is brutally simple and makes matters wonderfully simple:all subjects must be characters in the game. This bold simplification means that we need merely store the character identification number in the subject field of our sentence record. That sure makes subjects easy, doesn’t it?
Of course, a "bold simplification"inevitably generates ugly problems of its own. The most obvious of these is that this bold simplification obviates a great many sentence types. Gone are all statements about anything not a character. You can’t say, "The book is on the table."You can’t say, "The train leaves at noon." There’s a lot of linguistic territory that has been denied existence by my bold simplification.
Focus versus agglutination
At this point I want to digress to a larger issue about good thinking and design. I believe that intellectual subtraction can be just as productive as intellectual addition. In this I often find myself at odds with others, who seem to think that open-mindedness requires the largest possible array of ideas. To them, rejecting any idea is "negative thinking", something to be avoided. They feel that the best design includes the largest number of elements. Faced with two competing features, each of which has redeeming virtues, don’t choose between them include both.
I reject this on fundamental grounds. The whole thrust of human development is not one of addition, but of subtraction. We begin life with an undifferentiated nervous system, one that responds to all stimuli with a global response. If somebody touches us, we trigger every muscle in our body and wriggle randomly. As time goes on, we start to differentiate our responses. We learn to wave our arms in unison, to kick our feet simultaneously. The random gurgles, honks, and snorts that emanate from our vocal tract quickly focus into powerful screams that demand satisfaction. In childhood, we learn to focus our muscle actions with greater precision. Our gait becomes smoother, our gestures more practiced, our facial expressions more nuanced. As we mature, our emotional life shows increasing focus. Where as children we could befriend and enjoy any other child, as young adults we find that our tastes have become narrower, that we expect more of a friend than simple humanity. This process continues all through life. We refine ourselves, distill our experiences, tighten our behaviors, learn who we are. This is fundamentally a process of differentiation, of narrowing, of focus. We grow into who we are not by adding new traits but by defining our already-existing personalities more sharply. We don’t construct ourselves, we find ourselves.
Related essay: Enconium Negativism
This concept of focus through negativity is very close to the concept of meaning. Consider Claude Shannon’s ideas about information content. Suppose that I am sending you messages composed of basic pieces, such as letters of the alphabet. Suppose further that my messages are almost always composed of a’s and b’s, and almost never contain z’s. Then the information content of the a’s and b’s is very low; after all, if you were to guess what the next letter in the stream might be, and you guessed a or b, you’d probably be right. But the information content of a z would be very high; when one of those appeared, it would really mean something significant. Thus, the negative side of information is just as important as the positive side. Light isn’t light without shadow; if you enter a room with all light and no shadows, you’re just as blind as if you enter a totally dark room.
Thus, exclusion is just as valuable and important a component of design as inclusion. What you leave out can be just as important in some ways, more so than what you put in.
Such is the case with my bold simplification. Yes, it excludes a tremendous amount of material. But look how it narrows the field, how it focuses attention on the behavior of the characters. Isn’t that what storytelling is all about? By confining subjects to members of the cast of characters, I have lost a great deal of linguistic territory that isn’t terribly important to storytelling, and I have retained the most important factor: the behavior of those characters. This is an example of how negative thinking can help design.
OK, now let’s turn to the verb. This is the most important element in the sentence. We want lots of verbs so that our characters can do lots of interesting things to each other. It is tempting to try to set up some sort of organized scheme for verbs, perhaps a hierarchy that groups all the verbs in an organized way. I once spent a day studying Roget’s Thesaurus as a possible way to organize a basic set of verbs. I gave up.
I can’t see any clean simplifications we can use here. Perhaps somebody else will come up with something someday. For now, we simply have to create a dictionary of verbs and use their identification numbers in the verb field.Fortunately, this is not a technical problem, at least not in terms of data structures. A 16-bit verb field permits us 65K verbs. That should be plenty. The larger problem is setting up the internal dictionary of verbs, and defining the properties of each verb. There’s no way around this problem:it’s a lot of work.
With the direct object, I again invoke my bold simplification that only characters may be used as direct objects. What this means is that the only activities allowed in my system are actions that one character can take with respect to another character. But consider just how wide a range of behavior this is! Don’t worry; the number of ways that people can act on each other is large enough to keep the entertainment software designer quite busy. Besides, as I shall show later, we can extend some of these verbs with additional fields.
These are the three primary fields in any sentence data structure:subject, verb, and direct object. There are also a number of other fields that might seem beneficial to add. I shall take each up in turn.
This is an obvious addition, although one that might often be skipped. After all, not that many sentences require an indirect object. Still, it doesn’t hurt to include the field in the data structure, and it can be quite useful when we need it.
These are also obvious extensions to any sentence structure, and they certainly expand the range of possibilities. In the simplest case, we could simply have a set of adverbial intensifiers that extend the range of some of the verbs. For example, we could replace the verbs "walk"and "run"with a single verb "locomote" and add an adverbial intensifier. Then "locomote low" means "walk slowly", "locomote moderate" means "walk fast", and "locomote high" means "run". In the same way, "kiss low" means a peck on the cheek, "kiss moderate"means a normal affectionate kiss, and "kiss high"means a long, slobbering exercise in lip gymnastics.
More adverbial options could give finer shades of meaning. Sounds like a great idea, right?
In practice, I have found that adverbs are a bad idea. The problem doesn’t arise from the internal manipulations of the sentence structure; it comes from the I/O requirements. How would one go about showing or explaining the idea of "locomote high"?Perhaps we could simply show a person walking at three different speeds. OK, fine, but what happens when we show "kiss high" as a peck on the cheek at triple speed? Even if you confine yourself to plain old text, you still can’t get around the problem. "He locomoted slowly"works, but "He kissed her slowly" is misleading. "He kissed her intensely"might make sense, but "He locomoted intensely"doesn’t. No matter what I/O system you use for your storytelling, adverbial modifiers just aren’t going to work.
The solution is to fold the adverbs into the verbs and proliferate a larger set of verbs. If you want to have three different kinds of kiss, have three different kiss verbs. If you want five different speeds of walking/running, then have five different walking/running verbs. Granted, this can become cumbersome. For example, consider the verb "declare affection". For a variety of reasons, a designer would not want to have an arbitrary set of such verbs (e.g. "declare love", "declare hatred", "declare indifference", "declare friendship as opposed to intense love") lying about. It’s much cleaner to have a contiguous set of verbs for declaring affection, with some sort of intensity gradation. I use a five-level system; that seems to provide enough emotional resolution.
The same problem applies with the use of adjectives. Moreover, adjectives have an even greater problem:they modify nouns, not verbs, and as such are useful only in statements of fact. Who cares if the box is blue, or the car is shiny, or the stick is heavy?We’re talking drama here, not physics.
Nonetheless, there are is one place where adjectives are unavoidable; this caused me endless problems. More on this problem later.
Next we get into prepositional phrases. Again, I rejected such elements; they are functionally identical to either adverbs or adjectives, so there is no need to repeat the arguments against those.
After this comes a whole train of advanced linguistic concepts:conjunctions, clauses, subjunctivity, and so on. These are all easily rejected on the grounds of complexity. Yes, it would be nice to include them, and if you really want them, YOU figure out the design.
References to objects
My sentence structure so far has no provision whatever for dealing with objects. This may strike some as strange; after all, games always have objects in them; this thing has just people! Ugh!
I must concede the need for occasional references to objects. But I insist that all objects must be operationally significant in terms of the interpersonal interaction. That is, I am not countenancing a design in which the player accumulates vast armories, treasure chests bulging with gems, and other paraphernalia. Any object in an interactive story must exist for the sole purpose of generating more interesting interactions between characters. The way to integrate physical objects into the story is to make them the objects of individual greed. This allows people to engage in jealousy, trade, extortion, theft, and all the other activities that make life so interesting.
If we are to include objects in the sentence structure, though, we’ll need a structure to include it. We can of course make use of the indirect object slot. I have found that, for a variety of computational reasons, it’s a little cleaner to add a new field that Icall the Third Object. This is the slot into which we put objects.
You might wonder, "Wot the hay-ell is this Third Object stuff?"It really is easy to understand if you think about the function of the indirect object. Consider this sentence:"Fred gave the cup to John". Compare it with this sentence:"Fred gave John the cup". In the second sentence we have converted a prepositional phrase to an indirect object. The role of the indirect object is implicit in the verb "give". In the same fashion, a third object is really a condensation of a more complex structure, whose meaning is made implicit by the verb.
This is one place where adjectives enter the mix. Sometimes a numerical modifier must be included in the sentence. It’s really hard to kluge your way past numbers without getting into adjectives. Here’s one example:in Le Morte D’Arthur, after a battle, the computer needs to report casualties to Arthur. I can’t fall back on my trick of using five levels of casualties and using words (tiny, low, moderate, heavy, catastrophic) to describe these. I really need to show the number. The solution is to create a verb that takes a numerical value in its Third Object field.
There are other fields that we will want to include in the sentence structure. The time and place of the occurrence are useful items to include. After that come a variety of fields that I use for computational purposes. They facilitate many of the computations necessary to keep the whole system running smoothly. I won’t describe them here, as they are only meaningful in the context of the program at large.
Other linguistic structures
We should not be too ethnocentric (lingocentric?)in our formulation of sentence structure. There is nothing intrinsically superior about the (subject, verb, object) sentence structure so common to Indo-European languages. There are plenty of other ways to construct a language. For example, the Semitic languages construct nouns out of sequences of consonants, and then include different vowels between the consonants to indicate the role that the noun plays in the sentence. Japanese is more casual about subjects than the Indo-European languages; only about 25% of Japanese sentences include what an Indo-European grammarian would call a subject.Instead, Japanese sentences are structured around a topic. One linguist described a typical Japanese sentence as a topic with clothing wrapped around it.
The Hopi language uses a very different approach to sentence structure. Instead of conjugating verbs by tense and mood, Hopi conjugates them by the intent of the speaker or the action. For example, the Hopi word "wari" means "running" as a statement of fact, regardless of whether that fact lies in the present or the past. But "era wari"refers to running as a statement of fact drawn from memory. "Warikni" refers to running as a statement of expectation, and so might be confused with a future tense, but the emphasis of the shift is on expectation, not temporality. Lastly, "warikngwe" refers to running as a statement of a law or principle, as a consequence of some cause.
The important idea here is not that Hopi is any better or worse than English, but that it demonstrates how easily we can break our linguistic structures up in different ways. There is no need for us to confine ourselves to the structures of our own language. Remember, we are designing a system for the internal use of the computer, and so we don’t care how understandable it is to our users. After we have carried out our internal computations, we can always translate the results into a form palatable to the user.
If we are to break loose from the spatial reasoning that now dominates our designs, we will need data structures that support verbal reasoning. The most intuitive starting point for English-speakers is a data structure for sentences. Someday,Iam sure, the data structures we use for verbal reasoning will be complex and shifting, with a single program using a variety of different data structures for different tasks. But for now, our job is to figure out the basics.