How to Build An Inverse Parser

I developed the concept of the inverse parser as part of the Siboot project. The basic concept is not wholly unique; Texas Instruments developed a somewhat similar concept a few years earlier. I have seen a few other interface systems that utilize some of the concepts of the inverse parser.

An inverse parser is an interface system meant to overcome the deficiencies of the two most common interface systems by combining their strengths. To understand its significance, I must first explain the flaws in both parsers and menuing systems.

Parsers
We all know how a conventional parser works. A string of characters constitutes a command; this command is passed to the parser, which breaks the character string down into recognizable fragments, assigns meaning to each fragment, and executes the resultant command.

Parsers are relatively easy to program so long as the grammar they implement is not overly complex. A parser can implement a large range of commands with little resource consumption. And a parser is very convenient for those who have become fluent in its idiom. However, parsers impose serious problems in terms of the user interface. Because every parser uses some tiny subset of English grammar, and a greatly restricted dictionary, the beginning user is forced to guess the words and grammar of the parser. This is the source of the classic phrase "parser puzzles" by which text adventures were so derisively described. It is also the basis for the apparent contradictions over the merits of GUIs. Those people who have become fluent in the vocabulary and syntax of MS-DOS complain that the GUIs are slow and clumsy to use, while MS-DOS is fast and clean. This is true for those who already know the command set. For those not so experienced, a parser-driven system like MS-DOS can be a pain in the butt.

Menus
Menuing systems attack the parser puzzle problem by presenting the user with a list of all the options available. This does indeed solve the problem, but it imposes some clumsiness on the user interface. Large amounts of screen real estate must be devoted to presenting the user with options, most of which will not be used. When the number of direct options is so large that they cannot all fit in one screen, the menu system is forced to scroll or page the menu, adding even more clumsiness to the interface.

This problem is often tackled with the use of nested menus, in which one menu item calls up a new menu. While functional, this does increase the clumsiness of the system, and it introduces a new class of problem: menu navigation. It’s very easy for a player to get lost in the branches of a deeply nested menu system.

The overall conclusion is that menu systems work well with small interface problems, but they break down when attempting to address complicated situations with many menu items. My hunch is that most menuing systems start to gum up when the total number of commands in the system exceeds about 50; they collapse completely when the total number of commands exceeds 100.

The Inverse Parser
Think of an inverse parser as a parser system that works backwards, applying parser logic as the user is entering his choices, rather than afterwards. Alternatively, think of an inverse parser as a menuing system with two differences: first, the user’s choices build a sentence or command string on the screen where the user can see it. Second, the options available at any given time are intelligently pruned by the parsing logic.

An inverse parser treats each command as a sequence of words obeying a grammar -- in other words, a sentence. The sentence is composed by the user by adding words in sequence until the sentence is complete. This has an important benefit: it uses a metaphor (the sentence) that the player already grasps, thereby collapsing the menu navigation problem into a construct that the player is comfortable with.

Two display regions are required by an inverse parser. The first of these displays the sentence under construction. The second presents a list of all available words appropriate to the circumstances.

The first display is simple enough: you simply write the text that the user has created so far. It requires one textline of screen space.

The second display is more complicated. In the first place, it is interactive. The user will want to click on the word he chooses, so you must maintain a boundary rectangle for each word listed in the second display. The real problem, however, is selecting the words from the dictionary that are appropriate to the situation.

Fortunately, there’s a simple trick that makes inverse parsers easier to program and faster to execute. It’s a two-dimensional array of boolean flags that determine what words can follow any given word. If we call this array FOLLOWS then the value of FOLLOWS[i,j] tells us if word #j can meaningfully follow word #i. When you create your dictionary, you must define the legality of all word pairs. Thus, FOLLOWS[eat, apple] would be true, while FOLLOWS[eat, truck] would be false. This array consumes little space: even with a dictionary of 512 words, a packed array of boolean flags would consume only 32K of RAM.

Now, some of you text-adventure people might snort derisively at a dictionary with only 512 words, but let me point out that these are fully functional words, words that can be readily used throughout the game. Text adventures sometimes have dictionaries with several thousand words, but most of these words are synonyms or special-case words used only in particular rooms. The actual workhorse words number only a few score.

To decide, then, what words to place into the word-selection pane, you simply sweep through the dictionary, evaluating a logical expression for every word. This logical expression is a boolean AND of FOLLOWS and all of the context calculations that belong to your parser. The beauty of this system lies in the use of short-circuit boolean operators. The short-circuit AND will prune away most of the available words at a cost of only a few machine cycles per word. The words that remain can then be evaluated with lots of complicated logical factors.

An example might help. Suppose the player has already built the sentence fragment, "I eat". You must now decide what words to present for his consideration. The FOLLOWS array will eliminate all words in the dictionary but "apple", "candy", "meat", and "bread". For each of these words, you will apply some logical tests: does the player have the object in his possession? If so, has it been spoiled by rot, dirt, or poison? If the word passes all these tests, then it is placed in the word-selection pane.

Code Example
Here is a code fragment in Pascal from the Siboot program that illustrates the method used to select words from the dictionary. It is placed inside a loop whose control value is i; the loop sweeps through every word in the dictionary. "&" means "short-circuit AND"; "|" means "short-circuit OR".

IF ((Follows1[i,CurWord] & FirstSeq)
| (Follows2[i,CurWord] & NOT FirstSeq))
& (NOT SubjBFlg | (NOT Follows2[WThat,i]))
& (i<>WhSubject) {SubjectA­ObjectA1}
& (NOT SubjAFlg | NotAlone | NightVerb | NOT Follows1[1,i])
& (NOT VerbAFlg | NOT TransPVerb | SameLocn | Nighttime)
& ((SntcIndx>ObjectA1) | (i<>WWho))
& ((i<>WWho) | (MVerbA=WAsk))
& (NOT SubjAFlg | (NOT StateVerb))
& (NOT VerbAFlg | (MVerbA<>WGo)
| ((i-56)<>Abs(ThisPlace)))
& ((i<>WGo) | DayTime)
& ((i<=NumChars) | (i>ConstChars))
& (((i-56)<=NumChars) | ((i-56)>ConstChars))
& (NOT DealVerb | DealAble)
& (DealVerb | NOT SubjBFlg | StateVerb)
& (NOT SubjAFlg | (NOT Nighttime & NOT NightVerb)
| (Nighttime & NightVerb & GotAura))
& ((MVerbA<>WTell) | HistOK)
& ((MVerbA<>WAsk) | NOT ObjA2Flg | (i=WWho))
& ((MVerbA<>WAsk) | NOT VerbBFlg | (i=Player))
& (NOT SubjAFlg | NOT (i=WReveal) | ICanOffr)
& (NOT SubjAFlg | NOT AuraWord)
& (NOT RespVerb | RespAble)
& Accusable
& ((i<>WAgree) | AgreeFlag)
& ((i<>WReject) | AgreeFlag)
& ((i<>WWait) | NOT NotAlone)
& ((i<>WGo) | (NOT NotAlone))
& ((i<>WAnswer) | AnswerAble)
& (NOT AnswerFlag | AnswerOK)
& (HelloWrd | NOT NotAlone)
& GoodByeAble
& FeelFlag
& ((i<>WNBetray) | NBetrayAble)
& ((i<>WNAttack) | NAttackAble)
THEN UseIcon(i);

Note that this is one single statement! Despite its gargantuan size, it executes very fast; most instances fail on the very first evaluation and the short-circuit operator drops the program flow to the end of the statement. Note also that I use two dictionary matrices, Follows1 and Follows2. These permit me to use indirect objects as well as direct objects. For example, if I want to build the sentence "I give Joe the ball", the indirect object "ball" follows the verb "give", not "Joe". It is a permissible consequent to the verb, but not to "Joe".

Not shown here are a number of other calculations that determine intermediate boolean values such as Nighttime, DayTime, DealAble, and NotAlone. The reader can imagine that they are simple evaluations of various state variables.

Disclaimer: Icons Versus Words
Siboot used an artificial language that was represented with icons rather than text. This has led to some confusion on the part of those evaluating the inverse parser. Some people look at Siboot and see an icon-based interface. I think that the use of icons was a secondary consideration; the primary user interface innovation lay in the inverse parser, not the use of icons. There is no reason why an inverse parser must use icons. It can use either plain text or icons.

This brings us to the old text-versus-graphics hoo-hah. Most people now believe that text is dead as a medium of expression in computer games. I suggest that the death of text has been greatly exaggerated. After all, even games like
Wing Commander and Monkey Island used plenty of text. It is ironic, though, that most of the interaction in both games was spatial, and the designers couldn’t come up with an effective way to interact with the game through the use of text. That is, the text was used in a low-interactive fashion.

This is understandable. After all, the state of the art in text interaction is stinko. We have highly developed the state of the art in spatial interaction; we now have fully interactive 3D graphics technology. But we have ignored text interaction and then use the poor state of text interaction to dismiss the use of text. Talk about circular reasoning!

If we want to reach a larger portion of the population, we need to offer a more varied set of challenges than just spatial reasoning. The use of inverse parsers makes possible somewhat more variety.

Conclusions
Inverse parsers are a useful alternative to direct parsers and menuing systems. I don’t consider them to be the next great wave of interface design: they’re just not that great. Nevertheless, they offer some improvements over the older systems. My experience with Siboot was that the inverse parser used there was so good that it attracted little comment. That is, after all, the acid test of a truly clean user interface: nobody notices it.