Over and over again
So far you have learned a number of things to do with your computer: arithmetic, input and output, and branching. If you think about it, though, you might wonder what good it all is. After all, you can do arithmetic with a $10 calculator without learning all this high-faluting nonsense about variables and constants and deferred execution. The computer’s ability to make decisions is interesting, but you can manage without it. You might be tempted to dismiss computer programming as a lot of high-sounding pap. In other words: "Big, fat, hairy deal!"
You would be right to dismiss computers as a waste of time if all they did was calculate and branch. But we are at last going to learn something that makes computers really useful, and it is very important that you understand the fundamental concept behind this capability.
We humans have come a long ways since the good old days of caves and woolly mammoths. One of the key factors contributing to our progress has been the development of increasingly powerful tools. We use tools for almost everything we do. We even use "tool-tools" &emdash; tools that make other tools.
Just exactly what is a tool? It is a device that allows us to execute some special function quickly and more easily than would otherwise be possible. Of course, there is a price we pay for this benefit: the cost of the tool. Suppose, for example, that I want to dig a small hole. I could grab a nearby rock and scrape out a hole in five minutes. Or I could go build myself a shovel and then dig the hole in one minute. Now, building the shovel might take me several hours, so if all I want to do is dig a single hole, I am better off using the rock. But if I know that I will be digging quite a few holes in my time, then the savings from the shovel can really add up. This is the central concept behind any tool. You pay a steep entry price to get the tool, but each time you use it, you enjoy a savings of time. If my shovel cost me 60 minutes to build and saves me four minutes per hole, then after fifteen holes I am ahead of the game.
Play it again, Sam
Implicit in all of this is one of the most fundamental concepts behind all technology: the concept of repetition. Doing something once is slow, cumbersome, and prone to mistakes. But if we do it over and over, we develop speed and accuracy. If we can develop a system (a tool) that allows us to reduce a huge job to a sequence of repetitive operations, we suddenly become a great deal more efficient. You want a machine to take you long distances? OK, we’ll have a piston that pushes down once, turning the wheel that goes round, moving you forward. Then the piston goes back up and does it again, and again, and again. Take the ticket from the customer’s hand, tear it in half, and return the pieces to the customer. Then turn to the next person and do it again, and again, and again. Hammer the nail into the shingle; then get another nail and do it again, until the shingle is finished. Then do another shingle, and another, and another.
Again and again and again; another and another and another. This is the stuff of productivity: repetition. Repetition allows us to specialize, to learn and hone our skills. A painter can paint my house faster and better than I can, because he has painted many houses. An executive can manage a company better than I can, because he has managed many companies. Experience is nothing more than the end result of repetition. Repetition doesn’t make the wheels of civilization go round; it is the wheels of civilization going round.
The real value of the computer as a tool, then, will not lie in simple computations. The real utility of this machine is realized in repetition. If we use it over and over, we derive the true benefit of the tool. But there are two types of repetition: manual and automatic. Manual repetition is the repetition of the human; automatic repetition is the repetition of the machine. Thus, a carpenter nailing nails is engaging in manual repetition, while an automobile engine is automatically repetitive. A calculator is used in a manually repetitive manner. You, the user provide the repetition by using it over and over. The calculator itself doesn’t know or care anything about repetition. But the computer can engage in automatic repetition. The difference between a calculator and a computer is the difference between a chisel and a jackhammer, a rifle and a machine gun, a pen and a printing press.
The key to this automatic repetition with the computer is called the loop. Looping is the stuff and substance of computing. Any computer program without a loop is not worth writing or executing. Looping is truly the essence of computing.
The Infinite Loop
The simplest type of loop is trivially simple to create. You will recall from Chapter 4 that branching allows you to jump to different points in the program, to alter the program flow. The only branching cases I discussed in that chapter were cases of forward branching, in which the program always flows forward, never folding back on itself. But consider the following fragment of code:
50 PRINT "Hip, hip, hooray!"
60 GOTO 50
This is the simplest type of backward branching, in which the program flows back to itself. After it prints "Hip, hip, hooray!", the program moves to line 60, which directs it back to line 50. This simple loop is called an infinite loop, because it will continue forever unless you either a) press the BREAK key on your keyboard or b) reset or turn off the computer. An even more extreme case of an infinite loop is the following statement:
60 GOTO 60
This statement will execute forever, going to itself but never doing anything. It is more Sisyphusean than Sisyphus himself; at least he had a boulder to roll over and over. This statement does nothing over and over.
Terminating the loop
Do you remember the story of the Sorcerer’s Apprentice? The apprentice, having overheard certain magic words, orders a broom to bring some water. This the broom does, but the magic spell has apparently set the broom into an infinite loop, and the broom continues to bring more and more water, setting the room awash. The apprentice realizes to his dismay that he doesn’t know how to terminate the loop, so he is unable to stop the broom. There is a lesson here for the beginning programmer: make sure you know how to terminate a loop before you start it. The infinite loop is an academically interesting beast, but it represents a useless extreme. If a computer program with a loop is to have any practical value, it must be able to terminate the loop. Being able to start a process is only half a power; being able to stop it is the other half. How does one stop a loop? There are two fundamental methods: the conditional termination and the predetermined count.
In this form of looping, the program repeats the process until some condition is met. In effect, you are telling the computer, "Computer, do this over and over again until (blank) happens." The condition is any logical expression that can be evaluated in an IF-THEN statement. Thus, we normally use an IF-THEN statement to terminate the loop. Here is an example:
50 PRINT "Please input the amount of the next check."
60 PRINT "If no checks are left, input a zero."
70 INPUT AMOUNT
90 IF AMOUNT<>0 THEN GOTO 50
This little loop, which might come from a checkbook balancing program, repeatedly asks for the amount of the next check until the user enters a value of 0; then it exits the loop.
We use this type of loop when we don’t know how many times we want the loop to be executed. All we know is that the loop will be executed at least once and possibly many, many times. We keep executing it until the termination condition is satisfied.
The general structure of this type of loop is simple; we just put an IF-THEN statement at the end of the loop that loops us back up to the top of the loop unless the termination condition is satisfied.
This is the second major type of loop. We use this loop when we know how many times we want the loop to execute. It uses a new type of BASIC command: the FOR-NEXT loop. Here’s a simple example:
50 FOR X=1 TO 10
70 NEXT X
This bit of code works as follows: first, the computer stores a zero into FROGGY. Then, when it first encounters line 50, it stores a one into X. Then it moves on to line 60, where it adds X to FROGGY and stores the result (a one) into FROGGY. Then it goes to line 70, which tells it to use the "NEXT X". An awful lot happens in line 70. The next X after 1 is 2, so it stores a 2 into X. Then it checks to see if X has passed the upper limit of 10 that was imposed in line 50 ("FOR X=1 TO 10"). The value of X is only 2, so it automatically loops back up to line 50, does nothing there, and proceeds on to line 60. This loop continues with X taking values of 1, 2, 3, 4, and so on all the way to 10. When it reaches line 70 after the tenth pass, it goes to the next X, which is 11, and realizes that 11 is greater than the upper limit of 10 imposed in the FOR statement. It therefore decides to terminate the loop, and proceeds from line 70 to the next statement in the program.
The general form of a FOR-NEXT loop is as follows: A FOR-statement goes at the top of the loop, specifying three things: 1) the loop variable (in our example, X) that will step through all the values one at a time; 2) the initial value of the loop variable (in our example, 1) that tells the computer what value to start with for the loop variable; and 3) the final value of the loop variable (in our example, 10), that specifies the last value that the computer should use for the loop variable before terminating the loop. After the FOR-statement comes the body of the loop. We mark the bottom of the loop with a NEXT-statement.
You now know how to construct a loop. But there is a difference between knowing how to operate a tool and knowing how to utilize a tool. In order to utilize loops, you must understand a variety of other concepts. The first of these is the array. There are very few cases of loops that do not in some way use an array. An array is a group of numbers catalogued in a list. If a regular variable is like a box that holds a number, then an array is like row of boxes marked #1, #2, #3, and so on. An array allows us to readily deal with a group of numbers.
For example, hold old are you? That’s a number. How old is your brother? Your sister-in-law? Your cousin? If you were making files on all your relatives, you might want to make a list of everyone’s age. Now, the wrong way to do that would be to store each age in a separate variable, like this:
and so on. If you had 10 relatives, you’d end up with a mess. The right way to deal with this is to save all the numbers in an array. Then, if Joe is the first person, and Mary is the second person, and Grampa is the third person, then you would store Joe’s age (33) in the first box in the array, Mary’s age (24) in the second box, and Grampa’s age (62) in the third box in the array. The ages of the other relatives would go into other boxes.
Now, the problem with this system is keeping track of whose age is in which box. We normally don’t think of Grampa as person #3, but that’s what we have to do if we are going to use arrays. This problem can become severe if you have a big array with, say, 10 numbers in it. Then you get into all sorts of problems keeping track of who goes where. As it happens, there are a number of ways to solve this, but we’ll get to them later in the book. For now, just think in terms of a list like this:
# Name Age
1 Joe 33
2 Mary 24
3 Grampa 62
The array of ages is then 33, 24, 62, and so on.
Now for the technical details of implementing an array. To use an array, you must first notify the computer that you want to use an array. You must do this before you start to use the array. Normally, notifying the computer of arrays is something you do right at the beginning of the program. After all, it’s only simple courtesy to notify people up front of the expectations you’ll be placing on them, and the same thing goes for computers. You notify the computer with the command DIM, which is short for DIMENSION. Why they gave it a dumb name like DIMENSION, I’ll never know, but we’re stuck with it now. After you say DIM, you list the names and lengths of the arrays that you intend to be using in the program. A typical DIM-statement might look like this:
10 DIM AGES(10)
This statement tells the computer that you are going to be using an array that you will call AGES. You want 10 boxes in this array to hold 10 numbers. Now, if you ask for 10 boxes, you’ll get 10 boxes. This is important, because if you try to use 11 boxes, the program will not work. Don’t try to overcompensate by asking for, say, a million boxes. Those boxes come out of the memory that your computer has. Remember when you bought the computer, and the salesman yakked on and on about how it had "512K RAM", or something like that? Well, that’s how much memory your computer has, and, no matter how much you have, it never seems to be enough. So don’t waste your precious RAM by asking for more boxes than you need!
Once you have notified the computer with your DIM-statement, you are free to use the array. You treat the array in the same way that you treat a regular variable, with two exceptions. First, you can only handle a single number in the array at a time. That is, you can’t can’t tell the computer to, say, multiply the array by 5. How would you feel if somebody walked up to you, handed you a sheet of paper filled with numbers, and said, "Multiply this by 5." ? What does that mean? Multiply each one by 5? Add them all up and multiply the sum by 5? Who knows? So our program never treats the array as a lump; it must instead refer to individual numbers in the array.
The second difference between an array and a regular variable is that you must specify which element of the array you want to work with. If I have an array called AGE, I never write a command like this:
The problem with this command is that it doesn’t tell the computer which number in the array is equal to 5*FROGGY. So, to tell the computer which number you want, you use parentheses, like this:
This command makes sense; it tells the computer to put a value of 5*FROGGY into the second box in the array. Whenever you want to refer to one of the boxes in an array, you specify the name of the array, then an open parenthesis, then the number of the box, then a close parenthesis.
Of course, there is nothing that says that the number of the box has to be specified as a constant. That is, you don’t have to always say things like AGE(2), AGE(5), or AGE(9). If you want, you can use any arithmetic expression inside those parentheses. You could say this, for example:
These two commands will do exactly the same thing that the earlier example does; they will store a value of 5*FROGGY into the second box in the array.
You can even get snazzy with this, if you want:
This command will store a value of 5*FROGGY into whatever box is specified by the expression 3*BIRDIE+2. Of course, if BIRDIE is, say, 3, then 3*BIRDIE+2 will be 11, and the computer will try to store 5*FROGGY into the 11th box, which isn’t there, because we told the computer to make 10 boxes, and the computer will get confused, and that’s not good. So you have to be careful when you write snazzy commands like the one above.
Arrays and loops
The real value of arrays comes when you use them in loops; in fact, arrays and loops go hand in hand -- they were made for each other. Here is a very simple example of a loop using an array:
50 FOR X=1 TO 10
70 NEXT X
This little loop does nothing more than store a zero into each box in the array. This is a very common function, called initialization. You see, when you create an array with a DIM-statement, the computer sets aside some memory for the array. Odds are that memory was used for something else earlier, and so contains numbers produced by previous computations. Thus, when you first create an array, it already has all sorts of meaningless numbers in its boxes. You’ve got to clear out those boxes so that you don’t insult Auntie Millie by listing her age as 233. A loop like the above example will do that quite nicely.
Of course, setting all the ages to zero like the example may erase the chalkboard, but it doesn’t put any real numbers into the array. How do you do that? Here’s an example:
80 FOR X=1 TO 10
90 PRINT "Please input the next person’s age."
100 INPUT Y
120 NEXT X
This little loop will ask the user to type in the age of each person on the list. It will then store that age into the AGE array.
Now that all the ages are properly stored in the AGE array, let’s see how arrays and loops can work together. A good exercise is to find the largest number in an array. In other words, let’s have the computer find the oldest person in our array.
In order to do this, we must first design our algorithm. An algorithm is just a strategy for solving a problem. It is a set of steps that we know will lead to a solution. An algorithm is like a soft, fuzzy version of a program. The algorithm expresses the general idea of the program; the program translates the algorithm into specific commands. You use an algorithm when you give somebody directions to your home ("Go to the intersection of Elm and Hazel; turn right onto Hazel and follow it to the third stoplight. . ."). You are not specifying the actual commands necessary to get to your home, such as "Turn the steering wheel right 90 degrees, press on the accelerator. . ." You are describing the process in general terms. That’s what an algorithm is: a general, non-specific description of a procedure. Algorithms are a useful way to figure out a problem without plunging into the picky details of writing a program. Once you’ve got your general strategy figured out, it’s a lot easier to translate the algorithm into a program than to go directly from your problem to the program.
Our first task is to define what we are trying to accomplish. We want to find not one but two numbers: we want to know what the oldest age is and who has that oldest age. In programming terms, we need the highest array value and the array number of that value.
Our algorithm for doing finding these numbers will be as follows: We’ll start off by telling ourselves that the record-holding oldest age is 0 years old; that’s so young that we are guaranteed that everyone will break that record. We will also set the record-holder as 0, which is to say, nobody. Then we will sweep through the array, picking each age in turn. When we pick an age, we ask, "Is this age older than our world-record age?" If it isn’t, we should skip on to the next age, but if it is, we have a new record, so store this age into the world-record age, and store its array number ("This is the 3rd number in the array") into the record-holder. After we have done this for all the people in the array, we will print out our answer.
Now our only task is to translate the algorithm into some BASIC commands. Read this code carefully and verify for yourself that the code implements the algorithm:
150 FOR X=1 TO 10
160 IF AGE(X)
190 NEXT X
200 PRINT "The oldest person is person number "; WHOISOLDEST
210 PRINT "This person is ";OLDESTAGE;" years old."
What is the significance of this little program? If you were blessed with a skeptical attitude, you might argue that this whole thing is a waste of time, because you could search through a list of ten numbers and find the highest number in a few seconds. But consider how easily this program could be changed. What if, instead of ten numbers, there were a thousand numbers? To do that by hand would be a great deal more work, and you might make a mistake, but to change the loop to handle a thousand numbers, we need only change line 150 so that the 10 is 1000. It’s that simple. (Of course, somebody would still have to type in the thousand numbers.)
There are plenty of other things we could do. Do you want the youngest person instead of the oldest? Just change line 130 to OLDESTAGE=999 and line 160 to IF AGE(X)>OLDESTAGE THEN GOTO 190. Suppose you wanted to find anybody between 40 and 50 years old. Then you could change the loop to test for that condition ("IF (AGE(X)>=40) AND (AGE(X)<=50). . .") and print out a message if the condition was satisfied. For a real challenge, you could sort the array, listing the oldest person, then the next oldest, then the next, all the way to the youngest person. But that is too advanced a topic for this book.
The loop on the computer is closely analogous to the moveable type on the printing press. The invention of moveable type was profoundly important because it allowed each type character to be used over and over again, in different words on different places on the page. This made possible the printing press. At the same time, moveable type imposed severe constraints on the creative freedom of the author. The characters could only fit into the press in a defined fashion; no longer could the author scatter words across the page in any fashion that struck his fancy. The use of illustrations was severely curtailed, and color was impossible to print. Despite these restrictions, the printing press is ranked as one of the most important inventions of human history.
In the same way, the loop is the critical concept that makes the computer a useful, practical tool. The computer imposes constraints on our organization of data just as the early printing presses constrained the presentation of information on the printed page. And the computer may well be another landmark in the history of civilization. But there is one big difference between the printing press and the computer: the computer is accessible to everyone, not just a few experts. You can program this machine; you can guide it where you will. What will you print with your new press, Herr Gutenberg?