You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
May, May.1982, Issue 24 COMPUTEI ,.,<br />
COMPUTE!<br />
preter for ROM Applesoft which works in much<br />
preter for ROM Applesoft which works in much<br />
the same way as an &-interpreter ^--interpreter would, except<br />
that it does not require the &. 8c.<br />
How The Applesoft Interpreter Works<br />
Rather than storing the literal characters of BASIC<br />
commands, and of many other key words and<br />
symbols, Applesoft represents these internally in a<br />
tokenized format. Each key word is replaced by a<br />
number, between $80 and $EA SEA (see the Applesoft<br />
Reference Manual, p. 121), and can thus be stored in<br />
a single byte of memory. This is an extremely space<br />
efficient storage system. However, Applesoft now<br />
requires an interpreter to decode these tokens, in<br />
orderr to act on the commands, or to evaluate the<br />
functions which they represent. All key words are<br />
first tokenized, then interpreted, whether the<br />
machine is in Immediate Mode, responding to<br />
commands as you type them in, , or in Deferred<br />
Mode, running a program. Our interpreter mimics<br />
the Applesoft interpreter. For this reason, and also<br />
because we thinkk it would be helpful l for anyone<br />
writing modifications to Applesoft, we will describe<br />
the behavior of the Applesoft interpreterr in some<br />
detail.<br />
The Flow Of Control<br />
The Applesoft interpreter starts at location $D805.<br />
The Applesoft interpreter starts at location $D805.<br />
We cannOt cannot list this copyrighted routine here, but<br />
you can see it for yourself if you jump to the<br />
yo u can see it for yourself if yo u jump to the<br />
MONITOR (via CALL --151) lSI) aand then type in<br />
"D805L". Having the routine in front of you may<br />
clarify the flow of control described below.<br />
clarify the flow of control described below.<br />
This program begins, at $D805, by determining<br />
whether the TRACE command is in effect<br />
(flagged by the contents of $F2). If so, and if a<br />
Deferred Mode program is being run (flagged by<br />
Deferred Mode program is being run (flagged by<br />
contents of $76), then, before each command is<br />
executed, , the "#"" sign is printed, followed by the<br />
line number. The first location following g this printout<br />
is $ $D81D, 0 8 10 , which is branched to directly if the<br />
printing is not to be done.<br />
At $D81D, lD, the CHRGETH routine ($B1-$C8) I-$C8) is<br />
called. This subroutine fetches the next character<br />
of the program and sets the Zero flag if that character<br />
signals an "end of line," that is, if the character<br />
is a carriage return or a colon, ":". The routine<br />
then calls, via a JJSR, the actual interpretation subroutine,<br />
which starts at $0828. $D828. This returns immediately,<br />
via the RTS at $ $D857, 0 if an eend off line is<br />
encountered. Otherwise, the program will return<br />
from this subroutine later, in a more indirect way.<br />
When this subroutine is returned from, the interpreter<br />
exits by jumping to a routine called<br />
NEWSTT (NEW STaTement) at $D702, $D7D2, which<br />
will execute the next program statement, falling<br />
back into this interpreter in the process.<br />
IfCHRGET did not find an end of line, the<br />
If CHRGET did not find an end of line, the<br />
$D828 subroutine expects to find a command of<br />
some sort. If the character fetched by CHRGET is<br />
not a token, the interpreter assumes the programmer<br />
intended a LET command (eg X = 100) and<br />
mer intended a LET command (eg X= 100) and<br />
jumps to the LET subroutine at $OA46. $DA46. The inter<br />
r<br />
preter determines whether it has a token by subtracting<br />
$80, the value of the smallest token, from<br />
the Accumulator (A), which holds the character. If<br />
the Accumulator (A), which holds the character. If<br />
A is still positi ve, we have a token. This token may<br />
A is still positive, we have a token. This token may<br />
represent a command ($80 through $BF), or it<br />
may be some non-command key word ($CO ($C0 through<br />
$EQ). $EQ)- Since we have alreadyy subtracted $80 from<br />
A, we have a command onlyy if A is less than $40<br />
($40 + $80 = $CO), which is checked by a CMP<br />
P<br />
(compare) instruction. If A is not less than $40, we<br />
do not have a command, which is what should be<br />
do not have a command, which is what should be<br />
here, so the interpreter jumps to $D846, thence to<br />
here, so the interpreter jumps to $D846, thence to<br />
$DEC9, which produces a "? SYNTAX ERROR".<br />
The Command Table<br />
I Iff we are dealing with a command, the next job of<br />
the interpreter is to determine where to go to<br />
execute it. There is an address table, beginning at<br />
$0000, $D000, (Applesoft Ha: lla: $0800) which contains this<br />
information. In this table, the staning starting address of<br />
every command, , less 1, I , is stored in order of magnitude<br />
of the command's token. Thus the address of<br />
tude of the command's token. Thus the address of<br />
END, whose token is $80, is stored first, from<br />
$D000 to $D001. FOR's token is next, and the<br />
$0000 to $000 I. FOR's token is next, and the<br />
address of the FOR ro utine, less I , is stored from<br />
address of the FOR routine, less 1, is stored from<br />
$D002 to $0003. $D003. Since $80 was subtracted from A,<br />
A now stores a number between $00 and $3F.<br />
Double this number, by rotating A left, add it to<br />
Double this number, by rotating A left, add it to<br />
$ $D()00. DOOO, and you get the location of the twO two byte<br />
address of the command.<br />
The addition is accomplished by indexing<br />
$000 $D001I and $0000 $D000 with register Y, after Y is loaded<br />
with the doubled contents of A. The command's<br />
address, less I, 1, is then pushed onto the stack. When<br />
the next RTS is encountered, the program will<br />
"return" control to the last address on the stack,<br />
after adding 1I to that address. Thus, the next RTS<br />
we encounterr will force ajump a to the correct<br />
stanin startingg address of the command to be executed.<br />
The actual location of the interpreter's RTS is<br />
hidden. The final instruction of the interpreter is a<br />
JMP, rather than a JSR, to CHRGET, which will<br />
fetch the first character following the command.<br />
fetch the first character following the command.<br />
The RTS from CHRGET is the one which takes us<br />
to the command itself.<br />
to the command itself.<br />
Note that the next address on the stack is the<br />
address of the routine which called this interpreter.<br />
We have already seen what happens when this one<br />
is returned to: the interpretation process stops aand<br />
the program jumps to NEWSTT. As soon as the<br />
RTS at the end of the command we will execute is<br />
encountered, the program will effectively branch<br />
Note that the next address on the stack is the<br />
address of the routine which called this interpreter.<br />
encountered, the program will effectively branch