Original document: http://bitsavers.org/pdf/mit/cadr/Knight-LISP_Machine_Macro_Instruction_Set-1979.pdf -- Wednesday, February 21, 1979 01:05:39 AI: LMODC: MACRO 28 The LISP Machine Macro-instruction Set. author: Thomas Knight This document describes the LISP Machine's interpreted order code, referred to below as "macrocode." The macrocode is designed to be highly bit-efficient, and well-suited to LISP. The compilation of LISP into macrocode is very straightforward, as will be shown in examples below. Macrcocode is created by the "macrocompiler," which is called "QCMP." It is never neccesary to write out macrocode manually, because it correspondes so closely with the LISP source code that it would be easy to write the same thing in LISP. When QCMP is run on a function, it is said to "macrocompile" the function. It produces one "Function Entry Frame," referred to below as a FEF, for each function compiled. The resulting LISP pointer has datatype DTP-FEF-POINTER, (*1:missing some characters) and points to the FEF, which is a block of memory at least 8 words long. The FEF has several sections. The first section is always 7 words long. and contains various information about the format of the FEF and information about how the function should be invoked. For complete details of the bit layouts of these words, see the FORMAT document. The next section of the FEF contains pointers to the VALUE celle and FUNCTION cells of symbols. These pointers are of datatype DTP-EXTERNAL-VALUE-CELL-POINTER and are used as an "exit vector", that is, compiled code can refer to these pointers in order to access special variables. The next section contains the Argument Description List (ADL). The ADL contains one entry for each argument which the function expects to be passed, and contains all relevant information about the argument: whether it is required, optional, or rest, how to initialize it if it is not provided, whether it is local or special, datatype checking information, and so on. Sometimes the ADL can be dispensed with if the "fast argument option" can be used instead; this helps save time and memory for small, simple functions. The details can be found in the FORMAT document. The next section of the ADL contains various constants which the function might want to refer to: if the function includes (FOO '(A B)), then the list (A B) would be put in the constants area so that the microcode can refer to it. The rest of the FEF is the actual macroinstructions themselves. Each macroinstruction is 16 bits long, and so two macroinstructions are stored in each word of the IISP machine. There are four conceptual "classes" of microinstructions, each of which is broken down into fields in a different way. CLASS I: -------------------------------------------- | 3 | 4 | 3 | 6 | -------------------------------------------- DEST. OPCODE REGISTER OFFSET There are nine class I instructions, designated by 0 through 10 (octal) in the OPCODE field. Each instruction has a source, whose address is computed from the "REGISTER" and OFFSET fields, and a destination given by the DESTINATION field. The instructions are: OPCODE NAME ------ ---- 0 CALL Open a call block on the stack, to call the function specified by the address. Whatever the function returns will go to the destination. The actual transfer of control will not happen until the arguments have been stored. (See destinations NEXT and LAST.) 1 CALL0 CALL in the case of a function with no arguments. The transfer of control happens immediately. 2 MOVE Move the contents of E to the destination. 3 CAR Put the CAR of the contents of E in the destination. 4 CDR Analogous. 5 CADR Analogous. S CDDR Analogous. 7 CDAR Analogous. 8 CAAR Analogous. The effective address, E, is computed from the "register" and the offset. The instructions really use addressing relative to some convenient place specified by the "register" field. The register may be: REG FUNCTION === ======= 0 FEF This is the starting location of the currently-running FEF. This is how the macrocode addresses the pointers to value and function cells, and the constants area. 1 FEF+100 Same as 0, plus 100 octal. 2 FEF+200 Analogous. 3 FEF+300 Analogous. 4 CONSTANTS PAGE This is a page of widely used constants, such as T, NIL, small numbers,etc. There is only one constant page in the machine. They are kept all on one page so that they can be shared among all FEFs, so that they will not have to be repeated in each FEF's constant area. 5 LOCAL BLOCK This is the address of the local block on the PDL, and is used for addressing local variable. 6 ARG POINTER This is the argument pointer into the PDL, and is used for addressing arguments of the function. 7 PDL The offset must be 77. The top of the stack is popped off and used as the operand. The other possible values of offset are not currently used. (See the FORMAT file for how the PDL frame for each function is divided up into header, argument block, local block, and intermediate result stack.) Note: The first 4 addressing modes are all provided to allow an effective 8-bit offset into the FEF. Note: The same register-offset scheme is used in the class II instructions. An additional complication in computing the "effective address" comes from invisible pointers. Once the register_and offset have been used to compute an initial effective address E, the word at that location is examined (even if this is an instruction which uses E as a destination.) If the data type of that word is "Effective Address Invisible", the pointer field of that word is used as the effective address E. This is used, for example, to access value cells of special variables. The FEF "register" is used, and the location of the FEF addressed contains an effective address invisible pointer which points to the desired value cell. This scheme saves bits in the instruction, without requiring the use of extra instructions to make special value cells addressable. The destination field is somewhat more complicated. First of all. before the result is moved to the destination, two "indicators" are set. The indicators are each stored as a bit, and correspond to processor statue flags such as N and Z on the PDP-11. They are called the ATOM indicator, which is set if the result of the operation is an atom, and the NIL indicator, which is set if the result(*2) is NIL. The class III instructions (BRANCH) may look at the indicators. (*2:original was "resuly") Note: In actuality, there are not actually any physical indicators. Instead, the last result computed is saved in an internal register, and examined by the BRANCH instructions. The functional effect is the same. The destinations are: DEST FUNCTION ==== ======== 0 IGNORE This is the simplest; the result is simply discarded. It is still useful, because it sets the flags. 1 TO STACK This pushes the destination on the stack, which is useful for passing arguments to Class IV instructions, etc(*3:missing some cracters) 2 TO NEXT This is actually the same thing as TO STACK, but it is used for storing the next argument to the open function when it is computed. 3 TO LAST This is used for storing the last argument of the open function. It also pushes the result on the stack, and then it "activates" the open call block. That is. control will be passed to the function specified by the last CALL instruction, and the value returned by the function will be sent to the destination specified by the destination field of the call instruction. 4 TO RETURN Return the result of the instruction as the value of this function. (i.e. return from subroutine.) 5 TO NEXT, QUOTED This is the same as TO NEXT, except that for error checking, the USER-CONTROL bit of the word being, pushed is set, telling the called function that it is getting a quoted argument (if it cares). (This is not implemented.) 6 TO LAST,QUOTED Analogous. 7 TO NEXT LIST This one is fairly tricky. It is used in conjunction with the LIST (Class IV) instruction to efficiently perform the lisp "LIST" function. It is documented under the LIST instruction. Note: 5 and 6 (the QUOTED) destinations have not been implemented as of 11/03/76. Note: The same DESTINATION field is used by the class IV instructions. CLASS II: ------------------------------------------- | 7 | 3 | 6 | ------------------------------------------- OPCODE REGISTER OFFSET The class II instructions have no destination field; the result of the operation (if any) is either pushed on the stack (like a destination TO STACK or TO NEXT in a class one instruction) or is stored at the effective address. The "register" and offset are used in exactly the same way as in the class I instructions, except that the E calculated is sometimes used as a destination instead of a source. The instructions are broken up into three subgroups by the first three bits of the opcode [in the microcode they are referred to as Non-destination instruction groups 1, 2 and 3], and then into the separate instuctions by the next four bits as follows: (a "-" in the left hand column means that this instruction pops. the stack; a "+" means that it pushes something onto the stack.) GRP. OPCODE FUNCTION ==== ====== ======== 11 0 Not used. 11 1 + Adds C(E) to the top of the stack and replaces the result on the stack. 11 2 - Subtracts C(E) from the top of the stack. 11 3 * Analogous. 11 4 / Analogous. 11 5 AND Analogous. 11 6 XOR Analogous. 11 7 OR Analogous. 12 0 = \ These compare C(E) to the top of the stack, and if the 12 1 > |-condition being tested is true,the NIL indicator is clear 12 2 < | otherwise it is set. The stack is popped. 12 3 EQ / 12 4 SCDR Get the CDR of C(E), and store it in E. 12 5 SCDDR Analogous. 12 6 1+ Analogous. 12 7 1- Analogous. 13 0 BIND The cell at E is bound to itself. The linear binding pdl is used. 13 1 BINONIL The cell at E is bound to NIL. 13 2 BINDPOP The cell at E is bound to a value popped off the stack. 13 3 SETNIL Store NIL in E. 13 4 SETZERO Store fixnum 0 in E. 13 5 PUSH-E Push a locative pointer to E on the stack. 13 6 MOVEM Move the data on top of the stack to E. 13 7 POP Pop the top of the stack into E. CLASS III -------------------------------------------- | 3 | 4 | 9 | -------------------------------------------- BRANCH (14) OFFSET CODE The class III instruction is for branching. There is a 3 bit Branch Code field which determines whether the branch happens, and sometimes what to do if it fails. It is decoded as fallows: BRANCH CODE FUNCTION ====== ========== 0 ALWAYS Always branch. 1 NILINO Branch if the NIL indicator is set, else drop through. 2 NOT NILIND Branch if the NIL indicator is not set, else drop through. 3 NILIND ELSE POP \ These two are the same as NILlND and NOT NILIND, | except that if the condition fails, 4 NOT NILIND ELSE POP / the stack is popped. 5 ATOMIND Branch if the ATOM indicator is set, else drop through. 6 NOT ATOMIND Analogous. If the decision is made to perform the branch the offset, considered as a signed (two's complement) offset is added to the PC (i. e. a relative branch, such as is used by the PDP-11). If the offset is 777, however, it is interpreted as meaning that this is a long-distance branch, and the real offset is obtained from the next (16-bit) halfword. The PC added to is always the incremented PC; i.e. the address of the instruction +1 in the short case, and the address of the instruction +2 in the long case. CLASS IV -------------------------------------------- | 3 | 4 | 9 | -------------------------------------------- DEST. (15) OPCODE The class IV (miscellaneous) instructions take their arguments on the stack, and have a destination field which works the same way as the Class I instructions. They all have 15 (octal) in the opecode bits, and the actual opcode is in the last 9 bits. Thus there can be up to 512. of them. Most of them may be called directly from interpretive LISP, and so some of them duplicate functions available in classes I and II. [Note on implementation: since there are far too many class IV instructions to dispatch on using the dispatch memory of the CONS machine, the starting locations of the routines are kept in main memory. The location of the base of the dispatch table is kept in A-V-MISC-BASE at all times.] Since most of these functions are callable from interpretive level (the normal LISP user sees them), they form part of the nuclear system, and so are documented in the Lisp Machine Nuclear System document (LMNUC >). The first 200 (octal) Class IV operations are not in the dispatch table in main memory, but are specially checked for. These are the "LIST" instructions, which work in cooperation with the NEXT-LIST destination. Operations 0-77 are called LIST 0 through LIST 77. The list instruction allocates N Q's in the default consing area [A-CNSADF] which is initialized to a CDR-NEXT, CDR-NIL style list of NILs. Then the instruction pushes three words on the stack: 1) A pointer to the newly allocated block. 2) The destination field(*4) of the LIST instruction. (*4:original was"fiold") 3) A pointer to the newly allocated block (another copy). Note that the destination, as in the CALL instruction, is not used instantly; it is saved and used later. After the LIST instruction has been performed, further instructions can store to destination NEXT-LIST; once the macro-code computes the next arg of what was the LIST function in the source code it stores it to NEXT LIST. What destination NEXT LIST does is: the word on the top of the stack is taken to be a pointer to the next allocated cell. The pointer is popped, the result of the instruction is stored where it points, and then the CDR of the pointer is pushed back on to the stack. If the CDR was NIL however, then we must be finished with the LIST operation, so the NIL is popped off the stack and discarded and a pointer to the newly allocated area (another copy of which was thoughtfully stored on the stack) is sent to the destination of the LIST instruction (which was also stored on the stack), and the two remaining words which the LIST pushed are popped off. CLASS V Opcodes 16 and 17 (octal) are not used and reserved for future expansion. EXAMPLES: Here is a (very) typical Lisp function, for computing factorials. (defun fact (x) (cond ((zerop x) 1) (t (* x (fact (1- x)))))) This is the macrocode produced by the compiler, typed out by the DISASSEMBLE function on the Lisp Machine. 22 NOVE D-PDL ARG|0 ;X 23 MISC D-IGNORE ZEROP 24 BR-NIL 26 25 MOVE D-RETURN 'I 26 MOVE D-PDL ARG|0 ;X 27 CALL D-PDL FEF|10 ;@FUNCTION-CELL FACT 30 MOVE D-PDL ARG|0 ;X 31 MISC D-LAST 1- 32 * PDL-POP 33 MOVE D-RETURN PDL-POP The first thing (line 22) is to push argument 0 (x) onto the stack, and (line 23) check if it is equal to zero. Line 23 uses the ZEROP miscellaneous function, which sets the "indicators" to NIL if the quantity was not ZERO. Line 24 is a branch instruction which tests the "indicators"; if NIL is set, it will branch to 26. If NIL was not set (the number was zero), it falls through to line 25, which returns the value 1. If the number was not zero (the second clause of the COND in the source), then control passes to line 26, which pushes X on the PDL (first argument to the multiply on line 32). Next line 27 opens a call to FACT. Line 30 subtracts 1 from X (with the 1- miscellaneous function), and moves the result to "destination LAST". This result is thus the first and only argument to the recursive invokation of FACT, the result of which is left on the PDL because of the destination field of the CALL instrucfion on line 27. Now (FACT (1- x)) and X are on the PDL, and they are multiplied by the multiply instruction on line 32. It leaves its result on the PDL, to be found by line 33, which returns the result. -- EOF