Art of Assembly: Chapter Eight-7

[Chapter Eight][Previous] [Next] [Art of Assembly][Randall Hyde]

Art of Assembly: Chapter Eight

8.14 - Macros
8.14.1 - Procedural Macros
8.14.2 - Macros vs. 80x86 Procedures
8.14.3 - The LOCAL Directive
8.14.4 - The EXITM Directive
8.14.5 - Macro Parameter Expansion and Macro Operators
8.14.6 - A Sample Macro to Implement For Loops

8.14 Macros

A macro is like a procedure that inserts a block of statements at various points in your program during assembly. There are three general types of macros that MASM supports: procedural macros, functional macros, and looping macros. Along with conditional assembly, these tools provide the traditional if, loop, procedure, and function constructs found in many high level languages. Unlike the assembly instructions you write, the conditional assembly and macro language constructs execute during assembly. The conditional assembly and macros statements do not exist when your assembly language program is running. The purpose of these statements is to control which statements MASM assembles into your final ".exe" file. While the conditional assembly directives select or omit certain statements for assembly, the macro directives let you emit repetitive sequences of instructions to an assembly language file like high level language procedures and loops let you repetitively execute sequences of high level language statements.

8.14.1 Procedural Macros

The following sequence defines a macro:

name             macro  {parameter1 {parameter2 {,...}}}

              <statements>
              
                endm

Name must be a valid and unique symbol in the source file. You will use this identifier to invoke the macro. The (optional) parameter names are placeholders for values you specify when you invoke the macro; the braces above denote the optional items, they should not actually appear in your source code. These parameter names are local to the macro and may appear elsewhere in the program.

Example of a macro definition:

COPY            macro   Dest, Source
                mov     ax, Source
                mov     Dest, ax
                endm

This macro will copy the word at the source address to the word at the destination address. The symbols Dest and Source are local to the macro and may appear elsewhere in the program.

Note that MASM does not immediately assemble the instructions between the macro and endm directives when MASM encounters the macro. Instead, the assembler stores the text corresponding to the macro into a special table (called the symbol table). MASM inserts these instructions into your program when you invoke the macro.

To invoke (use) a macro, simply specify the macro name as a MASM mnemonic. When you do this, MASM will insert the statements between the macro and endm directives into your code at the point of the macro invocation. If your macro has parameters, MASM will substitute the actual parameters appearing as operands for the formal parameters appearing in the macro definition. MASM does a straight textual substitution, just as though you had created text equates for the parameters.

Consider the following code that uses the COPY macro defined above:

                call    SetUpX
                copy    Y, X
                add     Y, 5

This program segment will issue a call to SetUpX (which, presumably, does something to the variable X) then invokes the COPY macro, that copies the value in the variable X into the variable Y. Finally, it adds five to the value contained in variable Y.

Note that this instruction sequence is absolutely identical to:

                call    SetUpX
                mov     ax, X
                mov     Y, ax
                add     Y, 5

In some instances using macros can save a considerable amount of typing in your programs. For example, suppose you want to access elements of various two dimensional arrays. As you may recall, the formula to compute the row-major address for an array element is

element address = base address + (First Index * Row Size + Second Index) * element size

Suppose you want write some assembly code that achieves the same result as the following C code:

int a[16][7], b[16][7], x[7][16];
int i,j;

        for (i=0; i<16; i = i + 1)
                for (j=0; j < 7; j = j + 1)
                        x[j][i] = a[i][j]*b[15-i][j];

The 80x86 code for this sequence is rather complex because of the number of array accesses. The complete code is

                .386                    ;Uses some 286 & 386 instrs.
                option  segment:use16   ;Required for real mode programs
                 .
                 .
                 .
a               sword   16 dup (7 dup (?))
b               sword   16 dup (7 dup (?))
x               sword   7 dup (16 dup (?))
                 .
                 .
                 .
i               textequ <cx>            ;Hold I in CX register.
j               textequ <dx>            ;Hold J in DX register.

                mov     I, 0            ;Initialize I loop index with zero.
ForILp:         cmp     I, 16           ;Is I less than 16?
                jnl     ForIDone        ;If so, fall into body of I loop.

                mov     J, 0            ;Initialize J loop index with zero.
ForJLp:         cmp     J, 7            ;Is J less than 7?
                jnl     ForJDone        ;If so, fall into body of J loop.

                imul    bx, I, 7        ;Compute index for a[i][j].
                add     bx, J
                add     bx, bx          ;Element size is two bytes.
                mov     ax, A[bx]       ;Get a[i][j]

                mov     bx, 15          ;Compute index for b[15-I][j].
                sub     bx, I
                imul    bx, 7
                add     bx, J
                add     bx, bx          ;Element size is two bytes.
                imul    ax, b[bx]       ;Compute a[i][j] * b[16-i][j]

                imul    bx, J, 16       ;Compute index for X[J][I]
                add     bx, I
                add     bx, bx
                mov     X[bx], ax       ;Store away result.

                inc     J               ;Next loop iteration.
                jmp     ForJLp

ForJDone:       inc     I               ;Next I loop iteration.
                jmp     ForILp

ForIDone:                               ;Done with nested loop.

This is a lot of code for only five C/C++ statements! If you take a close look at this code, you'll notice that a large number of the statements simply compute the index into the three arrays. Furthermore, the code sequences that compute these array indices are very similar. If they were exactly the same, it would be obvious we could write a macro to replace the three array index computations. Since these index computations are not identical, one might wonder if it is possible to create a macro that will simplify this code. The answer is yes; by using macro parameters it is very easy to write such a macro. Consider the following code:

i               textequ <cx>            ;Hold I in CX register.
j               textequ <dx>            ;Hold J in DX register.

NDX2            macro   Index1, Index2, RowSize
                imul    bx, Index1, RowSize
                add     bx, Index2
                add     bx, bx
                endm

                mov     I, 0            ;Initialize I loop index with zero.
ForILp:         cmp     I, 16           ;Is I less than 16?
                jnl     ForIDone        ;If so, fall into body of I loop.

                mov     J, 0            ;Initialize J loop index with zero.
ForJLp:         cmp     J, 7            ;Is J less than 7?
                jnl     ForJDone        ;If so, fall into body of J loop.

                NDX2    I, J, 7
                mov     ax, A[bx]       ;Get a[i][j]

                mov     bx, 15          ;Compute index for b[15-I][j].
                sub     bx, I
                NDX2    bx, J, 7
                imul    ax, b[bx]       ;Compute a[i][j] * b[15-i][j]

                NDX2    J, I, 16
                mov     X[bx], ax       ;Store away result.

                inc     J               ;Next loop iteration.
                jmp     ForJLp

ForJDone:       inc     I               ;Next I loop iteration.
                jmp     ForILp

ForIDone:                               ;Done with nested loop.

One problem with the NDX2 macro is that you need to know the row size of an array (since it is a macro parameter). In a short example like this one, that isn't much of a problem. However, if you write a large program you can easily forget the sizes and have to look them up or, worse yet, "remember" them incorrectly and introduce a bug into your program. One reasonable question to ask is if MASM could figure out the row size of the array automatically. The answer is yes.

MASM's length operator is a holdover from the pre-6.0 days. It was supposed to return the number of elements in an array. However, all it really returns is the first value appearing in the array's operand field. For example, (length a) would return 16 given the definition for a above. MASM corrected this problem by introducing the lengthof operator that properly returns the total number of elements in an array. (Lengthof a), for example, properly returns 112 (16 * 7). Although the (length a) operator returns the wrong value for our purposes (it returns the column size rather than the row size), we can use its return value to compute the row size using the expression (lengthof a)/(length a). With this knowledge, consider the following two macros:

; LDAX- This macro loads ax with the word at address Array[Index1][Index2]
;       Assumptions:    You've declared the array using a statement like
;                       Array word Colsize dup (RowSize dup (?))
;                       and the array is stored in row major order.
;
;       If you specify the (optional) fourth parameter, it is an 80x86
;       machine instruction to substitute for the MOV instruction that
;       loads AX from Array[bx].

LDAX            macro   Array, Index1, Index2, Instr
                imul    bx, Index1, (lengthof Array) / (length Array)
                add     bx, Index2
                add     bx, bx

; See if the caller has supplied the fourth operand.

                ifb     <Instr>
                mov     ax, Array[bx]           ;If not, emit a MOV instr.
                else
                instr   ax, Array[bx]           ;If so, emit user instr.
                endif
                endm

; STAX- This macro stores ax into the word at address Array[Index1][Index2]
;       Assumptions: Same as above

STAX            macro   Array, Index1, Index2
                imul    bx, Index1, (lengthof Array) / (length Array)
                add     bx, Index2
                add     bx, bx
                mov     Array[bx], ax
                endm

With the macros above, the original program becomes:

i               textequ <cx>            ;Hold I in CX register.
j               textequ <dx>            ;Hold J in DX register.

                mov     I, 0            ;Initialize I loop index with zero.
ForILp:         cmp     I, 16           ;Is I less than 16?
                jnl     ForIDone        ;If so, fall into body of I loop.

                mov     J, 0            ;Initialize J loop index with zero.
ForJLp:         cmp     J, 7            ;Is J less than 7?
                jnl     ForJDone        ;If so, fall into body of J loop.

                ldax    A, I, J         ;Fetch A[I][J]
                mov     bx, 16          ;Compute 16-I.
                sub     bx, I
                ldax    b, bx, J, imul  ;Multiply in B[16-I][J].
                stax    x, J, I         ;Store to X[J][I]

                inc     J               ;Next loop iteration.
                jmp     ForJLp

ForJDone:       inc     I               ;Next I loop iteration.
                jmp     ForILp

ForIDone:                               ;Done with nested loop.

As you can plainly see, the code for the loops above is getting shorter and shorter by using these macros. Of course, the entire code sequence is actually longer because the macros represent more lines of code that they save in the original program. However, that is an artifact of this particular program. In general, you'd probably have more than three array accesses; furthermore, you can always put the LDAX and STAX macros in a library file and automatically include them anytime you're dealing with two dimensional arrays. Although, technically, your program might actually contain more assembly language statements if you include these macros in your code, you only had to write those macros once. After that, it takes very little effort to include the macros in any new program.

We can shorten this code sequence even more using some additional macros. However, there are a few additional topics to cover before we can do that, so keep reading.

8.14.2 Macros vs. 80x86 Procedures

Beginning assembly language programmers often confuse macros and procedures. A procedure is a single section of code that you call from various points in the program. A macro is a sequence of instructions that MASM replicates in your program each time you use the macro. Consider the following two code fragments:

Proc_1          proc    near
                mov     ax, 0
                mov     bx, ax
                mov     cx, 5
                ret
Proc_1          endp

Macro_1         macro
                mov     ax, 0
                mov     bx, ax
                mov     cx, 5
                endm

                call    Proc_1
                 .
                 .
                call    Proc_1
                 .
                 .
                Macro_1
                 .
                 .
                Macro_1

Although the macro and procedure produce the same result, they do it in different ways. The procedure definition generates code when the assembler encounters the proc directive. A call to this procedure requires only three bytes. At execution time, the 80x86:

encounters the call instruction,
pushes the return address onto the stack,
jumps to Proc_1,
executes the code therein,
pops the return address off the stack, and
returns to the calling code.

The macro, on the other hand, does not emit any code when processing the statements between the macro and endm directives. However, upon encountering Macro_1 in the mnemonic field, MASM will assemble every statement between the macro and endm directives and emit that code to the output file. At run time, the CPU executes these instructions without the call/ret overhead.

The execution of a macro expansion is usually faster than the execution of the same code implemented with a procedure. However, this is another example of the classic speed/space trade-off. Macros execute faster by eliminating the call/return sequence. However, the assembler copies the macro code into your program at each macro invocation. If you have a lot of macro invocations within your program, it will be much larger than the same program that uses procedures.

Macro invocations and procedure invocations are considerably different. To invoke a macro, you simply specify the macro name as though it were an instruction or directive. To invoke a procedure you need to use the call instruction. In many contexts it is unfortunate that you use two separate invocation mechanisms for such similar operations. The real problem occurs if you want to switch a macro to a procedure or vice versa. It might be that you've been using macro expansion for a particular operation, but now you've expanded the macro so many times it makes more sense to use a procedure. Maybe just the opposite is true, you've been using a procedure but you want to expand the code in-line to improve it's performance. The problem with either conversion is that you will have to find every invocation of the macro or procedure call and modify it. Modifying the procedure or macro is easy, but locating and changing all the invocations can be quite a bit of work. Fortunately, there is a very simple technique you can use so procedure calls share the same syntax as macro invocation. The trick is to create a macro or a text equate for each procedure you write that expands into a call to that procedure. For example, suppose you write a procedure ClearArray that zeros out arrays. When writing the code, you could do the following:

ClearArray      textequ <call $$ClearArray>
$$ClearArray    proc    near
                 .
                 .
                 .
$$ClearArray    endm

To call the ClearArray procedure, you'd simply use a statement like the following:

                 .
                 .
                 .
        <Set up parameters for ClearArray>
                ClearArray
                 .
                 .
                 .

If you ever change the $$ClearArray procedure to a macro, all you need to do is name it ClearArray and dispose of the textequ for the procedure. Conversely, if you already have a macro and you want to convert it to a procedure, Simply name the procedure $$procname and create a text equate that emits a call to this procedure. This allows you to use the same invocation syntax for procedures or macros.

This text won't normally use the technique described above, except for the UCR Standard Library routines. This is not because this isn't a good way to invoke procedures. Some people have trouble differentiating macros and procedures, so this text will use explicit calls to help avoid that confusion. Standard Library calls are an exception because using macro invocations is the standard way to call these routines.

8.14.3 The LOCAL Directive

Consider the following macro definition:

LJE             macro   Dest
                jne     SkipIt
                jmp     Dest
SkipIt:
                endm

This macro does a "long jump if equal". However, there is one problem with it. Since MASM copies the macro text verbatim (allowing, of course, for parameter substitution), the symbol SkipIt will be redefined each time the LJE macro appears. When this happens, the assembler will generate a multiple definition error. To overcome this problem, the local directive can be used to define a local symbol within the macro. Consider the following macro definition:

LJE             macro   Dest
                local   SkipIt
                jne     SkipIt
                jmp     Dest
SkipIt:
                endm

In this macro definition, SkipIt is a local symbol. Therefore, the assembler will generate a new copy of SkipIt each time you invoke the macro. This will prevent MASM from generating an error.

The local directive, if it appears within your macro definition, must appear immediately after the macro directive. If you need multiple local symbols, you can specify several of them in the local directive's operand field. Simply separate each symbol with a comma:

IFEQUAL         macro   a, b
                local   ElsePortion, Done
                mov     ax, a
                cmp     ax, b
                jne     ElsePortion
                inc     bx
                jmp     Done
ElsePortion:    dec     bx
Done:
                endm

8.14.4 The EXITM Directive

The exitm directive immediately terminates the expansion of a macro, exactly as though MASM encountered endm. MASM ignores all text from the exitm directive to the endm.

You're probably wondering why anyone would ever use the exitm directive. After all, if MASM ignores all text between exitm and endm, why bother sticking an exitm directive into your macro in the first place? The answer is conditional assembly. Conditional assembly can be used to conditionally execute the exitm directive, thereby allowing further macro expansion under certain conditions, consider the following:

Bytes           macro   Count 
                byte    Count
                if      Count eq 0
                exitm
                endif
                byte    Count dup (?)
                endm

Of course, this simple example could have been coded without using the

exitm

directive (the conditional assembly directive is all we require), but it does demonstrate how the exitm directive can be used within a conditional assembly sequence to control its influence.

8.14.5 Macro Parameter Expansion and Macro Operators

Since MASM does a textual substitution for macro parameters when you invoke a macro, there are times when a macro invocation might not produce the results you expect. For example, consider the following (admittedly dumb) macro definition:

Index           =       8

; Problem-      This macro attempts to load AX with the element of a word
;               array specified by the macro's parameter. This parameter
;               must be an assembly-time constant.

Problem         macro   Parameter
                mov     ax, Array[Parameter*2]
                endm
                 .
                 .
                 .
                Problem 2
                 .
                 .
                 .
                Problem Index+2

When MASM expands the first invocation of Problem above, it produces the instruction:

                mov     ax, Array[2*2]

Okay, so far so good. This code loads element two of Array into ax. However, consider the expansion of the second invocation to Problem, above:

                mov     ax, Array[Index+2*2]

Because MASM's address expressions support operator precedence (see "Operator Precedence" on page 396), this macro expansion will not produce the correct result. It will access the sixth element of Array (at index 12) rather than the tenth element at index 20.

The problem above occurs because MASM simply replaces a formal parameter by the actual parameter's text, not the actual parameter's value. This pass by name parameter passing mechanism should be familiar to long-time C and C++ programmers who use the #define statement. If you think that macro (pass by name) parameters work just like Pascal and C's pass by value parameters, you are setting yourself up for eventual disaster.

One possible solution, that works well for macros like the above, is to put parentheses around macro parameters that occur within expressions inside the macro. Consider the following code:

Problem         macro   Parameter
                mov     ax, Array[(Parameter)*2]
                endm
                 .
                 .
                 .
                Problem Index+2

This macro invocation expands to

                mov     ax, Array[(Index+2)*2]

This produces the expected result.

Textual parameter substitution is but one problem you'll run into when using macros. Another problem occurs because MASM has two types of assembly time values: numeric and text. Unfortunately, MASM expects numeric values in some contexts and text values in others. They are not fully interchangeable. Fortunately, MASM provides a set of operators that let you convert between one form and the other (if it is possible to do so). To understand the subtle differences between these two types of values, look at the following statements:

Numeric         =       10+2
Textual         textequ <10+2>

MASM evaluates the numeric expression "10+2" and associates the value twelve with the symbol Numeric. For the symbol Textual, MASM simply stores away the string "10+2" and substitutes it for Textual anywhere you use it in an expression.

In many contexts, you could use either symbol. For example, the following two statements both load ax with twelve:

                mov     ax, Numeric     ;Same as mov ax, 12
                mov     ax, Textual     ;Same as mov ax, 10+2

However, consider the following two statements:

                mov     ax, Numeric*2   ;Same as mov ax, 12*2
                mov     ax, Textual*2   ;Same as mov ax, 10+2*2

As you can see, the textual substitution that occurs with text equates can lead to the same problems you encountered with textual substitution of macro parameters.

MASM will automatically convert a text object to a numeric value, if the conversion is necessary. Other than the textual substitution problem described above, you can use a text value (whose string represents a numeric quantity) anywhere MASM requires a numeric value.

Going the other direction, numeric value to text value, is not automatic. Therefore, MASM provides an operator you can use to convert numeric data to textual data: the "%" operator. This expansion operator forces an immediate evaluation of the following expression and then it converts the result of the expression into a string of digits. Look at these invocations of the Problem macro:

                Problem 10+2    ;Parameter is "10+2"
                Problem %10+2   ;Parameter is "12"

In the second example above, the text expansion operator instructs MASM to evaluate the expression "10+2" and convert the resulting numeric value to a text value consisting of the digits that represent the value twelve. Therefore, these two macro expand into the following statements (respectively):

                mov     ax, Array[10+2*2]       ;Problem 10+2 expansion
                mov     ax, Array[12*2]         ;Problem %10+2 expansion

MASM provides a second operator, the substitution operator that lets you expand macro parameter names where MASM does not normally expect a symbol. The substitution operator is the ampersand ("&") character. If you surround a macro parameter name with ampersands inside a macro, MASM will substitute the parameter's text regardless of the location of the symbol. This lets you expand macro parameters whose names appear inside other identifiers or inside literal strings. The following macro demonstrates the use of this operator:

DebugMsg        macro   Point, String
Msg&String&     byte    "At point &Point&: &String&"
                endm
                 .
                 .
                 .
                DebugMsg 5, <Assertion fails>

The macro invocation immediately above produces the statement:

Msg5            byte    "At point 5: Assertion failed"

Note how the substitution operator allowed this macro to concatenate "Msg" and "5" to produce the label on the byte directive. Also note that the expansion operator lets you expand macro identifiers even if they appear in a literal string constant. Without the ampersands in the string, MASM would have emitted the statement:

Msg5            byte    "At point point: String"

Another important operator active within macros is the literal character operator, the exclamation mark ("!"). This symbol instructs MASM to pass the following character through without any modification. You would normally use this symbol if you need to include one of the following symbols as a character within a macro:

! & > %

For example, had you really wanted the string in the DebugMsg macro to display the ampersands, you would use the definition:

DebugMsg        macro   Point, String
Msg&String&     byte    "At point !&Point!&: !&String!&"
                endm

"Debug 5, <Assertion fails>" would produce the following statement:
Msg5            byte    "At point &Point&: &String&"

Use the "<" and ">" symbols to delimit text data inside MASM. The following two invocations of the PutData macro show how you can use these delimiters in a macro:

PutData         macro   TheName, TheData
PD_&TheName&    byte    TheData
                endm
                 .
                 .
                 .
                PutData MyData, 5, 4, 3         ;Emits "PD_MyData byte 5"
                PutData MyData, <5, 4, 3>       ;Emits "PD_MyData byte 5, 4, 3"

You can use the text delimiters to surround objects that you wish to treat as a single parameter rather than as a list of multiple parameters. In the PutData example above, the first invocation passes four parameters to PutData (PutData ignores the last two). In the second invocation, there are two parameters, the second consisting of the text 5, 4, 3.

The last macro operator of interest is the ";;" operator. This operator begins a macro comment. MASM normally copies all text from the macro into the body of the program during assembly, including all comments. However, if you begin a comment with ";;" rather than a single semicolon, MASM will not expand the comment as part of the code during macro expansion. This increases the speed of assembly by a tiny amount and, more importantly, it does not clutter a program listing with copies of the same comment (see "Controlling the Listing" on page 424 to learn about program listings).

Macro Operators
Operator	Description
&	Text substitution operator
< >	Literal text operator
!	Literal character operator
%	Expression operator
;;	Macro comment

8.14.6 A Sample Macro to Implement For Loops

Remember the for loops and matrix operations used in a previous example? At the conclusion of that section there was a brief comment that we could "improve" that code even more using macros, but the example had to wait. With the description of macro operators out of the way, we can now finish that discussion. The macros that implement the for loop are

; First, three macros that let us construct symbols by concatenating others.
; This is necessary because this code needs to expand several components in
; text equates multiple times to arrive at the proper symbol. 
;
; MakeLbl-      Emits a label create by concatenating the two parameters
;               passed to this macro.

MakeLbl         macro   FirstHalf, SecondHalf
&FirstHalf&&SecondHalf&:
                endm

jgDone          macro   FirstHalf, SecondHalf
                jg      &FirstHalf&&SecondHalf&
                endm

jmpLoop         macro   FirstHalf, SecondHalf
                jmp     &FirstHalf&&SecondHalf&
                endm

; ForLp-                This macro appears at the beginning of the for loop. To invoke
;               this macro, use a statement of the form:
;
;               ForLp   LoopCtrlVar, StartVal, StopVal
;
; Note: "FOR" is a MASM reserved word, which is why this macro doesn't
; use that name.

ForLp           macro   LCV, Start, Stop

; We need to generate a unique, global symbol for each for loop we create.
; This symbol needs to be global because we will need to reference it at the
; bottom of the loop. To generate a unique symbol, this macro concatenates
; "FOR" with the name of the loop control variable and a unique numeric value
; that this macro increments each time the user constructs a for loop with the
; same loop control variable.

                ifndef  $$For&LCV&      ;;Symbol = $$FOR concatenated with LCV
$$For&LCV&      =       0               ;;If this is the first loop w/LCV, use
                else                    ;; zero, otherwise increment the value.
$$For&LCV&      =       $$For&LCV& + 1
                endif

; Emit the instructions to initialize the loop control variable:

                mov     ax, Start
                mov     LCV, ax

; Output the label at the top of the for loop. This label takes the form
;               $$FOR LCV x
; where LCV is the name of the loop control variable and X is a unique number
; that this macro increments for each for loop that uses the same loop control
; variable.

                MakeLbl $$For&LCV&, %$$For&LCV&

; Okay, output the code to see if this for loop is complete.
; The jgDone macro generates a jump (if greater) to the label the
; Next macro emits below the bottom of the for loop.

                mov     ax, LCV
                cmp     ax, Stop
                jgDone  $$Next&LCV&, %$$For&LCV&
                endm

; The Next macro terminates the for loop. This macro increments the loop
; control variable and then transfers control back to the label at the top of
; the for loop.

Next            macro   LCV
                inc     LCV
                jmpLoop $$For&LCV&, %$$For&LCV&
                MakeLbl $$Next&LCV&, %$$For&LCV&
                endm

With these macros and the LDAX/STAX macros, the code from the array manipulation example presented earlier becomes very simple. It is

                ForLp   I, 0, 15
                ForLp   J, 0, 6

                ldax    A, I, J         ;Fetch A[I][J]
                mov     bx, 15          ;Compute 16-I.
                sub     bx, I
                ldax    b, bx, J, imul  ;Multiply in B[15-I][J].
                stax    x, J, I         ;Store to X[J][I]

                Next    J
                Next    I

Although this code isn't quite as short as the original C/C++ example, it's getting pretty close!

While the main program became much simpler, there is a question of the macros themselves. The ForLp and Next macros are extremely complex! If you had to go through this effort every time you wanted to create a macro, assembly language programs would be ten times harder to write if you decided to use macros. Fortunately, you only have to write (and debug) a macro like this once. Then you can use it as many times as you like, in many different programs, without having to worry much about it's implementation.

Given the complexity of the For and Next macros, it is probably a good idea to carefully describe what each statement in these macros is doing. However, before discussing the macros themselves, we should discuss exactly how one might implement a for/next loop in assembly language. This text fully explores the for loop a little later, but we can certainly go over the basics here. Consider the following Pascal for loop:

        for variable := StartExpression to EndExpression do
                Some_Statement;

Pascal begins by computing the value of StartExpression. It then assigns this value to the loop control variable (variable). It then evaluates EndExpression and saves this value in a temporary location. Then the Pascal for statement enters the loop's body. The first thing the loop does is compare the value of variable against the value it computed for EndExpression. If the value of variable is greater than this value for EndExpression, Pascal transfers to the first statement after the for loop, otherwise it executes Some_Statement. After the Pascal for loop executes Some_Statement, it adds one to variable and jumps back to the point where it compares the value of variable against the computed value for EndExpression. Converting this code directly into assembly language yields the following code:

;Note: This code assumes StartExpression and EndExpression are simple variables.
;If this is not the case, compute the values for these expression and place
;them in these variables.

                mov     ax, StartExpression
                mov     Variable, ax
ForLoop:        mov     ax, Variable
                cmp     ax, EndExpression
                jg      ForDone

        <Code for Some_Statement>

                inc     Variable
                jmp     ForLoop
ForDone:

To implement this as a set of macros, we need to be able to write a short piece of code that will write the above assembly language statements for us. At first blush, this would seem easy, why not use the following code?

ForLp           macro   Variable, Start, Stop
                mov     ax, Start
                mov     Variable, ax
ForLoop:        mov     ax, Variable
                cmp     ax, Stop
                jg      ForDone
                endm

Next            macro   Variable
                inc     Variable
                jmp     ForLoop
ForDone:
                endm

These two macros would produce correct code - exactly once. However, a problem develops if you try to use these macros a second time. This is particularly evident when using nested loops:

                ForLp   I, 1, 10
                ForLp   J, 1, 10
                 .
                 .
                 .
                Next    J
                Next    I

The macros above emit the following 80x86 code:

                mov     ax, 1           ;The ForLp I, 1, 10
                mov     I, ax           ; macro emits these
ForLoop:        mov     ax, I           ; statements.
                cmp     ax, 10          ;       .
                jg      ForDone         ;       .

                mov     ax, 1           ;The ForLp J, 1, 10
                mov     J, ax           ; macro emits these
ForLoop:        mov     ax, J           ; statements.
                cmp     ax, 10          ;        .
                jg      ForDone         ;        .
                 .
                 .
                 .
                inc     J               ;The Next J macro emits these
                jmp     ForLp           ; statements.
ForDone:
                inc     I               ;The Next I macro emits these
                jmp     ForLp           ; statements.
ForDone:

The problem, evident in the code above, is that each time you use the ForLp macro you emit the label "ForLoop" to the code. Likewise, each time you use the Next macro, you emit the label "ForDone" to the code stream. Therefore, if you use these macros more than once (within the same procedure), you will get a duplicate symbol error. To prevent this error, the macros must generate unique labels each time you use them. Unfortunately, the local directive will not work here. The local directive defines a unique symbol within a single macro invocation. If you look carefully at the code above, you'll see that the ForLp macro emits a symbol that the code in the Next macro references. Likewise, the Next macro emits a label that the ForLp macro references. Therefore, the label names must be global since the two macros can reference each other's labels.

The solution the actual ForLp and Next macros use is to generate globally known labels of the form "$$For" + "variable name" + "some unique number." and "$$Next" + "variable name" + "some unique number". For the example given above, the real ForLp and Next macros would generate the following code:

                mov     ax, 1           ;The ForLp I, 1, 10
                mov     I, ax           ; macro emits these
$$ForI0:        mov     ax, I           ; statements.
                cmp     ax, 10          ;       .
                jg      $$NextI0        ;       .

                mov     ax, 1           ;The ForLp J, 1, 10
                mov     J, ax           ; macro emits these
$$ForJ0:        mov     ax, J           ; statements.
                cmp     ax, 10          ;        .
                jg      $$NextJ0        ;        .
                 .
                 .
                 .
                inc     J               ;The Next J macro emits these
                jmp     $$ForJ0         ; statements.
$$NextJ0:
                inc     I               ;The Next I macro emits these
                jmp     $$ForI0         ; statements.
$$NextI0:

The real question is, "How does one generate such labels?"

Constructing a symbol of the form "$$ForI" or "$$NextJ" is pretty easy. Just create a symbol by concatenating the string "$$For" or "$$Next" with the loop control variable's name. The problem occurs when you try to append a numeric value to the end of that string. The actual ForLp and Next code accomplishes this creating assembly time variable names of the form "$$Forvariable_name" and incrementing this variable for each loop with the given loop control variable name. By calling the macros MakeLbl, jgDone, and jmpLoop, ForLp and Next output the appropriate labels and ancillary instructions.

The ForLp and Next macros are very complex. Far more complex than you would typically find in a program. They do, however, demonstrate the power of MASM's macro facilities. By the way, there are much better ways to create these symbols using macro functions. We'll discuss macro functions next.

8.14 - Macros
8.14.1 - Procedural Macros
8.14.2 - Macros vs. 80x86 Procedures
8.14.3 - The LOCAL Directive
8.14.4 - The EXITM Directive
8.14.5 - Macro Parameter Expansion and Macro Operators
8.14.6 - A Sample Macro to Implement For Loops

Art of Assembly: Chapter Eight - 26 SEP 1996

[Chapter Eight][Previous] [Next] [Art of Assembly][Randall Hyde]