Acorn User

ARM CODE TUTORIAL

by Martin Penny

Part 7 - Pitfalls And Problems

In previous parts of this series, I've gone over the ARM instruction set and covered a number of points where assembly language may be useful. In this part, I'll take a slightly different line - describing a couple of quirks with assembly language, some "points-to-remember", and general error-handling.

The first step is to mention a couple of these "quirks". Like most modern CPUs, the various ARM processors have fairly complex designs, and this means that there are certain instruction sequences can have some unexpected side-effects; these side-effects can easily catch out the unwary. Admittedly, many of these sequences are logical, and most can be trapped by assemblers, but a "belt-and-braces" approach is not only the "safest" method of writing code, but probably also good practise.

The first "quirk" has already been mentioned - it is the "NV" ("never") condition code; it was originally included in the ARM instruction set for both symmetry and completeness. It never really did have any great use, except as a way of temporarily "masking out" code and turning it into sequences of "NOP" instructions, but even that has been superseded - ARM Ltd. recommends the use of the general "NOP" instruction "MOV R0,R0" instead of using "NV".

On to the next point, and something that most people are only going to encounter when writing either writing modules or using the "OS_EnterOS" software interrupt routine. The problem relates to the ARM's use of processor modes, and the private registers available in each mode. The registers "R0" to "R7" and "R15" are common to each processor mode, but each mode has its own copy of both "R13" and "R14"; the ARM's "FIQ" ("fast interrupt") mode also has private copies of "R8" to "R12", whilst all other modes share the same, standard, copies of these latter registers. As I mentioned in my coverage of the ARM instruction set, the "LDM" and "STM" instructions have forms that can be used in privileged modes to load or store the user-mode registers, rather than the privileged versions; similarly, the "P" option available with the compare and test instructions can be used to change processor mode, although this only works as expected in "26-bit" privileged modes.

The problem surfaces when you try to access one of these banked registers in the clock-cycle immediately after such an action; during this period of time, the register bank is settling, and it cannot be guaranteed which version of the banked register - user or privileged - is visible. To this end, any instruction which accesses a banked register should be followed by a "NOP" ("MOV R0,R0") instruction to ensure "correct" access to registers in later instructions.

Just to illustrate the point, figure 1 is some "pseudo-code" to show how "OS_EnterOS" and its corresponding "TEQP" would be used, though there are a couple of points to remember here. Firstly, the ARM processor is in "SVC" ("supervisor") mode, you are using the system (RISC OS) stack via "R13", so be extremely careful about handling stack-based data. Secondly, if you need to use a software interrupt routine from within "SVC" mode, you must preserve the contents of "R14".

-- Figure 1 --

The reasoning behind this last point lies with the way that the ARM processors handle software interrupts. When a "SWI" instruction is executed, the ARM processor switches to supervisor mode, using "R13_svc" as the stack pointer and storing the return address in "R14_svc", the supervisor mode's private version of "R13" and "R14"; when the software interrupt returns control to the calling program, the contents of "R14_svc" are copied back into "R15". Due to the way flags are handled, the precise method used depends on whether the code is running in "26-bit" or "32-bit" address space, but the net result is that the ARM processor switches back to user mode, also switching back to using "R13_usr" and "R14_usr", the user mode's "R13" and "R14".

The corollary of this is that if the ARM processor is already in supervisor mode, the old contents of "R14_svc" are lost, making it effectively impossible to restore control properly after returning from the software interrupt routine. To overcome this problem, "R14_svc" must be pushed onto the stack before a "SWI" instruction is used; this has the effect of preserving the register's contents properly. Figure 2 is a code fragment for a module's "*"-command, showing this procedure in use.

-- Figure 2 --

This brings me nicely to my next "quirk" of the ARM processors - nested subroutine calls. Each use of the "BL" instruction copies the return address into "R14", so using a "BL" within a subroutine without saving the existing return address in "R14" leads to this existing return address being lost. This means that the contents of "R14" have to be pushed onto the stack before the nested "BL", and restored afterwards. Figure 3 gives some "pseudo-code" that gives an outline of how this is done.

-- Figure 3 --

The last major "quirk" has to do with register usage; on this note, I'll go into more detail on the subject of register usage later on, but, for now I'll stick to some "dos" and don'ts". "R13", "R14" and "R15" all have specific uses, as already mentioned elsewhere; this asymmetric use of the register set - in that not all the registers are perfectly equivalent - leads to certain restrictions. Both "R13" and "R14" could be used as general-purpose registers, but only under certain conditions. If you are not using a stack, "R13" is freed up; however, most code assumes "R13" to be a valid pointer to a full, descending stack, so re-using "R13" can lead to problems. "R14" can be used if its contents have been preserved on the stack, but as subroutine return addresses are copied into "R14" by "BL" instructions, this is not a long-term option. "R15" is always the program counter, so cannot be used for other purposes.

Indeed, it is best to avoid using "R15" in any instructions, as far as is reasonably possible, as physical changes to the design of ARM processors can lead to code having to be rewritten - for example, the differences between "26-bit" and "32-bit" address spaces. The "safest" way of using "R15" directly is in "LDR" and "MOV" instructions with "R15" as the destination, and indirectly in the program counter-relative instructions similar to "ADR". In each of these cases, care has to be taken in how the other operands are put together, to avoid potential, subtle, problems. The appendices of the PRMs give a lot more detail on this topic, with many examples of known "oddities" in the instruction sets of various ARM processors.

Another point of "good practise" crops up within subroutines. I've already made a couple of comments regarding saving the contents of "R14" on the stack during a subroutine, with the easiest way of doing this being given in figure 3. This particular point is that other registers used in the subroutine are also best saved on the stack at the same time as "R14", and restored at the end of the subroutine. "R1" is one register you'll probably want to save this way, especially if you are writing a Desktop application, when "R1" is usually used to point to parameter blocks.

On to another topic now - error handling. Trapping errors within BASIC is fairly straightforward; as BASIC programs are interpreted, the program's code isn't being executed directly by the ARM processor. This means that errors passed back to the program from RISC OS routines are rarely completely fatal, even without a user-written error-handling routine. If the program in question is a compiled from "C" or "C++" source code, the compiler will tend to add code to allow errors internal to the program to be trapped and reported more easily. This, however, doesn't guarantee reliability, as many of you will agree! What goes for compiled code goes for assembler, but in spades - this error-handling code has to be added manually to the program to ensure reliability; more than once, I've had "development" versions of my programs lock up on me, even going as far as completely freezing RISC OS.

Back when I covered the "SWI" instruction, I described how the 24-bit "parameter" was decoded by RISC OS, with one bit of the parameter - bit 17 - is used to control how errors are reported. If bit 17 is clear, RISC OS handles reporting any errors and - if and as necessary - terminating relevant applications. On the other hand, if bit 17 is set, RISC OS does not report any errors, but just returns control back to the application using the "SWI" routine. If control is returned back to the application in this way, the "V" flag is also set to indicate an error; the corollary is that "SWI" routines normally return with the "V" flag clear. The whole point of this option is to allow the programmer a certain degree of control over how and when errors are reported. For example, when writing a Desktop application, you will have need to use various "OS_name" routines; if these are allowed to report errors freely, then any errors that crop up will, as already mentioned, usually cause the application to be terminated by RISC OS.

To illustrate this point, figure 4 gives a code fragment from my application, "!68Host". This particular subroutine is used to load a pair of files into memory from disc. Each of the calls to the "OS_File" routine is via the "X" version - "XOS_File" - and so errors are passed back to "!68Host" rather than being reported. This prevents any such errors from "leaking out" and causing the application from crashing uncontrollably if, for example, a file doesn't exist. The first call to "OS_File" for each file to be loaded checks to see if the file exists, and as a file; the size of the file is then checked, before the file is finally loaded. If, however, an error is encountered at any point, the code passes control to the "ROMImages_Error%" error-handling routine. This particular piece of code is used to report the error itself, before the application tidies up and quits.

-- Figure 4 --

The "OS_GenerateError" routine can be used by CLI programs to generate errors; when used, "R0" is used as a pointer to an error block, which contains the error number and message text. This call is similar to "OS_Exit" is one major respect - its doesn't return control back to the program. If such a program wishes to just display an error message on the screen, any error block returned to the program by a software interrupt routine would have to be printed by the program itself.

A Desktop application that wishes to report an error has the "Wimp_ReportError" routine; a couple of examples are given in figures 5 and 6. Figure 5 generates an error-box with just an "OK" button, while figure 6's box has both "OK" and "Cancel" buttons; because of this, the code in figure 6 uses the "V" flag to indicate which of the two buttons has been selected. The "Wimp_ReportError" routine always returns control back to the application that called it, allowing error boxes to be either just informative, or, at worst, giving the application a chance to tidy up after itself before quitting.

-- Figure 5 --

-- Figure 6 --

The last subject for this part of the series goes back to what I was saying earlier - register usage. There is a formal standard used by compilers and linkers to allow code written in a variety of programming languages to be combined together in the same program. The "Acorn Procedure Call Standard" - "APCS" - has a number of slight variants, the most relevant of which is "APCS-R", used within RISC OS. Part of the standard defines aliases for the registers, as given in figure 7; the aliases are used to organise the usage of the registers. "pc", "lr" and "sp" are all familiar and have their obvious uses, while "sl" is used to define the lower limit of the stack. "ip" is a workspace pointer, and "fp" points to parameters stored on the stack. "v1" to "v6" are used as general-purpose variables, but must be preserved across the subroutine or function, while "a1" to "a4" are used to pass the first four (integer) parameters to the subroutine or function, but need not be preserved by it. There is also a similar set of aliases - not given here - for the floating-point registers.

-- Figure 7 --

Another part of the "APCS" standard sets down which piece of code - the caller or the callee - is responsible for preserving registers that should be preserved; it is the callee - the subroutine or function - that is responsible for this. The rest of the standard covers in more detail how parameters are passed to subroutines and functions, and also how results are returned to the the calling code. All this ensures that - as said earlier - it is reasonably straightforward to combine assembler, "C" and "C++" code, as well as code from other languages - for example, "FORTRAN" maths libraries. The PRMs contain a great more detail on this subject, including how parameters are stored on the stack and which registers should be preserved by a subroutine.

That wraps it up for the seventh part of my series on assembly language; in the eighth and final part of this series, I will cover some aspects of programming within RISC OS.

Return to ARM Code Tutorial index

Return to Tutorials index

Return to Main index

This CD and its design is Copyright © 2000 Tau Press Limited. It may not be copied or distributed without the prior consent of Tau Press. Failure to abide by this may result in prosecution. (That doesn't mean the contents are our copyright, just the linking pages that we created and the CD itself.)