home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Fresh Fish 8
/
FreshFishVol8-CD2.bin
/
bbs
/
dev
/
oberon-a-1.4ß.lha
/
Oberon-A
/
texts
/
OP2.Paper.Text
< prev
next >
Wrap
Text File
|
1994-03-10
|
26KB
|
569 lines
OP2: A PORTABLE OBERON-2 COMPILER
Presented at the 2nd International Modula-2 Conference, Loughborough,
Sept 91
Regis CRELIER
Institut für Computersysteme, ETH Zürich
ABSTRACT
A portable compiler for the language Oberon-2 is presented. Most related
works pay for portability with low compilation speed or poor code
quality. Portability and efficiency have been given the same importance
in our approach. Hence, an automated retargetable code generation has
not been considered. The compiler consists of a front-end and a
back-end. The front-end does the lexical and syntactic analysis,
including type checking. It builds a machine-independent structure
representing the program. This structure is made up of a symbol table
and an abstract syntax tree, rather than a stream of pseudo-instructions
coded in an "intermediate language". If no errors are found, control is
passed to the back-end which generates code from this intermediate
structure. This structure clearly separates the front-end which is
machine-independent from the back-end which is machine-dependent. While
the front-end can remain unchanged, the back-end has to be reprogrammed,
when the compiler is retargeted to a new machine.
This compiler has been successfully used to port the Oberon System onto
different computers. Code generators have been implemented both for CISC
and RISC processors. Differences in processor architectures are
reflected in the complexity of the back-end, the generated code density
and performance. The compiler is written in Oberon. New compilers have
therefore to be first compiled on an already working Oberon System. If
such a system is not available, a version of the compiler whose back-end
produces C code may be used for the bootstrap.
The compilation techniques presented here are not restricted to Oberon
compilers, but could be used for other programming languages too.
Nevertheless, Oberon and OP2 tend to the same ideal: simplicity,
flexibility, efficiency and elegance.
INTRODUCTION
Portability is an important criterion for program quality. A compiler is
a program as well, and it may be ported. If it should produce the same
code as before on the new machine (cross compiler), then it is not more
difficult to port it than any other program also written in a higher
programming language. But if the produced code must run on the new
machine, the compiler has to be rewritten and it is not the same program
any more. In that sense, the compiler is not, and cannot be, portable.
By the term portable compiler , we refer here to a compiler that needs
reasonably small effort to be adapted to a new machine and/or to be
modified to produce new code. Most related work attempt to reduce this
adaptation cost to a minimum, compromising the compilation speed and the
code quality. A classification of such automated retargetable code
generation techniques and a survey of the works on those techniques are
presented in [1]. The basic idea is to produce code for a virtual
machine. This code is then expanded into real machine instructions. The
expansion can be done by hand-written translators [2] or by a
machine-independent code generation algorithm, in which case each
intermediate language instruction [3] or each recognized pattern of
these [4] is expanded into a fixed pattern of target machine
instructions recorded in tables. Trees may replace linear code to feed
the pattern matching algorithm [5, 6, 7]. These techniques usually yield
poor code quality, making a peephole optimization phase necessary, which
further increases the compilation time.
In our approach, we tried to find the right balance between code
quality, compilation speed and portability. We think it is worth-while
investing, say three man-months, for a port, if the resulting compiler
is very fast and produces efficient code. Thus, a pattern matching or
table-driven code generation has not been considered. Instead, we looked
at more conventional and faster techniques, such as single-pass
compilation [8, 9]. In a single-pass compiler, the compilation phases
are executed simultaneously. No intermediate representation of the
source text is needed between the different phases, making the compiler
compact and efficient, but not very portable. Indeed, since
machine-dependent and machine-independent phases are mixed up, it is
very difficult to modify the compiler for a new machine.
One solution to the problem is to clearly separate the compilation
phases into two groups: the front-end consisting of the
machine-independent phases (lexical and syntactic analysis, type
checking) and the back-end consisting of the machine-dependent phases
(storage allocation, code generation). Only the back-end must be
modified when the compiler is ported. The front-end enters declarations
in a symbol table and builds an intermediate representation of the
program statements, an abstract syntax tree. If no errors were found,
control is passed to the back-end, which generates code from the syntax
tree. Since this structure is guaranteed to be free of errors, type
checking or error recovery are not part of the back-end, which is a
noteworthy advantage. Only implementation restrictions must be checked.
Another advantage of the intermediate structure is that optional passes
may be inserted to optimize the code. Such an optimization phase cannot
be easily embedded in a conventional single-pass compiler. The front-end
and the back-end are implemented separately as a set of modules.
MODULE STRUCTURE
Originally, OP2 has been designed to compile Oberon programs [10] and
has been slightly modified later to compile Oberon-2 programs [11, 12].
It consists of nine modules (see figure 1) all written in Oberon.
The lowest module of the hierarchy is OPM , where M stands for machine.
We must distinguish between the host machine on which the compiler is
running, and the target machine for which the compiler is generating
code. Most of the time, the two machines are the same, except during a
bootstrap or in case of a cross-compiler. The module OPM defines and
exports several constants used to parametrize the front-end. Some of
these constants reflect target machine characteristics or implementation
restrictions. For example, these values are used in the front-end to
check the evaluation of constant expressions on overflow. But OPM has a
second function. It works as interface between the compiler and the host
machine. This interface includes procedures to read the text to be
compiled, to read and write data in symbol files [13], and to display
text (error messages e.g.) onto the screen. All these input and output
operations are strongly dependent on the operating system. If the
compiler resides in the Oberon System environment [14, 15], the
host-dependent part of OPM remains unchanged.
The topmost module (OP2) is very short. It is the interface to the user,
and therefore host machine-dependent. It first calls the front-end with
the source text to be compiled as parameter. If no error is detected, it
then calls the back-end with the root of the tree that was returned by
the front-end as parameter.
Figure 1. Module import graph (an arrow from A to B means B imports A)
Between the highest and the lowest module, one finds the front-end and
the back-end, which consist of four, respectively three modules. There
is no interaction during compilation between these two sets of modules.
The symbol table and the syntax tree are defined in module OPT and are
used by both the front-end and the back-end. This explains the presence
of import arrows from OPT to back-end modules visible in the import
graph (figure 1). But there is no transfer of control, such as procedure
calls.
The front-end is controlled by the module OPP, a recursive-descent
parser. Its main task is to check syntax and to call procedures to
construct the symbol table and the syntax tree. The parser requests
lexical symbols from the scanner (OPS) and calls procedures of OPT,
the symbol table handler, and of OPB, the syntax tree builder. OPB
also checks type compatibility.
The back-end is controlled by OPV, the tree traverser. It first
traverses the symbol table to enter machine-dependent data (usi