home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Gold Fish 1
/
GoldFishApril1994_CD1.img
/
d1xx
/
d183
/
pcq
/
pascal.doc
< prev
next >
Wrap
Text File
|
1989-02-25
|
49KB
|
1,167 lines
PCQ version 1.0
A very simple Pascal compiler for the Amiga
by Patrick Quaid
PCQ (which stands for Pascal Compiler, um, Q ... look, I
couldn't come up with a name so I used my initials, OK?) is a modest
Pascal sub-set compiler that produces assembly code. It is not in
the Public Domain (I retain the copyright to the source code, the
compiler, the run time library source code, the run time library,
and this documentation), but it can be freely distributed as long as
all the files in the archive are included (with the possible
exception of the assembler and linker) and unchanged. The compiler
is slow, and it can't handle a couple of things, but all in all it's
worth the price. To summarize:
The bad:
The compiler is awfully slow.
It doesn't allow range types.
It doesn't support the 'with' statement or sets.
Multiplication and division are done the easy way, which
results in an odd mixture of 16 and 32 bit math. This
will be fixed before the next release.
The code is not optimized at all. It is, therefore, slow,
fat and generally silly looking.
Programs produced by PCQ can be run only from the CLI.
This will be fixed fairly soon.
The compiler gets knocked for a loop by most errors.
The good:
It works, for the most part.
The compiler supports include files.
It allows for external references, although you have to
do the checking (this isn't Modula-2, after all).
It supports records, enumerated types, pointers, arrays,
and strings.
Type conversion as found in Modula-2 is supported. In
other words, something like "integer('d')" is
legal.
You can have as many const, var, type, procedure and
function blocks as you want, in any order.
It's free.
Table of Contents
This manual is intended to be read with a file reader or
text editor, so this table of contents is based on line numbers
rather than page numbers.
Section Line number
How To Use PCQ ........................ 89
An Explanation of Its Ills ............ 179
Predefined Stuff ...................... 276
Constants ......................... 303
Types ............................. 340
Variables ......................... 380
Functions ......................... 396
Procedures ........................ 436
Extra Statements .................. 479
The extra libraries ............... 516
Reserved Words ........................ 529
Floating Point Math ................... 555
The Limits of PCQ ..................... 601
Strings ............................... 626
Compiler Directives ................... 676
Type Conversions ...................... 718
External References ................... 761
Input/Output .......................... 842
Errors ................................ 996
Run Time Errors ....................... 1022
Sources ............................... 1039
Notes to Assembly Programmers ......... 1083
Improvements On The Burner ............ 1098
Other Notes, Copyright & My Address ... 1123
How To Use PCQ
There are several files in this archive you will need to
copy over to your work disk. The compiler (Pascal) is one, of
course, as well as the run time library (called PCQ.lib- there's a
readme file in the archive that explains all the file names, by the
way). If you do not have the assembler (A68k) and linker (Blink),
you'll have to copy them as well (they might not have been included
in this archive, but should be available on a local bulletin board
or on Fred Fish disks). These files are necessary for even the
simplest compilations.
The files with the suffix .p are example Pascal programs,
which you can copy over if you want. I spent a lot more time
working on the compiler than on these examples, but a couple of them
are interesting if you haven't seen programs like them before. They
demonstrate just about every aspect of the compiler that I could
think of, so you should probably take a look at them and then get rid
of them.
The files that end with .i are include files for a few of
system libraries. They define the records, types, constants,
procedures, functions and variables needed to access the system.
These you probably should keep around. There is also an include file
for a few string routines. The code related to all these routines is
in the run time library.
In order to compile a program, first write one. Or use one
of the example programs. Then type:
1> Pascal prog.p prog.asm
'Pascal' is, of course, the name of the compiler. You can
change it if you want. 'Prog.p' is the pascal source file, which
can also be called whatever you want. The last word is the name of
the assembly file produced. At the moment these are the only
command line arguments allowed. By the way, the example programs
assume that the include files are in a directory called "Include",
which is actually a subdirectory of the current directory (in other
words, the programs will try to include "Include/exec.i" instead of
just "exec.i"). If this conflicts with your setup, just edit the
include statements at the start of the file. Assuming the
compilation completes without any errors, you then type:
1> A68k prog.asm prog.o
This invokes the assembler to produce object code. If the
archive included A68k it probably also included the documentation for
it, so read that for information about the assembler. If the
assembler was not included, get and use A68k by Charlie Gibbs,
version 1.2. A68k does lots of small scale optimization that the
code from PCQ might very well depend upon, so I don't claim that the
compiler works with any other assembler. Finally, you want to link
the program, so you type:
1> Blink prog.o small.lib to prog library PCQ.lib
This will produce a finished executable program called 'prog'.
All of the Pascal run time routines, Amiga system routines, and my
tiny little string library are contained in PCQ.lib. If any of the
routine names clash with ones you are working with, just be sure to
put your library or object file in front of PCQ.lib on the Blink
command line. If Blink was included in the archive it's documentation
probably was as well, so read that to answer any questions you may
have about the link process.
I use Blink version 6.7, and again I assume that PCQ won't
work with any other linker or version. Small.lib is a library of
addresses written by Matt Dillon. Because of an apparent bug
in Blink, it has to be included in the object files, rather
than the libraries where it belongs. It won't increase the
size of your executable files, though.
Instead of all this business you could just use the 'make'
script that's included in the archive. You may have to change it
around a bit so that it looks in the proper directories and whatnot,
then through the magic of AmigaDOS 1.3 you should make it a script
file. Then you can invoke it like:
1> make prog
It will take the file 'prog.p' and produce the finished file
'prog'. If your program has separately compiled units, you'll need
to modify the batch file or write another. I recommend writing a
script file for any program you'll need to compile a few times. If
none of this makes any sense, write or call me and I'll try to give
you more coherent instructions.
An Examination of Its Ills
I might as well get this over with right away. As was
mentioned earlier, sets and the 'with' statement do not work at all.
Another thing that's not accepted is syntax like:
type
smallnumber = 1..20;
PCQ doesn't do any overflow checking of this sort during
runtime, so this wouldn't mean much at the moment anyway. The
exception to this is in the declaration of arrays, where this syntax
is accepted and in fact there is some range checking available.
Read on for details. Something else that won't work is this:
type
WindowPtr = ^Window;
Window = record
NextWindow : WindowPtr;
...
It will fail on the first line with an 'Unknown ID' error.
Instead, use something like:
type
Window = record
NextWindow : ^Window;
....
end;
WindowPtr = ^Window;
This is something I should get around to fixing, but it isn't
strictly necessary, so there you go....
Also note that PCQ does not require, and in fact
cannot accept, file variables in the Program statement at the
beginning of a program. In other words something like...
Program Tester(input, output);
...is not allowed. Just leave out everything in the
parentheses, then leave out the parentheses. PCQ assumes, at this
point, that you will need both Input and Output. Another feature of
standard Pascal that I'm not too hot on including is the "goto"
statement. Although both "label" and "goto" are reserved words, I
have not yet made them part of the language.
Literal real numbers (like "10.0") are not yet allowed in
PCQ. In the next version real numbers will be more fully supported,
but for now you must use the techniques described in the section
"Floating Point Math".
The compiler will not yet allow variant records. This I will
fix pretty soon, since the next version of the compiler will probably
require them. In character array constants, PCQ does not accept the
two single quotes in a row that are supposed to signify one quote
within the array. Instead it offers you a different type that I'll
get to in a moment.
The compiler is written in PCQ Pascal, of course, and
therefore exhibits some of its problems. One of these is that,
although integers are 32 bits long, the compiler will
misunderstand any literal integer in the text of your program
that is greater than about 100,000 (actually it's much more than this,
but I figure this is easier to remember). At this point it also cannot
properly read and write them either. This will be fixed when I
add full 32 bit math support, but the temporary fix is to use
hexadecimal numbers. With these you can specify any 32 bit number,
using the normal dollar sign followed by 0..9 or a..f or A..F syntax.
If the compiler has to write a large number, it will use hexadecimal
in order to these errors.
Finally we get to nested procedures. Because of two
problems, I had long ago decided to leave them out. Although I
wrote most of the compiler with that in mind, in turns out that
they almost work, so I guess I'll have to address them. The
first problem is with their names. Using nested procedures in
Pascal it is possible to have two procedures with the same name,
but under different scopes. This is fine as far as the compiler
is concerned, but the assembler that takes over has only one
scope. Thus I should have made sure that the compiler produced
unique names for each procedure and function. This would have
been easy enough to take care of, but like I said I didn't even
consider the possibility. Next time, definitely.
The other problem is more complex than I really want to get
into, but it boils down to this: from a nested procedure, you
cannot access the local variables of parent procedures. You can
access the procedure's own local variables, its parameters, and the
variables global to the program. Again the compiler will not
complain (it is, after all, legal Pascal), but the program won't run
right. I know now how I'm going to take care of this, and it's
fairly simple, but fixing it means a whole new round of testing so
this release goes without it.
Predefined Stuff
I've arranged the predefined identifiers as they are supposed
to appear in Pascal. In PCQ, however, you can have these blocks
in any order, and you can have more than one of each. In other words,
your program could look like:
Program name;
var
variable declarations
type
types
var
more variables
procedure
a procedure
var
still more variables....
And so on. An identifier must still be declared before it
is used, of course. I allowed this because it is a real pain to
arrange a bunch of different include files (each of the system
include files would have had to be split into four sections : the
constants, the types, the variables, and the procedures and
functions).
CONST
True and False are defined as -1 and 0, respectively.
Nil is defined as a pointer with the constant value zero,
but is not a reserved word as it is in standard Pascal.
Most places the compiler requires a constant, it will take a
constant expression (one that can be evaluated during the compile).
For example, the following will work:
const
first = 234;
second = first * 2;
type
thetype = array [first .. first + 7] of char;
Unfortunately you cannot yet use standard functions, type
conversions, or other nifty things that you can do with expressions
in the program body. Just the five basic math functions (+, -, *,
div, mod), for now. Also note that 'first + 7' up there would be
evaluated during the compile, but the same text in the body of the
program would be evaluated during run time. In other words, there
is no such thing as constant folding yet.
When you are using integer constants, you can separate the
digits with an underscore, similar to Ada. In other words you
could have:
const
thousand = 1_000;
tenthousand = 1_0_0_0_0;
MaxInt is defined as $7FFFFFFF, which comes out to something
over two billion. Don't try to write it. MaxShort is 32767, or
$7FFF in hex.
TYPE
There are several predefined types. They include:
Integer 4 bytes, but only 16 bits of reliable range
when doing multiplication and division. This
will be fixed.
Short 2 bytes. Literals within the program text are
assumed to be short values unless they are greater
than 32767 or less than -32767.
Byte 1 byte. These three types are all numeric types, so
you can use them in normal expressions without
worrying about type conversions. The compiler
automatically 'promotes' the small values to
whatever size is required. Remember that there is
currently no overflow checking.
Char 1 byte.
Boolean 1 byte. False is 0 and true is -1.
String 4 bytes. Really just defined as '^char'. I will
explain further in the section 'Strings'.
Address 4 bytes. This is a pointer to no particular type.
It is type compatible with any other pointer- in fact
the constant nil is of type Address.
Text 18 bytes. This is not the same as a 'file of
char'. The standard input and output are Text
files. You can read and write integers,
characters, arrays of characters, and strings
to Text files. You can also write Boolean values.
Enumerated 2 bytes.
As was mentioned above, you can have arrays, pointers,
records, and files based on the above types. You can also have
synonym types (like 'type int = integer;'), but they don't work very
consistently.
Also note that almost anywhere you need a type, you can use
a full type description. Some compilers have a problem with
this, and I'm not sure what Standard Pascal says about it, but
then again I really don't care much.
VAR
The only standard variable included in PCQ is :
CommandLine : array [1..128] of char;
As its name would indicate, this variable is initialized
during the startup routine to whatever the CLI command line held.
It is an extra copy, so you can alter it as you wish. The
significant characters are terminated by a zero byte, after which
it's anybody's guess as to what it contains. After you have used
the information from this array, or if you didn't need it in the
first place, feel free to use the array for whatever you might need.
It's going to be there regardless (sorry about that), so you
might as well get some use out of it.
FUNCTION
The standard functions that do not concern real numbers are
provided. They include:
function ord(x : any ordinal type): integer;
returns the ordinal position of the argument.
function chr(x : numeric type) : char;
returns the indicated character.
function abs(x : numeric type) : the same type;
returns the absolute value.
function succ(x : ordinal type) : the same type;
returns x + 1, of the same type
function pred(x : ordinal type) : the same type;
returns x - 1, in that type
function odd(x : numeric type) : boolean;
returns true if the number is odd
function eof(x : any file): boolean;
returns true if you are at the end of an input file.
In addition to these standard standard functions, there is
another standard function for this compiler in hopes of making it
somewhat useful. It is
function adr(var x : any variable): Address;
returns the address of the variable in question.
All the routines up to this point are handled in line. The
other two standard functions are for opening files. They will be
more fully explained when I get around to writing about
input/output. There is also a syntax like 'typename(expression)'
supported by the language which looks like a function. This will
be explained in a later section called Type Conversions.
PROCEDURE
The standard procedures are write, writeln, read, readln,
get, new, dispose, exit, and trap. The first five will be covered in
the IO section. The other four are:
procedure new(var x : pointer variable);
This allocates public memory the size of whatever type is
pointed to, then puts the address into x. PCQ allocates memory using
Intuition's AllocRemember() routine, so that at the end of execution
all the memory allocated through new() is returned to the system.
This means that you don't absolutely have to call dispose() for every
new(), although you should. By the way, if the allocation fails, the
program aborts (Sorry about that. I'll change it eventually).
procedure dispose(var x : pointer variable);
This returns the allocated memory to the system. If
something got confused, and you try to dispose of memory you
never allocated, this will just return. Unfortunately that means
you may never diagnose a problem in your program, but at least it
won't be calling the Guru all the time.
procedure exit(error : integer);
Exit() aborts a program early. It is the acceptable method
of escaping a program. It does the same stuff that the program
normally does when it quits, then returns the error number you give
it to AmigaDOS. This routine will free all the memory and close the
open files. By the way, the error number should be zero if the
program terminated correctly, 5 for a warning, 10 for an error, and
20 for a catastrophic error.
procedure trap(num : integer);
The argument for this procedure must be a constant
expression, although the type doesn't matter. All it does is insert
a 68000 trap instruction into the code at the point of the statement.
This is useful for the debugger I use, and for nothing else I can
imagine. It effectively inserts a break point in the program.
Extra Statements
First of all, PCQ supports if, while, repeat, for and case
statements. It does not yet support 'with' statements, but it will
soon enough.
The if, while and repeat statements work pretty much like
they should. The case statement is a bit weak. First of all, the
individual cases must be constants. Unfortunately there can
currently only be single cases- in normal Pascal you can list
several cases separated by commas and use ranges. Soon you'll be
able to do both, but not yet. The syntax for the case statement
looks informally like:
case <ordinal expression> of
<constant expression> : <statement>;
<constant expression> : <statement>;
...
end;
...where each <constant expression> is of the same type as
the <ordinal expression>.
The for statement supports 'downto', which changes the
increment from 1 to -1. It also supports 'by', which allows you to
set the increment. The argument for the 'by' part can be any
regular expression, but for any negative increment you must use
'downto' rather than 'to', or the loop will only run once. By the
way, for loops always run at least one time. Anyway the syntax
looks something like:
for <variable> := <expression> to|downto <expression>
[by <expression>] do <statement>;
The other statement included is 'return', which simply
aborts a PROCEDURE early. You can abort a FUNCTION early by
assigning the function name to some value, so 'return' works only in
procedures.
The extra libraries
There should be some extra libraries included in the archive
(the code for the libraries is in PCQ.lib, but there should be
include files describing them). Most of these libraries are
interfaces to the system, and all of them are individually documented
in their .i files. Note that to use Intuition, Exec, AmigaDOS or
basic floating point math functions you will NOT need to open the
associated libraries. All these libraries are opened during the
start sequence, and they are in fact required by all PCQ programs.
Reserved Words
The reserved words of PCQ are as follows:
and for procedure
array forward program
begin function record
by goto repeat
case if return
const in set
div label then
do mod to
downto not type
else of until
end or var
external packed while
file private with
As you can see, even the unimplemented stuff is reserved. The
only one that is not explained somewhere in this document is
"private", which is one of the things that will help make external
references and modularity more flexible in version 1.1.
Floating Point Math
First of all, real numbers are not fully integrated into the
language. They can be used, but you have to do some extra work.
Real math is based on the MathFFP.library, which is one of the
libraries that is in memory. The main reason I haven't fully
included real numbers, by the way, is because I am looking for some
feedback concerning the awkwardness this approach.
The way you carry out floating point math is to make calls
to the library. At the top of your program you must include
"Math.i", which will declare all the functions from mathffp.library.
You will not have to open the library, however. In any case, in
order to do "f1 := f1 + f2", you use:
f1 := spadd(f1, f2);
Read "Math.i" for a list of the functions. In the example
programs there is a file called RealIO.p which has routines to read
and write real values to and from files and standard IO.
Incidentally, the way to specify a literal real value (since you
can't write something like "10.0") is to use spfloat(). For example
to specify 4.546, you would write:
spdiv(spfloat(4546),spfloat(1000))
This is slow, and involves no less than three calls
to the real numbers library, but that is so far the only way to do
it.
The next version of the compiler will have fully integrated
real numbers. In other words you'll be able to specify literal
values, do simple math, and carry out IO on them.
Functions like sin(), cos(), and sqrt() are not handled by
mathffp.library. They are located in mathtrans.library, which is
disk based. Thus whenever you write a program that needs these
functions, the system disk will have to be inserted in order to get
to LIBS:. Read MathTrans.i for further information about all this.
By the way, I have no plans to implement these functions in any
other fashion. In the forseeable future you'll need
MathTrans.library every time you need trigonometric or exponential
functions.
The Limits of PCQ
The compiler can accept lines of any length, although it
will display at most the previous 128 characters read in if an error
occurs. As far as the size of the file is concerned, it can be any
length (the only part of the file that is in memory at any time is
the current character), with, of course, a few caveats. The first
is that, since this version of the compiler still uses a big array
to hold identifiers, there is a limit to the total number you can
have. Don't worry about that though: all of the include files
combined only take up about half the room. This will be fixed in
the next version. There are other fixed limits in the compiler, but
I never got anywhere near them in compiling the compiler, so I can't
imagine you'll hit them.
The other limit is that, since the compiler produces lots of
assembly code output, there must be room on the disk for the whole
file. The assembly output is, as a rule of thumb, as much as five
times as large as the Pascal source.
One dubious advantage of using mostly fixed amounts of memory
is that I can tell you that the compiler takes up just under 150k in
memory, so with its stack and the rest of incidental memory it should
require about 160 or 170k to run.
Strings
As was mentioned above, strings should be thought of like
'^char'. They are defined that way, but also are given special
properties. They can be dynamically created, sized, and disposed
of. A string is supposed to be terminated by a zero byte, so if
you write any string handling routines be sure you follow that
convention. Otherwise you'll confuse all the other string
routines. In the text of a program, you delineate strings with
double quotes, instead of the single quotes found around normal
arrays of char. Thus:
"A string" is indeed a string, while
'not one ' is considered an array [1..8] of char.
The other interesting thing about strings is that they can
have C-like escape sequences. What happens is that you type a
backslash (looks like this: \), and the very next character is
specially handled. C has a bunch of these things, but I have, so
far, included only the ones I use, which are:
\n which stands for a line feed, chr(10)
\t which stands for a tab, chr(9)
Everything else passes through unchanged, so that you can
also use this mechanism to include double quotes in your strings.
And you have to use it to include backslashes. What this all
boils down to is that the string "A\tboy\nand\\his \"dog.\""
prints out like:
|A boy
|and\his "dog"
There is something called StringLib.i in this archive that
declares a few string handling routines - the ones I needed for
the compiler. Read that file for more information. And if you
get confused about strings, just remember that they're pretty much
like C strings, and can be used in most of the same situations.
Remember that if you declare a string you don't get any space for
the characters. All you get is space to hold the address of where
the characters are, so you have to call AllocString() in StringLib
or something like it to get some room to work. If you are a BASIC
programmer you might run into some difficulty on this subject, and
I would suggest reading up on C strings in hopes that whatever you
read can explain the situation better than I.
By the way, note that 'stringvar^' is valid, and is of type
'char'.
Compiler Directives
Eventually there will be billions of compiler directives,
but for now there are just three. Compiler directives work like
this: if the first character in a comment is the dollar sign ($),
the compiler looks to the next character for a command. No spaces
are allowed between the bracket, dollar sign, and command character.
The compiler directives are:
{$I "fname"} This will insert the file "fname" into the
stream at this point. When it has finished,
it will end the comment (no more directives
allowed in this comment) and continue on.
There can be any amount of white space in
front of the filename and anything you want,
such as the rest of a comment, after it. The
filename is a string, so it must be in
quotes. Several of the example programs should
demonstrate the include syntax.
{$A This directive inserts assembly instructions
Instructions into the assembly file produced by the compiler.
} Look at the assembly language produced by the
compiler to figure out how to reference variables
and subroutines. This directive simply passes
everything from after the A until, but not including,
the closing bracket. You should therefore include
comments in assembly fashion.
{$R+} or The '+' directive instructs the compiler to produce
{$R-} range-checking code for arrays. From this point
until the compiler reaches a {$R-} directive, each
array access will check that the index value is
within the bounds of the array. This expands and
slows the code, so I recommend only doing this
during testing. If the index is out of bounds, the
program will abort with an error code (look at the
section "Run Time Errors" for more information).
Type Conversions
If you have used Modula-2, you can skip this section. In
writing the compiler I found the need to cheat a bit on type
checking, so I decided to use Modula-2's syntax for changing the
type of an expression. What you do is use the name of the type
as if it were a function. The expression in the parentheses is
evaluated, and the result is considered to be of the type named.
It goes like this:
IntegerVariable := integer(any ordinal expression);
CharVar := char(456 - 450);
if boolean('r') then ....
This works not only for the included standard types, but
also for any type you create. Thus this is also legal:
type
charptr = ^char;
var
charvar : charptr;
....
charvar := charptr(0);
charvar := charptr(integer(charvar) + 1);
Note that the type must be named in order for this to work.
Something like...
variable := array [1..4] of char(expression())
...will not work. This then is the only case where a type
is possible, but you can't use a complete type definition. I'm
pretty sure you can in all other cases, but what do I know....
Note further that not all type conversions are valid.
Converting a type to one of a different size is often a bad idea, as
is converting a structured type (array or record) to a simple type.
I should probably warn you against the indiscriminate use of
these, but what the heck. Have a ball.
External References
First a little background. The source code for this
compiler is, in total, about 70k. The assembly listings produced by
the compiler generally expand the Pascal source about five times, so
you can see that if I decided to write the compiler as one big
program, it would be way too unwieldy. What I needed was a facility
for separate compilation. What I came up with was this: if you
have a previously compiled procedure somewhere that you want to call
from the Pascal program, just make it a forward declaration
somewhere before you use it. If the compiler gets to the end of
your program and has not yet run across the full definition of a
forward declared procedure, it assumes it's an external reference
and makes the appropriate statements in the assembly file. So it
looks like this:
procedure DrawMap;
forward;
As long as you don't have something else defined as DrawMap,
the compiler will produce an external reference to _DrawMap (note
the underscore prepended to the name).
Now for something somewhat less kosher. I needed some
syntax to allow the external routines to access the same global
variables as the main file. What I came up with is a different
file format. Whereas the normal Pascal file looks like:
program Name;
declarations
procedures and functions
begin
main program
end.
The external file looks like this:
external;
declarations (like normal)
procedure and functions (like normal)
There are three things to note. The first is that there is
no main program, the second that there is no special ending
syntax. It is just a bunch of procedures and functions in a row
until the end of the file. The other thing is that any variables
declared at the global, or outermost, level are considered
external references. In the source for the compiler there is a
file that has just the global variable declarations. This file
is included by all ten of the source files, but only the main
file produces storage space for them. The other nine just
produce external references.
I guess this is a good time to discuss a couple of issues
related to using an assembler with the separate compilation deal.
First, note that all procedure, function and variable names are
offered as external references by the module in which they are
defined. If an outside routine wants to use any of these values,
it should be looking for something starting with an underscore
and spelled the same as the first time the word is encountered in
the program. Pascal is case insensitive, of course, but I can't
help the assembler and linker. Also remember that there is no
type checking across files (again, get Modula-2 if you want that
sort of stuff). This means that a procedure that expects a
string might be sent a Boolean value, which would probably
conjure the Guru.
The other thing to note is that this compiler pushes
procedure and function arguments on the stack from left to right.
Most C compilers (including Lattice and PDC) do it the opposite way.
Draco also does it left to right. This doesn't mean that you can't
use code and libraries from them - it simply means that you should
reverse the order of the arguments.
Just two more notes on this subject: first, the compiler
considers registers d0, d1, d2, a0, and a1 fair game, and will
destroy them at will. d2 might be a problem, but the others
shouldn't. For further information, just look at the assembly code
produced. The second note is just a reminder to anyone who might
want to link Pascal programs to other languages: remember what
'var' does before a variable, and be sure to use it correctly.
Input/Output
There are several routines for handling IO in PCQ. Before I
get to them, however, let me discuss what happens when you open a
file. The actual file variable you declare in the program, as in:
var
filevar : file of integer;
is actually something like a record, which would look like
this:
file = record
FileHandle : a DOS file handle
Buffer : a pointer to the input buffer
Size : the size of the elements of the
file
EOF : a Boolean value
IN/OUT : (input, output)
NextFile : a pointer to the next file record
end;
Now you can't actually access these fields, but nonetheless
18 bytes of memory is reserved. When you open a file, all of the
fields are initialized as necessary, and the first element is
read into the buffer. The buffer is accessed by the filevar^
syntax. Also note that if the size of the elements of the file
is greater than 4, or if it's 3 (don't ask), the program will
allocate memory for a buffer. This will be pointed to by the
variable Buffer in this record. If the size is 1, 2 or 4 (as in
the case with chars, shorts and integers, respectively), the
program will instead use the variable Buffer as the buffer, thus
saving a little memory and time. Filevar^ will always properly
access the buffer whatever it is.
If at the end of execution there remain some open files, the
shut-down code will close them for you. This is only true for
files opened through Pascal, using one of the open() routines
explained below. Anything you open directly through AmigaDOS is
your own responsibility.
The routines that handle file IO are these:
function open(filename : string;
filevar : file of something, or Text):boolean;
This opens a file for writing. If the file was there
before, this routine will erase it. If everything worked
OK, it will return true. If not, of course, it's false.
function reopen(filename : string;
filevar : file of something, or Text) : boolean;
This is analogous to open() except it opens an existing
file for reading.
The rest of the routines are the same as most Pascals. Just
for the sake of completeness, however, they are:
write() Write the stuff to a file or to standard out
writeln() Do the same as write, then output a line
feed. This only makes sense for Text files.
read() Read some stuff from a file or standard in.
read(filevar, x) mimics...
x := filevar^;
get(filevar);
...just like most Pascals. In this case, it
mimics it very closely.
readln() Do read then keep reading until you hit a
line feed. This too only makes sense for
Text files.
get() Reads the next file element from the file
into the buffer.
If the first argument of a read or write is a file variable,
the input or output is from a file rather than the console or
whatever. That, of course, is normal Pascal, and looks like:
writeln(outfile, 'The result is ', 56 div 4);
Field widths are supported, but must be a constant
expression. What this means is that something like...
writeln((67 * 32) + 5:10);
... will print the result right justified in a field of ten
characters, with spaces padding out the area to the left. If you
specify a field width lower than the width of the number, the number
is printed in as few characters as possible. Valid values for the
field width are greater than or equal to one and less than MaxShort.
You can specify a field width for any type in a write statement,
although only when writing to a text file.
Just for the sake of precision, I'll go over the delimeters
for IO on Text files with various types:
Write Char
Writes one character.
Write Boolean
Writes TRUE or FALSE, with no extra spaces.
Write Integer
Writes the number with no extra spaces, but
possibly a negative sign
Write Array of Char
writes the entire array, from first element to last.
Write String
Writes from the first character up to but not
including the zero byte.
Writeln
Writes a single EOLN (chr(10)) to the file.
Read Char
Reads the next char.
Read Boolean
Can't do it.
Read Integer
This eats spaces and tabs until it meets up with
something else, then eats digits until it comes
upon a non-digit. It does not eat that last non
digit. If the routine runs across an EOLN before
it gets to the first digit, it returns zero. If
it finds letters before it finds digits, it returns
zero also.
Read Array of Char
Reads characters into the array until either the
array is full or the routine finds an EOLN. If it
finds an EOLN it will not eat it, so you'll have to
do that with a readln if you want. If it returns
because of an EOLN it will also pad the rest of the
array with spaces.
Read String
Reads characters until it gets an EOLN. The EOLN
is left in the input stream, and a zero is put in
its place in the string. Note that this routine
does not check for length, so you must be sure that
your string can handle the longest line it might
encounter.
Readln
Reads characters up to and including the next EOLN.
Also remember eof(filevar), from the functions, and note
that there is no put() analogous to the get() routine. For
examples of all of these, look at the example programs. Also
note that the filevar^ sort of syntax is present. Look at a
Pascal text to understand it (I don't think Turbo Pascal uses
this, so it might be Greek to a lot of Pascal programmers).
Errors
As I mentioned somewhere above, most errors will completely
confuse the poor compiler, which will then start spewing out
errors that don't really exist. It can get by a couple of
errors- for example if you leave out a semicolon somewhere, you
should get an error message but everything else should compile.
Very few other errors will work that well. I hope to make the
compiler a bit friendlier, but in the meantime the compiler will
abort the compile if it gets more than 5 errors. I put this in
because the compiler will sometimes get one error, then start
producing errors on every symbol, and even get hung up on a
symbol. Really ugly.
If an error occurs, the compiler will write out at most the
two lines leading up to the error, and highlight the part that
it's currently working on. The error probably occured either at
the highlighted symbol or just before it. Also note that the
highlighted symbol is always the last symbol written (when the
symbol is just some punctuation, it can be difficult to see that
it is highlighted). On the next line is the line number of the
error and the explanation of the error. Currently I'm using text
descriptions of the errors, so there are no error numbers.
Run Time Errors
A couple of things cause run time errors. The few that are
handled at the moment are:
Error Explanation
50 No memory for new()
51 Divide by zero with Floating point numbers.
52 Array access out of range.
The error number is returned (through the exit() function) to
AmigaDOS. If the program is running in a batch file you'll get to
see the return code. I hope to have the run time system better
thought out in the next version of the compiler, so these might go.
Sources
Like I said, I wrote this for the learning experience. Some
of the places I went for information are:
1. PDC, a freely distributable C compiler supported by Jeff
Lydiatt. This is a very good program, and one of the best
freely available compilers for the Amiga (the other really
good one is Draco by Chris Gray). I learned (and used) a
lot about activation frames from the listings produced by
this compiler. Looking at the assembly code produced by
this compiler was also my inspiration for starting to
write a compiler.
2. Pascal-S, the Pascal compiler produced out of ETH Zurich.
I got some ideas about the structure of a compiler from
this, but not too many.
3. Small-C, another freely distributable C compiler. This one
is not nearly as powerful as PDC, but its simplicity helped
me understand a thing or two. Probably the best compiler
source code that I found to learn from. This and PDC were
the compilers I used before this compiler was able to
compile itself. Many aspects of the design of PCQ come
from Small-C.
4. Brinch Hansen on Pascal Compilers, by Per Brinch Hansen.
This book was of some use, which is more than I can say
about the other half dozen I read while writing this. From
this book I mainly learned about all the things I was doing
wrong. Great.
If you like the idea of freely distributable compilers, please
be sure to check out Draco from Chris Gray (on Fred Fish 76 & 77) and
PDC from Jeff Lydiatt (an old version is on Fred Fish 110). Both are
much better products than PCQ and even rival the commercial compilers.
I'm not sure what a good source for the newer version of PDC would be -
perhaps you could write to Jeff (it's certainly worth it. PDC has a
full preprocessor, a 'cc' front end, very fast optimized code ...
the works). The syntax of Draco, by the way, is fairly similar to
Pascal.
Notes to Assembly Programmers
During the course of a program PCQ uses registers d0, d1, a0
and a1 as scratch. It also uses d2 and d3 during IO calls and d2
when comparing or assigning large data structures. a7 is, of
course, the stack pointer, and I use a5 as the frame pointer. a6 is
used to hold the library base during any call to the system, and a4
is reserved for future use (for accessing local variables of a
parent procedure). The other registers are free, and in fact the
scratch registers should be free for you to use between statements.
After all, the compiler does no optimizing.
Improvements On The Burner
Version 1.1 of this compiler will definitely have:
Full 32 bit math.
Fully integrated floating point math.
Properly implemented nested procedures.
No more fixed arrays in the compiler itself.
The ability to work with the Workbench.
As far as the various other problems go, my main concern is
fixing bugs. Rather far down on the list is adding every last detail
of Pascal. Way down at the bottom of the list is code optimization.
As far as gimmicks go, I'd like to integrate the compiler with
CygnusEd Professional (the editor I use) through the editor's Arexx
port.
Version 1.1, with the improvements listed above and possibly
others, will be released during the summer of '89 at the latest.
Any lettered version, eg version 1.0b, will be a bug fix.
I hope I don't run out of letters. Increments in the tenths place
will indicate added functionality. If I come out with 2.0 it will
be Modula-2. 3.0 will be Ada.
Other Notes, Copyright & My Address
As I mentioned above, this documentation, the source code
for the compiler, the compiler itself, the source code for the run
time library, and the run time library itself, are all (ahem):
Copyright (c) 1989 Patrick Quaid.
I will allow the package to be freely distributed, as long
as all the files in the archive, with the possible exception of
the assembler and linker (please include them if at all possible),
are included and unchanged. Of course no one can make any real
money for distributing this program. It may only be distributed
on disk collections where a reasonable fee is charged for the disk
itself. A reasonable fee is defined here as the greater of $10
per disk, whatever Fred Fish is currently charging. Sorry about
being repetitive, but I imagine it's best to state these things
clearly.
Feel free to mess around with the compiler source code. If you
make any substantial improvements, I would appreciate a copy of
them so that they can be incorporated into the next version if
appropriate. If you make improvements that are not along the lines
of standard Pascal or the path indicated above, please don't
distribute your program under the name PCQ. That would only
confuse things.
This is not a shareware package. Feel no guilt about using
it without paying for it. The one payment I would really appreciate
is if you could let me know about bugs you discover (not
unimplemented features- I know about them. I'm not trying to write
the end-all greatest compiler, but I do want it to be correct). If
you have an overwhelming urge to give money away, please send a
donation to Charlie Gibbs, who wrote the assembler, and the Software
Distillery, who wrote the linker.
Any questions, comments, or whatever can be addressed to:
Pat Quaid
8320 E. Redwing
Scottsdale, AZ 85253
(602) 948-8325
Enjoy the compiler. If you have any complaints, remember
what you paid for it.