The Fred Fish Collection 1.5

home *** CD-ROM | disk | FTP | other *** search

/ The Fred Fish Collection 1.5 / ffcollection-1-5-1992-11.iso / ff_disks / 300-399 / ff339.lzh / PCQ / Pascal.DOC < prev next >

Wrap

Text File | 1990-03-19 | 66KB | 1,562 lines

PCQ version 1.1 A very simple Pascal compiler for the Amiga by Patrick Quaid PCQ (which stands for Pascal Compiler, um, Q ... look, I couldn't come up with a name so I used my initials, OK?) is a modest Pascal sub-set compiler that produces assembly code. It is not in the Public Domain (I retain the copyright to the source code, the compiler, the run time library source code, the run time library, and this documentation), but it can be freely distributed as long as all the files in the archive are included (with the possible exception of the assembler and linker) and unchanged. The compiler is slow, and it can't handle a couple of things, but all in all it's worth the price. To summarize: The bad: The compiler is awfully slow. It doesn't support sets. The code is not optimized at all. It is, therefore, slow, fat and generally silly looking. The compiler gets knocked for a loop by most errors. The good: It works, for the most part. The compiler supports include files. It allows for external references, although you have to do the checking (this isn't Modula-2, after all). It supports records, enumerated types, pointers, arrays, and strings. Type conversion as found in Modula-2 is supported. In other words, something like "integer('d')" is legal. Several features from Turbo and Quick Pascal, such as Exit procedures, operators such as Shl and Shr, and typed constants, have been added. You can have as many const, var, type, procedure and function blocks as you want, in any order. It's free. Table of Contents This manual is intended to be read with a file reader or text editor, so this table of contents is based on line numbers rather than page numbers. Section Line number How To Use PCQ ........................ 88 An Examination of Its Ills ............ 191 Predefined Stuff ...................... 226 Constants ......................... 253 Types ............................. 299 Variables ......................... 350 Functions ......................... 379 Procedures ........................ 460 Extra Statements .................. 516 Reserved Words ........................ 556 Expressions ........................... 579 Floating Point Math ................... 606 The Limits of PCQ ..................... 628 Typed Constants ....................... 643 Strings ............................... 717 Exit Procedures ....................... 779 Compiler Directives ................... 817 Type Conversions ...................... 897 External References ................... 938 Input/Output .......................... 1013 Standard IO ........................... 1191 Errors ................................ 1252 Run Time Errors ....................... 1283 Sources ............................... 1312 Notes to Assembly Programmers ......... 1366 Improvements On The Burner ............ 1385 Update History ........................ 1399 Other Notes, Copyright & My Address ... 1508 How To Use PCQ There are several files in this archive you will need to copy over to your work disk. The compiler (Pascal) is one, of course, as well as the run time library (called PCQ.lib- there's a readme file in the archive that explains all the file names, by the way). If you do not have the assembler (A68k) and linker (Blink), you'll have to copy them as well (they might not have been included in this archive, but should be available on a local bulletin board or on Fred Fish disks). These files are necessary for even the simplest compilations. The files with the suffix .p are example Pascal programs, which you can copy over if you want. I spent a lot more time working on the compiler than on these examples, but a couple of them are interesting if you haven't seen programs like them before. They demonstrate just about every aspect of the compiler that I could think of, so you should probably take a look at them and then get rid of them. If you got the source code with the compiler, there will be a bunch of those files lying around with the suffix ".p" also. The files that end with .i are include files for a few of system libraries. They define the records, types, constants, procedures, functions and variables needed to access the system. These you probably should keep around. There are also a few include files for routines I've supplied in PCQ.lib. Take a look through these files to see what's available- it changes frequently. The code related to all these routines is in the run time library. In order to compile a program, first write one. Or use one of the example programs. Then type: 1> Pascal prog.p prog.asm {-q} 'Pascal' is, of course, the name of the compiler. You can change it if you want. 'Prog.p' is the pascal source file, which can also be called whatever you want. The second parameter is the name of the assembly file produced. If you include the "-q" directive (which can be anywhere in the command line), the compiler will suppress all output except error reports. Furthermore, the error reports will be shortened to a more regular form. At the moment these are the only command line arguments allowed. If you try to compile the example programs, you might run into some problems with the organization of the disk. The examples all refer to the include files they require as ":Include/Something.i". Therefore the Include directory is expected to be on the root of the current disk. If this conflicts with your setup, just edit the include statements at the start of the file. Assuming the compiler finishes without any errors, you then type: 1> A68k prog.asm prog.o This invokes the assembler to produce object code. If the archive included A68k it probably also included the documentation for it, so read that for information about the assembler. If the assembler was not included, get and use A68k by Charlie Gibbs, version 2.6 if possible. A68k does lots of small scale optimization that the code from PCQ might very well depend upon, so I don't claim that the compiler works with any other assembler. Finally, you want to link the program, so you type: 1> Blink prog.o to prog library PCQ.lib This will produce a finished executable program called 'prog'. All of the Pascal run time routines, Amiga system routines, and my tiny little string library are contained in PCQ.lib. If any of the routine names clash with ones you are working with, just be sure to put your library or object file in front of PCQ.lib on the Blink command line. If Blink was included in the archive it's documentation probably was as well, so read that to answer any questions you may have about the link process. I use Blink version 6.7, and again I assume that PCQ won't work with any other linker or version. Note that in previous versions you had to include Small.lib on the Blink line. I switched versions of Small.lib, however, so I was able to just include it with PCQ.lib. Instead of all this business you could just use the 'make' script that's included in the archive. You may have to change it around a bit so that it looks in the proper directories and whatnot, then through the magic of AmigaDOS 1.3 you should make it a script file. Then you can invoke it like: 1> make prog It will take the file 'prog.p' and produce the finished file 'prog'. If your program has separately compiled units, you'll need to modify the batch file or write another. I recommend writing a script file for any program you'll need to compile a few times. If none of this makes any sense, write or call me and I'll try to give you more coherent instructions. If you have the full distribution disk, there is a simple way to give the compiler a workout. Just cd to the "Examples" directory, then type "MakeExample Moire". Note that there's no ".p" on the program name. This script looks for the compiler, assembler, linker and runtime library on the root of the disk, uses the T: directory extensively, and leaves the completed program in a file called "Moire" (or whatever you chose) in the Examples directory. An Examination of Its Ills I might as well get this over with right away. As was mentioned earlier, sets do not work at all. Another thing that's not accepted is syntax like: type WindowPtr = ^Window; Window = record NextWindow : WindowPtr; ... It will fail on the first line with an 'Unknown ID' error. Instead, use something like: type Window = record NextWindow : ^Window; .... end; WindowPtr = ^Window; This is something I should get around to fixing, but it isn't strictly necessary, so there you go.... The compiler still will not allow variant records. I suppose I'll get around to fixing this eventually. The familiar syntax for specifying a single quote character, which looks like: '''', is not accepted. Instead, PCQ Pascal using the C escape convention. Thus the single quote character would look like: '\''. See the section called Strings for more information. Predefined Stuff I've arranged the predefined identifiers as they are supposed to appear in Pascal. In PCQ, however, you can have these blocks in any order, and you can have more than one of each. In other words, your program could look like: Program name; var variable declarations type types var more variables procedure a procedure var still more variables.... And so on. An identifier must still be declared before it is used, of course. I allowed this because it is a real pain to arrange a bunch of different include files (each of the system include files would have had to be split into four sections : the constants, the types, the variables, and the procedures and functions). CONST True and False are defined as -1 and 0, respectively. Nil is defined as a pointer with the constant value zero, but is not a reserved word as it is in standard Pascal. Most places the compiler requires a constant, it will take a constant expression (one that can be evaluated during the compile). For example, the following will work: const first = 234; second = first * 2; type thetype = array [first .. first + 7] of char; Unfortunately you cannot yet use standard functions, type conversions, floating point numbers, or other nifty things that you can do with expressions in the program body. Just the five basic math functions (+, -, *, div, mod), for now. Also note that 'first + 7' up there would be evaluated during the compile, but the same text in the body of the program would be evaluated during run time. In other words, there is no such thing as constant folding yet. When you are using integer constants, you can separate the digits with an underscore, similar to Ada. In other words you could have: const thousand = 1_000; tenthousand = 1_0_0_0_0; MaxInt is defined as $7FFFFFFF, which comes out to something over two billion. MaxShort is 32767, or $7FFF in hex. Another form of constant is the 'Typed Constant'. In this case the syntax looks like: CONST Identifier : Type Description = Constant Expression; Typed constants are initialized at the beginning of the program to the Constant Expression, and thereafter can be used in exactly the same way as variables. These values are explained in depth in the section called Typed Constants. TYPE There are several predefined types. They include: Integer 4 bytes, so the range is plus or minus MaxInt. Short 2 bytes. Literals within the program text are assumed to be short values unless they are greater than 32767 or less than -32767. Byte 1 byte. These three types are all numeric types, so you can use them in normal expressions without worrying about type conversions. The compiler automatically 'promotes' the small values to whatever size is required. Remember that there is currently no overflow checking. As of version 1.1, the Byte type has the range 0..255 rather than -128..127, its range in previous versions. Real 4 bytes. This is in FFP format. Char 1 byte. Boolean 1 byte. False is 0 and true is -1. String 4 bytes. Really just defined as '^char'. I will explain further in the section 'Strings'. Address 4 bytes. This is a pointer to no particular type. It is type compatible with any other pointer- in fact the constant Nil is of type Address. Text 32 bytes. This is not the same as a 'file of char'. Input and Output are Text files. You can read and write integers, characters, arrays of characters, and strings to Text files. You can also write Boolean values. Enumerated 1 or 2 bytes, depending on the number of enumerations. As was mentioned above, you can have arrays, pointers, records, and files based on the above types. You can also have synonym types, like 'type int = integer;'. Also note that almost anywhere you need a type, you can use a full type description. Some compilers have a problem with this, and I'm not sure what Standard Pascal says about it, but then again I really don't care much. In version 1.0, you were forced to write out a multi-dimensional fully. In other words you couldn't just write: Array [0..5, 0..11] of Integer; Instead you needed to expand it to: Array [0..5] of Array [0..11] of Integer; ....for the definition, and ArrayName[x][y] for the actual use in a program. Most Pascal compilers allow the comma-delimited shorthand, however, so now I do too. VAR Version 1.1 of PCQ Pascal has several new variables. They are: CommandLine : String; In version 1.1 this was an Array of Char, and also a copy. It is neither now: it is a pointer to the actual stack space on which the command line is stored. You can use routines such as GetParam to get copies of the individual parameters. ExitProc : String; This variable points to the first in a chain of procedures to be executed when the program is shutting down. See the section called Exit Procedures for more information. ExitCode : Integer; If the program exited normally, this will be 0. If the program called the Exit() procedure, this will be the argument of that call. Otherwise this is a run-time error code. Again, see Exit Procedures for more information. ExitAddr : Address; If the program died due to a run-time error, this value will hold the address of the statement after the error. FUNCTION With the exception of a few exponential functions, most of the standard functions are provided. They include: function ord(x : any ordinal type): integer; returns the ordinal position of the argument. function chr(x : numeric type) : char; returns the indicated character. function abs(x : real, integer, short, or byte) : the same type; returns the absolute value. function succ(x : ordinal type) : the same type; returns x + 1, of the same type function pred(x : ordinal type) : the same type; returns x - 1, in that type function odd(x : numeric type) : boolean; returns true if the number is odd function trunc(x : real) : integer; returns the integer part of a real number. function float(x : integer, short or byte) : real; converts these types to FFP format. function floor(x : real): real; returns the greatest 'integer' value less than x. function ceil(x : real) : real; returns the least 'integer' value greater than x. function sqr(x : real) : real; returns x * x, but is slightly faster and smaller. function sqrt(x : real) : real; returns the approximate square root of x. function sin(x : real radians) : real; returns the approximate sine function cos(x : real radians) : real; returns an approximation of the cosine function tan(x : real radians) : real; returns the approximate tangent of x. If x is a multiple of Pi/2, this will blow up. function arctan(x : real) : real; returns the approximate arctangent (in radians) of x. function eof(x : any file): boolean; returns true if you are at the end of an input file. function adr(var x : any variable): Address; returns the address of the variable in question. function SizeOf(t : name of a type) : Integer; returns the size of the specified type, which must be a single identifier. function Bit(t : Integer) : Integer; returns the number corresponding to the bit position specified. In other words it returns an integer with just the one bit set. Function IOResult : Integer; Returns the result of the last IO statement. If it is non-zero, it's probably an AmigaDOS error code. This call clears IOResult. If IO checking is off and there is an IO error, IOResult will become non-zero and no subsequent IO statements will have effect. There are two other standard functions (open and reopen), but since they are IO functions I'll describe them in the Input/Output section. There is also a syntax like 'typename(expression)' supported by the language. It looks like a function, but isn't, and will be explained in a later section called Type Conversions. PROCEDURE The standard procedures are Write, Writeln, Read, Readln, Get, Put, New, Dispose, Exit, and Trap, Inc and Dec. The first six will be covered in the IO section. The other six are: Procedure New(var x : any pointer variable); This allocates public memory the size of whatever type is pointed to, then puts the address into x. PCQ allocates memory using Intuition's AllocRemember() routine, so that at the end of execution all the memory allocated through new() is returned to the system. This means that you don't absolutely have to call dispose() for every new(), although you should. If the allocation fails, the program aborts with a run-time error. Procedure Dispose(var x : pointer variable); This returns the allocated memory to the system. If something got confused, and you try to dispose of memory you never allocated, this will just return. Unfortunately that means you may never diagnose a problem in your program, but at least it won't be calling the Guru all the time. Procedure Exit(error : integer); Exit() aborts a program early. It is the acceptable method of escaping a program. It does the same stuff that the program normally does when it quits, then returns the error number you give it to AmigaDOS. Any exit procedures you have installed can recognize a program that terminated due to the Exit() procedure because ExitAddr will always be Nil. According to convention, the error number should be zero if the program terminated correctly, 5 for a warning, 10 for an error, and 20 for a catastrophic error. Procedure Trap(num : integer); The argument for this procedure must be a constant expression, although the type doesn't matter. All it does is insert a 68000 trap instruction into the code at the point of the statement. This is useful for the debugger I use, and for nothing else I can imagine. It effectively inserts a break point in the program. Procedure Inc(x : Any ordinal or Pointer type); If x is an ordinal type, Inc() just adds one to it. If it is a pointer type, Inc() adds the size of whatever x points to. If x is a string, for example, Inc() just adds one. If x is an Address type, it adds four (No particular reason for that, by the way). Procedure Dec(x : any ordinal or pointer type); Dec() is exactly analogous to Inc(), in that it subtracts either one or the size of whatever the pointer points to. Extra Statements First of all, PCQ supports if, while, repeat, for, case, goto and with statements. The if, while, repeat, goto and with statements work just like the Standard Pascal report says they should. The case statement is now much more like normal Pascal than it was in version 1.0. Each case can have any number of constants or constant ranges. At the end of the case construct, as the final case, you can have an ELSE statement. This will execute, not surprisingly, if none of the cases is true. Thus a couple of example case statements are: case Letter of case Number * 5 of 'a' : statement1; -MaxInt..0 : statement1; 'b'..'g' : statement2; 1..MaxInt : statement2; 'j', end; 'm'..'o', 'h' : statement3; else statement4; end; The for statement supports 'downto', which changes the increment from 1 to -1. It also supports 'by', which allows you to set the increment. The argument for the 'by' part can be any regular expression, but for any negative increment you must use 'downto' rather than 'to', or the loop will only run once. For that matter all 'for' loops run at least one time. Anyway the syntax looks something like: for <variable> := <expression> to|downto <expression> [by <expression>] do <statement>; The other statement included is 'return', which simply aborts a PROCEDURE early. You can abort a FUNCTION early by assigning the function name to some value, so 'return' works only in procedures. Reserved Words The reserved words of PCQ are as follows: and for procedure array forward program begin function record by goto repeat case if return const in set div label then do mod to downto not type else of until end or var external packed while file private with As you can see, even the unimplemented stuff is reserved. Expressions The compiler will accept the normal expression syntax, like most programming languages. It will also accept several new operators similar to ones in Turbo Pascal and C. These are: Xor This operator returns the exclusive-or result of the two operands. For example, 3 xor 5 returns 7. This is the same as the Turbo XOR operator, or the ^ operator in C. This operator has the same precedence as the +, -, and OR operators. Shl This operator shifts left the value on the left the number of bit positions specified on the right. Thus 1 shl 5 = 32. This again is the same as the Turbo operator and the C << operator. It has the same precedence as the *, /, div, and AND operator. Shr This is the same as Shl, but shifts the value to the right. It uses logical rather than arithmetic shifts, so negative values will provide positive results. Hexadecimal representation can be used anywhere an integer is expected. Floating Point Math As of version 1.0c, real numbers are integrated into the language. In the program text they can be specified using the normal syntax of a series of digits, followed by a period, followed by any number of digits. The syntax that looks like 1.0876E-4 is NOT supported. The only math operators supported are +, -, /, and *. The rest of the MathFFP.library is also accessible. The standard functions pertaining to real math are Abs(), floor(), ceil(), trunc() and float(). I have included some reasonable sin() and cos() functions, which are accurate to about four digits, and reasonably fast. I also added a sqrt (square root) function based on Newton's method. It is accurate enough that sqr(sqrt(x)) is less than x/10000 off. Functions like exp() and ln() are not handled by the MathFFP.library. They are located in MathTrans.library, which is disk based. Thus whenever you write a program that needs these functions, the system disk will have to be inserted in order to get to LIBS:. Read MathTrans.i for further information about all this. The Limits of PCQ The compiler can accept lines of any length, although it will display at most the previous 128 characters read in if an error occurs. As far as the size of the file is concerned, it can be any length (the only part of the file that is in memory at any time is the current character), with, of course, a few caveats. The main limit is that, since the compiler produces lots of assembly code output, there must be room on the disk for the whole file. The assembly output is, as a rule of thumb, as much as five times as large as the Pascal source. Typed Constants Turbo and Quick Pascal in the MS-DOS world have introduced typed constants to the Pascal world. These objects serve the same purpose as initialized variables in C, and why they are not defined as variables I don't know. In the interest of molding the syntax of PCQ Pascal after that of Turbo (the working standard), I maintain their odd Constant idea. As I mentioned above, the syntax of typed constants looks like: CONST Identifier : TypeDefinition = Constant Expression; The identifier is a normal Pascal identifier, followed by a colon, followed by any full type expression, an equal sign, and a modified constant expression. These expressions are the normal constant expressions (just like normal expressions without standard functions), augmented by a syntax for referring to arrays and records. Specifying types like Integer, Real, Char, Byte, Boolean, etc. is done in the same way as you would expect. Specifying arrays is done by starting off with a left parenthesis, followed by a number of elements separated by commas, and ended by a right parenthesis. The elements themselves would normally be integers or characters, but could also be arrays or records themselves. In an array definition there is always the same number of of elements as there are elements in the array- any difference is an error. The exception to the normal array format is in arrays of characters. These are specified in the same way as most character arrays - an apostrophe followed by characters and ended with another apostrophe. Records are defined in the same way. A left parenthesis, followed by the definition of each of the elements, followed by a right parenthesis. Pointers have a special syntax. They will mostly be defined as Nil, but can also take the value of the address of previously defined global variables and typed constants. This is done by using the '@' operator, also borrowed from Turbo Pascal. The '@' operator returns the address of the following identifier, and is only used in this context. Some typed constants can even be used in subsequent constant expressions. In this case the initial value of the constant is always used. This value is only meaningful for simple types like integers, reals, characters, and strings. For arrays and records you will get a nonsense result. Typed constants declared in procedures and functions can't be referenced outside of the function, of course, but they do have the interesting property that they retain their value across calls the the routine. This will screw up recursive routines, so be careful. Examples of the typed constant definitions include: TYPE Array1 : Array [-4..-2] of String; Array2 : ^Array1; Array3 : record Name : String; Letter : char; Value : Real; end; CONST Pi : Real = 3.1415; Val1 : Array1 = ("Message 1", "Second Message", Nil, "Last"); Val2 : Array2 = Nil; Val3 : Array2 = @Val1; Val4 : Array3 = ("Ziasus Pomouk", '\n', Pi); Strings As was mentioned above, strings should be thought of like '^char'. They are defined that way, but also are given special properties. They can be dynamically created, sized, and disposed of. A string is supposed to be terminated by a zero byte, so if you write any string handling routines be sure you follow that convention. Otherwise you'll confuse all the other string routines. In the text of a program, you delineate strings with double quotes, instead of the single quotes found around normal arrays of char. Thus: "A string" is indeed a string, while 'not one ' is considered an array [1..8] of char. The other interesting thing about strings is that they can have C-like escape sequences. What happens is that you type a backslash (looks like this: \), and the very next character is specially handled. C has a bunch of these things, and I've included most of them, including: \n Line Feed, chr(10) \t Tab, chr(9) \0 Null, chr(0) \b Backspace, chr(8) \e ESC, chr(27) \c CSI (Control Sequence Introducer), chr($9B) \a Attention, chr(7) \f Form Feed, chr(12) \r Carraige Return, chr(13) \v Vertical Tab, chr(11) Everything else passes through unchanged, so that you can also use this mechanism to include double quotes in your strings. And you have to use it to include backslashes. What this all boils down to is that the string "A\tboy\nand\\his \"dog.\"" prints out like: |A boy |and\his "dog" There is something called StringLib.i in this archive that declares a few string handling routines - the ones I needed for the compiler, mostly. Read that file for more information. And if you get confused about strings, just remember that they're pretty much like C strings, and can be used in most of the same situations. Remember that if you declare a string you don't get any space for the characters. All you get is space to hold the address of where the characters are, so you have to call AllocString() in StringLib or something like it to get some room to work. If you are a BASIC programmer you might run into some difficulty on this subject, and I would suggest reading up on C strings in hopes that whatever you read can explain the situation better than I. By the way, note that 'stringvar^' is valid, and is of type 'char'. The other way to examine characters in a string is with the index notation. For example 's[3]' returns the fourth character in the string 's', since all strings indexes begin at zero. Exit Procedures Yet another feature imported from Turbo Pascal. When your program ends, the exit routine will look at the value of ExitProc, a standard global variable. If it is not Nil, the exit routine calls the routine pointed to by ExitProc. Just before doing so, it puts the value Nil in ExitProc. When that routine returns, the exit procedure again looks at ExitProc, and until it becomes Nil it keeps calling the routines. If you want to install an exit procedure, you first save the address of the previous exit procedure, then set ExitProc to the address of yours. Part of your procedure should be to set ExitProc to its previous value. In this way, exit procedures form a nested chain, and each procedure is called in the reverse order that it is installed. From within your exit procedure, you can examine the variables ExitAddr and ExitCode. ExitAddr holds the location that a run-time error occured, so if you want to return to the program it is, theoretically, possible. More frequently you'll use this value to determine the general area the error occured. If you have entered the exit procedure by way of the standard Exit() function (not the DOS function), this value will be Nil. The ExitCode just holds the return value the program will return to DOS. If you called the exit() procedure, ExitCode just holds the value you passed to that routine. Otherwise it's a runtime error code, or zero if there was no error. By default, ExitProc holds the address of a routine that closes all open PCQ Pascal files and frees all memory acquired by New(). If you get rid of this routine in the chain, you might consider replacing the parts you want. Compiler Directives Eventually there will be billions of compiler directives, but for now there are just a few. Compiler directives work like this: if the first character in a comment is the dollar sign ($), the compiler looks to the next character for a command. No spaces are allowed between the bracket, dollar sign, and command character. Some directive can be followed by others: if a comma is the first character after a directive, the next character is considered the beginning of another directive. Thus the following is legal: {$O-,R+} The I and A directives can't be followed by other directives, although they can be preceded by them. The compiler directives are: {$I "fname"} This will insert the file "fname" into the stream at this point. When it has finished, it will end the comment (no more directives allowed in this comment) and continue on. There can be any amount of white space in front of the filename and anything you want, such as the rest of a comment, after it. The filename is a string, so it must be in quotes. Several of the example programs should demonstrate the include syntax. As of version 1.0c include files have two new properties. The first is that they can now be nested. That is, an include file can include another file. The second feature is that PCQ now keeps a list of included files, and will not include a particular file name twice. It only considers the actual file name for this, not the directories or drives. {$A This directive inserts assembly instructions Instructions into the assembly file produced by the compiler. } Look at the assembly language produced by the compiler to figure out how to reference variables and subroutines. This directive simply passes everything from after the A until, but not including, the closing bracket. You should therefore include comments in assembly fashion. {$R+} or The '+' directive instructs the compiler to produce {$R-} range-checking code for arrays. From this point until the compiler reaches a {$R-} directive, each array access will check that the index value is within the bounds of the array. This expands and slows the code, so I recommend only doing this during testing. If the index is out of bounds, the program will abort with an error code (look at the section "Run Time Errors" for more information). The default for this directive is {$R-}. {$O+} or This directive controls IO checking. A test is {$O-} inserted after every IO operation (writeln, readln, etc.), and if there was an error, the program aborts with an AmigaDOS error code. If this feature is turned off, you will have to call IOResult after every questionable IO operation. This defaults to {$O+}, just like Turbo Pascal. {$SN} or This directive controls object declarations and {$SX} or storage. SN (which stands for Normal Storage) {$SP} allocates space for all the global variables and typed constants it runs across, and makes the identifiers available to other units to import. SX (External Storage) assumes that all subsequent variables and typed constants were defined and exported by some other unit, so the current unit just imports the name. SP (Private Storage) allocates space for all variables and typed constants it runs across, but does not export their names. It does not export the names of procedures or functions, either. The default for normal files is SN, and the default for External units is SX. Type Conversions If you have used Modula-2, you can skip this section. In writing the compiler I found the need to cheat a bit on type checking, so I decided to use Modula-2's syntax for changing the type of an expression. What you do is use the name of the type as if it were a function. The expression in the parentheses is evaluated, and the result is considered to be of the type named. It goes like this: IntegerVariable := integer(any ordinal expression); CharVar := char(456 - 450); if boolean('r') then .... This works not only for the included standard types, but also for any type you create. Thus this is also legal: type charptr = ^char; var charvar : charptr; .... charvar := charptr(0); charvar := charptr(integer(charvar) + 1); Note that the type must be named in order for this to work. Something like... variable := array [1..4] of char(expression()) ...will not work. Note further that not all type conversions are valid. Converting a type to one of a different size is often a bad idea, as is converting a structured type (array or record) to a simple type. I should probably warn you against the indiscriminate use of these, but what the heck. Have a ball. External References In version 1.0 of the compiler, procedure and function references were made external by a failure to define a forward-declared procedure or function. Version 1.1 changes this arragement to be more consistent with other Pascal compilers. Now in order to declare an external procedure or function, you simply use the External key word. Therefore: Procedure DefinedElsewhere; External; ....would simply generate an external reference. Now for something somewhat less kosher. I needed some syntax to allow the external routines to access the same global variables as the main file. What I came up with is a different file format. Where the normal Pascal file looks like: program Name; declarations procedures and functions begin main program end. The external file looks like this: external; declarations (like normal) procedure and functions (like normal) There are three things to note. The first is that there is no main program, the second that there is no special ending syntax. It is just a bunch of procedures and functions in a row until the end of the file. The other thing is that any variables declared at the global, or outermost, level are considered, by default, external references. In the source for the compiler there is a file that has just the global variable declarations. This file is included by all ten of the source files, but only the main file produces storage space for them. The other nine just produce external references. This can be changed by using the $S compiler directive explained above. I guess this is a good time to discuss a couple of issues related to using an assembler with the separate compilation deal. First, note that all procedure, function and variable names are offered as external references by the module in which they are defined, unless the storage mode has been set to Private by the $SP directive. If an outside routine wants to use any of these values, it should be looking for something starting with an underscore and spelled the same as the first time the word is encountered in the program. Pascal is case insensitive, of course, but I can't help the assembler and linker. Also remember that there is no type checking across files (again, get Modula-2 if you want that sort of stuff). This means that a procedure that expects a string might be sent a Boolean value, which would probably conjure the Guru. The other thing to note is that this compiler pushes procedure and function arguments on the stack from left to right. Most C compilers (including Lattice and PDC) do it the opposite way, so they can have variable numbers of parameters. Draco also does it left to right. This doesn't mean that you can't use code and libraries from them - it simply means that you should reverse the order of the arguments. Just two more notes on this subject: first, the compiler considers registers d0, d1, d2, a0, and a1 fair game, and will destroy them at will. d2 might be a problem, but the others shouldn't. For further information, just look at the assembly code produced. The second note is just a reminder to anyone who might want to link Pascal programs to other languages: remember what 'var' does before a variable, and be sure to use it correctly. Input/Output There are several routines for handling IO in PCQ. Before I get to them, however, let me discuss what happens when you open a file. The actual file variable you declare in the program, as in: var filevar : file of integer; is actually something like a record, which would look like this: file = record HANDLE : A DOS file handle NEXT : A pointer to the next file in the system list BUFFER : The address of the file's buffer CURRENT : The current position within the buffer LAST : The Last position of a read. MAX : One byte past the last byte of the buffer RECSIZE : The size of the file elements. INTERACTIVE : A boolean value EOF : Another boolean value ACCESS : Either ModeNewFile or ModeOldFile. end; Now you can't actually access these fields, but nonetheless 32 bytes of memory is reserved. When you open a file, all of the fields are initialized as necessary, and if the file is an input file and it's not interactive, the buffer is filled. The buffer can be accessed by the filevar^ syntax, which in version 1.1 is considered an IO statement (Therefore it might be followed by a IO check). If at the end of execution there remain some open files, the shut-down code will close them for you. This is only true for files opened through Pascal, using one of the open() routines explained below. Anything you open directly through AmigaDOS is your own responsibility. The routines that handle file IO are these: Function Open(filename : string; filevar : file of something, or Text {; BufferSize : Integer}) : Boolean; This opens a file for writing. If the file was there before, this routine will erase it. If everything worked OK, it will return true. If not, of course, it's false. The last option, the Buffer Size, is optional. If you specify a value, the Open routine will attempt to allocate a buffer of approximately that size. If you don't specify a value, 128 will be used. Also note that the actual buffer size allocated will be: (RequestedSize div RecSize) * RecSize. If that value is zero, RecSize is used. Function ReOpen(filename : string; filevar : file of something, or Text {; BufferSize : Integer} ) : boolean; This is analogous to open() except it opens an existing file for reading. You can also specify a value for the buffer size. If the file turns out to be interactive (connected to a console), the actual buffer size allocated will be RecSize. The rest of the routines are the same as most Pascals. Just for the sake of completeness, however, they are: write() Write the stuff to a file or to standard out. This mimics the sequence: FileVar^ := x; Put(FileVar); writeln() Do the same as write, then output a line feed. This only makes sense for Text files. read() Read some stuff from a file or standard in. read(filevar, x) mimics... x := filevar^; get(filevar); readln() Do read then keep reading until you hit a line feed. This too only makes sense for Text files. get() Reads the next file element from the file into the buffer. put() Advances the file pointer past the current file element, flushing the buffer to disk if necessary. If the first argument of a read or write is a file variable, the input or output is from a file rather than to Input or Output, as the case may be. That, of course, is normal Pascal, and looks like: writeln(outfile, 'The result is ', 56 div 4); Field widths are supported, and can be any normal expression. What this means is that something like... writeln((67 * 32) + 5:10); ... will print the result right justified in a field of ten characters, with spaces padding out the area to the left. If you specify a field width lower than the width of the number, the number is printed in as few characters as possible. Valid values for the field width are greater than or equal to one and less than MaxShort. You can specify a field width for any type in a write statement, although only when writing to a text file. Real numbers take two field widths. The first is used just like the one for integers. The second one is not required, and specifies the number of places after the decimal point to print. If it is zero, no numbers and no period are printed. The maximum for this is about 30 digits, which is well beyond the accuracy limits of FFP anyway. The defaults for this are 1:2. Just for the sake of precision, I'll go over the delimeters for IO on Text files with various types: Write Char Writes one character. Write Boolean Writes TRUE or FALSE, with no extra spaces. Write Integer Writes the number with no extra spaces, but possibly a negative sign Write Real Writes the integer part of the number just like an integer, then if the second field width is > 0 or absent it prints a period followed by the number of characters in the second field width. Write Array of Char writes the entire array, from first element to last. Write String Writes from the first character up to but not including the zero byte. Writeln Writes a single EOLN (chr(10)) to the file. Read Char Reads the next char. Read Boolean Can't do it. Read Integer This eats spaces and tabs until it meets up with something else, then eats digits until it comes upon a non-digit. It does not eat that last non digit. If the routine runs across an EOLN before it gets to the first digit, it returns zero. If it finds letters before it finds digits, it returns zero also. Read Real Reads an integer just like the above. If the next character is a period, it reads it then reads digits until something other than a digit is found. Read Array of Char Reads characters into the array until either the array is full or the routine finds an EOLN. If it finds an EOLN it will not eat it, so you'll have to do that with a readln if you want. If it returns because of an EOLN it will also pad the rest of the array with spaces. Read String Reads characters until it gets an EOLN. The EOLN is left in the input stream, and a zero is put in its place in the string. Note that this routine does not check for length, so you must be sure that your string can handle the longest line it might encounter. Readln Reads characters up to and including the next EOLN. Also remember EOF(filevar) and IOResult, from the functions. For examples of all of these, look at the example programs. Also note that the filevar^ sort of syntax is present. Look at a Pascal text to understand it (Turbo Pascal doesn't use it, so it might be Greek to a lot of Pascal programmers). Standard IO One of the tricky parts about programming on the Amiga is that there are two distinct environments. The CLI invokes a program in much the same way as traditional computers, whereas the Workbench sets the program up with, basically, nothing. In particular, the Workbench does not set up standard IO channels, which are always provided by the CLI. Version 1.0 of PCQ Pascal handled this by automatically opening a console window if a) the program was invoked by the Workbench and b) it tried to do a Read or Write to the standard IO channels, which are now named files (Input and Output) but were not then. That has changed somewhat in version 1.1. If your program is invoked by the Workbench, the startup code looks for a string variable called StdInName. If you don't declare a string by this name, the program will use a default value included in PCQ.lib. If this string is Nil, the program will not open a standard IO channel, and will go on to try to open an output channel. If StdInName is not Nil, the startup tries to open a file by the name specified. If it can't open it, the program dies with a runtime error. If it opens OK, and it's an interactive file (attached to a window), and if StdInName and StdOutName point to the same string, the same file is used as Output. Otherwise the code goes through much the same process for StdOutName. Somewhere deep inside PCQ.lib is the equivalent of the following fragment: CONST StdInName : String = "CON:0/0/640/200/"; StdOutName: String = StdInName; Thus if you run a program from the Workbench, the startup code will by default open a full screen, unnamed window. If you don't want the window, include the following fragment in your code: CONST StdInName : String = Nil; StdOutName: String = Nil; In this case, you'd better not use Write or Read without specifying a file, or you could cause a Guru. The reason I changed this, by the way, is because the new IO system uses buffered IO, so the program doesn't know the IO channels aren't open until after it's already tried to write to it. One last thing about the standard IO files. You can access them by name, as Input and Output. If the program was run from the CLI, you can't close the files- the CLI opened them for you, and will close them. If the program was run from the Workbench, then you are allowed at least to close Input. If Input and Output refer to different files (according to the rules above), then you can close them both. In fact if you get rid of the final exit procedure (the one that closes all open files and frees all the memory), you will have to close the files opened by the startup code. The point of all this is that if the startup code opens standard IO files, they can be considered normal PCQ Pascal files. At least one of them. Errors As I mentioned somewhere above, most errors will completely confuse the poor compiler, which will then start spewing out errors that don't really exist. It can get by a couple of errors- for example if you leave out a semicolon somewhere, you should get an error message but the assembly file should be valid. Very few other errors will work that well. I hope to make the compiler a bit friendlier, but in the meantime the it will abort the compile if it gets 5 errors. I put this in because the compiler will sometimes get one error, then start producing errors on every symbol, and even get hung up on a symbol. Really ugly. If an error occurs, the compiler will write out at most the two lines leading up to the error, and highlight the part that it's currently working on. The error probably occured either at the highlighted symbol or just before it. Also note that the highlighted symbol is always the last symbol written (when the symbol is just some punctuation, it can be difficult to see that it is highlighted). On the next line is the line number of the error and the explanation of the error. Currently I'm using text descriptions of the errors, so there are no error numbers. If you specified the "-q" command line directive, the error reports will print something like "Line ### : Error Msg". This is so that automatic routines, in particular AREXX, will have an easier time parsing the error reports. Run Time Errors Several things can cause run time errors. The few that are handled at the moment are: Error Explanation 50 No memory for IO buffer 51 Read past EOF 52 Input file not open 53 Could not open StdInName 54 New() failed 55 Integer divide by zero 56 Output file not open 57 Could not open StdOutName 58 Found EOF before first digit in reading an integer 59 No digits found in reading an integer 60 Range error The error number is returned (through the exit() function) to AmigaDOS. If any of these errors occur, ExitCode will be set to the appropriate number, and ExitAddr will have the address where it occurred (actually the instruction after the error). You might be able to install exit procedures to gracefully handle these errors: see the section called Exit Procedures for more information. Sources Like I said, I wrote this for the learning experience. Some of the places I went for information are: 1. PDC, a freely distributable C compiler supported by Jeff Lydiatt. This is a very good program, and one of the best freely available compilers for the Amiga (the other really good one is Draco by Chris Gray). I learned (and used) a lot about activation frames from the listings produced by this compiler. Looking at the assembly code produced by this compiler was also my inspiration for starting to write a compiler. 2. Pascal-S, the Pascal compiler produced out of ETH Zurich. I got some ideas about the structure of a compiler from this, but not too many. 3. Small-C, another freely distributable C compiler. This one is not nearly as powerful as PDC, but its simplicity helped me understand a thing or two. Probably the best compiler source code that I found to learn from. This and PDC were the compilers I used before this compiler was able to compile itself. Many aspects of the design of PCQ come from Small-C. 4. Brinch Hansen on Pascal Compilers, by Per Brinch Hansen. This book was of some use, which is more than I can say about the other half dozen I read while writing this. From this book I mainly learned about all the things I was doing wrong. Great. 5. Sozobon-C. This is a freeware C compiler for the Atari ST that was recently partially ported to the Amiga. I got my 32 bit math routines from this project already, and I might lift some floating point math as well. 6. The Toy Compiler series in Amiga Transactor, written by Chris Gray. This series is very informative, and is written by the author of Draco. Gray also writes compilers for a living. If you like the idea of freely distributable compilers, please be sure to check out Draco from Chris Gray (a new version is on Fred Fish disk 201, I think), PDC from Jeff Lydiatt (an old version is on Fred Fish 110) and Sozobon-C. All three are much better products than PCQ and even rival the commercial compilers. I'm not sure what a good source for the newer version of PDC would be - perhaps you could write to Jeff (it's certainly worth it. PDC has a full preprocessor, a 'cc' front end, very fast optimized code ... the works). The syntax of Draco, by the way, is fairly similar to Pascal. Notes to Assembly Programmers During the course of a program PCQ uses registers d0, d1, a0 and a1 as scratch. It also uses d2 and d3 during IO calls and d2 when comparing or assigning large data structures. D2, D3, and A2 are all blown away by the 32 bit math routines. a7 is, of course, the stack pointer, and I use a5 as the frame pointer. a6 is used to hold the library base during any call to the system, and a4 is used to access local variables of a parent procedure. The other registers are free, and in fact the scratch registers should be free for you to use between statements. After all, the compiler does no optimizing. If you make a call to a 'glue' routine, you should expect all registers used in passing parameters to be scratch. Improvements On The Burner Version 1.1 has all the features I predicted it would have in the documentation for version 1.0, and many more. In general future enhancements will incorporate more and more of the features of Turbo and Quick Pascal. The feature that will motivate version 1.2 will be better code generation, through the simple device of creating expression trees before generating the code for them. That will provide dramatically smarter, and smaller, code. Otherwise, it's fixing bugs. Update History Version 1.1c, March 3, 1990: The only changes to the compiler are the new standard functions. The more significant changes were in the runtime library. First, I replaced the sin() and cos() functions based on suggestions by Martin Combs - the result is that the results are accurate to about 3 digits, and only slightly slower. Martin was kind enough to send along a very useful set of routines, which also included the tan() and arctan() functions. I also fixed the routine that writes real numbers, so values between -1.0 and 0.0 now include the minus sign. Version 1.1b, February 6, 1990: This program is over a year old. Added the Sqr() function. Sqr(n) is the same as n * n, but marginally faster and smaller. Also, the compiler used to generate lots of errors when an include file was missing. Now it skips the rest of the comment, like it should. Apparently floating point constants didn't used to work. Why am I always the last to know? I also added the Sin() and Cos() functions, based on an aside during a lecture on an entirely different topic. Later I added the sqrt() function, using Newton's method. Version 1.1a, January 20, 1990: Fixed a bug in the WriteArb routine that manifested itself whenever you wrote to a 'File of Something'. Fixed a bug left in the floating point math library. It seems that it had not been updated for the all the 1.1 changes, so during linking it required objects that aren't around anymore. Since floating point math is now handled by the compiler, I hadn't noticed it before. Version 1.1, December 1, 1989: This version is completely re-written, and has far too many changes to list them individually here. The main changes are the with statement, the new IO system, a completely redesigned symbol table, nested procedures, and several new arithmetic operators. In order to help port programs from Turbo Pascal and C, I added typed constants, the Goto statement, and the normal syntax for multi- dimensional arrays. Version 1.0c, May 21, 1989: I changed the input routines around a bit, using DOS files rather than PCQ files. I buffered the input, and made the structure more flexible so I could nest includes. Rather than make up some IfNDef directive, I decided to keep track of the file names included and skip the ones already done. Buffering the input cut compile times in half. I would not have guessed buffering would be that significant, and I suppose I should rethink PCQ input/output in light of this. I added code to check for the CTRL-C, so you can break out early but cleanly. The Ports.i include file had a couple of errors, which I fixed, and I also fixed the routine that opens a console for programs programs that need one. It used to have problems when there were several arguments in the first write(). I added the SizeOf() function, floating point math, and the standard functions related to floating point math. There were several minor problems in the include files which I found when I got the 1.3 includes, the first official set I've had since 1.0. I relaxed the AND, OR and NOT syntax to allow any ordinal type. This allows you to get bitwise operations on integers and whatever. I also added a standard function called Bit(), described above. These are all temporary until I can get sets into the language. I finally added string indexing. In doing so I found a bug in the addressing routine selector(), so I rewrote it to be more sensible. I think it also produces larger code, but I'm not too worried because I'm going to add expression trees soon anyway. Version 1.0b, April 17, 1989: I fixed a bug in the way complex structures were compared. It seems that one too many bytes were considered, so quite often the comparison would fail. Version 1.0a, April 8, 1989: This version added 32 bit math, and fixed the case statement. The math part was just a matter of getting the proper assembly source, but I changed the case statement completely. Version 1.0 of the compiler produced a table that was searched sequentially for the appropriate value, which if found was matched up with an address. I thought all compilers did this, but when debugging a Turbo Pascal program at work I found that it just did a bunch of comparisons before each statement, as if it were doing a series of optimized if statements. I had thought of this and rejected it as being too simplistic, but if it's good enough for Turbo it's good enough for me. The next thing I changed in this release was the startup code. You can now run PCQ Pascal programs from the Workbench. This was just a matter of taking care of the Workbench message, but I also fooled around with standard input and output. If you try to read or write to standard in or out from a program launched from the Workbench, the run time code will open a window for you. I also fixed one bug that I found: an array index that was not a numeric type had its type confused. Nevermore. Version 1.0, February 1, 1989 Original release. Other Notes, Copyright & My Address As I mentioned above, this documentation, the source code for the compiler, the compiler itself, the source code for the run time library, and the run time library itself, are all (ahem): Copyright (c) 1989 Patrick Quaid. I will allow the package to be freely distributed, as long as all the files in the archive, with the possible exception of the assembler and linker (please include them if at all possible), are included and unchanged. Of course no one can make any real money for distributing this program. It may only be distributed on disk collections where a reasonable fee is charged for the disk itself. A reasonable fee is defined here as the greater of $10 per disk, or whatever Fred Fish is currently charging (about six dollars as I write this). Only one distributor is specifically prohibited from distributing this package: Stefan Ossowski, who evidently lives in Essen, West Germany. He charges far too much for my disk, and action has been taken by German Amiga programmers against him. Feel free to mess around with the compiler source code. If you make any substantial improvements, I would appreciate a copy of them so that they can be incorporated into the next version if appropriate. If you make improvements that are not along the lines of standard Pascal or the path indicated above, please don't distribute your program under the name PCQ. That would only confuse things. This is not a shareware package. Feel no guilt about using it without paying for it. The one payment I would really appreciate is if you could let me know about bugs you discover (not unimplemented features- I know about them. I'm not trying to write the end-all greatest compiler, but I do want it to be correct). If you have an overwhelming urge to give money away, please send a donation to Charlie Gibbs, who wrote the assembler, and the Software Distillery, who wrote the linker. If you would like me to send you the latest version of the compiler, keep the following in mind. Disk mailers cost me about 50 cents, postage costs me 75 sents, and disks cost about a buck. Therefore I would consider anything over $2.25 to be adequate to cover my costs, and I don't want anything more. Any questions, comments, or whatever can be addressed to: Pat Quaid 8320 E. Redwing Scottsdale, AZ 85250 (602) 967-3356 They changed our ZIP code (the one listed is the new one), but the old one should work for quite a while. You are much more likely to be able to contact me by mail than by phone, but I certainly don't mind if you try. Enjoy the compiler. If you have any complaints, remember what you paid for it.