Gold Fish 1

home *** CD-ROM | disk | FTP | other *** search

/ Gold Fish 1 / GoldFishApril1994_CD1.img / d1xx / d183 / pcq / pascal.doc < prev next >

Wrap

Text File | 1989-02-25 | 49KB | 1,167 lines

PCQ version 1.0 A very simple Pascal compiler for the Amiga by Patrick Quaid PCQ (which stands for Pascal Compiler, um, Q ... look, I couldn't come up with a name so I used my initials, OK?) is a modest Pascal sub-set compiler that produces assembly code. It is not in the Public Domain (I retain the copyright to the source code, the compiler, the run time library source code, the run time library, and this documentation), but it can be freely distributed as long as all the files in the archive are included (with the possible exception of the assembler and linker) and unchanged. The compiler is slow, and it can't handle a couple of things, but all in all it's worth the price. To summarize: The bad: The compiler is awfully slow. It doesn't allow range types. It doesn't support the 'with' statement or sets. Multiplication and division are done the easy way, which results in an odd mixture of 16 and 32 bit math. This will be fixed before the next release. The code is not optimized at all. It is, therefore, slow, fat and generally silly looking. Programs produced by PCQ can be run only from the CLI. This will be fixed fairly soon. The compiler gets knocked for a loop by most errors. The good: It works, for the most part. The compiler supports include files. It allows for external references, although you have to do the checking (this isn't Modula-2, after all). It supports records, enumerated types, pointers, arrays, and strings. Type conversion as found in Modula-2 is supported. In other words, something like "integer('d')" is legal. You can have as many const, var, type, procedure and function blocks as you want, in any order. It's free. Table of Contents This manual is intended to be read with a file reader or text editor, so this table of contents is based on line numbers rather than page numbers. Section Line number How To Use PCQ ........................ 89 An Explanation of Its Ills ............ 179 Predefined Stuff ...................... 276 Constants ......................... 303 Types ............................. 340 Variables ......................... 380 Functions ......................... 396 Procedures ........................ 436 Extra Statements .................. 479 The extra libraries ............... 516 Reserved Words ........................ 529 Floating Point Math ................... 555 The Limits of PCQ ..................... 601 Strings ............................... 626 Compiler Directives ................... 676 Type Conversions ...................... 718 External References ................... 761 Input/Output .......................... 842 Errors ................................ 996 Run Time Errors ....................... 1022 Sources ............................... 1039 Notes to Assembly Programmers ......... 1083 Improvements On The Burner ............ 1098 Other Notes, Copyright & My Address ... 1123 How To Use PCQ There are several files in this archive you will need to copy over to your work disk. The compiler (Pascal) is one, of course, as well as the run time library (called PCQ.lib- there's a readme file in the archive that explains all the file names, by the way). If you do not have the assembler (A68k) and linker (Blink), you'll have to copy them as well (they might not have been included in this archive, but should be available on a local bulletin board or on Fred Fish disks). These files are necessary for even the simplest compilations. The files with the suffix .p are example Pascal programs, which you can copy over if you want. I spent a lot more time working on the compiler than on these examples, but a couple of them are interesting if you haven't seen programs like them before. They demonstrate just about every aspect of the compiler that I could think of, so you should probably take a look at them and then get rid of them. The files that end with .i are include files for a few of system libraries. They define the records, types, constants, procedures, functions and variables needed to access the system. These you probably should keep around. There is also an include file for a few string routines. The code related to all these routines is in the run time library. In order to compile a program, first write one. Or use one of the example programs. Then type: 1> Pascal prog.p prog.asm 'Pascal' is, of course, the name of the compiler. You can change it if you want. 'Prog.p' is the pascal source file, which can also be called whatever you want. The last word is the name of the assembly file produced. At the moment these are the only command line arguments allowed. By the way, the example programs assume that the include files are in a directory called "Include", which is actually a subdirectory of the current directory (in other words, the programs will try to include "Include/exec.i" instead of just "exec.i"). If this conflicts with your setup, just edit the include statements at the start of the file. Assuming the compilation completes without any errors, you then type: 1> A68k prog.asm prog.o This invokes the assembler to produce object code. If the archive included A68k it probably also included the documentation for it, so read that for information about the assembler. If the assembler was not included, get and use A68k by Charlie Gibbs, version 1.2. A68k does lots of small scale optimization that the code from PCQ might very well depend upon, so I don't claim that the compiler works with any other assembler. Finally, you want to link the program, so you type: 1> Blink prog.o small.lib to prog library PCQ.lib This will produce a finished executable program called 'prog'. All of the Pascal run time routines, Amiga system routines, and my tiny little string library are contained in PCQ.lib. If any of the routine names clash with ones you are working with, just be sure to put your library or object file in front of PCQ.lib on the Blink command line. If Blink was included in the archive it's documentation probably was as well, so read that to answer any questions you may have about the link process. I use Blink version 6.7, and again I assume that PCQ won't work with any other linker or version. Small.lib is a library of addresses written by Matt Dillon. Because of an apparent bug in Blink, it has to be included in the object files, rather than the libraries where it belongs. It won't increase the size of your executable files, though. Instead of all this business you could just use the 'make' script that's included in the archive. You may have to change it around a bit so that it looks in the proper directories and whatnot, then through the magic of AmigaDOS 1.3 you should make it a script file. Then you can invoke it like: 1> make prog It will take the file 'prog.p' and produce the finished file 'prog'. If your program has separately compiled units, you'll need to modify the batch file or write another. I recommend writing a script file for any program you'll need to compile a few times. If none of this makes any sense, write or call me and I'll try to give you more coherent instructions. An Examination of Its Ills I might as well get this over with right away. As was mentioned earlier, sets and the 'with' statement do not work at all. Another thing that's not accepted is syntax like: type smallnumber = 1..20; PCQ doesn't do any overflow checking of this sort during runtime, so this wouldn't mean much at the moment anyway. The exception to this is in the declaration of arrays, where this syntax is accepted and in fact there is some range checking available. Read on for details. Something else that won't work is this: type WindowPtr = ^Window; Window = record NextWindow : WindowPtr; ... It will fail on the first line with an 'Unknown ID' error. Instead, use something like: type Window = record NextWindow : ^Window; .... end; WindowPtr = ^Window; This is something I should get around to fixing, but it isn't strictly necessary, so there you go.... Also note that PCQ does not require, and in fact cannot accept, file variables in the Program statement at the beginning of a program. In other words something like... Program Tester(input, output); ...is not allowed. Just leave out everything in the parentheses, then leave out the parentheses. PCQ assumes, at this point, that you will need both Input and Output. Another feature of standard Pascal that I'm not too hot on including is the "goto" statement. Although both "label" and "goto" are reserved words, I have not yet made them part of the language. Literal real numbers (like "10.0") are not yet allowed in PCQ. In the next version real numbers will be more fully supported, but for now you must use the techniques described in the section "Floating Point Math". The compiler will not yet allow variant records. This I will fix pretty soon, since the next version of the compiler will probably require them. In character array constants, PCQ does not accept the two single quotes in a row that are supposed to signify one quote within the array. Instead it offers you a different type that I'll get to in a moment. The compiler is written in PCQ Pascal, of course, and therefore exhibits some of its problems. One of these is that, although integers are 32 bits long, the compiler will misunderstand any literal integer in the text of your program that is greater than about 100,000 (actually it's much more than this, but I figure this is easier to remember). At this point it also cannot properly read and write them either. This will be fixed when I add full 32 bit math support, but the temporary fix is to use hexadecimal numbers. With these you can specify any 32 bit number, using the normal dollar sign followed by 0..9 or a..f or A..F syntax. If the compiler has to write a large number, it will use hexadecimal in order to these errors. Finally we get to nested procedures. Because of two problems, I had long ago decided to leave them out. Although I wrote most of the compiler with that in mind, in turns out that they almost work, so I guess I'll have to address them. The first problem is with their names. Using nested procedures in Pascal it is possible to have two procedures with the same name, but under different scopes. This is fine as far as the compiler is concerned, but the assembler that takes over has only one scope. Thus I should have made sure that the compiler produced unique names for each procedure and function. This would have been easy enough to take care of, but like I said I didn't even consider the possibility. Next time, definitely. The other problem is more complex than I really want to get into, but it boils down to this: from a nested procedure, you cannot access the local variables of parent procedures. You can access the procedure's own local variables, its parameters, and the variables global to the program. Again the compiler will not complain (it is, after all, legal Pascal), but the program won't run right. I know now how I'm going to take care of this, and it's fairly simple, but fixing it means a whole new round of testing so this release goes without it. Predefined Stuff I've arranged the predefined identifiers as they are supposed to appear in Pascal. In PCQ, however, you can have these blocks in any order, and you can have more than one of each. In other words, your program could look like: Program name; var variable declarations type types var more variables procedure a procedure var still more variables.... And so on. An identifier must still be declared before it is used, of course. I allowed this because it is a real pain to arrange a bunch of different include files (each of the system include files would have had to be split into four sections : the constants, the types, the variables, and the procedures and functions). CONST True and False are defined as -1 and 0, respectively. Nil is defined as a pointer with the constant value zero, but is not a reserved word as it is in standard Pascal. Most places the compiler requires a constant, it will take a constant expression (one that can be evaluated during the compile). For example, the following will work: const first = 234; second = first * 2; type thetype = array [first .. first + 7] of char; Unfortunately you cannot yet use standard functions, type conversions, or other nifty things that you can do with expressions in the program body. Just the five basic math functions (+, -, *, div, mod), for now. Also note that 'first + 7' up there would be evaluated during the compile, but the same text in the body of the program would be evaluated during run time. In other words, there is no such thing as constant folding yet. When you are using integer constants, you can separate the digits with an underscore, similar to Ada. In other words you could have: const thousand = 1_000; tenthousand = 1_0_0_0_0; MaxInt is defined as $7FFFFFFF, which comes out to something over two billion. Don't try to write it. MaxShort is 32767, or $7FFF in hex. TYPE There are several predefined types. They include: Integer 4 bytes, but only 16 bits of reliable range when doing multiplication and division. This will be fixed. Short 2 bytes. Literals within the program text are assumed to be short values unless they are greater than 32767 or less than -32767. Byte 1 byte. These three types are all numeric types, so you can use them in normal expressions without worrying about type conversions. The compiler automatically 'promotes' the small values to whatever size is required. Remember that there is currently no overflow checking. Char 1 byte. Boolean 1 byte. False is 0 and true is -1. String 4 bytes. Really just defined as '^char'. I will explain further in the section 'Strings'. Address 4 bytes. This is a pointer to no particular type. It is type compatible with any other pointer- in fact the constant nil is of type Address. Text 18 bytes. This is not the same as a 'file of char'. The standard input and output are Text files. You can read and write integers, characters, arrays of characters, and strings to Text files. You can also write Boolean values. Enumerated 2 bytes. As was mentioned above, you can have arrays, pointers, records, and files based on the above types. You can also have synonym types (like 'type int = integer;'), but they don't work very consistently. Also note that almost anywhere you need a type, you can use a full type description. Some compilers have a problem with this, and I'm not sure what Standard Pascal says about it, but then again I really don't care much. VAR The only standard variable included in PCQ is : CommandLine : array [1..128] of char; As its name would indicate, this variable is initialized during the startup routine to whatever the CLI command line held. It is an extra copy, so you can alter it as you wish. The significant characters are terminated by a zero byte, after which it's anybody's guess as to what it contains. After you have used the information from this array, or if you didn't need it in the first place, feel free to use the array for whatever you might need. It's going to be there regardless (sorry about that), so you might as well get some use out of it. FUNCTION The standard functions that do not concern real numbers are provided. They include: function ord(x : any ordinal type): integer; returns the ordinal position of the argument. function chr(x : numeric type) : char; returns the indicated character. function abs(x : numeric type) : the same type; returns the absolute value. function succ(x : ordinal type) : the same type; returns x + 1, of the same type function pred(x : ordinal type) : the same type; returns x - 1, in that type function odd(x : numeric type) : boolean; returns true if the number is odd function eof(x : any file): boolean; returns true if you are at the end of an input file. In addition to these standard standard functions, there is another standard function for this compiler in hopes of making it somewhat useful. It is function adr(var x : any variable): Address; returns the address of the variable in question. All the routines up to this point are handled in line. The other two standard functions are for opening files. They will be more fully explained when I get around to writing about input/output. There is also a syntax like 'typename(expression)' supported by the language which looks like a function. This will be explained in a later section called Type Conversions. PROCEDURE The standard procedures are write, writeln, read, readln, get, new, dispose, exit, and trap. The first five will be covered in the IO section. The other four are: procedure new(var x : pointer variable); This allocates public memory the size of whatever type is pointed to, then puts the address into x. PCQ allocates memory using Intuition's AllocRemember() routine, so that at the end of execution all the memory allocated through new() is returned to the system. This means that you don't absolutely have to call dispose() for every new(), although you should. By the way, if the allocation fails, the program aborts (Sorry about that. I'll change it eventually). procedure dispose(var x : pointer variable); This returns the allocated memory to the system. If something got confused, and you try to dispose of memory you never allocated, this will just return. Unfortunately that means you may never diagnose a problem in your program, but at least it won't be calling the Guru all the time. procedure exit(error : integer); Exit() aborts a program early. It is the acceptable method of escaping a program. It does the same stuff that the program normally does when it quits, then returns the error number you give it to AmigaDOS. This routine will free all the memory and close the open files. By the way, the error number should be zero if the program terminated correctly, 5 for a warning, 10 for an error, and 20 for a catastrophic error. procedure trap(num : integer); The argument for this procedure must be a constant expression, although the type doesn't matter. All it does is insert a 68000 trap instruction into the code at the point of the statement. This is useful for the debugger I use, and for nothing else I can imagine. It effectively inserts a break point in the program. Extra Statements First of all, PCQ supports if, while, repeat, for and case statements. It does not yet support 'with' statements, but it will soon enough. The if, while and repeat statements work pretty much like they should. The case statement is a bit weak. First of all, the individual cases must be constants. Unfortunately there can currently only be single cases- in normal Pascal you can list several cases separated by commas and use ranges. Soon you'll be able to do both, but not yet. The syntax for the case statement looks informally like: case <ordinal expression> of <constant expression> : <statement>; <constant expression> : <statement>; ... end; ...where each <constant expression> is of the same type as the <ordinal expression>. The for statement supports 'downto', which changes the increment from 1 to -1. It also supports 'by', which allows you to set the increment. The argument for the 'by' part can be any regular expression, but for any negative increment you must use 'downto' rather than 'to', or the loop will only run once. By the way, for loops always run at least one time. Anyway the syntax looks something like: for <variable> := <expression> to|downto <expression> [by <expression>] do <statement>; The other statement included is 'return', which simply aborts a PROCEDURE early. You can abort a FUNCTION early by assigning the function name to some value, so 'return' works only in procedures. The extra libraries There should be some extra libraries included in the archive (the code for the libraries is in PCQ.lib, but there should be include files describing them). Most of these libraries are interfaces to the system, and all of them are individually documented in their .i files. Note that to use Intuition, Exec, AmigaDOS or basic floating point math functions you will NOT need to open the associated libraries. All these libraries are opened during the start sequence, and they are in fact required by all PCQ programs. Reserved Words The reserved words of PCQ are as follows: and for procedure array forward program begin function record by goto repeat case if return const in set div label then do mod to downto not type else of until end or var external packed while file private with As you can see, even the unimplemented stuff is reserved. The only one that is not explained somewhere in this document is "private", which is one of the things that will help make external references and modularity more flexible in version 1.1. Floating Point Math First of all, real numbers are not fully integrated into the language. They can be used, but you have to do some extra work. Real math is based on the MathFFP.library, which is one of the libraries that is in memory. The main reason I haven't fully included real numbers, by the way, is because I am looking for some feedback concerning the awkwardness this approach. The way you carry out floating point math is to make calls to the library. At the top of your program you must include "Math.i", which will declare all the functions from mathffp.library. You will not have to open the library, however. In any case, in order to do "f1 := f1 + f2", you use: f1 := spadd(f1, f2); Read "Math.i" for a list of the functions. In the example programs there is a file called RealIO.p which has routines to read and write real values to and from files and standard IO. Incidentally, the way to specify a literal real value (since you can't write something like "10.0") is to use spfloat(). For example to specify 4.546, you would write: spdiv(spfloat(4546),spfloat(1000)) This is slow, and involves no less than three calls to the real numbers library, but that is so far the only way to do it. The next version of the compiler will have fully integrated real numbers. In other words you'll be able to specify literal values, do simple math, and carry out IO on them. Functions like sin(), cos(), and sqrt() are not handled by mathffp.library. They are located in mathtrans.library, which is disk based. Thus whenever you write a program that needs these functions, the system disk will have to be inserted in order to get to LIBS:. Read MathTrans.i for further information about all this. By the way, I have no plans to implement these functions in any other fashion. In the forseeable future you'll need MathTrans.library every time you need trigonometric or exponential functions. The Limits of PCQ The compiler can accept lines of any length, although it will display at most the previous 128 characters read in if an error occurs. As far as the size of the file is concerned, it can be any length (the only part of the file that is in memory at any time is the current character), with, of course, a few caveats. The first is that, since this version of the compiler still uses a big array to hold identifiers, there is a limit to the total number you can have. Don't worry about that though: all of the include files combined only take up about half the room. This will be fixed in the next version. There are other fixed limits in the compiler, but I never got anywhere near them in compiling the compiler, so I can't imagine you'll hit them. The other limit is that, since the compiler produces lots of assembly code output, there must be room on the disk for the whole file. The assembly output is, as a rule of thumb, as much as five times as large as the Pascal source. One dubious advantage of using mostly fixed amounts of memory is that I can tell you that the compiler takes up just under 150k in memory, so with its stack and the rest of incidental memory it should require about 160 or 170k to run. Strings As was mentioned above, strings should be thought of like '^char'. They are defined that way, but also are given special properties. They can be dynamically created, sized, and disposed of. A string is supposed to be terminated by a zero byte, so if you write any string handling routines be sure you follow that convention. Otherwise you'll confuse all the other string routines. In the text of a program, you delineate strings with double quotes, instead of the single quotes found around normal arrays of char. Thus: "A string" is indeed a string, while 'not one ' is considered an array [1..8] of char. The other interesting thing about strings is that they can have C-like escape sequences. What happens is that you type a backslash (looks like this: \), and the very next character is specially handled. C has a bunch of these things, but I have, so far, included only the ones I use, which are: \n which stands for a line feed, chr(10) \t which stands for a tab, chr(9) Everything else passes through unchanged, so that you can also use this mechanism to include double quotes in your strings. And you have to use it to include backslashes. What this all boils down to is that the string "A\tboy\nand\\his \"dog.\"" prints out like: |A boy |and\his "dog" There is something called StringLib.i in this archive that declares a few string handling routines - the ones I needed for the compiler. Read that file for more information. And if you get confused about strings, just remember that they're pretty much like C strings, and can be used in most of the same situations. Remember that if you declare a string you don't get any space for the characters. All you get is space to hold the address of where the characters are, so you have to call AllocString() in StringLib or something like it to get some room to work. If you are a BASIC programmer you might run into some difficulty on this subject, and I would suggest reading up on C strings in hopes that whatever you read can explain the situation better than I. By the way, note that 'stringvar^' is valid, and is of type 'char'. Compiler Directives Eventually there will be billions of compiler directives, but for now there are just three. Compiler directives work like this: if the first character in a comment is the dollar sign ($), the compiler looks to the next character for a command. No spaces are allowed between the bracket, dollar sign, and command character. The compiler directives are: {$I "fname"} This will insert the file "fname" into the stream at this point. When it has finished, it will end the comment (no more directives allowed in this comment) and continue on. There can be any amount of white space in front of the filename and anything you want, such as the rest of a comment, after it. The filename is a string, so it must be in quotes. Several of the example programs should demonstrate the include syntax. {$A This directive inserts assembly instructions Instructions into the assembly file produced by the compiler. } Look at the assembly language produced by the compiler to figure out how to reference variables and subroutines. This directive simply passes everything from after the A until, but not including, the closing bracket. You should therefore include comments in assembly fashion. {$R+} or The '+' directive instructs the compiler to produce {$R-} range-checking code for arrays. From this point until the compiler reaches a {$R-} directive, each array access will check that the index value is within the bounds of the array. This expands and slows the code, so I recommend only doing this during testing. If the index is out of bounds, the program will abort with an error code (look at the section "Run Time Errors" for more information). Type Conversions If you have used Modula-2, you can skip this section. In writing the compiler I found the need to cheat a bit on type checking, so I decided to use Modula-2's syntax for changing the type of an expression. What you do is use the name of the type as if it were a function. The expression in the parentheses is evaluated, and the result is considered to be of the type named. It goes like this: IntegerVariable := integer(any ordinal expression); CharVar := char(456 - 450); if boolean('r') then .... This works not only for the included standard types, but also for any type you create. Thus this is also legal: type charptr = ^char; var charvar : charptr; .... charvar := charptr(0); charvar := charptr(integer(charvar) + 1); Note that the type must be named in order for this to work. Something like... variable := array [1..4] of char(expression()) ...will not work. This then is the only case where a type is possible, but you can't use a complete type definition. I'm pretty sure you can in all other cases, but what do I know.... Note further that not all type conversions are valid. Converting a type to one of a different size is often a bad idea, as is converting a structured type (array or record) to a simple type. I should probably warn you against the indiscriminate use of these, but what the heck. Have a ball. External References First a little background. The source code for this compiler is, in total, about 70k. The assembly listings produced by the compiler generally expand the Pascal source about five times, so you can see that if I decided to write the compiler as one big program, it would be way too unwieldy. What I needed was a facility for separate compilation. What I came up with was this: if you have a previously compiled procedure somewhere that you want to call from the Pascal program, just make it a forward declaration somewhere before you use it. If the compiler gets to the end of your program and has not yet run across the full definition of a forward declared procedure, it assumes it's an external reference and makes the appropriate statements in the assembly file. So it looks like this: procedure DrawMap; forward; As long as you don't have something else defined as DrawMap, the compiler will produce an external reference to _DrawMap (note the underscore prepended to the name). Now for something somewhat less kosher. I needed some syntax to allow the external routines to access the same global variables as the main file. What I came up with is a different file format. Whereas the normal Pascal file looks like: program Name; declarations procedures and functions begin main program end. The external file looks like this: external; declarations (like normal) procedure and functions (like normal) There are three things to note. The first is that there is no main program, the second that there is no special ending syntax. It is just a bunch of procedures and functions in a row until the end of the file. The other thing is that any variables declared at the global, or outermost, level are considered external references. In the source for the compiler there is a file that has just the global variable declarations. This file is included by all ten of the source files, but only the main file produces storage space for them. The other nine just produce external references. I guess this is a good time to discuss a couple of issues related to using an assembler with the separate compilation deal. First, note that all procedure, function and variable names are offered as external references by the module in which they are defined. If an outside routine wants to use any of these values, it should be looking for something starting with an underscore and spelled the same as the first time the word is encountered in the program. Pascal is case insensitive, of course, but I can't help the assembler and linker. Also remember that there is no type checking across files (again, get Modula-2 if you want that sort of stuff). This means that a procedure that expects a string might be sent a Boolean value, which would probably conjure the Guru. The other thing to note is that this compiler pushes procedure and function arguments on the stack from left to right. Most C compilers (including Lattice and PDC) do it the opposite way. Draco also does it left to right. This doesn't mean that you can't use code and libraries from them - it simply means that you should reverse the order of the arguments. Just two more notes on this subject: first, the compiler considers registers d0, d1, d2, a0, and a1 fair game, and will destroy them at will. d2 might be a problem, but the others shouldn't. For further information, just look at the assembly code produced. The second note is just a reminder to anyone who might want to link Pascal programs to other languages: remember what 'var' does before a variable, and be sure to use it correctly. Input/Output There are several routines for handling IO in PCQ. Before I get to them, however, let me discuss what happens when you open a file. The actual file variable you declare in the program, as in: var filevar : file of integer; is actually something like a record, which would look like this: file = record FileHandle : a DOS file handle Buffer : a pointer to the input buffer Size : the size of the elements of the file EOF : a Boolean value IN/OUT : (input, output) NextFile : a pointer to the next file record end; Now you can't actually access these fields, but nonetheless 18 bytes of memory is reserved. When you open a file, all of the fields are initialized as necessary, and the first element is read into the buffer. The buffer is accessed by the filevar^ syntax. Also note that if the size of the elements of the file is greater than 4, or if it's 3 (don't ask), the program will allocate memory for a buffer. This will be pointed to by the variable Buffer in this record. If the size is 1, 2 or 4 (as in the case with chars, shorts and integers, respectively), the program will instead use the variable Buffer as the buffer, thus saving a little memory and time. Filevar^ will always properly access the buffer whatever it is. If at the end of execution there remain some open files, the shut-down code will close them for you. This is only true for files opened through Pascal, using one of the open() routines explained below. Anything you open directly through AmigaDOS is your own responsibility. The routines that handle file IO are these: function open(filename : string; filevar : file of something, or Text):boolean; This opens a file for writing. If the file was there before, this routine will erase it. If everything worked OK, it will return true. If not, of course, it's false. function reopen(filename : string; filevar : file of something, or Text) : boolean; This is analogous to open() except it opens an existing file for reading. The rest of the routines are the same as most Pascals. Just for the sake of completeness, however, they are: write() Write the stuff to a file or to standard out writeln() Do the same as write, then output a line feed. This only makes sense for Text files. read() Read some stuff from a file or standard in. read(filevar, x) mimics... x := filevar^; get(filevar); ...just like most Pascals. In this case, it mimics it very closely. readln() Do read then keep reading until you hit a line feed. This too only makes sense for Text files. get() Reads the next file element from the file into the buffer. If the first argument of a read or write is a file variable, the input or output is from a file rather than the console or whatever. That, of course, is normal Pascal, and looks like: writeln(outfile, 'The result is ', 56 div 4); Field widths are supported, but must be a constant expression. What this means is that something like... writeln((67 * 32) + 5:10); ... will print the result right justified in a field of ten characters, with spaces padding out the area to the left. If you specify a field width lower than the width of the number, the number is printed in as few characters as possible. Valid values for the field width are greater than or equal to one and less than MaxShort. You can specify a field width for any type in a write statement, although only when writing to a text file. Just for the sake of precision, I'll go over the delimeters for IO on Text files with various types: Write Char Writes one character. Write Boolean Writes TRUE or FALSE, with no extra spaces. Write Integer Writes the number with no extra spaces, but possibly a negative sign Write Array of Char writes the entire array, from first element to last. Write String Writes from the first character up to but not including the zero byte. Writeln Writes a single EOLN (chr(10)) to the file. Read Char Reads the next char. Read Boolean Can't do it. Read Integer This eats spaces and tabs until it meets up with something else, then eats digits until it comes upon a non-digit. It does not eat that last non digit. If the routine runs across an EOLN before it gets to the first digit, it returns zero. If it finds letters before it finds digits, it returns zero also. Read Array of Char Reads characters into the array until either the array is full or the routine finds an EOLN. If it finds an EOLN it will not eat it, so you'll have to do that with a readln if you want. If it returns because of an EOLN it will also pad the rest of the array with spaces. Read String Reads characters until it gets an EOLN. The EOLN is left in the input stream, and a zero is put in its place in the string. Note that this routine does not check for length, so you must be sure that your string can handle the longest line it might encounter. Readln Reads characters up to and including the next EOLN. Also remember eof(filevar), from the functions, and note that there is no put() analogous to the get() routine. For examples of all of these, look at the example programs. Also note that the filevar^ sort of syntax is present. Look at a Pascal text to understand it (I don't think Turbo Pascal uses this, so it might be Greek to a lot of Pascal programmers). Errors As I mentioned somewhere above, most errors will completely confuse the poor compiler, which will then start spewing out errors that don't really exist. It can get by a couple of errors- for example if you leave out a semicolon somewhere, you should get an error message but everything else should compile. Very few other errors will work that well. I hope to make the compiler a bit friendlier, but in the meantime the compiler will abort the compile if it gets more than 5 errors. I put this in because the compiler will sometimes get one error, then start producing errors on every symbol, and even get hung up on a symbol. Really ugly. If an error occurs, the compiler will write out at most the two lines leading up to the error, and highlight the part that it's currently working on. The error probably occured either at the highlighted symbol or just before it. Also note that the highlighted symbol is always the last symbol written (when the symbol is just some punctuation, it can be difficult to see that it is highlighted). On the next line is the line number of the error and the explanation of the error. Currently I'm using text descriptions of the errors, so there are no error numbers. Run Time Errors A couple of things cause run time errors. The few that are handled at the moment are: Error Explanation 50 No memory for new() 51 Divide by zero with Floating point numbers. 52 Array access out of range. The error number is returned (through the exit() function) to AmigaDOS. If the program is running in a batch file you'll get to see the return code. I hope to have the run time system better thought out in the next version of the compiler, so these might go. Sources Like I said, I wrote this for the learning experience. Some of the places I went for information are: 1. PDC, a freely distributable C compiler supported by Jeff Lydiatt. This is a very good program, and one of the best freely available compilers for the Amiga (the other really good one is Draco by Chris Gray). I learned (and used) a lot about activation frames from the listings produced by this compiler. Looking at the assembly code produced by this compiler was also my inspiration for starting to write a compiler. 2. Pascal-S, the Pascal compiler produced out of ETH Zurich. I got some ideas about the structure of a compiler from this, but not too many. 3. Small-C, another freely distributable C compiler. This one is not nearly as powerful as PDC, but its simplicity helped me understand a thing or two. Probably the best compiler source code that I found to learn from. This and PDC were the compilers I used before this compiler was able to compile itself. Many aspects of the design of PCQ come from Small-C. 4. Brinch Hansen on Pascal Compilers, by Per Brinch Hansen. This book was of some use, which is more than I can say about the other half dozen I read while writing this. From this book I mainly learned about all the things I was doing wrong. Great. If you like the idea of freely distributable compilers, please be sure to check out Draco from Chris Gray (on Fred Fish 76 & 77) and PDC from Jeff Lydiatt (an old version is on Fred Fish 110). Both are much better products than PCQ and even rival the commercial compilers. I'm not sure what a good source for the newer version of PDC would be - perhaps you could write to Jeff (it's certainly worth it. PDC has a full preprocessor, a 'cc' front end, very fast optimized code ... the works). The syntax of Draco, by the way, is fairly similar to Pascal. Notes to Assembly Programmers During the course of a program PCQ uses registers d0, d1, a0 and a1 as scratch. It also uses d2 and d3 during IO calls and d2 when comparing or assigning large data structures. a7 is, of course, the stack pointer, and I use a5 as the frame pointer. a6 is used to hold the library base during any call to the system, and a4 is reserved for future use (for accessing local variables of a parent procedure). The other registers are free, and in fact the scratch registers should be free for you to use between statements. After all, the compiler does no optimizing. Improvements On The Burner Version 1.1 of this compiler will definitely have: Full 32 bit math. Fully integrated floating point math. Properly implemented nested procedures. No more fixed arrays in the compiler itself. The ability to work with the Workbench. As far as the various other problems go, my main concern is fixing bugs. Rather far down on the list is adding every last detail of Pascal. Way down at the bottom of the list is code optimization. As far as gimmicks go, I'd like to integrate the compiler with CygnusEd Professional (the editor I use) through the editor's Arexx port. Version 1.1, with the improvements listed above and possibly others, will be released during the summer of '89 at the latest. Any lettered version, eg version 1.0b, will be a bug fix. I hope I don't run out of letters. Increments in the tenths place will indicate added functionality. If I come out with 2.0 it will be Modula-2. 3.0 will be Ada. Other Notes, Copyright & My Address As I mentioned above, this documentation, the source code for the compiler, the compiler itself, the source code for the run time library, and the run time library itself, are all (ahem): Copyright (c) 1989 Patrick Quaid. I will allow the package to be freely distributed, as long as all the files in the archive, with the possible exception of the assembler and linker (please include them if at all possible), are included and unchanged. Of course no one can make any real money for distributing this program. It may only be distributed on disk collections where a reasonable fee is charged for the disk itself. A reasonable fee is defined here as the greater of $10 per disk, whatever Fred Fish is currently charging. Sorry about being repetitive, but I imagine it's best to state these things clearly. Feel free to mess around with the compiler source code. If you make any substantial improvements, I would appreciate a copy of them so that they can be incorporated into the next version if appropriate. If you make improvements that are not along the lines of standard Pascal or the path indicated above, please don't distribute your program under the name PCQ. That would only confuse things. This is not a shareware package. Feel no guilt about using it without paying for it. The one payment I would really appreciate is if you could let me know about bugs you discover (not unimplemented features- I know about them. I'm not trying to write the end-all greatest compiler, but I do want it to be correct). If you have an overwhelming urge to give money away, please send a donation to Charlie Gibbs, who wrote the assembler, and the Software Distillery, who wrote the linker. Any questions, comments, or whatever can be addressed to: Pat Quaid 8320 E. Redwing Scottsdale, AZ 85253 (602) 948-8325 Enjoy the compiler. If you have any complaints, remember what you paid for it.