home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Crawly Crypt Collection 2
/
crawlyvol2.bin
/
program
/
pascal
/
p2cbin
/
p2c.cat
< prev
next >
Wrap
Text File
|
1993-09-12
|
60KB
|
1,188 lines
local P2C(1)
NAME
p2c - Pascal to C translator, version 1.20
SYNOPSIS
p2c [ options ] [ file [ module ] ]
DESCRIPTION
_✓P_✓2_✓c is a tool for translating Pascal programs into C. The input con-
sists of a set of source files in any of the following Pascal dialects:
HP Pascal, Turbo/UCSD Pascal, DEC VAX Pascal, Oregon Software Pascal/2,
Macintosh Programmer's Workshop Pascal, Sun/Berkeley Pascal. Modula-2
syntax is also supported. Output is a set of .c and .h files that
comprise an equivalent program in any of several dialects of C. Output
code may be kept machine- and dialect-independent, or it may be targeted
to a specific machine and compiler. Most reasonable Pascal programs are
converted into fully functional C which will compile and run with no
further modifications, although _✓p_✓2_✓c sometimes chooses to generate read-
able code at the expense of absolute generality. _✓P_✓2_✓c endeavors to insert
notes and warning messages into the output code to point out areas which
may require human intervention. Output code is arranged to be readable
and efficient, and to make use of C idioms wherever possible. The main
goal of the translation is to produce C files which are pleasant and
"natural" enough to be acceptable as the new source files for a program.
In a pinch, _✓p_✓2_✓c will also serve as an ad hoc Pascal compiler.
Code generated by _✓p_✓2_✓c normally does not assume characters are signed or
unsigned. Also, it assumes int is the same as either short or long but
does not depend on which. However, if int is not the same as long it is
best to use a modern C compiler which supports prototypes. Generated
code does not require an ANSI-compatible compiler (unless ANSI-style
code is requested), but it does use various ANSI-standard library rou-
tines.
All generated code includes the file <_✓p_✓2_✓c/_✓p_✓2_✓c._✓h> which in turn includes
<_✓s_✓t_✓d_✓i_✓o._✓h> and various other common resources. Also, many translated
programs will need to be linked with the run-time library, typically -
_✓l_✓p_✓2_✓c.
Given a file name, _✓p_✓2_✓c reads from the specified file and outputs to a
file with a .c suffix added or substituted. For example,
p2c myfile.pas
reads from _✓m_✓y_✓f_✓i_✓l_✓e._✓p_✓a_✓s to produce the file _✓m_✓y_✓f_✓i_✓l_✓e._✓c. The input file may
contain a Pascal main program or a single Pascal module (or "unit" in
Turbo and UCSD Pascal nomenclature), or it may just contain a number of
procedures and declarations. _✓P_✓2_✓c is designed to work for correct input
programs. That is, it will accept partial programs but may occasionally
core dump if the input refers to undefined symbols.
If the input is a module, the translator will also produce a file
_✓m_✓o_✓d_✓u_✓l_✓e.h containing a translation of the module's interface section.
The implementation section may be omitted in which case only the .h file
1
P2C(1) local
will be interesting. If the program or module has include files, these
may cause additional .c files to be generated depending on the value of
the ExpandIncludes option (see below).
If no file name is given, _✓p_✓2_✓c reads Pascal from the standard input and
writes the resulting C to standard output (though a .h file may still be
produced). If a file name and module name are given, the file may
include several modules (or units). The specified module is translated;
any others are skipped. The output files will be named _✓m_✓o_✓d_✓u_✓l_✓e.c and
_✓m_✓o_✓d_✓u_✓l_✓e.h. _✓P_✓2_✓c never translates more than one module per run.
Before starting, _✓p_✓2_✓c reads the file /_✓t_✓m_✓p/_✓q_✓q/_✓h_✓o_✓m_✓e/_✓p_✓2_✓c_✓r_✓c for a number of
configuration parameters. (The actual path used on your system may
vary. The -i option is a handy way to examine this file.) If the P2CRC
environment variable is set, it gives the name of a file to read instead
of the system file; this file can start with Include %H/p2crc to include
the system file. Next, _✓p_✓2_✓c attempts to read the file _✓p_✓2_✓c_✓r_✓c in your
directory for further configuration. If this file does not exist, _✓p_✓2_✓c
looks for ._✓p_✓2_✓c_✓r_✓c instead.
OPTIONS
-o _✓c_✓f_✓i_✓l_✓e
Use _✓c_✓f_✓i_✓l_✓e in place of _✓f_✓i_✓l_✓e.c or _✓m_✓o_✓d_✓u_✓l_✓e.c as the primary output
file. A single dash (`-o -') says to write the C code to the stan-
dard output.
-h _✓h_✓f_✓i_✓l_✓e
Use _✓h_✓f_✓i_✓l_✓e in place of _✓m_✓o_✓d_✓u_✓l_✓e.h as the output file for interface
text. This only has effect if the input is an HP Pascal module or
a Turbo Pascal unit.
-s _✓s_✓f_✓i_✓l_✓e
Read interface text from _✓s_✓f_✓i_✓l_✓e before beginning the translation.
This file typically contains one or more modules, often with inter-
face sections omitted for speed, which the program or module being
translated will use. (Typically the ImportFrom and ImportDir
parameters in _✓p_✓2_✓c_✓r_✓c are set up to allow _✓p_✓2_✓c to locate interface
text without needing any -s options.) If there are several -s
options in the command, the _✓s_✓f_✓i_✓l_✓e_✓s are read from left to right.
-p_✓n Display progress of translation in the form of a line number/file
name display. This is refreshed every _✓n lines, 25 by default.
-c _✓r_✓c_✓f_✓i_✓l_✓e
Read local configuration commands from _✓r_✓c_✓f_✓i_✓l_✓e instead of _✓p_✓2_✓c_✓r_✓c or
._✓p_✓2_✓c_✓r_✓c. A dash (`-c -') in place of _✓r_✓c_✓f_✓i_✓l_✓e causes no local confi-
guration file to be used.
-v ("Vanilla.") Do not read from the system configuration file
/_✓t_✓m_✓p/_✓q_✓q/_✓h_✓o_✓m_✓e/_✓p_✓2_✓c_✓r_✓c. Since some of the parameters in this file are
required, your local configuration file must include those parame-
ters instead. This also suppresses the file named by the P2CRC
environment variable.
2
local P2C(1)
-H _✓h_✓o_✓m_✓e_✓d_✓i_✓r
Use _✓h_✓o_✓m_✓e_✓d_✓i_✓r instead of /_✓t_✓m_✓p/_✓q_✓q/_✓h_✓o_✓m_✓e as the _✓p_✓2_✓c home directory. The
system _✓p_✓2_✓c_✓r_✓c file will be searched for in this directory.
-I_✓p_✓a_✓t_✓t_✓e_✓r_✓n
Add _✓p_✓a_✓t_✓t_✓e_✓r_✓n to the ImportDir search list of places to find modules
which are imported. The pattern should include a %_✓s to represent
the module name, and should evaluate to a potential file name for
that module's source code. For example, ../%s.pas looks for
_✓m_✓o_✓d_✓u_✓l_✓e_✓n_✓a_✓m_✓e.pas in the parent of the current directory.
-i This special option (which must be the only argument on the command
line if used) simply copies the system configuration file
/_✓t_✓m_✓p/_✓q_✓q/_✓h_✓o_✓m_✓e/_✓p_✓2_✓c_✓r_✓c to the standard output in its entirety. (It may
be used with -H, but -i is most useful precisely when you don't
know the location of the home directory.)
-q Quiet mode. Suppresses output of status messages during transla-
tion.
-E_✓n Abort translation after _✓n errors. If _✓n is omitted it defaults to
zero, which means unlimited errors are allowed. Use -E1 to make
_✓p_✓2_✓c halt after the first error.
-e Echo the Pascal source into the output file, surrounded by #ifdefs.
This is the same as the CopySource parameter in the _✓p_✓2_✓c_✓r_✓c file.
-a Produce modern ANSI C. This is a convenient override for the AnsiC
parameter in the _✓p_✓2_✓c_✓r_✓c file.
-L _✓l_✓a_✓n_✓g_✓u_✓a_✓g_✓e
Select input language name, such as VAX or TURBO. This is a con-
venient override for the Language parameter.
-V Verbose mode. This causes _✓p_✓2_✓c to generate an additional ".log"
file with further details of the translation, such as a list of
warnings and notes including those which are suppressed in the reg-
ular output.
-M0 Disable memory conservation. This prevents _✓p_✓2_✓c from freeing vari-
ous data structures after translating each function, in case this
new conservation feature causes unforseen problems.
-R Regression testing mode. Formats notes and warning messages in a
way that makes it easier to run _✓d_✓i_✓f_✓f(1) on the output of _✓p_✓2_✓c.
_✓P_✓2_✓c also understands a few debugging options which may occasionally be
useful when tracking down translation problems. The -d_✓n option sets the
"debug level" to _✓n, a small integer which is normally zero. Debugging
output is written into the regular output file along with the C code;
the higher your _✓n, the more "wallpaper" you get. Also, -t prints debug-
ging information at every Pascal token, -B_✓n enables line-breaker debug-
ging, and -C_✓n enables comment placement debugging.
3
P2C(1) local
CHOICE OF SOURCE LANGUAGE
The Language configuration parameter or -L command-line option tells _✓p_✓2_✓c
which Pascal dialect to expect in the input file. Any language features
which do not overlap between dialects are supported all of the time.
The Language parameter is consulted when a syntax or usage is detected
that has different meanings in two different dialects, and also to
determine default values for various other translation parameters as
described below.
The following language words are supported by _✓p_✓2_✓c. Names are case-
insensitive.
HP HP Pascal. This is the default language. All features of HP
Standard Pascal, the Pascal Workstation version, are supported
except as noted in BUGS below. Some features of MODCAL, HP's
extended Pascal, are also supported. This is a superset of ISO
standard Pascal, including conformant arrays and procedural
parameters.
HP-UX HP Pascal, HP-UX version. Almost identical to the "HP" dialect.
Turbo Turbo Pascal 5.0 for the IBM PC. Few conflicts with HP Pascal,
so the Language parameter is not often needed for Turbo. (Most
important is that the Turbo and HP dialects use 16 and 32 bit
integers, respectively.)
UCSD UCSD Pascal. Similar to Turbo in many ways.
MPW Macintosh Programmer's Workshop Pascal 2.0. Should also do a
pretty good job for Lightspeed Pascal. Object Pascal features
are not supported, nor is the fact that char variables are some-
times stored in 16 bits.
VAX VAX/VMS Pascal version 3.5. Most but not all language features
supported. This has not yet been tested on large programs.
Oregon Oregon Software Pascal/2. All features implemented.
Berk Berkeley Pascal with Sun extensions.
Modula Modula-2. Based on Wirth's _✓P_✓r_✓o_✓g_✓r_✓a_✓m_✓m_✓i_✓n_✓g _✓i_✓n _✓M_✓o_✓d_✓u_✓l_✓a-_✓2, 3rd edi-
tion. Proper setting of the Language parameter is _✓n_✓o_✓t optional.
Translation will be incomplete in most cases, but should be good
enough to work with. Structure of local sub-modules is essen-
tially ignored; like-named identifiers may be confused. Type
WORD is translated as an integer, but type ADDRESS is translated
as char * or void *; this may cause inconsistencies in the out-
put code.
Modula-2 modules have two parts in separate files. Suppose
these are called _✓f_✓o_✓o._✓d_✓e_✓f (definition part) and _✓f_✓o_✓o._✓m_✓o_✓d (imple-
mentation part) for module _✓f_✓o_✓o. Then a pattern like %s.def must
be included in the ImportDir list, and LibraryFile must be
changed to refer to _✓s_✓y_✓s_✓t_✓e_✓m._✓m_✓2 instead of _✓s_✓y_✓s_✓t_✓e_✓m._✓i_✓m_✓p. To
4
local P2C(1)
translate the definition part, give the command
p2c foo.def
to translate the definition part into files _✓f_✓o_✓o._✓h and _✓f_✓o_✓o._✓c; the
latter will usually be empty. The command
p2c -s foo.def foo.mod
will translate the implementation part into file _✓f_✓o_✓o._✓c.
Even if all language features are supported for a dialect, some prede-
fined functions may be omitted. In these cases, the function call will
be translated literally into C with a warning. Some hand modification
may be required.
CONFIGURATION PARAMETERS
_✓P_✓2_✓c is highly configurable. The defaults are suitable for most applica-
tions, but customizing these parameters will help you get the best pos-
sible translation. Since the output of _✓p_✓2_✓c is intended to be used as
human-maintainable source code, there are many parameters for describing
the coding style and conventions you prefer. Others give hints about
your program that help _✓p_✓2_✓c to generate more correct, efficient, or read-
able code.
The _✓p_✓2_✓c_✓r_✓c files contain a list of parameters, one per line. The system
configuration file, which may be viewed using the -i option to _✓p_✓2_✓c,
serves as an example of the proper format. Parameter names are case-
insensitive. If a parameter name occurs exactly once in the system
_✓p_✓2_✓c_✓r_✓c, this indicates that it must have a unique value and the last
value given to it by the configuration files is used. Other parameters
are written several times in a row; these are lists to which each confi-
guration line adds an entry.
Many _✓p_✓2_✓c_✓r_✓c options take a numeric value of 0 or 1, roughly corresponding
to "no" or "yes." Sometimes a blank value or the value "def"
corresponds to an intermediate "maybe" state. For example, the stylis-
tic option ExtraParens switches between copious or minimal parentheses
in expressions, with the default being a nice compromise intended to be
best for readers with an average knowledge of C operator precedences.
Configuration options may also be embedded in the source file in the
form of Pascal comments:
{ShortOpt=0} {AvoidName=fred}
{FuncMacro slope(x,y)=atan2(y,x)*RadDeg}
disables automatic short-circuiting of and and or expressions, adds
"_✓f_✓r_✓e_✓d" to the list of names to avoid using in generated C code, and
defines a special translation for the Pascal program's _✓s_✓l_✓o_✓p_✓e function
using the standard C _✓a_✓t_✓a_✓n_✓2 function and a constant _✓R_✓a_✓d_✓D_✓e_✓g presumably
defined in the program. Whitespace is generally not allowed in embedded
parameters. The `=' sign is required for embedded parameters, though it
is optional in _✓p_✓2_✓c_✓r_✓c files. Comments within embedded parameters are
5
P2C(1) local
delimited by `##'. Numeric parameters may replace `=' with `+' or `-'
to increase or decrease the parameter; list-based parameters may use `-'
to remove a name from a list rather than adding it. Also, the parameter
name by itself in comment braces means to restore the parameter's value
that was current before the last change:
{VarFiles=0 ## Pass FILE *'s params by value even if VAR}
_✓s_✓o_✓m_✓e _✓d_✓e_✓c_✓l_✓a_✓r_✓a_✓t_✓i_✓o_✓n_✓s
{VarFiles ## Back to original FILE * passing}
causes the parameter VarFiles to have the value 0 for those few declara-
tions, without affecting the parameter's value elsewhere in the file.
If an embedded parameter appears in an include file or in interface text
for a module, the effect of the assignment normally carries over to any
programs that included that file. If the parameter name is preceded by
a `*', then the assignment is automatically undone after the source file
that contains it ends:
{IncludeFrom strings=<p2c/strings.h>}
{*ExportSymbol=pascal_%s}
module strings;
will record the location of the _✓s_✓t_✓r_✓i_✓n_✓g_✓s module's include file for the
rest of the translation, but the assignment of ExportSymbol pertains
only to the module itself.
For the complete list of _✓p_✓2_✓c_✓r_✓c parameters, run _✓p_✓2_✓c with the -i option.
Here are some additional comments on selected parameters:
ImportAll Because Turbo Pascal only allows one unit per source
file, _✓p_✓2_✓c normally stops reading past the word _✓i_✓m_✓p_✓l_✓e_✓m_✓e_✓n_✓-
_✓t_✓a_✓t_✓i_✓o_✓n in a file being scanned for interface text. But
HP Pascal allows several modules per file and so this
would not be safe to do. The ImportAll option lets you
override the default behavior for your Pascal dialect.
AnsiC This parameter selects which dialect of C to use. If 1,
all conventions of ANSI C such as prototypes, void *
pointers, etc. are used. If 0, only strict K&R (first
edition) C is used. The default is to use "traditional
UNIX C," which includes enum and void but not void * or
prototypes. Once again there are a number of other
parameters which may be used to control the individual
features if just setting AnsiC is not enough.
C++ At present _✓p_✓2_✓c does not use much of C++ at all. The
default action is to generate code that will compile in
either language.
UseVExtern Many non-UNIX linkers prohibit variables from being
defined (not declared) by more than one source file. One
module must declare, e.g., "int foo;", and all others
must declare "extern int foo;". _✓P_✓2_✓c accomplishes this by
6
local P2C(1)
declaring public variables "vextern" in header files, and
arranging for the macro vextern to expand to extern or to
nothing when appropriate. If you set UseVExtern=0 _✓p_✓2_✓c
will instead declare variables in a simpler way that
works only on UNIX-style linkers.
UseAnyptrMacros
Certain C reserved words have meanings which may vary
from one C implementation to another. _✓P_✓2_✓c uses special
capitalized names for these words; these names are
defined as macros in the file _✓p_✓2_✓c._✓h which all translated
programs include. You can set UseAnyptrMacros=0 to dis-
able the use of these macros. Note that the functions of
many of these macros can also be had directly using other
parameters; for example, UseConsts allows you to specify
whether your target language recognizes the word const in
constant declarations. The default is to use the Const
macro instead, so that your code will be portable to
either kind of implementation.
Signed expands to the reserved word signed if that word
is available, otherwise it is given a null definition.
Similarly, Const expands to const if that feature is
available. The words Volatile and Register are also
defined in _✓p_✓2_✓c._✓h, although _✓p_✓2_✓c does not use them at
present. The word Char expands to char by default, but
might need to be redefined to signed char or unsigned
char in a particular implementation. This is used for
the Pascal character type; lowercase char is used when
the desired meaning is "byte," not "character."
The word Static always expands to static by default.
This is used in situations where a function or variable
is declared static to make it local to the source file;
lowercase static is used for static local variables.
Thus you can redefine Static to be null if you want to
force private names to be public for purposes of debug-
ging.
The word Void expands to void in all cases; it is used
when declaring a function with no return value. The word
Anyptr is a typedef for void * or char * as necessary; it
represents a generic pointer.
UsePPMacros The _✓p_✓2_✓c._✓h header also declares two macros for function
prototyping, PP(x) and PV(). These macros are used as
follows:
Void foo PP( (int x, int y, Char *z) );
Char *bar PV( );
If prototypes are available, these macros will expand to
Void foo (int x, int y, Char *z);
7
P2C(1) local
Char *bar (void);
but if only old-style declarations are supported, you
instead get
Void foo ();
Char *bar ();
By default, _✓p_✓2_✓c uses these macros for all function
declarations, but function _✓d_✓e_✓f_✓i_✓n_✓i_✓t_✓i_✓o_✓n_✓s are written in
old-style C. The UsePPMacros parameter can be set to 0
to disable all use of PP and PV, or it can be set to 1 to
use the macros even when defining a function. (This is
accomplished by preceding each old-style definition with
a PP-style declaration.) If you know your code will
always be compiled on systems that support prototyping,
it is prettier to set Prototypes=1 or simply AnsiC=1 to
get true function prototypes.
EatNotes Notes and warning messages containing any of these
strings as sub-strings are not omitted. Each type of
message includes an identifier like [145]; you can add
this identifier to the EatNotes list to suppress that
message. Another useful form is to use a variable name
or other identifier to suppress warnings about that vari-
able. The strings are a space-separated list, and thus
may not contain embedded spaces. To suppress notes
around a section of code, use, e.g., {_✓E_✓a_✓t_✓N_✓o_✓t_✓e_✓s+[_✓1_✓4_✓5]} and
{_✓E_✓a_✓t_✓N_✓o_✓t_✓e_✓s-[_✓1_✓4_✓5]}. Most notes are generated during pars-
ing, but to suppress those generated during output the
string may need to remain in the list far beyond the
point where it appears to be generated. Use the string
"1" or "0" to disable or enable all notes, respectively.
ExpandIncludes The default action is to expand Pascal include files in-
line. This may not be desirable if include files are
being used to simulate modules. With ExpandIncludes=0,
_✓p_✓2_✓c attempts to convert include files containing only
whole procedures and global declarations into analogous C
include files. This may not always work, though; if you
get error messages, don't use this option. By combining
this option with StaticFunctions=0, then doing some
fairly minor editing on the result, you can convert a
pseudo-modular Pascal program into a truly modular col-
lection of C source files.
ElimDeadCode Some transformations that _✓p_✓2_✓c does on the program may
result in unreachable or "dead" code. By default _✓p_✓2_✓c
removes such code, but sometimes it removes more than it
should. If you have "if false" segments which you wish
to retain in C, you may have to set ElimDeadCode=0.
SkipIndices Normally Pascal arrays not based at zero are "shifted"
down for C, preserving the total size of the array. A
8
local P2C(1)
Pascal array a[2..10] is translated to a C array a[9]
with references like "a[i]" changed to "a[i-2]" every-
where. If SkipIndices is set to a value of 2 or higher,
this array would instead be translated to a[11] with the
first two elements never used. This arrangement may gen-
erate incorrect code, though, for tricky source programs.
FoldConstants Pascal non-structured constants generally translate to
#define's in C. Set this to 1 to have constants instan-
tiated directly into the code. This may be turned on or
off around specific constant declarations. Set this to 0
to force _✓p_✓2_✓c to make absolutely no assumptions about the
constant's value in generated code, so that you can
change the constant later in the C code without invali-
dating the translation. The default is to allow _✓p_✓2_✓c to
take advantage of its knowledge of a constant's value,
such as by generating code that assumes the constant is
positive.
CharConsts This governs whether single-character string literals in
Pascal const declarations should be interpreted as char-
acters or strings. In other words, _✓c_✓o_✓n_✓s_✓t _✓a='_✓x'; will
translate to #_✓d_✓e_✓f_✓i_✓n_✓e _✓a '_✓x' if CharConsts=1 (the default),
or to #_✓d_✓e_✓f_✓i_✓n_✓e _✓a _✓x if CharConsts=0. Note that if _✓p_✓2_✓c
guesses wrong, the generated code will not be wrong, just
uglier. For example, if _✓a is written as a character con-
stant but it turns out to be used as a string, _✓p_✓2_✓c will
have to write char-to-string conversion code each time
the constant is used.
VarStrings In HP Pascal, a parameter of the form "var s : string"
will match a string variable of any size; a hidden size
parameter is passed which may be accessed by the Pascal
_✓s_✓t_✓r_✓m_✓a_✓x function. You can prevent _✓p_✓2_✓c from creating a
hidden size parameter by setting VarString=0. (Note that
each function uses the value of VarStrings as of the
_✓f_✓i_✓r_✓s_✓t declaration of the function that is parsed, which
is often in the interface section of a module.)
Prototypes Control whether ANSI C function prototypes are used.
Default is according to AnsiC. This also controls
whether to include parameter names or just their types in
situations where names are optional. The FullPrototyping
parameter allows prototypes to be generated for declara-
tions but not for definitions (older versions of
Lightspeed C required this). If you use a mixture of
prototypes and old-style definitions, types like short
and float will be promoted to int and double as required
by the ANSI standard, unless PromoteArgs is used to over-
ride this. The CastArgs parameter controls whether
type-casts are used in function arguments; by default
they are used only if prototypes are not available.
StaticLinks HP Pascal and Turbo Pascal each include the concept of
9
P2C(1) local
procedure or function pointers, though with somewhat dif-
ferent syntaxes. _✓P_✓2_✓c recognizes both notational styles.
Another difference is that HP's procedure pointers can
point to nested procedures, while Turbo's can point only
to global procedures. In HP Pascal a procedure pointer
must be stored as a struct containing both a pure C func-
tion pointer and a "static link," a pointer to the parent
procedure's locals. (The static link is NULL for global
procedures.) This notation can be forced by setting Sta-
ticLinks=1. In Turbo, the default (StaticLinks=0) is to
use plain C function pointers with no static links. A
third option (StaticLinks=2) uses structures with static
links, but assumes the links are always NULL when calling
through a pointer (if you need compatibility with the HP
format but know your procedures are global).
SmallSetConst Pascal sets are translated into one of two formats,
depending on the size of the set. If all elements have
ordinal values in the range 0..31, the set is translated
as a single integer variable using bit operations. (The
SetBits parameter may be used to change the upper limit
of 31.) The SmallSetConst parameter controls whether
these small-sets are used, and, if so, how constant sets
should be represented in C. For larger sets, an array
of long is used. The _✓s[0] element contains the number of
succeeding array elements which are in use. Set elements
in the range 0..31 are stored in the _✓s[1] array element,
and so on. Sets are normalized so that _✓s[_✓s[0]] is
nonzero for any nonempty set. The standard run-time
library includes all the necessary procedures for operat-
ing on sets.
ReturnValueName
This is one of many "naming conventions" parameters.
Most of these take the form of a _✓p_✓r_✓i_✓n_✓t_✓f-like string con-
taining a %_✓s where the relevant information should go.
In the case of ReturnValueName, the %_✓s refers to a func-
tion name and the resulting string gives the name of the
variable to use to hold the function's return value.
Such a variable will be made if a function contains
assignments to its return value buried within the body,
so that _✓r_✓e_✓t_✓u_✓r_✓n statements cannot conveniently be used.
Some parameters (ReturnValueName included) do not require
the %_✓s to be present in the format string; for example,
the standard _✓p_✓2_✓c_✓r_✓c file stores every function's return
value in a variable called _✓R_✓e_✓s_✓u_✓l_✓t.
AlternateName _✓P_✓2_✓c normally translates Pascal names into C names verba-
tim, but occasionally this is not possible. A Pascal
name may be a C reserved word or traditional C name like
_✓p_✓u_✓t_✓c, or there may be several like-named things that are
hidden from each other by Pascal's scoping rules but must
be global in C. In these situations _✓p_✓2_✓c uses the parame-
ter AlternateName1 to generate an alternative name for
10
local P2C(1)
the symbol. The default is to add an underscore to the
name. There is also an AlternateName2 parameter for a
second alternate name, and an AlternateName parameter for
the _✓nth alternate name. (The value for this parameter
should include both a %_✓s and a %_✓d, in either order.) If
these latter parameters are not defined, _✓p_✓2_✓c applies
AlternateName1 many times over.
ExportSymbol Symbols in the interface section for a Pascal module are
formatted according to the value of ExportSymbol, if any.
It is not uncommon to use _✓m_✓o_✓d_✓u_✓l_✓e_✓n_✓a_✓m_✓e_%_✓s for this symbol;
the default is %_✓s, i.e., no special treatment for
exported symbols. If you also define the Export_Symbol
parameter, that format is used instead for exported sym-
bols which contain an underscore character. If %_✓S (with
a capital "S") appears in the format string it stands for
the current module name.
Alias If the value of this parameter contains a %_✓s, it is a
format string applied to the names of external functions
or variables. If the value does not contain a %_✓s, it
becomes the name of the next external symbol which is
declared (after which the parameter is cleared).
Synonym This creates a synonym for another Pascal symbol or key-
word. The format is
Synonym _✓o_✓l_✓d-_✓n_✓a_✓m_✓e = _✓n_✓e_✓w-_✓n_✓a_✓m_✓e
All occurrences of _✓o_✓l_✓d-_✓n_✓a_✓m_✓e in the input text are treated
as if they were _✓n_✓e_✓w-_✓n_✓a_✓m_✓e by the parser. If _✓n_✓e_✓w-_✓n_✓a_✓m_✓e is a
keyword, _✓o_✓l_✓d-_✓n_✓a_✓m_✓e will be an equivalent keyword. If
_✓n_✓e_✓w-_✓n_✓a_✓m_✓e is the name of a predefined function, _✓o_✓l_✓d-_✓n_✓a_✓m_✓e
will behave in the same way as that function, and so on.
If _✓n_✓e_✓w-_✓n_✓a_✓m_✓e is omitted, then occurrences of _✓o_✓l_✓d-_✓n_✓a_✓m_✓e are
entirely ignored in the input file. Synonyms allow you
to skip over a keyword in your dialect of Pascal that is
not understood by _✓p_✓2_✓c, or to simulate a keyword or prede-
fined identifier of your dialect with a similar one that
_✓p_✓2_✓c recognizes. Note that all predefined functions are
available at all times; if you have a library routine
that behaves like, e.g., Turbo Pascal's _✓g_✓e_✓t_✓m_✓e_✓m procedure,
you can make your routine a synonym for _✓g_✓e_✓t_✓m_✓e_✓m even if
you are not translating in Turbo mode.
NameOf This defines the name to use in C for a specific symbol.
It must appear before the symbol is declared in the Pas-
cal code; it is usually placed in the local _✓p_✓2_✓c_✓r_✓c file
for the project. The format is
NameOf _✓p_✓a_✓s_✓c_✓a_✓l-_✓n_✓a_✓m_✓e = _✓C-_✓n_✓a_✓m_✓e
By default, Pascal names map directly onto C names with
no change (except for the various kinds of formatting
11
P2C(1) local
outlined above). If the _✓p_✓a_✓s_✓c_✓a_✓l-_✓n_✓a_✓m_✓e is of the form
_✓m_✓o_✓d_✓u_✓l_✓e._✓n_✓a_✓m_✓e or _✓p_✓r_✓o_✓c_✓e_✓d_✓u_✓r_✓e._✓n_✓a_✓m_✓e then the command applies
only to the instance of the Pascal name that is global to
that module, or local to that procedure. Otherwise, it
applies to all usages of the name.
VarMacro This is analogous to NameOf, but specifically for use
with Pascal variables. The righthand side can be most
any C expression; all references to the variable are
expanded into that C expression. Names used in the C
expression are taken verbatim. There is also a Const-
Macro parameter for translating constants as arbitrary
expressions. Note that the variable on the lefthand side
must actually be declared in the program or in a module
that it uses. The declaration for the variable will be
omitted from the generated code unless the Pascal-name
appears in the expression: If you ask to replace _✓i with
_✓i+_✓1, the variable _✓i will still be declared but its value
will be shifted accordingly. Note that if _✓i appears on
the lefthand side of an assignment, _✓p_✓2_✓c will use algebra
to "solve" for _✓i.
In all cases where _✓p_✓2_✓c parses C expressions, all C opera-
tors are recognized except compound assignments like
`+='. (Increment and decrement operators are allowed.)
All variable and function names are assumed to have
integer type, even if they are names that occur in the
actual program. A type-specification operator `::' has
been introduced; it has the same precedence as `.' or `-
>' but the righthand side must be a Pascal type identif-
ier (built-in or defined by your program previously to
when the macro definition was parsed), or an arbitrary
Pascal type expression in parentheses. The lefthand
argument is then considered to have the specified type.
This may be necessary if your macro is used in situations
where the exact type of the expression must be known
(say, as the argument to a _✓w_✓r_✓i_✓t_✓e_✓l_✓n).
FieldMacro Here the lefthand side must have the form _✓r_✓e_✓c_✓o_✓r_✓d._✓f_✓i_✓e_✓l_✓d,
where _✓r_✓e_✓c_✓o_✓r_✓d is the Pascal type or variable name for a
record, and _✓f_✓i_✓e_✓l_✓d is a field in that record. The right-
hand side must be a C expression generally including the
name _✓r_✓e_✓c_✓o_✓r_✓d. All instances of that name are replaced by
the actual record being "dotted." For example,
FieldMacro Rect.topLeft = topLeft(Rect)
translates _✓a[_✓i]._✓t_✓o_✓p_✓L_✓e_✓f_✓t into _✓t_✓o_✓p_✓L_✓e_✓f_✓t(_✓a[_✓i]), where _✓a is an
array of _✓R_✓e_✓c_✓t.
FuncMacro The lefthand side must be any Pascal function or pro-
cedure name plus a parameter list. The number of parame-
ters must match the number in the function's uses and
declaration. Calls to the function are replaced by the C
12
local P2C(1)
expression on the righthand side. For example,
FuncMacro PtInRect(p,r) = PtInRect(p,&r)
causes the second argument of _✓P_✓t_✓I_✓n_✓R_✓e_✓c_✓t to be passed by
reference, even though the declaration says it's not. If
the function in question is actually defined in the pro-
gram or module being translated, the FuncMacro will not
affect the definition but it will affect all calls to the
function elsewhere in the module. FuncMacros can also be
applied to predefined or never-defined functions.
IncludeFrom This specifies that a given module's header should be
included from a given place. The second argument may be
surrounded by " " or < > as necessary; if the second
argument is omitted, no include directive will be gen-
erated for the module.
ImportFrom This specifies that a given module's Pascal interface
text can be found in the given file. The named file
should be either the source file for the module, or a
specially prepared file with the implementation section
removed for speed. If no ImportFrom entry is found for a
module, the path defined by the ImportDir list is
searched. Each entry in the path may contain a %_✓s, which
expands to the name of the module. The default path
looks for %_✓s._✓p_✓a_✓s and %_✓s._✓t_✓e_✓x_✓t in the current directory,
then for /_✓t_✓m_✓p/_✓q_✓q/_✓h_✓o_✓m_✓e/%_✓s._✓i_✓m_✓p. (where /tmp/qq/home is the
_✓p_✓2_✓c home directory.)
StructFunction This parameter is a list of functions which follow the
_✓p_✓2_✓c semantics for structure-valued functions (functions
returning arrays, sets, and strings, and structs in prim-
itive C dialects). For these functions, a pointer to a
return-value area is passed to the function as a special
first parameter. The function stores the result in this
area, then returns a copy of the pointer. (The standard
C function _✓s_✓t_✓r_✓c_✓p_✓y is an example of this concept. _✓S_✓p_✓r_✓i_✓n_✓t_✓f
also behaves this way in some dialects; it always appears
on the StructFunction list regardless of the type of
implementation.) The system configuration file includes
a list of common structured functions so that _✓p_✓2_✓c's
optimizer will know how to manipulate them.
StrlapFunction Functions on this list are structured functions as above,
but with the ability to work in-place; that is, the same
pointer may be passed as both the return value area and a
regular parameter.
Deterministic Functions on this list have no side effects or side
dependencies. An example is the _✓s_✓i_✓n function in the
standard math library; two calls with the same parameter
values produce the same result, and have no effects other
than returning a value. _✓P_✓2_✓c can make use of this
13
P2C(1) local
knowledge when optimizing code for efficiency or reada-
bility. Functions on this list are also assumed to be
relatively fast, so that it is acceptable to duplicate a
call to the function.
LeaveAlone Functions on this list are not subjected to the normal
built-in translation rules that _✓p_✓2_✓c would otherwise use.
For example, adding _✓w_✓r_✓i_✓t_✓e_✓l_✓n to this list would translate
_✓w_✓r_✓i_✓t_✓e_✓l_✓n statements blindly into calls to a C _✓w_✓r_✓i_✓t_✓e_✓l_✓n()
function, rather than being translated into equivalent
_✓p_✓r_✓i_✓n_✓t_✓f calls. The built-in translation is also
suppressed if the function has a FuncMacro.
BufferedFile _✓P_✓2_✓c normally assumes binary files will use _✓r_✓e_✓a_✓d/_✓w_✓r_✓i_✓t_✓e,
not _✓g_✓e_✓t/_✓p_✓u_✓t/^ notation. A file buffer variable will only
be created for a file if buffer notation is used for it.
For global file variables this may be detected too late
(a declaration without buffers may already have been
written). Such files can be listed in BufferedFile to
force _✓p_✓2_✓c to allocate buffers for them; do this if you
get a warning message that says it is necessary. Set
BufferedFile=1 to buffer all files, in which case UnBuf-
feredFile allows you to force certain files _✓n_✓o_✓t to have
buffers.
StructFiles If _✓p_✓2_✓c still can't translate your file operations
correctly, you can set StructFiles=1 to cause Pascal
files to translate into structs which include the usual C
_✓F_✓I_✓L_✓E pointer, as well as file buffer and file name
fields. While the resulting code doesn't look as much
like native C, the file structs will allow _✓p_✓2_✓c to do a
correct translation in many more cases.
CheckFileEOF Normally only file-open operations are checked for
errors. Additional error checking, such as read-past-
end-of-file, can be enabled with parameters like Check-
FileEOF. These checks can make the code very ugly! If
I/O checking is enabled by the program ($iocheck on$ in
HP Pascal; {$I+} in Turbo; this is always the default
state), these checks will generate fatal errors unless
enclosed in an HP Pascal try-recover construct. If I/O
checking is disabled, these will cause the global vari-
able _✓P__✓i_✓o_✓r_✓e_✓s_✓u_✓l_✓t to be set zero or nonzero according to
the outcome. The default for most of these options is to
check only when I/O checking is disabled.
ISSUES
Integer size. _✓P_✓2_✓c normally generates code to work with either 16 or 32
bit ints. If you know your C integers will be 16 or 32 bits, set
IntSize appropriately. In particular setting IntSize=32 will generate
much cleaner code: _✓p_✓2_✓c no longer must carefully cast function arguments
between int and long. These casts also will be unnecessary if ANSI pro-
totypes are available. To disable int/long casting because you know at
least one of these cases will hold, set CastLongArgs=0. (The CastArgs
14
local P2C(1)
parameter similarly controls other types of casts, such as between ints
and doubles.) The Integer16 parameter controls whether Pascal integers
are interpreted as 16 or 32 bits, or translated as native C integers.
The default value depends on the Language selected.
Signed/unsigned chars. Pascal characters are normally "weakly" inter-
preted as unsigned; this is controlled by UnsignedChar. The default is
"either," so that C's native char type may be used even if its signed-
ness is unknown. Code that uses characters outside of the range 0-127
may need a different setting. Alternatively, you can use the types
{SIGNED} char and {UNSIGNED} char in the few cases where it really
matters. These comments are controlled by the SignedComment and Unsig-
nedComment parameters. (The type {UNSIGNED} integer is also recog-
nized.) The SignedChar parameter tells whether C characters are signed
or unsigned (default is "unknown"). The HasSignedChar parameter tells
whether the phrase "signed char" is legal in the output. If it is not,
_✓p_✓2_✓c may have to translate Pascal signed bytes into C shorts.
Special types. _✓P_✓2_✓c understands the following predefined Pascal type
names: integer, signed integers depending on Integer16; longint, signed
32-bit integers; unsigned, unsigned 32-bit integers; sword, signed 16-
bit integers; word, unsigned 16-bit integers; c_int, signed native C
integers; c_uint, unsigned native C integers; sbyte, signed 8-bit
integers; byte, unsigned 8-bit integers; real, floating-point numbers
depending on DoubleReals; single, single-precision floats; longreal,
double, and extended, double-precision floats; pointer and anyptr, gen-
eric pointers (assignment-compatible with any pointer type); string,
generic string of length StringDefault (normally 255); also, the usual
Pascal types char, boolean, and text. (If your Pascal uses different
names for these concepts, the Synonym option will come in handy.)
Embedded code. It is possible to write a Pascal comment containing C
code to be embedded into the output. See the descriptions of EmbedCom-
ment and its relatives in the system _✓p_✓2_✓c_✓r_✓c file. These techniques are
helpful if you plan to do repeated translations of code that is still
being maintained in Pascal.
Comments and blank lines. _✓P_✓2_✓c collects the comments in a procedure into
a list. All comments and statements are stamped with serial numbers
which are used to reattach comments to statements even after code has
been added, removed, or rearranged during translation. "Orphan" com-
ments attached to statements that have been lost are attached to nearby
statements or emitted at the end of the procedure. Blank lines are
treated as a kind of comment, so _✓p_✓2_✓c will also reproduce your usage of
blank lines. If the comment mechanism goes awry, you can disable com-
ments with EatComments or disable their being attached to code with
SpitComments.
Indentation. _✓P_✓2_✓c has a number of parameters to govern indentation of
code. The default values produce the GNU Emacs standard indentation
style, although _✓p_✓2_✓c can do a better job since it knows more about the
code it is indenting. Indentation works by applying "indentation del-
tas," which are either absolute numbers (which override the previous
indentation), or signed relative numbers (which augment the previous
15
P2C(1) local
indentation). A delta of "+0" specifies no change in indentation. All
of the indentation options are described in the standard _✓p_✓2_✓c_✓r_✓c file.
Line breaking. _✓P_✓2_✓c uses an algorithm similar to the TeX typesetter's
paragraph formatter for breaking long statements into multiple lines. A
"penalty" is assigned to various undesirable aspects of all possible
line breaks; the "badness" of a set of line breaks is approximately the
sum of all the penalties. Chief among these are serious penalties for
overrunning the desired maximum line length (default 78 columns), an
infinite penalty for overrunning the absolute maximum line length
(default 90), and progressively greater penalties for breaking at opera-
tors deeply nested in expressions. Parameters such as OpBreakPenalty
control the relative weights of various choices. BreakArith and its
neighbors control whether the operator at a line break should be placed
at the end of the previous line or at the beginning of the next. If you
don't want any oversize lines, define MaxLineWidth=78.
Unlike TeX, _✓p_✓2_✓c's line breaker must actually try all possible sets of
break points. To avoid excessive computation, the total penalty contri-
buted at each decision point must sum to a nonnegative value; negative
values are clipped up to zero. This allows _✓p_✓2_✓c to prune away obviously
undesirable alternatives in advance. The MaxLineBreakTries parameter
(default 5000) controls how many alternatives to try before giving up
and using the best so far.
PASCAL_MAIN. _✓P_✓2_✓c generates a call to this function at the front of the
main program. In the (unmodified) run-time library all this does is
save argc and argv away because in both HP and Turbo these are accessed
as global variables. If you do not wish to use this feature, define
ArgCName to be _✓a_✓r_✓g_✓c, ArgVName to be _✓a_✓r_✓g_✓v, and MainName (normally
"PASCAL_MAIN") to be blank. This will work if argc and argv are never
accessed outside of your main program.
BUGS
_✓P_✓2_✓c was designed with the idea that clean, readable output in most cases
is worth more than guaranteed correct output in extreme cases. _✓P_✓2_✓c is
_✓n_✓o_✓t a compiler! However, ideally the "extreme" cases would include only
those which never arise in real life. Thus if _✓p_✓2_✓c actually generates
incorrect code I will consider it a bug, but I will not apologize for
it. :-) Below are the major remaining cases where this is known to
occur.
Certain kinds of conformant array parameters (including multi-
dimensional conformant arrays) produce code that declares variable-
length arrays in C. Only a few C compilers, such as the GNU C compiler,
support this language extension. Otherwise some hand re-coding will be
required.
HP Pascal try-recover structures are translated into calls to _✓T_✓R_✓Y and
_✓R_✓E_✓C_✓O_✓V_✓E_✓R macros, which are defined to simulate the construct using _✓s_✓e_✓t_✓j_✓m_✓p
and _✓l_✓o_✓n_✓g_✓j_✓m_✓p. If this emulation does not work, define the symbol FAKE_TRY
to cause these macros to become "inert." (In cases where the error is
detected by code physically within the body of the try statement, a C
goto to the recover section is always generated.) Also, local file
16
local P2C(1)
variables in scopes which are destroyed by an escape are not closed.
Non-local GOTO's and try-recover statements are each implemented, but
may conflict if both are used at once. Non-local GOTO's are fairly
careful about closing files that go out of scope but may fail to do so
in the presence of recursion.
Arrays containing files are not initialized to NULL as other files are.
In some cases, such as file variables allocated by NEW, the file is ini-
tialized but not automatically closed by DISPOSE.
LINK variables allowing sub-procedures access to their parents' vari-
ables are occasionally omitted by mistake, if the access is too indirect
for _✓p_✓2_✓c to notice. If this happens, you can add an explicit reference
to a parent variable in the sub-procedure. A statement of the form
"a:=a" will count as a reference but then be optimized away by _✓p_✓2_✓c.
Many aspects of Modula-2 are translated only superficially. For exam-
ple, the type-compatibility properties of the _✓W_✓O_✓R_✓D and _✓A_✓R_✓R_✓A_✓Y _✓O_✓F _✓W_✓O_✓R_✓D
types are only roughly modelled, as are the scope rules concerning
modules.
Parts of VAX Pascal are still untreated. In particular, the [_✓U_✓N_✓S_✓A_✓F_✓E]
attribute and a few others are not fully supported, nor are the seman-
tics of the _✓O_✓P_✓E_✓N procedure.
Turbo and VAX Pascal's _✓d_✓o_✓u_✓b_✓l_✓e, _✓q_✓u_✓a_✓d_✓r_✓u_✓p_✓l_✓e, and _✓e_✓x_✓t_✓e_✓n_✓d_✓e_✓d real types all
translate to the C double type. Turbo's _✓c_✓o_✓m_✓p_✓u_✓t_✓a_✓t_✓i_✓o_✓n_✓a_✓l type is not sup-
ported at all.
Because Pascal strings (with length bytes) are translated into C strings
(with null terminators), certain Pascal string tricks will not work in
the translated code. For example the assignment _✓s[_✓0]:=_✓c_✓h_✓r(_✓x) is
translated to _✓s[_✓x]=_✓0 on the assumption that the string is being shor-
tened. If _✓x is actually greater than the current length, but not of a
recognizable form like _✓o_✓r_✓d(_✓s[_✓0])+_✓n, then the generated code will not
work. In VAX Pascal this corresponds to performing arithmetic on the
_✓L_✓E_✓N_✓G_✓T_✓H field of a varying-length string.
Turbo Pascal's automatic clipping of strings is not supported. In
Turbo, if a ten character string is assigned to a _✓s_✓t_✓r_✓i_✓n_✓g[_✓8] variable,
the last two characters are silently removed. The code produced by _✓p_✓2_✓c
generally will overrun the target string instead! The StringTruncLimit
parameter (80 by default if Language=Turbo) specifies a string size
which should be considered "short"; assignments of potentially-long
strings to short string variables will cause a warning but will not
automatically truncate. The cure is to use _✓c_✓o_✓p_✓y in the Pascal source to
truncate the strings explicitly.
FILES
file._✓x_✓x_✓x Pascal source files
file.c resulting C source file
module.h resulting C header file
p2crc local configuration file
17
P2C(1) local
.p2crc alternate local configuration file
/tmp/qq/home/p2crc system-wide configuration file
/tmp/qq/home/system.impdeclarations for predefined functions
/tmp/qq/home/system.m2 analogous declarations for Modula-2
/tmp/qq/home/*.imp interface text for standard modules
/tmp/qq/home/p2c/p2c.h header file for translated programs
/tmp/qq/home/libp2c.a run-time library
AUTHOR
Dave Gillespie, daveg@csvax.cs.caltech.edu.
Many thanks to William Bader, Steven Levi, Rick Koshi, Eric Raymond,
Magne Haveraaen, Dirk Grunwald, David Barto, Paul Fisher, Tom Schneider,
and others whose suggestions and bug reports have helped improve _✓p_✓2_✓c in
countless ways.
18