Benchmarking and Performance with Object-Oriented Programming
in Fortran 90
Objective:
Large and complex computations require modern programming techniques
since their organization and development can benefit from new
approaches. While object-oriented programming paradigms represent one
approach useful in scientific computation, good compilers are also
critical to the successful development of applications on high
performance computers.
A methodology for object-oriented programming in Fortran 90 has been
developed [1], but many compilers have had
difficulties with some aspects of this approach. To determine which
compilers support this programming style well, a set of benchmarks to
identify common compiler failures has been developed.
Approach:
Sixty benchmark examples were constructed from which thirteen representative
examples were selected. These 13 were the most challenging for
compiler vendors to implement correctly. All of these emulate C++
programming language examples.
Accomplishments:
Eight vendor compilers were evaluated including, IBM, Cray C90, Cray T3E,
DEC, Sun, Fujitsu, SGI, and the Absoft compiler. Two of them, IBM and
Fujistu, passed all of the benchmark tests. (We have been working with
the vendors that experienced the most difficulties with these
benchmarks.) These benchmarks also provided a refined understanding
of the capabilities of Fortran 90 for object-oriented programming. To
examine how some of these techniques apply to a simulation program, a
3D Fortran 90 parallel plasma particle-in-cell (PIC) code was written
and benchmarked on the Cornell IBM SP2 supercomputer.
Cornell Theory Center IBM SP2 (32 PEs, 8 Millon Particles) |
|
|
Time (seconds) |
Language |
Compiler |
RS6000 Model 390 Chips |
P2SC Super Chips |
P2SC Optimized |
Fortran 77 |
IBM xlf |
N/A |
668.03 |
537.95 |
Fortran 90 |
IBM xlf90 |
1226.75 |
622.60 |
488.88 |
C++ |
KAI KCC |
2817.62 |
1316.20 |
1173.31 |
data:image/s3,"s3://crabby-images/3f19d/3f19d873df9d646048571269eb4d69d9e4a09301" alt=""
3D Parallel Plasma PIC Experiments - CPU Times for Various Compilers
(KAI C++, IBM F90, and IBM F77 with IBM MPI on Cornell's SP2)
The most aggressive compiler options produced the fastest timings and
are represented in the table. The KAI C++ compiler with +K3 -O3
--abstract_pointer spent over 2 hours in the compilation process.
The IBM F90 compiler with -O3 -qlanglvl=90std -qstrict
-qalias=noaryovrlp used 5 minutes for compilation. (The KAI
compiler is generally considered the most efficient C++ compiler when
objects are used. This compiler generated slightly faster executables
than the IBM C++ compiler.)
Applying hardware optimization switches (-qarch=pwr2
-qtune=pwr2) introduced additional performance improvements.
These timings are illustated in yellow.
We have found Fortran 90 very useful, and generally safer with higher
performance than C++ and sometimes Fortran 77, for large problems on
supercomputers. (Fortran 90 derived-type objects improved cache
utilization, for large problems, over Fortran 77. The C++ objects had
the same storage organization.) Fortran 90 is less powerful than C++,
since it has fewer features and those available can be restricted to
enhance performance, but many of the advanced features of C++ have not
been required in scientific computing. Nevertheless, advanced C++
features may be more appropriate for other problem domains [2,3].
Significance:
Understanding the capabilities of Fortran 90 compilers is important
since Fortran 2000 will have full support for object-oriented
programming, relying upon the existing mechanisms present in Fortran 90.
These benchmarks also encourage vendors to produce correct compilers.
Status/Future Plans:
We will continue to investigate constructs that cause performance
penalties, but this is often vendor dependent. Interlanguage
communication between Fortran 90 and C++ objects will also be
investigated further. These experiences are currently being applied in
the development of a Fortran 90 based parallel adaptive mesh refinement
code on the NASA Goddard Cray T3E.
Points of Contact:
Viktor K. Decyk Charles D. Norton
Jet Propulsion Laboratory Jet Propulsion Laboratory
vdecyk@olympic.jpl.nasa.gov Charles.D.Norton@jpl.nasa.gov
(818) 393-2690 (818) 393-3920
References:
- V. K. Decyk, C. D. Norton, and B. K. Szymanski. "Expressing
Object-Oriented Concepts in Fortran 90". ACM Fortran Forum, vol. 16,
num 1, pp. 13-18, April 1997. (Also to appear in NASA Tech Briefs.)
- V. K. Decyk, C. D. Norton, and B. K. Szymanski. "How to Express C++
Concepts in Fortran 90". Submitted to J. Software-Practice and
Experience, Feb 1997.
- C. D. Norton, V. K. Decyk, and B. K. Szymanski. "High Performance
Object-Oriented Programming in Fortran 90". In CD-ROM Proc. Eighth
SIAM Conference on Parallel Processsing for Scientific Computing,
March 14-17, 1997.
"Part of this research was conducted using the resources of the
Cornell Theory Center, which receives major funding from the National
Science Foundation (NSF) and New York State, with additional support
from the Advanced Research Projects (ARPA), the National Center for
Research Resources at the National Institutes of Health (NIH), IBM
Corporation, and other members of the center's Corporate Partnership
Program."