Benchmarking and Performance with Object-Oriented Programming in Fortran 90

Objective:

Large and complex computations require modern programming techniques since their organization and development can benefit from new approaches. While object-oriented programming paradigms represent one approach useful in scientific computation, good compilers are also critical to the successful development of applications on high performance computers.

A methodology for object-oriented programming in Fortran 90 has been developed [1], but many compilers have had difficulties with some aspects of this approach. To determine which compilers support this programming style well, a set of benchmarks to identify common compiler failures has been developed.

Approach:

Sixty benchmark examples were constructed from which thirteen representative examples were selected. These 13 were the most challenging for compiler vendors to implement correctly. All of these emulate C++ programming language examples.

Accomplishments:

Eight vendor compilers were evaluated including, IBM, Cray C90, Cray T3E, DEC, Sun, Fujitsu, SGI, and the Absoft compiler. Two of them, IBM and Fujistu, passed all of the benchmark tests. (We have been working with the vendors that experienced the most difficulties with these benchmarks.) These benchmarks also provided a refined understanding of the capabilities of Fortran 90 for object-oriented programming. To examine how some of these techniques apply to a simulation program, a 3D Fortran 90 parallel plasma particle-in-cell (PIC) code was written and benchmarked on the Cornell IBM SP2 supercomputer.

Cornell Theory Center IBM SP2 (32 PEs, 8 Millon Particles)
Time (seconds)
Language Compiler RS6000 Model 390 Chips P2SC Super Chips P2SC Optimized
Fortran 77 IBM xlf N/A 668.03 537.95
Fortran 90 IBM xlf90 1226.75 622.60 488.88
C++ KAI KCC 2817.62 1316.20 1173.31

3D Parallel Plasma PIC Experiments - CPU Times for Various Compilers
(KAI C++, IBM F90, and IBM F77 with IBM MPI on Cornell's SP2)

The most aggressive compiler options produced the fastest timings and are represented in the table. The KAI C++ compiler with +K3 -O3 --abstract_pointer spent over 2 hours in the compilation process. The IBM F90 compiler with -O3 -qlanglvl=90std -qstrict -qalias=noaryovrlp used 5 minutes for compilation. (The KAI compiler is generally considered the most efficient C++ compiler when objects are used. This compiler generated slightly faster executables than the IBM C++ compiler.)

Applying hardware optimization switches (-qarch=pwr2 -qtune=pwr2) introduced additional performance improvements. These timings are illustated in yellow.

We have found Fortran 90 very useful, and generally safer with higher performance than C++ and sometimes Fortran 77, for large problems on supercomputers. (Fortran 90 derived-type objects improved cache utilization, for large problems, over Fortran 77. The C++ objects had the same storage organization.) Fortran 90 is less powerful than C++, since it has fewer features and those available can be restricted to enhance performance, but many of the advanced features of C++ have not been required in scientific computing. Nevertheless, advanced C++ features may be more appropriate for other problem domains [2,3].

Significance:

Understanding the capabilities of Fortran 90 compilers is important since Fortran 2000 will have full support for object-oriented programming, relying upon the existing mechanisms present in Fortran 90. These benchmarks also encourage vendors to produce correct compilers.

Status/Future Plans:

We will continue to investigate constructs that cause performance penalties, but this is often vendor dependent. Interlanguage communication between Fortran 90 and C++ objects will also be investigated further. These experiences are currently being applied in the development of a Fortran 90 based parallel adaptive mesh refinement code on the NASA Goddard Cray T3E.

Points of Contact:

Viktor K. Decyk                          Charles D. Norton
Jet Propulsion Laboratory                Jet Propulsion Laboratory
vdecyk@olympic.jpl.nasa.gov              Charles.D.Norton@jpl.nasa.gov
(818) 393-2690                           (818) 393-3920

References:

  1. V. K. Decyk, C. D. Norton, and B. K. Szymanski. "Expressing Object-Oriented Concepts in Fortran 90". ACM Fortran Forum, vol. 16, num 1, pp. 13-18, April 1997. (Also to appear in NASA Tech Briefs.)

  2. V. K. Decyk, C. D. Norton, and B. K. Szymanski. "How to Express C++ Concepts in Fortran 90". Submitted to J. Software-Practice and Experience, Feb 1997.

  3. C. D. Norton, V. K. Decyk, and B. K. Szymanski. "High Performance Object-Oriented Programming in Fortran 90". In CD-ROM Proc. Eighth SIAM Conference on Parallel Processsing for Scientific Computing, March 14-17, 1997.
"Part of this research was conducted using the resources of the Cornell Theory Center, which receives major funding from the National Science Foundation (NSF) and New York State, with additional support from the Advanced Research Projects (ARPA), the National Center for Research Resources at the National Institutes of Health (NIH), IBM Corporation, and other members of the center's Corporate Partnership Program."