Parallel Calculation of Sensitivity Derivatives for Aircraft Design Using Automatic Differentiation

Objective: This work compares two computational approaches for calculating sensitivity derivatives (SD) from gradient code obtained by means of automatic differentiation (AD).

Approach: The ADIFOR (AD of Fortran) tool, developed by Argonne National Laboratory and Rice University, is applied to the TLNS3D thin-layer Navier-Stokes flow solver to obtain aerodynamic SD with respect to wing geometric design variables. The number of design variables (NDV) ranges from 1 to 60. Coarse-grained parallelization (as shown in the Figure 1) of the TLNS3D.AD code is employed on an IBM SP/1 workstation cluster with a Fortran-M wrapper to improve the code speed and memory use. Results from the initial (unoptimized) parallel implementation on the SP/1 are compared with the most efficient (to date) implementation of the TLNS3D.AD code on a single processor of the vector Cray Y-MP.

Accomplishment: Figure 2 shows the beneficial effects of SP/1 parallelization; as expected, the time required to compute the aerodynamic SD on a 972517 viscous grid decreases significantly as the number of processors (NP) used increases from 1 to 15. A fair comparison between the SP/1 and Y-MP implementations involves complex trade-offs among numerous parameters including single processor speed, Y-MP vector performance, total available memory, the amount of SP/1 parallelization employed, and machine life-cycle cost. Generally, though, on this grid the SD compute time of the Y-MP is about 10 times faster than that of the SP/1 if the number of design variables (NDV) is small. However, the Y-MP is only about 2 times faster (or less) than the SP/1 as NDV increases and parallelization can be efficiently exploited on the SP/1.

Significance: Although the compute time for the vector Cray Y-MP is faster than that of the parallel IBM SP/1, for most of the SD cases examined the difference is only about a factor of 2 or less; SD calculations for large NDV can be performed efficiently on the SP/1 using coarse-grained parallelization. Consideration of the total elapsed job time, rather than compute time would favor the SP/1 even more for these cases. Moreover, the total machine resources of a 128 node SP/1 can accommodate about 1000 design variables, whereas the Cray can only accommodate about 100 design variable for this size grid.

Status/Plans: Other strategies exploiting more parallelization within the TLNS3D.AD code will be studied. Fortran-M has been installed on NASA Langley Research Center Computers to allow these parallelization techniques to be mapped onto networks of heterogeneous workstations. Points of Contact:


Return to the Table of Contents



curator: Larry Picha