NAS Parallel Benchmarks

The NAS Parallel Benchmarks (NPB) are a small set of programs designed to help evaluate the performance of parallel supercomputers. The benchmarks are derived from computational fluid dynamics (CFD) applications and consist of five kernels and three pseudo-applications in the original "pencil-and-paper" specification (NPB 1). The benchmark suite has been extended to include new benchmarks for unstructured adaptive meshes, parallel I/O, multi-zone applications, and computational grids. Problem sizes in NPB are predefined and indicated as different classes. Reference implementations of NPB are available in commonly-used programming models like MPI and OpenMP (NPB 2 and NPB 3).

Benchmark Specifications

The original eight benchmarks specified in NPB 1 mimic the computation and data movement in CFD applications:

Five kernels
- IS - Integer Sort, random memory access
- EP - Embarrassingly Parallel
- CG - Conjugate Gradient, irregular memory access and communication
- MG - Multi-Grid on a sequence of meshes, long- and short-distance communication, memory intensive
- FT - discrete 3D fast Fourier Transform, all-to-all communication
Three pseudo applications
- BT - Block Tri-diagonal solver
- SP - Scalar Penta-diagonal solver
- LU - Lower-Upper Gauss-Seidel solver

Multi-zone versions of NPB (NPB-MZ) are designed to exploit multiple levels of parallelism in applications and to test the effectiveness of multi-level and hybrid parallelization paradigms and tools. There are three types of benchmark problems derived from single-zone pseudo applications of NPB:

BT-MZ - uneven-size zones within a problem class, increased number of zones as problem class grows
SP-MZ - even-size zones within a problem class, increased number of zones as problem class grows
LU-MZ - even-size zones within a problem class, a fixed number of zones for all problem classes

Benchmarks for unstructured computation, parallel I/O, and data movement

UA - Unstructured Adaptive mesh, dynamic and irregular memory access
BT-IO - test of different parallel I/O techniques
DC - Data Cube
DT - Data Traffic

GridNPB is designed specifically to rate the performance of computational grids. Each of the four benchmarks in the set consists of a collection of communicating tasks derived from the NPB. They symbolize distributed applications typically run on grids.

ED - Embarrassingly Distributed
HC - Helical Chain
VP - Visualization Pipeline
MB - Mixed Bag

Documents on benchmark specifications:

NAS Parallel Benchmarks, RNR-94-007 (PDF-425KB) for IS, EP, CG, MG, FT, BT, SP, LU
NAS Parallel Benchmarks I/O Version 2.4, NAS-03-002 (PDF-57KB) for BT-IO
Unstructured Adaptive NAS Parallel Benchmark, NAS-04-006 (PDF-184KB) for UA
NAS Parallel Benchmarks, Multi-Zone Versions, NAS-03-010 (PDF-128KB (PDF-128KB) for BT-MZ, SP-MZ, LU-MZ
Arithmetic Data Cube as a Data Intensive Benchmark, NAS-03-005 (PDF-65KB) for DC
NAS Grid Benchmarks Version 1.0, NAS-02-005 (PDF-352KB) for GridNPB

Benchmark Classes

Class S: small for quick test purposes
Class W: workstation size (a 90's workstation; now likely too small)
Classes A, B, C: standard test problems; ~4X size increase going from one class to the next
Classes D, E, F: large test problems; ~16X size increase from each of the previous classes

Details of NPB Problem Sizes and Parameters.

Reference Implementations

Vendors and others implement the detailed specifications in the NPB 1 report, using algorithms and programming models appropriate to their different machines. NPB 1 implementations are generally proprietary and are not distributed by NAS.

A set of reference implementations of the NPB specifications has been written and distributed by NAS as NPB 2 and NPB 3. These source-code implementations are intended to be run with little or no tuning, and approximate the performance a typical user can expect to obtain for a portable parallel program. NPB 2 contains MPI-based source code implementations of the original eight benchmarks, and NPB 3 has included new benchmarks and problem classes as well as implementations using other programming models. NPB 3.4.2 is the latest release.

Summary of Source Code Releases with Download Links
Download Version	Benchmarks Included	Problem Classes	Programming Models Used	Major Changes
NPB 3.4.2 (GZIP, 428KB)	IS, EP, CG, MG, FT, BT, BT-IO, SP, LU, UA, DC, DT	S,W,A,B,C,D,E,F	MPI, OpenMP	added Class F, dynamic memory allocation
NPB 3.4.2-MZ (GZIP, 232KB)	BT-MZ, SP-MZ, LU-MZ	S,W,A,B,C,D,E,F	MPI+OpenMP, OpenMP	dynamic memory allocation
NPB 3.3.1 (GZIP, 625KB)	IS, EP, CG, MG, FT, BT, BT-IO, SP, LU, UA, DC, DT	S,W,A,B,C,D,E	MPI, OpenMP, serial	added Class E
NPB 3.3.1-MZ (GZIP, 336KB)	BT-MZ, SP-MZ, LU-MZ	S,W,A,B,C,D,E,F	MPI+OpenMP, OpenMP, serial	nested OpenMP version
GridNPB 3.1 (GZIP, 112KB)	ED, HC, VP, MB	S,W,A,B	Globus, Java, serial	added Globus version
NPB 3.0 (GZIP, 507KB)	IS, EP, CG, MG, FT, BT, SP, LU	S,W,A,B,C	OpenMP, HPF, Java	new programming paradigms
NPB 2.4.1 (GZIP, 164KB)	IS, EP, CG, MG, FT, BT, BT-IO, SP, LU	S,W,A,B,C,D	MPI	added BT-IO, Class D
NPB 2.3 (GZIP, 276KB)	IS, EP, CG, MG, FT, BT, SP, LU	S,W,A,B,C	MPI, serial	added CG, serial version

Reports and Results

NPB 2 and NPB 3

These reports document some details of and experience with the reference implementations in NPB 2 and NPB 3:

Results:

The results presented here are partly generated by NAS, and partly by computer vendors and the HPC community. NAS is currently working on creating a more automatic archiving facility, as well as a tool for creating performance plots on the fly.

NPB 2.1 Results Report, NAS-95-010 (PDF-213KB)

How To Submit Results:

NAS solicits NPB 2 and NPB 3 results from all sources. They will be incorporated into a database maintained by NAS, and will also be made available on this web page. Send results to npb@nas.nasa.gov. Details on what information should be submitted are contained in appendix D of the NPB 2.0 report. This information includes:

Output generated by the benchmarks (includes timing information).
All files that were changed (including Makefiles).
Detailed information about the hardware and software on which the benchmark was run, and contact information for the submitter. Submitters should make every attempt to provide enough information so that their experiment(s) can be repeated by others.

NPB 1

Vendors and others implement the detailed specifications in the original "pencil and paper" benchmark specifications, using algorithms and programming models appropriate to their different machines. Submitted results are verified by the NAS Division.

Results:

How To Submit Results:

NAS solicits NPB 1 results from all sources, but especially from computer vendors. These results are compiled in tabular fashion in periodic reports that will appear on this web page. Send results to npb@nas.nasa.gov. Details on what information to include are contained in section 1.4 of the NPB 1 specification.

GridNPB 3

The NAS distribution of GridNPB contains serial and concurrent reference implementations in Fortran and Java, including a version that uses Globus as grid middleware. A variation of the grid benchmarks called "Rapid Fire" is included.

Reports on implementations: GridNPB 3.0, NAS-02-005 (PDF-352KB)

How To Submit Results:

Objective measures for grid performance are not yet available; NAS currently does not collect GridNPB performance results. A valid implementation should always report turnaround time and, preferably, resources used, including libraries, schedulers, resource brokers, and so forth.