# Numerical.FFT*

### From BenchIT-Wiki

## Contents |

# short description

The BenchIT FFT kernels allow performance analysis of different FFT types (1D/2D/3D, float/double etc.) using different FFT libraries. Currently, the following libraries are supported: FFTW3 (GPL), MKL (Intel), ACML (AMD), SCSL (SGI).

All FFT kernels have been automatically generated using the BenchIT Kernel Generator which is available for developers only (you won't need it anyway).

# requirements

A C compiler and the libraries you want to do your measurements with.

# parameters

## kernel specific parameters

### BENCHIT_KERNEL_PROBLEMSIZE_MIN and BENCHIT_KERNEL_PROBLEMSIZE_MAX

Only for "powers of two"-kernels, define the lower and upper bound for the problemsize, for example: BENCHIT_KERNEL_PROBLEMSIZE_MIN = 1 BENCHIT_KERNEL_PROBLEMSIZE_MAX = 20: problemsize will iterate from 2^1 = 2 to 2^20 = 1048576

### BENCHIT_KERNEL_PROBLEMSIZES

Only for "non-powers of two"-kernels, explicitely defines all problemsizes that have to be measured (comma-separated). We try to stick to the problemsizes that are beeing used by the FFTW team for it's own speed measurements (see here)

## additional parameters

You can overwrite every environment variable in this parameters file. Some important ones have been listed and commented out. For explanation of these variables please look at LOCALDEFS. For serial (single-CPU) kernels we set "BENCHIT_NUM_CPUS" to "1". We also use a high accuracy ("BENCHIT_RUN_ACCURACY") by overwriting the LOCALDEFS default with "20".

# expected results

You will usually see a typical cache-dominated performance curve with high performance for small problemsizes and significantly lower performance for problemsizes that exceed the processor cache. You can check your results against the FFTW speed results of a similar processor if you like.

# detailed description

All kernels are written in C. The different kernels include:

- the libraries FFTW, MKL, ACML, and SCSL
- 1D, 2D, and 3D FFTs
- "powers of two" and "non-powers of two"
- single precision complex and double precision complex
- OpenMP version for the MKL and SCSL kernels (only 2D and 3D)

Each kernel performs an in place (input array and results array are the same) and an out of place (input array and results array are different) measurement.

This is a detailed list of all 64 FFT kernels that are currently available:

(n-dimensional, library, precision, powers-of-two/non-powers-of-two)

- 1D, FFTW3, double precision, powers of two
- 1D, FFTW3, double precision, non-powers of two
- 1D, MKL, double precision, powers of two
- 1D, MKL, double precision, non-powers of two
- 1D, SCSL, double precision, powers of two
- 1D, SCSL, double precision, non-powers of two
- 1D, ACML, double precision, powers of two
- 1D, ACML, double precision, non-powers of two
- 2D, FFTW3, double precision, powers of two
- 2D, FFTW3, double precision, non-powers of two
- 2D, MKL, double precision, powers of two
- 2D, MKL, double precision, non-powers of two
- 2D, SCSL, double precision, powers of two
- 2D, SCSL, double precision, non-powers of two
- 2D, ACML, double precision, powers of two
- 2D, ACML, double precision, non-powers of two
- 2D, MKL (OpenMP), double precision, powers of two
- 2D, MKL (OpenMP), double precision, non-powers of two
- 2D, SCSL (OpenMP), double precision, powers of two
- 2D, SCSL (OpenMP), double precision, non-powers of two
- 3D, FFTW3, double precision, powers of two
- 3D, FFTW3, double precision, non-powers of two
- 3D, MKL, double precision, powers of two
- 3D, MKL, double precision, non-powers of two
- 3D, SCSL, double precision, powers of two
- 3D, SCSL, double precision, non-powers of two
- 3D, ACML, double precision, powers of two
- 3D, ACML, double precision, non-powers of two
- 3D, MKL (OpenMP), double precision, powers of two
- 3D, MKL (OpenMP), double precision, non-powers of two
- 3D, SCSL (OpenMP), double precision, powers of two
- 3D, SCSL (OpenMP), double precision, non-powers of two
- 1D, FFTW3, single precision, powers of two
- 1D, FFTW3, single precision, non-powers of two
- 1D, MKL, single precision, powers of two
- 1D, MKL, single precision, non-powers of two
- 1D, SCSL, single precision, powers of two
- 1D, SCSL, single precision, non-powers of two
- 1D, ACML, single precision, powers of two
- 1D, ACML, single precision, non-powers of two
- 2D, FFTW3, single precision, powers of two
- 2D, FFTW3, single precision, non-powers of two
- 2D, MKL, single precision, powers of two
- 2D, MKL, single precision, non-powers of two
- 2D, SCSL, single precision, powers of two
- 2D, SCSL, single precision, non-powers of two
- 2D, ACML, single precision, powers of two
- 2D, ACML, single precision, non-powers of two
- 2D, MKL (OpenMP), single precision, powers of two
- 2D, MKL (OpenMP), single precision, non-powers of two
- 2D, SCSL (OpenMP), single precision, powers of two
- 2D, SCSL (OpenMP), single precision, non-powers of two
- 3D, FFTW3, single precision, powers of two
- 3D, FFTW3, single precision, non-powers of two
- 3D, MKL, single precision, powers of two
- 3D, MKL, single precision, non-powers of two
- 3D, SCSL, single precision, powers of two
- 3D, SCSL, single precision, non-powers of two
- 3D, ACML, single precision, powers of two
- 3D, ACML, single precision, non-powers of two
- 3D, MKL (OpenMP), single precision, powers of two
- 3D, MKL (OpenMP), single precision, non-powers of two
- 3D, SCSL (OpenMP), single precision, powers of two
- 3D, SCSL (OpenMP), single precision, non-powers of two