Numerical.FFT*
From BenchIT-Wiki
Contents |
short description
The BenchIT FFT kernels allow performance analysis of different FFT types (1D/2D/3D, float/double etc.) using different FFT libraries. Currently, the following libraries are supported: FFTW3 (GPL), MKL (Intel), ACML (AMD), SCSL (SGI).
All FFT kernels have been automatically generated using the BenchIT Kernel Generator which is available for developers only (you won't need it anyway).
requirements
A C compiler and the libraries you want to do your measurements with.
parameters
kernel specific parameters
BENCHIT_KERNEL_PROBLEMSIZE_MIN and BENCHIT_KERNEL_PROBLEMSIZE_MAX
Only for "powers of two"-kernels, define the lower and upper bound for the problemsize, for example: BENCHIT_KERNEL_PROBLEMSIZE_MIN = 1 BENCHIT_KERNEL_PROBLEMSIZE_MAX = 20: problemsize will iterate from 2^1 = 2 to 2^20 = 1048576
BENCHIT_KERNEL_PROBLEMSIZES
Only for "non-powers of two"-kernels, explicitely defines all problemsizes that have to be measured (comma-separated). We try to stick to the problemsizes that are beeing used by the FFTW team for it's own speed measurements (see here)
additional parameters
You can overwrite every environment variable in this parameters file. Some important ones have been listed and commented out. For explanation of these variables please look at LOCALDEFS. For serial (single-CPU) kernels we set "BENCHIT_NUM_CPUS" to "1". We also use a high accuracy ("BENCHIT_RUN_ACCURACY") by overwriting the LOCALDEFS default with "20".
expected results
You will usually see a typical cache-dominated performance curve with high performance for small problemsizes and significantly lower performance for problemsizes that exceed the processor cache. You can check your results against the FFTW speed results of a similar processor if you like.
detailed description
All kernels are written in C. The different kernels include:
- the libraries FFTW, MKL, ACML, and SCSL
- 1D, 2D, and 3D FFTs
- "powers of two" and "non-powers of two"
- single precision complex and double precision complex
- OpenMP version for the MKL and SCSL kernels (only 2D and 3D)
Each kernel performs an in place (input array and results array are the same) and an out of place (input array and results array are different) measurement.
This is a detailed list of all 64 FFT kernels that are currently available:
(n-dimensional, library, precision, powers-of-two/non-powers-of-two)
- 1D, FFTW3, double precision, powers of two
- 1D, FFTW3, double precision, non-powers of two
- 1D, MKL, double precision, powers of two
- 1D, MKL, double precision, non-powers of two
- 1D, SCSL, double precision, powers of two
- 1D, SCSL, double precision, non-powers of two
- 1D, ACML, double precision, powers of two
- 1D, ACML, double precision, non-powers of two
- 2D, FFTW3, double precision, powers of two
- 2D, FFTW3, double precision, non-powers of two
- 2D, MKL, double precision, powers of two
- 2D, MKL, double precision, non-powers of two
- 2D, SCSL, double precision, powers of two
- 2D, SCSL, double precision, non-powers of two
- 2D, ACML, double precision, powers of two
- 2D, ACML, double precision, non-powers of two
- 2D, MKL (OpenMP), double precision, powers of two
- 2D, MKL (OpenMP), double precision, non-powers of two
- 2D, SCSL (OpenMP), double precision, powers of two
- 2D, SCSL (OpenMP), double precision, non-powers of two
- 3D, FFTW3, double precision, powers of two
- 3D, FFTW3, double precision, non-powers of two
- 3D, MKL, double precision, powers of two
- 3D, MKL, double precision, non-powers of two
- 3D, SCSL, double precision, powers of two
- 3D, SCSL, double precision, non-powers of two
- 3D, ACML, double precision, powers of two
- 3D, ACML, double precision, non-powers of two
- 3D, MKL (OpenMP), double precision, powers of two
- 3D, MKL (OpenMP), double precision, non-powers of two
- 3D, SCSL (OpenMP), double precision, powers of two
- 3D, SCSL (OpenMP), double precision, non-powers of two
- 1D, FFTW3, single precision, powers of two
- 1D, FFTW3, single precision, non-powers of two
- 1D, MKL, single precision, powers of two
- 1D, MKL, single precision, non-powers of two
- 1D, SCSL, single precision, powers of two
- 1D, SCSL, single precision, non-powers of two
- 1D, ACML, single precision, powers of two
- 1D, ACML, single precision, non-powers of two
- 2D, FFTW3, single precision, powers of two
- 2D, FFTW3, single precision, non-powers of two
- 2D, MKL, single precision, powers of two
- 2D, MKL, single precision, non-powers of two
- 2D, SCSL, single precision, powers of two
- 2D, SCSL, single precision, non-powers of two
- 2D, ACML, single precision, powers of two
- 2D, ACML, single precision, non-powers of two
- 2D, MKL (OpenMP), single precision, powers of two
- 2D, MKL (OpenMP), single precision, non-powers of two
- 2D, SCSL (OpenMP), single precision, powers of two
- 2D, SCSL (OpenMP), single precision, non-powers of two
- 3D, FFTW3, single precision, powers of two
- 3D, FFTW3, single precision, non-powers of two
- 3D, MKL, single precision, powers of two
- 3D, MKL, single precision, non-powers of two
- 3D, SCSL, single precision, powers of two
- 3D, SCSL, single precision, non-powers of two
- 3D, ACML, single precision, powers of two
- 3D, ACML, single precision, non-powers of two
- 3D, MKL (OpenMP), single precision, powers of two
- 3D, MKL (OpenMP), single precision, non-powers of two
- 3D, SCSL (OpenMP), single precision, powers of two
- 3D, SCSL (OpenMP), single precision, non-powers of two
