herumi's technical notes: x64

2012-06-20

2011-08-27

fast double precision exponential function with SSE

I make a fast double precision exponential function using SSE2.

fmath.hpp (https://github.com/herumi/fmath, fast approximate float function fmath)

benchmark of fmath::expd
CPU	OS	compiler	std::exp	fmath::expd	one element for fmath::expd_v(array version)
Xeon X5650 2.67GHz	64-bit Linux	gcc 4.6.0	128.89	27.38	17.84
i7-2600 3.4GHz	64-bit Linux	gcc 4.4.5	69.11	12.10	8.25
i7-2600 3.4GHz	64-bit Windows 7	VC10	36.33	14.37	7.08

The function double fmath::expd(double) defined in fmath.hpp is about five time faster than std::exp of gcc-4.6 on 64-bit Linux and about two point five faster than that of Visual Studio 2010 on 64-bit Windows.

The error of rms (Root Mean Square) for 1000000 points generated from standard normal distribution is about 1.117645e-16.

The source code for benchmark is fastexp.cpp, which requires Xbyak.

I write some results for various environments in the comment of the header of fastexp.cpp.

Moreover, fmath.hpp provies fmath::exp(float) and fmath::log(float).
These functions are also 2~5 times faster than those of standard library.

Let's try it if you want speed.

2012-06-20

AA-sort with SSE4

2011-08-27

fast double precision exponential function with SSE