Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN propagation in exponential function #1016

Open
nielskm opened this issue Oct 14, 2022 · 4 comments
Open

NaN propagation in exponential function #1016

nielskm opened this issue Oct 14, 2022 · 4 comments

Comments

@nielskm
Copy link
Contributor

nielskm commented Oct 14, 2022

When applying the Exp function to NaN on my CPU (Intel i7-1185G7 using the AVX3_DL target) the function returns 0.0 instead of the NaN that I expected (and that std::exp returns).

Is this a bug or performance-related feature of the highway Exp function?

@jan-wassenberg
Copy link
Member

Hi @nielskm , we were not focusing on NaN propagation in the design of the math library. This typically comes at a (minor but nonzero) cost.

The relevant line is return IfThenElseZero(Ge(x, kLowerBound), y);. Comparison of x=NaN returns false, so we get zero from IfThenElseZero. I suppose this could be turned into something like: IfThenZeroElse(Lt(x, kLowerBound), IfThenElse(Ge(x, kLowerBound), y, x). The first comparison is to flush tiny inputs to zero, the second keeps the result or replaces with x=NaN.
Are NaNs essential to your app?

@chriselrod
Copy link
Contributor

chriselrod commented Oct 17, 2022

const V y = impl.LoadExpShortRange(
d, Add(impl.ExpPoly(d, impl.ExpReduce(d, x, q)), kOne), q);

The vscalefpd instruction should handle NaNs correctly given AVX3, so correct NaN handling should be free on those targets. It should also make the current implementation of LoadExpShortRange cheaper for those targets

HWY_INLINE V LoadExpShortRange(D d, V x, VI32 e) {
const VI32 y = ShiftRight<1>(e);
return Mul(Mul(x, Pow2I(d, y)), Pow2I(d, Sub(e, y)));
}

https://www.felixcloutier.com/x86/VSCALEFPD.html
https://uops.info/html-instr/VSCALEFPD_ZMM_ZMM_ZMM.html

EDIT: I'm not sure exactly how it's handled / I'd have to look a lot closer, but can confirm that my implementation in Julia that uses vscalefpd returns NaN/0.0/Inf as appropriate without needing any checks.

@nielskm
Copy link
Contributor Author

nielskm commented Oct 18, 2022

Thank you both for your detailed replies! NaN propagation is not essential in my application, but I just happened to test highway Exp against std::exp and the exponential function implemented in Eigen3. Both of the others propagate NaNs, which made me wonder if this was a deliberate design choice in highway.

@jan-wassenberg
Copy link
Member

@chriselrod good point about using vscale for LoadExpShortRange. We could add an AddExponent op or similar to take advantage of that, if anyone is interested.
There's also vfixupimmpd for fixups when no scaling is required, but I'm not sure it's worth the cost on other platforms.

It was indeed a deliberate design choice but that can be changed if/when someone really cares about NaN propagation :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants