Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@simd enables unexpected fastmath optimizations #49387

Closed
mikmoore opened this issue Apr 17, 2023 · 2 comments
Closed

@simd enables unexpected fastmath optimizations #49387

mikmoore opened this issue Apr 17, 2023 · 2 comments
Labels
maths Mathematical functions

Comments

@mikmoore
Copy link
Contributor

@simd results in @fastmath being applied. This is undocumented behavior (except for re-association of reduction variables). The behavior needs to be documented or the @fastmath needs to not occur.

Reduced example derived from an effort to write a faster norm(x, 2):

# Julia v1.8.0

function test(x)
	scale = exp2(768)
	s1 = 0.0
	s2 = 0.0
	@simd for z in x
		s1 += abs2(z)
		s2 += abs2(z*scale)
	end
	s = s2==Inf ? s1 : s2
	return s,s1,s2
end

test(1:3)
# (Inf, 14.0, Inf)

Inspecting the @code_llvm shows many @fastmath operations and the @code_native shows unauthorized fma instructions. The result should be s == s1 == 14.0, but instead we get s == s2 == Inf because the s2==Inf check is replaced by false by the "noinf" property inherited from @fastmath.

@giordano
Copy link
Contributor

I don't think "@simd invokes @fastmath" is an accurate description because as far as I know @simd never invokes the @fastmath macro, but the code it generates enables LLVM fast-math flags under certain conditions, I presume around

/// If Phi is part of a reduction cycle of FAdd, FSub, FMul or FDiv,
/// mark the ops as permitting reassociation/commuting.
/// As of LLVM 4.0, FDiv is not handled by the loop vectorizer
static void enableUnsafeAlgebraIfReduction(PHINode *Phi, Loop *L) JL_NOTSAFEPOINT
{
typedef SmallVector<Instruction*, 8> chainVector;
chainVector chain;
Instruction *J;
unsigned opcode = 0;
for (Instruction *I = Phi; ; I=J) {
J = NULL;
// Find the user of instruction I that is within loop L.
for (User *UI : I->users()) { /*}*/
Instruction *U = cast<Instruction>(UI);
if (L->contains(U)) {
if (J) {
LLVM_DEBUG(dbgs() << "LSL: not a reduction var because op has two internal uses: " << *I << "\n");
return;
}
J = U;
}
}
if (!J) {
LLVM_DEBUG(dbgs() << "LSL: chain prematurely terminated at " << *I << "\n");
return;
}
if (J == Phi) {
// Found the entire chain.
break;
}
if (opcode) {
// Check that arithmetic op matches prior arithmetic ops in the chain.
if (getReduceOpcode(J, I) != opcode) {
LLVM_DEBUG(dbgs() << "LSL: chain broke at " << *J << " because of wrong opcode\n");
return;
}
}
else {
// First arithmetic op in the chain.
opcode = getReduceOpcode(J, I);
if (!opcode) {
LLVM_DEBUG(dbgs() << "LSL: first arithmetic op in chain is uninteresting" << *J << "\n");
return;
}
}
chain.push_back(J);
}
switch (opcode) {
case Instruction::FAdd:
++AddChains;
break;
case Instruction::FMul:
++MulChains;
break;
}
++ReductionChains;
int length = 0;
for (chainVector::const_iterator K=chain.begin(); K!=chain.end(); ++K) {
LLVM_DEBUG(dbgs() << "LSL: marking " << **K << "\n");
(*K)->setFast(true);
++length;
}
ReductionChainLength += length;
MaxChainLength.updateMax(length);
}
probably
(*K)->setFast(true);
in particular

@mikmoore mikmoore changed the title @simd invokes @fastmath @simd enables unexpected fastmath optimizations Apr 18, 2023
@vchuravy
Copy link
Member

help?> @simd
  @simd

    •  Floating-point operations on reduction variables can be reordered,
       possibly causing different results than without @simd.

There is an argument to only enable reassoc/contract.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maths Mathematical functions
Projects
None yet
Development

No branches or pull requests

4 participants