Skip to content

Commit 3bc09c7

Browse files
committed
[SCEVExpander] Allow udiv with isKnownNonZero(RHS) + add vscale case
Motivation here is to unblock LSRs ability to use ICmpZero uses - the major effect of which is to enable count down IVs. The test changes reflect this goal, but the potential impact is much broader since this isn't a change in LSR at all. SCEVExpander needs(*) to prove that expanding the expression is safe anywhere the SCEV expression is valid. In general, we can't expand any node which might fault (or exhibit UB) unless we can either a) prove it won't fault, or b) guard the faulting case. We'd been allowing non-zero constants here; this change extends it to non-zero values. vscale is never zero. This is already implemented in ValueTracking, and this change just adds the same logic in SCEV's range computation (which in turn drives isKnownNonZero). We should common up some logic here, but let's do that in separate changes. (*) As an aside, "needs" is such an interesting word here. First, we don't actually need to guard this at all; we could choose to emit a select for the RHS of ever udiv and remove this code entirely. Secondly, the property being checked here is way too strong. What the client actually needs is to expand the SCEV at some particular point in some particular loop. In the examples, the original urem dominates that loop and yet we completely ignore that information when analyzing legality. I don't plan to actively pursue either direction, just noting it for future reference. Differential Revision: https://reviews.llvm.org/D129710
1 parent ab02680 commit 3bc09c7

File tree

3 files changed

+15
-11
lines changed

3 files changed

+15
-11
lines changed

llvm/lib/Analysis/ScalarEvolution.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6651,6 +6651,13 @@ ScalarEvolution::getRangeRef(const SCEV *S,
66516651
}
66526652
}
66536653

6654+
// vscale can't be equal to zero
6655+
if (const auto *II = dyn_cast<IntrinsicInst>(U->getValue()))
6656+
if (II->getIntrinsicID() == Intrinsic::vscale) {
6657+
ConstantRange Disallowed = APInt::getZero(BitWidth);
6658+
ConservativeResult = ConservativeResult.difference(Disallowed);
6659+
}
6660+
66546661
return setRange(U, SignHint, std::move(ConservativeResult));
66556662
}
66566663

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2572,9 +2572,7 @@ namespace {
25722572
// only needed when the expression includes some subexpression that is not IV
25732573
// derived.
25742574
//
2575-
// Currently, we only allow division by a nonzero constant here. If this is
2576-
// inadequate, we could easily allow division by SCEVUnknown by using
2577-
// ValueTracking to check isKnownNonZero().
2575+
// Currently, we only allow division by a value provably non-zero here.
25782576
//
25792577
// We cannot generally expand recurrences unless the step dominates the loop
25802578
// header. The expander handles the special case of affine recurrences by
@@ -2592,8 +2590,7 @@ struct SCEVFindUnsafe {
25922590

25932591
bool follow(const SCEV *S) {
25942592
if (const SCEVUDivExpr *D = dyn_cast<SCEVUDivExpr>(S)) {
2595-
const SCEVConstant *SC = dyn_cast<SCEVConstant>(D->getRHS());
2596-
if (!SC || SC->getValue()->isZero()) {
2593+
if (!SE.isKnownNonZero(D->getRHS())) {
25972594
IsUnsafe = true;
25982595
return false;
25992596
}

llvm/test/Transforms/LoopStrengthReduce/RISCV/icmp-zero.ll

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -129,10 +129,10 @@ define void @icmp_zero_urem_nonzero(i64 %N, i64 %M, ptr %p) {
129129
; CHECK-NEXT: [[UREM:%.*]] = urem i64 [[N:%.*]], [[NONZERO]]
130130
; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
131131
; CHECK: vector.body:
132-
; CHECK-NEXT: [[IV:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[VECTOR_BODY]] ]
132+
; CHECK-NEXT: [[LSR_IV:%.*]] = phi i64 [ [[LSR_IV_NEXT:%.*]], [[VECTOR_BODY]] ], [ [[UREM]], [[ENTRY:%.*]] ]
133133
; CHECK-NEXT: store i64 0, ptr [[P:%.*]], align 8
134-
; CHECK-NEXT: [[IV_NEXT]] = add i64 [[IV]], 2
135-
; CHECK-NEXT: [[DONE:%.*]] = icmp eq i64 [[IV_NEXT]], [[UREM]]
134+
; CHECK-NEXT: [[LSR_IV_NEXT]] = add i64 [[LSR_IV]], -2
135+
; CHECK-NEXT: [[DONE:%.*]] = icmp eq i64 [[LSR_IV_NEXT]], 0
136136
; CHECK-NEXT: br i1 [[DONE]], label [[EXIT:%.*]], label [[VECTOR_BODY]]
137137
; CHECK: exit:
138138
; CHECK-NEXT: ret void
@@ -161,10 +161,10 @@ define void @icmp_zero_urem_vscale(i64 %N, ptr %p) {
161161
; CHECK-NEXT: [[UREM:%.*]] = urem i64 [[N:%.*]], [[VSCALE]]
162162
; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
163163
; CHECK: vector.body:
164-
; CHECK-NEXT: [[IV:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[VECTOR_BODY]] ]
164+
; CHECK-NEXT: [[LSR_IV:%.*]] = phi i64 [ [[LSR_IV_NEXT:%.*]], [[VECTOR_BODY]] ], [ [[UREM]], [[ENTRY:%.*]] ]
165165
; CHECK-NEXT: store i64 0, ptr [[P:%.*]], align 8
166-
; CHECK-NEXT: [[IV_NEXT]] = add i64 [[IV]], 2
167-
; CHECK-NEXT: [[DONE:%.*]] = icmp eq i64 [[IV_NEXT]], [[UREM]]
166+
; CHECK-NEXT: [[LSR_IV_NEXT]] = add i64 [[LSR_IV]], -2
167+
; CHECK-NEXT: [[DONE:%.*]] = icmp eq i64 [[LSR_IV_NEXT]], 0
168168
; CHECK-NEXT: br i1 [[DONE]], label [[EXIT:%.*]], label [[VECTOR_BODY]]
169169
; CHECK: exit:
170170
; CHECK-NEXT: ret void

0 commit comments

Comments
 (0)