Skip to content

Guard out-of-range shift counts in integer opcodes#559

Open
uwezkhan wants to merge 1 commit into
pydata:masterfrom
uwezkhan:shift-count-guard
Open

Guard out-of-range shift counts in integer opcodes#559
uwezkhan wants to merge 1 commit into
pydata:masterfrom
uwezkhan:shift-count-guard

Conversation

@uwezkhan

Copy link
Copy Markdown

The integer shift opcodes hand the per-element shift count straight to C++ <</>>. When a<<b or a>>b runs with a count that is negative or at least the operand width (32 for int, 64 for long long) the shift is undefined behavior. On arm64 `evaluate("a<<b")" with b=100 returns 80 because the hardware masks the count, while NumPy returns 0, and a UBSAN build traps right at the shift. Both int and long long, left and right, hit this.

After the change an out-of-range left shift yields 0 and an out-of-range right shift clamps the count to width-1 so the sign bit fills, which is the result NumPy gives. The guard sits in the opcode next to the existing div/mod guards because the count is only known per element at run time, so a caller-side check could not cover array shift counts. Tradeoff is one extra unsigned compare per element; in-range counts keep the same value and the branch is well predicted.

@uwezkhan

Copy link
Copy Markdown
Author

gentle ping

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes integer shift opcodes deterministic and NumPy-compatible when the per-element shift count is negative or exceeds the operand bit width, avoiding undefined behavior in the C++ interpreter loop.

Changes:

  • Guard OP_LSHIFT_* to return 0 when the shift count is out-of-range.
  • Guard OP_RSHIFT_* to clamp out-of-range counts to width-1 to preserve sign-fill behavior.
  • Add a regression test covering out-of-range shift counts for i4 and i8.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
numexpr/interp_body.cpp Adds per-element range checks for 32-bit and 64-bit integer shift opcodes to avoid UB and match NumPy semantics.
numexpr/tests/test_numexpr.py Adds a regression test asserting NumExpr matches NumPy for out-of-range shift counts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +464 to +469
for dtype in ('i4', 'i8'):
x = array([5, -5, 0], dtype=dtype)
for count in (-1, 64, 200):
y = array([count] * len(x), dtype=dtype)
assert_array_equal(evaluate("x << y"), x << y)
assert_array_equal(evaluate("x >> y"), x >> y)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants