Skip to content

tensor[tile] when tile size is 1 returns a 1D tensor, instead of a scalar #275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

joydddd
Copy link
Contributor

@joydddd joydddd commented Jul 11, 2025

@joydddd joydddd force-pushed the joydddd/stack/15 branch from 41f371d to bc6ea11 Compare July 11, 2025 01:08
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 11, 2025
…alar

stack-info: PR: #275, branch: joydddd/stack/15
@joydddd joydddd force-pushed the joydddd/stack/15 branch from bc6ea11 to 5348847 Compare July 11, 2025 01:16
@joydddd joydddd requested review from jansel, drisspg and yf225 July 11, 2025 02:09
Copy link
Contributor

@jansel jansel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test for the fix?

Comment on lines -381 to +382
if fake_value.size(i) != 1:
stride = state.device_function.tensor_stride(fake_value, i).name
index_expr.append(f"{idx} * {stride}")
stride = state.device_function.tensor_stride(fake_value, i).name
index_expr.append(f"{idx} * {stride}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like this:
N = x.size(0)
for tile in hl.tile(N):
x_tile = x[tile]

When block_size=1, the if statement evaluates to be False, so the indexing ignore the N dimension, and generate
x_tile = tl.load(tile + tl.zeros([1], ...))

I'll add a test case for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this checking the tensor size not the block size?

if block_size == 1:
extra_body.append(
statement_from_string(
f"{index_var} = {offset_var} + tl.zeros([1], {dtype})"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this do the same thing as arange? I'd expect we wuld need shape=[] or even just offset_var directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this does the same thing. We don't need to make this change to fix tile indexing when block_size=1.
However, why does grid_codegen handle block_size == 1 differently with tl.zeros instead tl.arange?

@jansel
Copy link
Contributor

jansel commented Jul 12, 2025

The broadcasting behavior for size==1 tensors is intentional. We match numpy/pytorch broadcasting rules: https://numpy.org/devdocs/user/basics.broadcasting.html

I added some more tests for this here, which we should make sure this doesn't break: #285

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants