Skip to content

Conversation

@mesejo
Copy link
Contributor

@mesejo mesejo commented Oct 16, 2025

Description of changes

N/A

Issues closed

@github-actions github-actions bot added tests Issues or PRs related to tests sql Backends that generate SQL labels Oct 16, 2025
@mesejo mesejo force-pushed the fix/order_by_first_and_last branch 2 times, most recently from cf7287a to 6713a4c Compare October 17, 2025 10:12
@mesejo mesejo marked this pull request as ready for review October 17, 2025 10:48
@cpcloud
Copy link
Member

cpcloud commented Oct 17, 2025

@mesejo Is there a test you can add to ensure this doesn't regress?

@mesejo mesejo force-pushed the fix/order_by_first_and_last branch from c015111 to 2fd80fc Compare October 17, 2025 16:25
@mesejo mesejo force-pushed the fix/order_by_first_and_last branch from 2fd80fc to d357c42 Compare October 17, 2025 17:30
@mesejo
Copy link
Contributor Author

mesejo commented Oct 17, 2025

@cpcloud, Thanks for the review, just added a test

@NickCrews
Copy link
Contributor

@mesejo I just updated the test, let me know if that is acceptable to you! I think using the handcrafted data will make it more clear what we are testing. Also added tests for descending order bys

@NickCrews
Copy link
Contributor

LGTM

@mesejo
Copy link
Contributor Author

mesejo commented Oct 17, 2025

@mesejo I just updated the test, let me know if that is acceptable to you! I think using the handcrafted data will make it more clear what we are testing. Also added tests for descending order bys

That's fine with me. Thanks!

@NickCrews
Copy link
Contributor

NickCrews commented Oct 17, 2025

@mesejo any idea why druid is failing on my version? On your version, the commit right before, it passed.

@NickCrews
Copy link
Contributor

I'm guessing because in your version you only operated over a single column, where in mine we are ordering by a different column than we are aggregating over? Probably just mark as xfail and call it a day?

@mesejo
Copy link
Contributor Author

mesejo commented Oct 17, 2025

@NickCrews, I think it is because of the memtable. The following works for the Druid backend:

@pytest.mark.parametrize(
    "method,expected",
    [
        pytest.param(lambda col: col.first(order_by="bigint_col"), 0, id="first_asc"),
        pytest.param(lambda col: col.last(order_by="bigint_col"), 9, id="last_asc"),
        pytest.param(
            lambda col: col.first(order_by=ibis._.bigint_col.desc()), 9, id="first_desc"
        ),
        pytest.param(
            lambda col: col.last(order_by=ibis._.bigint_col.desc()), 0, id="last_desc"
        ),
    ],
)
def test_first_last_ordered_in_mutate(alltypes, con, method, expected):
    # originally reported in https://github.com/ibis-project/ibis/issues/11656
    t = alltypes
    expr = t.mutate(new=method(t.int_col))
    actual = con.to_pyarrow(expr.new).to_pylist()
    assert actual == [expected] * len(actual)

@NickCrews
Copy link
Contributor

huh, good detective work, thanks for cleaning up my mess. That is weird, since that seems like such basic usage of memtable, and memtable doesn't appear to be broken otherwise on druid.

I don't love how the test data is so far away from the test (like I have no idea if 0 and 9 are actually the right results without looking up the test data).

But I'm willing to say good enough to get this fix in, but @cpcloud may feel differently.

@mesejo mesejo force-pushed the fix/order_by_first_and_last branch from 3bff124 to f57ce17 Compare October 18, 2025 04:54
@mesejo
Copy link
Contributor Author

mesejo commented Oct 18, 2025

@NickCrews, I changed the test to:

t = alltypes.select(
    a=ibis._.tinyint_col, val=ibis._.int_col, ob=ibis._.bigint_col
).filter(
    ((ibis._.val == 4) & (ibis._.ob == 40))
    | ((ibis._.val == 5) & (ibis._.ob == 50))
)
expr = t.mutate(new=method(t.val)).limit(10)
actual = con.to_pyarrow(expr.new).to_pylist()
assert actual == [expected] * 10 

This form makes it easy to reason about the shape and content of the data.

@NickCrews
Copy link
Contributor

That is better, thanks. still not ideal as handwritten data, but if we can't have that, then this really helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql Backends that generate SQL tests Issues or PRs related to tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: incorrect results when using last with order_by

3 participants