Skip to content

chore: migrate to DF 49.0.0 #2040

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft

chore: migrate to DF 49.0.0 #2040

wants to merge 5 commits into from

Conversation

comphead
Copy link
Contributor

@comphead comphead commented Jul 23, 2025

Which issue does this PR close?

A followup on #1997
Closes #1993 .

Rationale for this change

What changes are included in this PR?

How are these changes tested?

@codecov-commenter
Copy link

codecov-commenter commented Jul 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 42.85%. Comparing base (f09f8af) to head (d614dca).
Report is 331 commits behind head on main.

Additional details and impacted files
@@              Coverage Diff              @@
##               main    #2040       +/-   ##
=============================================
- Coverage     56.12%   42.85%   -13.27%     
- Complexity      976     1034       +58     
=============================================
  Files           119      135       +16     
  Lines         11743    13136     +1393     
  Branches       2251     2390      +139     
=============================================
- Hits           6591     5630      -961     
- Misses         4012     6460     +2448     
+ Partials       1140     1046       -94     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mbutrovich
Copy link
Contributor

This PR apache/datafusion#16290 changed the signature for several functions to return Utf8View. I can start bringing in changes to support Utf8View from my branch (https://github.com/mbutrovich/datafusion-comet/tree/german_style_strings) but this has a huge blast radius for Comet. I'll investigate further.

@comphead
Copy link
Contributor Author

Fixed windows and crypto tests, the ignore nulls correctness issue is weird. Ive localized it with 3 rows test

  test("first/last with ignore null") {
    val data = Range(0, 3).flatMap(n => Seq((n, 1), (n, 2))).toDF("a", "b")
    withTempDir { dir =>
      val filename = s"${dir.getAbsolutePath}/first_last_ignore_null.parquet"
      data.repartition(2).write.parquet(filename)
      withSQLConf(CometConf.COMET_BATCH_SIZE.key -> "1") {
        spark.read.parquet(filename).createOrReplaceTempView("t1")
        for (expr <- Seq("first", "last")) {
          // deterministic query that should return one non-null value per group
          val df = spark.sql(
            s"SELECT a, $expr(IF(b==1,null,b)) IGNORE NULLS FROM t1 GROUP BY a ORDER BY a")
          checkSparkAnswerAndOperator(df)
        }
      }
    }
  }

It has something with Batch size and number of input files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade to DataFusion 49.0.0
3 participants