Skip to content

SN 0.5.6 -> 0.5.7 -> 0.5.8 #4362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Jul 24, 2025
Merged

SN 0.5.6 -> 0.5.7 -> 0.5.8 #4362

merged 31 commits into from
Jul 24, 2025

Conversation

durban
Copy link
Contributor

@durban durban commented Apr 6, 2025

Notes so far:

  • There is something weird going on with exception stacktraces; I haven't been able to minimize so far. UPDATE: seems to work fine in CI, so possibly something is broken in my local environment.
  • ioAppTestsNative fails on macos-14 and ubuntu-22.04-arm

@durban
Copy link
Contributor Author

durban commented Apr 7, 2025

Okay, I think I can turn on and off the (probably) segfaults with a scala-native config flag 😀

generateFunctionSourcePositions(true) // ioAppTestsNative fails in CI, and locally I get incorrect stacktraces
generateFunctionSourcePositions(false) // neither of those things happens

I can see, that stacktraces are computed in a different way if that flag is true (and this seems new in 0.5.7). My best guess is that that different way sometimes segfaults 🤷‍♂️ (and also causes incorrect stacktraces for me locally). Also, this seems related: scala-native/scala-native/pull/4094

@armanbilge
Copy link
Member

Green CI, nice work!!

@durban
Copy link
Contributor Author

durban commented Apr 9, 2025

Current status:

  • generateFunctionSourcePositions(true) makes ioAppTestsNative fail on ubuntu-22.04-arm (I think, although lately I haven't seen it) and macos-14 (definitely). I believe both of those are ARM64. With false, everything is green.
  • (true also causes incorrect exception stacktraces for me locally, but I also can't minimize this.)
  • The CI failures are usually "segfault-like" (i.e., SIGSEGV or SIGBUS), although sometimes just exit code 1 (I'm not sure what's that about).
  • -fcxx-exceptions / -fno-cxx-exceptions doesn't seem to make a difference either way.
  • If I disable tracing in CE (i.e., hardwire stackTracingMode = "none"), everything is green, even with generateFunctionSourcePositions(true). (Of course, to do this, I had to disable a few tests which need tracing to work. This was just an experiment, I'll obviously rollback this.)

Based on these, I can't definitively prove, but I suspect this could be a scala-native issue. I can see, that stacktraces are calculated in a different way in 0.5.7 if generateFunctionSourcePositions is true. My best guess is that that different way sometimes segfaults 🤷‍♂️. (Also, this seems related: scala-native/scala-native#4094.)

@armanbilge armanbilge modified the milestone: v3.8.0 Apr 13, 2025
@durban
Copy link
Contributor Author

durban commented Jun 9, 2025

I think it's scala-native/scala-native/issues/4366 (more details there).

@djspiewak
Copy link
Member

Okay so I think what we're going to do is release 3.7 (or at least, an RC) against 0.5.6, just to unblock the ecosystem and move the train forward. Excellent digging btw, @durban.

@durban
Copy link
Contributor Author

durban commented Jul 19, 2025

Good news, everyone! It seems we can avoid scala-native/scala-native/issues/4366 by turning on the SN optimizer (see details there, tl;dr: alwaysinline isn't).

Apparently the optimizer being on is the default, and funnily we probably turned it off to make multithreading work easier. I don't know why would anyone want to turn it off (okay, I obviously know...), but we might just say in the release notes: "don't do that"?

@mergify mergify bot mentioned this pull request Jul 19, 2025
@durban
Copy link
Contributor Author

durban commented Jul 19, 2025

Also, I can now confirm that my fix for the SN 0.5.7 segfaults (scala-native/scala-native#4301) was indeed correct. (Due to the new segfaults in 0.5.8 that wasn't certain until now. But now I've tried (#4447), and 0.5.7 also segfaults with the optimizer on.)

build.sbt Outdated
@@ -364,7 +364,7 @@ Global / tlCommandAliases ++= Map(
lazy val nativeTestSettings = Seq(
nativeConfig ~= { c =>
c.withSourceLevelDebuggingConfig(_.enableAll.generateFunctionSourcePositions(false)) // `true` causes segfaults(?)
.withOptimize(false) // disable Scala Native optimizer
.withOptimize(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true is the default, so you can remove the line altogether.

Suggested change
.withOptimize(true)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as stuff is broken with false, I'd prefer to be explicit. Also, to remove that line would mean removing the explanatory comment (you've reviewed an outdated commit).

@djspiewak
Copy link
Member

Somehow I missed the fact that turning the optimizer on fixes this. Great news! @durban should we unwip this and merge the upgrade to 0.5.8?

@durban
Copy link
Contributor Author

durban commented Jul 23, 2025

Yeah, I'm gonna merge 3.x into this branch, to be sure. If CI is green, I think it's good to go.

@durban durban marked this pull request as ready for review July 23, 2025 20:17
@durban durban changed the title WIP: SN 0.5.6 -> 0.5.7 -> 0.5.8 SN 0.5.6 -> 0.5.7 -> 0.5.8 Jul 23, 2025
djspiewak
djspiewak previously approved these changes Jul 23, 2025
Copy link
Member

@djspiewak djspiewak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 All that work for 6 lines lol

build.sbt Outdated
c.withSourceLevelDebuggingConfig(_.enableAll) // enable generation of debug information
.withOptimize(false) // disable Scala Native optimizer
nativeConfig ~= { c =>
c.withSourceLevelDebuggingConfig(_.enableAll.generateFunctionSourcePositions(false)) // `true` causes segfaults(?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be recommended somewhere in a "scala native" docs page?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh... thanks, that's an outdated comment. We can use true in 0.5.8, I've fixed it. I'm gonna remove that false and the comment...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hell yeee

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops just saw the failing test

djspiewak
djspiewak previously approved these changes Jul 23, 2025
@durban
Copy link
Contributor Author

durban commented Jul 23, 2025

There is a failure I don't remember seeing before 😠
https://github.com/typelevel/cats-effect/actions/runs/16481182824/job/46595663830#step:22:1633

@djspiewak
Copy link
Member

Annoying. We'll have to be sure to experiment downstream (fs2 and http4s in particular) with whether this same configuration is required.

@durban
Copy link
Contributor Author

durban commented Jul 24, 2025

To summarize: I thought I've fixed things (in SN) to work with generateFunctionSourcePositions(true), and I've definitely fixed something (scala-native/scala-native#4301). However, now there is a "new" error with that option. This is not a segfault, but a ClassCastException.

Interestingly, with generateFunctionSourcePositions(false), the error disappears. So the situation is that we need both withOptimize(true) and generateFunctionSourcePositions(false) to have a green CI.

We might document that as a workaround, and leave it at that. But I'm bothered by the details of this "new" error:

Error: Exception in thread "main" java.lang.ClassCastException: scala.scalanative.runtime.dwarf.DWARF$SubprogramDIE cannot be cast to scala.scalanative.runtime.Array
	at scala.scalanative.runtime.package$.throwClassCast(package.scala:189)
	at scala.scalanative.unsafe.package$UnsafeRichArray$.at$extension(package.scala:150)
	at java.lang.String.hashCode(String.scala:392)
	at scala.runtime.Statics$.anyHash(Statics.scala:76)
	at scala.runtime.Statics.anyHash(Statics.scala:76)
	at scala.collection.mutable.HashMap.scala$collection$mutable$HashMap$$computeHash(HashMap.scala:79)
	at scala.collection.mutable.HashMap.findNode(HashMap.scala:85)
	at scala.collection.mutable.HashMap.get(HashMap.scala:434)
	at catseffect.examples.NativeRunner$.main(examplesplatform.scala:61)
	at catseffect.examples.NativeRunner.main(examplesplatform.scala:61)
	at <none>.main(unknown:2)
	at <none>.(Unknown Source)
	at <none>.__libc_start_main(Unknown Source)
	at <none>._start(Unknown Source)

Two things:

  1. I really don't see how is this related generateFunctionSourcePositions. Okay, there are no DWARF$SubprogramDIE instances created if that is false, but other objects could be, so:
  2. This very much looks like either a GC bug, or someone incorrectly not having a GC-discoverable reference to something they're still using. I've tried to look into it, and couldn't find any evidence of the latter. So it's concerning. And the generateFunctionSourcePositions could be a complete coincidence, and not an actual workaround. I'm really not sure.

(In any case, I really don't think this is CE's fault.)

@djspiewak djspiewak merged commit 3d929b8 into typelevel:series/3.x Jul 24, 2025
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants