Skip to content

Unbuffered output speed regression #18811

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
RivenSkaye opened this issue Apr 17, 2025 · 6 comments
Open

Unbuffered output speed regression #18811

RivenSkaye opened this issue Apr 17, 2025 · 6 comments
Labels
Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Attention The core contributors need to come back around and look at this ASAP. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting

Comments

@RivenSkaye
Copy link

Windows Terminal version

1.22.10731.0

Windows build number

10.0.19045.5737

Other Software

Bat (the cat(1) clone with wings)
Any GNU util that doesn't ignore the -u flag (cat(1) ignores nowadays, bat is always unbuffered)
... Don't have more easily accessible examples on hand

Steps to reproduce

Run any command that produces half a screen or more of unbuffered output, e.g. tab autocompletions in bash (git, msys2, but not WSL)

Expected Behavior

(reasonably) fast printing of output

Actual Behavior

through WSL, exactly the thing I expect. When unbuffered, I can see every line being printed separately. When paging is involved, I can see line by line output happening. When coloration is involved, scrolling lines in paged output becomes extremely slow.

On the bright side, this does help pinpoint what utils have room for improvement. On the not-so-bright side I'm sometimes actively waiting for output to finish. This is something that's changed in the last couple of updates (I'm on the manual track and am not very frequent with checking, sorry)

@RivenSkaye RivenSkaye added Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Apr 17, 2025
@lhecker
Copy link
Member

lhecker commented Apr 17, 2025

I believe I know what causes the performance regression, if any. Previously, we would incorrectly turn unbuffered IO into buffered IO when it didn't cross the viewport to cause scrolling. We stopped doing so in 1.22.

With the correctness aside, I'm not entirely sure I see a performance issue specific to Windows Terminal per se. Obviously, there's an increased overhead if buffering is disabled since there are more syscalls involved, so it cannot possibly be as fast as buffered IO. Since I'm not super certain what a good baseline is, I compared it with foot and kitty under Linux on the same hardware and running bat with no buffering was roughly as fast with them as it was with Windows Terminal via WSL.

As such, could you provide a specific scenario where the performance issue is noticeable and reproducible for me?

@lhecker lhecker added the Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something label Apr 17, 2025
@RivenSkaye
Copy link
Author

RivenSkaye commented Apr 17, 2025

I specifically mentioned "but not through WSL" (perhaps the wrapper buffers? It doesn't have the issue though)
In my case I've used MSYS2/MinGW64 using a starting setup similar to the commandline this vscode setup produces, including the MSYS env var being set beforehand using the environment JSON key available since somewhere in v1.18.

If you'd like me to produce a windows-gnu build of bat to run and/or a terminal profile, I'll gladly do so tomorrow.

Edit: I haven't tried an MSVC build yet. I'll also do that and if that makes a notable difference I'll move the issue to the MSYS2 repos. I assumed it was WT since that was the only thing that changed before the performance drop in unbuffered output.
Buffered output (e.g. less or bat --paging=always) isn't affected. And the less build is also windows-gnu as per the MinGW PKGBUILD file

@microsoft-github-policy-service microsoft-github-policy-service bot added Needs-Attention The core contributors need to come back around and look at this ASAP. and removed Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something labels Apr 17, 2025
@lhecker
Copy link
Member

lhecker commented Apr 17, 2025

It would definitely not be unexpected if this turned out to be a Windows Terminal "issue". As I mentioned above, we previously would not treat the unbuffered IO as buffered and only flush it out after ~16ms. It's now unbuffered and the processing is synchronous, as one would expect from a terminal. This may then result in worse performance for applications which do many, small, unbuffered writes.

If you're curious, the PR which made that change is this one: #17510
It's a very large PR because it is a complete rewrite of our Console API & VT layer.

I've tried the latest official bat-v0.25.0-x86_64-pc-windows-msvc.zip and it runs in roughly 100ms for a random ~1700 lines Rust file I have. Disabling coloring and writing to NUL (/dev/null) takes 10ms. Enabling coloring again takes ~55ms. The IO overhead with that version of bat on my file is thus roughly 100% (would be ~3ms if it buffered the output). But this is in line with my tests under Linux (same duration).
I did some tests and found that the MSVC version emits 1 write per 1 line of output. ~1700 writes aren't great, but they're also not too terrible. My assumption is that the UNIX/Linux version should work the same way.

I specifically mentioned "but not through WSL"

I apologize for that. 😣

If you'd like me to produce a windows-gnu build of bat to run [...]
Edit: I haven't tried an MSVC build yet. I'll also do that and if that makes a notable difference I'll move the issue to the MSYS2 repos. [...]

You bring up a good point. We maintain all console APIs, including any IO via Windows console handles. Sometimes we also try to solve MSYS2 bugs, but in this case, there's an official Windows version which presumably uses the same VT output as the UNIX version (compiled via MSYS2). So, if the issue doesn't reproduce with the MSVC version and also doesn't reproduce under WSL, I think chances are high that the MSYS2 maintainers would indeed be better equipped to debug this issue.

Otherwise, please do feel free of course to share your MSYS2 build of course!

@RivenSkaye
Copy link
Author

RivenSkaye commented Apr 22, 2025

Okay ran some tests, using my uhhh... More modern than MIPS32 system at the office, got some more detail on when and where things hit slowdowns. Peak machine perf means the differences are also more easily noticeable.

  1. windows-gnu and MSVC builds make no real difference
  2. All testing was done using copies of bat in the current working directory, using time as provided in MSYS2's /usr/bin

As for the results

  • --paging=never --color=never makes no notable difference to before the update
  • --paging=never --color=always is affected, but remarkably less than paged
  • --paging=always --color=always lets me watch output getting printed line by line¹
  • -p makes all of them faster, which I assume has to do with font rendering due to ligatures in all directions. Paged is still very slow though

As for the system:
Image

¹ This is both when forcing bat to be its own pager, as well as when using less. The full-term redrawing part of it is definitely getting hit harder I think.

Both bat builds added here for completeness:
bat-both.zip


Update: it seems colorization specifically is much slower

time ls --color=always
# snip: 3 lines, 11 columns, 32 files/dirs/symlinks total
# exact stats: 14 files, 13 dirs, 5 symlinks

real    0m1.619s
user    0m0.030s
sys     0m1.062s

❯ time ls --color=never
# snip the same

real    0m0.729s
user    0m0.078s
sys     0m0.436s

@lhecker
Copy link
Member

lhecker commented Apr 24, 2025

I'm having trouble reproducing these timings... By the way, does this issue only occur on your machines in your office? Another difference between 1.21 and 1.22 is that we now use overlapped IO and there's a possibility that your Anti-Virus is scanning the interactions. I only use Windows Defender, so I'm not sure how it behaves under other AV systems...

@RivenSkaye
Copy link
Author

That is a very good question - I'll try that on my home machine over the weekend and get back to you. Unfortunately I don't control the AV running here, so it just might be. I'd be very sorry sending you on this goose chase if that were the case, though it'd definitely lead to fast resolution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Attention The core contributors need to come back around and look at this ASAP. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting
Projects
None yet
Development

No branches or pull requests

2 participants