Skip to content

Conversation

@egmontkob
Copy link
Contributor

@egmontkob egmontkob commented Oct 20, 2025

Proposed changes

More robust reading of pwd, namely:

  • support trailing newlines
  • don't send long lines that might clog the kernel's cooked mode buffer
  • make sure to read the entire pwd if the writer is slow

Checklist

👉 Our coding style can be found here: https://midnight-commander.org/coding-style/ 👈

  • I have referenced the issue(s) resolved by this PR (if any)
  • I have signed-off my contribution with git commit --amend -s
  • Lint and unit tests pass locally with my changes (make indent && make check)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added the necessary documentation (if appropriate)

@github-actions github-actions bot added needs triage Needs triage by maintainers prio: medium Has the potential to affect progress labels Oct 20, 2025
@github-actions github-actions bot added this to the Future Releases milestone Oct 20, 2025
@zyv zyv added area: core Issues not related to a specific subsystem and removed needs triage Needs triage by maintainers labels Oct 20, 2025
@zyv zyv modified the milestones: Future Releases, 4.8.34 Oct 20, 2025
@mc-worker mc-worker requested review from mc-worker and removed request for aborodin October 20, 2025 16:38
@egmontkob egmontkob force-pushed the 2325_4480_pwd_newline branch from ceeb70e to d6b260d Compare October 20, 2025 17:26
@egmontkob
Copy link
Contributor Author

Oops, I've accidentally pushed the branch into the main mc repo; I didn't mean that. Sorry for that! You can remove that branch if you wish.

@egmontkob
Copy link
Contributor Author

Nevermind, I've deleted it. Sorry for the noise.

@egmontkob
Copy link
Contributor Author

There's a regression with tcsh:

Previously it could enter directories with a non-alphanumeric UTF-8 symbol (e.g. heart ) in their name, now it cannot. (It can still enter alphanumeric UTF-8 characters.)

Let me play around a little bit with tcsh to see if I can fix this.

Let's hold off this PR for now.

Copy link
Contributor

@ossilator ossilator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3nd commit msg: typo in 'platforms'

@egmontkob egmontkob force-pushed the 2325_4480_pwd_newline branch 2 times, most recently from 57e9ac6 to d267252 Compare October 20, 2025 20:36
@egmontkob
Copy link
Contributor Author

Back to tcsh:

I think I can do either of these two things:

  • Use the $'...' string constant notation. As strings aren't byte-safe, they're forced to the locale (practically UTF-8), it means that in an UTF-8 environment directories with invalid UTF-8 in their name won't work. But the method handles newline characters just fine.

  • Use `printf ...` command substitution. This method breaks newlines – in case of tcsh not just the trailing ones but any internal newlines. On the other hand it's binary-safe, we can enter any directory, even if invalid UTF-8. Note though that it doesn't work backwards: if you hide the panels and do a cd to a directory with an invalid UTF-8 name, the subshell reporting it back to mc using echo $cwd:q will also mangle it and mc won't be able to cd there (already buggy in current mc). To which a workaround could be to invoke the external pwd -L utility.

So it's either-or: invalid UTF-8 but no newlines, or newlines but not invalid UTF-8.

And no, I'm not going to implement both and pick runtime so that we only fail if a path contains both :)

Which one do you guys vote for?

@egmontkob
Copy link
Contributor Author

tcsh: I was wrong, command substitution isn't binary safe either.

So, without terrible hacks, I can get newline working but not invalid UTF-8.


To get invalid UTF-8 working, I think this would do it:

  • save the value of LC_ALL
  • setenv LC_ALL C
  • cd target_directory
  • restore LC_ALL (or most likely: the lack thereof)
  • set cwd=target_directory # to fix the accents shown in the prompt

and when a command completes and tcsh sends it working directory to mc, invoke the external pwd -L.

I'm not gonna do this, it's just not worth it.

@zyv
Copy link
Member

zyv commented Oct 21, 2025

Which one do you guys vote for?

My thinking is that if I had to choose, I'd take newlines.

@egmontkob egmontkob force-pushed the 2325_4480_pwd_newline branch from d267252 to eefcf8c Compare October 21, 2025 07:39
@egmontkob
Copy link
Contributor Author

Yup I'm going with newlines.

New commit pushed to fix regression with tcsh.

How confident are we that placing unquoted 128..255 bytes in the command line is safe in every shell, they don't have any special meaning in the shell?

@egmontkob
Copy link
Contributor Author

New ideas for fully fixing tcsh filed in #4851.

But let's not do everything in this PR, let's leave that for another day.

Pushed an unchanged version, just rebase.

Please review.

Copy link
Member

@zyv zyv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In as far as I'm concerned, I'm satisfied with the current state. I have to admit that neither have I looked at the code too closely, nor have I tested it on my machines, but I really like the new structure and the documentation around it.

Just as a general comment, if I leave line comments with just one statement (and not more than one sentence), it looks better to me to omit the final full stop. Which is of course not pursued consistently throughout the whole codebase due to age.

@zyv zyv changed the title 2325 4480 pwd newline Tickets #2325/#4480: cd into directories with long names and/or embedded newlines Nov 3, 2025
@zyv zyv changed the title Tickets #2325/#4480: cd into directories with long names and/or embedded newlines Tickets #2325/#4480: cd into directories with long names and/or embedded \n Nov 3, 2025
… the subshell

If the subshell writes the working directory slowly, previously we could
read its beginning and stop there.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
This piece of code was never live in mc. It would work around a BusyBox bug
that was fixed in 2012.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
…ng a directory with special characters

Handle trailing '\n' character in the directory name.

Make sure to construct the cd command in physical lines no longer than
250 bytes so that we don't hit the small limit of the kernel's cooked mode
tty buffer size on some platforms.

tcsh still has problems entering directories with special characters
(including invalid UTF-8) in their name. Other shells are now believed
to handle any directory name properly.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
…ibyte UTF-8

Don't escape safe shell characters commonly used in paths, such as
'/', '.', '-' and '_'.

Don't escape multibyte UTF-8 characters. Escaping each byte separately
in string assignments doesn't work in tcsh. The previous commit
introduces a regression here: tcsh cannot enter directories whose name
is valid UTF-8 but contains non-alphanumeric UTF-8 characters. It used
to work because printf would glue them together correctly, but we no
longer use printf and command substitution because that breaks newlines.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
@egmontkob egmontkob force-pushed the 2325_4480_pwd_newline branch from 013cd52 to 051cf0f Compare November 3, 2025 09:17
@egmontkob
Copy link
Contributor Author

/rebase

Just as a general comment, if I leave line comments with just one statement (and not more than one sentence), it looks better to me to omit the final full stop. Which is of course not pursued consistently throughout the whole codebase due to age.

I myself am inconsistent with that, too. Also whether to begin with uppercase or lowercase.

Fixed one such occurrence.

In the last commit, there's a one-liner and a two-liner that I'd like to keep consistent, so I kept the trailing dot.

@zyv zyv changed the title Tickets #2325/#4480: cd into directories with long names and/or embedded \n Tickets #2325 & #4480: cd into directories with long names and/or embedded \n Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: core Issues not related to a specific subsystem prio: medium Has the potential to affect progress

4 participants