-
Notifications
You must be signed in to change notification settings - Fork 577
Add ability to emulate thread-safe locale operations #21971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
61fb53d
to
ea961d3
Compare
Locale information was originally global for an entire process. Later, it was realized that different threads could want to be running in different locales. Windows added this ability, and POSIX 2008 followed suit (though using a completely different API). When available, perl automatically uses these capabilities. But many platforms have neither, or their implementation, such as on Darwin, is buggy. This commit adds the capability for Perl programs to operate as if the platform were thread-safe. This implementation is based on the observation that the underlying locale matters only to relatively few libc calls, and only during their execution. It can be anything at all at any other time. perl keeps what the proper locale should be for each category in a a per-thread array. Each locale-dependent operation must be wrapped in mutex lock/unlock operations. The lock additionally compares what libc knows the locale to be, and what it should be for this thread at this time, and changes the actual locale to the proper value if necessary. That's all that is needed. This commit adds macros to perl.h, for example "MBTOWC_LOCK_", that expand to do the mutex lock, and change the global locale to the expected value. On perls built without this emulation capability, they are no-ops. All code in the perl core (unless I've missed something), are changed to use these macros (there weren't actually many places that needed this). Thus, any pure perl program will automatically become locale-thread-safe under this Configuration. In order for XS code to also become locale-thread-safe, it must use these macros to wrap calls to locale-dependent functions. Relatively few modules call such functions. For example, the only one I found that ships with the perl core is Time::Piece, and it has more fundamental issues with running under threads than this. I am preparing pull requests for it. Thus, this is not completely transparent to code like native-thread-safe locale handling is. Therefore ${^SAFE_LOCALES} returns 2 (instead of 1) for this type of thread-safety. Another deficiency compared to the native thread safety is when a thread calls a non-perl library that accesses the locale. The typical example is Gtk (though this particular application can be configured to not be problematic). With the native safe threads, everything works as long as only one such thread is used per Perl program. That thread would then be the only one operating in the global locale, hence there are no conflicts. With this emulation, all threads are operating in the global locale, and mutexes would have to be used to prevent conflicts. To minimize those, the code added in this commit restores the global locale when through to the state it was in when started. A major concern is the performance impact. This is after all trading speed for accuracy. lib/locale_threads.t is noticeably slower when this is being used. But that is doing multiple threads constantly using locale-dependent operations. I don't notice any change with the rest of the test suite. In pure perl, this only comes into play while in the scope of 'use locale' or when using some of the few POSIX:: functions that are locale-dependent. And to some extent when formatting, but the regular overhead there should dwarf what this adds. This commit leaves this feature off by default. The next commit changes that for the next few 5.39 development releases, so we can see if there is actually an issue.
ea961d3
to
2a87e59
Compare
@khwilliamson, this pull request has not received any discussion since it was filed more than six months ago. That is uncommon. I suspect that people aren't really convinced that Perl has a big problem here -- "big" in the sense that it's bothering a lot of people -- or "big" in the sense that it requires intervention in 14 different files, almost all of them deep in the Perl guts. As an indirect consequence of the lack of discussion of the p.r., it has acquired merge conflicts in 3 of those core files. If you wish to keep this p.r. in consideration, could you resolve those conflicts and re-push? Thanks. |
Locale information was originally global for an entire process. Later, it was realized that different threads could want to be running in different locales. Windows added this ability, and POSIX 2008 followed suit (though using a completely different API). When available, perl automatically uses these capabilities.
But many platforms have neither, or their implementation, such as on Darwin, is buggy. This commit adds the capability for Perl programs to operate as if the platform were thread-safe.
This implementation is based on the observation that the underlying locale matters only to relatively few libc calls, and only during their execution. It can be anything at all at any other time. perl keeps what the proper locale should be for each category in a a per-thread array. Each locale-dependent operation must be wrapped in mutex lock/unlock operations. The lock additionally compares what libc knows the locale to be, and what it should be for this thread at this time, and changes the actual locale to the proper value if necessary. That's all that is needed.
This commit adds macros to perl.h, for example "MBTOWC_LOCK_", that expand to do the mutex lock, and change the global locale to the expected value. On perls built without this emulation capability, they are no-ops. All code in the perl core (unless I've missed something), are changed to use these macros (there weren't actually many places that needed this). Thus, any pure perl program will automatically become locale-thread-safe under this Configuration.
In order for XS code to also become locale-thread-safe, it must use these macros to wrap calls to locale-dependent functions. Relatively few modules call such functions. For example, the only one I found that ships with the perl core is Time::Piece, and it has more fundamental issues with running under threads than this. I am preparing pull requests for it.
Thus, this is not completely transparent to code like native-thread-safe locale handling is. Therefore ${^SAFE_LOCALES} returns 2 (instead of 1) for this type of thread-safety.
Another deficiency compared to the native thread safety is when a thread calls a non-perl library that accesses the locale. The typical example is Gtk (though this particular application can be configured to not be problematic). With the native safe threads, everything works as long as only one such thread is used per Perl program. That thread would then be the only one operating in the global locale, hence there are no conflicts. With this emulation, all threads are operating in the global locale, and mutexes would have to be used to prevent conflicts. To minimize those, the code added in this commit restores the global locale when through to the state it was in when started.
A major concern is the performance impact. This is after all trading speed for accuracy. lib/locale_threads.t is noticeably slower when this is being used. But that is doing multiple threads constantly using locale-dependent operations. I don't notice any change with the rest of the test suite. In pure perl, this only comes into play while in the scope of 'use locale' or when using some of the few POSIX:: functions that are locale-dependent. And to some extent when formatting, but the regular overhead there should dwarf what this adds.
This commit leaves this feature off by default.