How can you make shell scripts portable and run faster? That are the questions these test cases aim to answer.
Table of Contents
- 1.0 SHELL SCRIPT PERFORMANCE AND PORTABILITY
- 3.0 ABOUT PERFORMANCE
- 4.0 PORTABILITY
- 5.0 RANDOM NOTES
- 6.0 FURTHER READING
- COPYRIGHT
- LICENSE
The tests reflect results under Linux
using GNU utilities. The focus is on the
features found in
Bash
and
POSIX.1-2024
compliant sh
shells. The term compliant
is used here as "most POSIX compliant",
as there is no, and has never been,
shell that is fully POSIX compliant.
POSIX is useful if you are looking for
more portable scripts. See also POSIX in
Wikipedia.
Please note that
sh
here refers to modern, best-of-effort POSIX-compatible, minimal shells like dash and posh. See section PORTABILITY, SHELLS AND POSIX.
In Linux like systems, for all rounded shell scripting, Bash is the sensible choice for data manipulation in memory with arrays, associative arrays, and strings with an extended set of parameter expansions, regular expressions, including extracting regex matches and utilizing functions.
In other operating systems, for example BSD, the obvious choice for shell scripting would be fast Ksh (ksh93, mksh, etc.).
Shell scripting is about combining redirections, pipes, calling external utilities, and gluing them all together. Shell scripts are also quite portable by default, requiring no additional installation. Perl or Python excel in their respective fields, where the requirements differ from those of the shell.
Certain features in Bash are slow, but
knowing the cold spots and using
alternatives helps. On the other hand,
small POSIX sh
, for example
dash,
scrips are much faster at calling
external processes and functions. More
about this in section
SHELLS AND PERFORMANCE.
The results presented in this README
provide only some highlighs from the test
cases listed in RESULTS. Consider the raw
time
results only as guidance, as they reflect
only the system used at the time of
testing. Instead, compare the relative
order in which each test case produced
the fastest results.
- RESULTS
- RESULTS-BRIEF
- RESULTS-PORTABILITY
- The test cases and code in bin/
- USAGE
- CONTRIBUTING
bin/ The tests
doc/ Results by "make doc"
COPYING License (GNU GPL)
INSTALL Install instructions
USAGE.md How to run the tests
CONTRIBUTING.md Writing test cases
-
Homepage: https://github.com/jaalto/project--shell-script-performance-and-portability
-
To report bugs: see homepage.
-
Source repository: see homepage.
-
Depends: Bash, GNU coreutils, file
/usr/share/dict/words
(Debian package: wamerican). -
Optional depends: GNU make. For some tests: GNU parallel.
Regardless of the shell you use for scripting (sh, ksh, bash), consider these factors.
-
If you run scripts on many small files, set up a RAM disk and copy the files to it. This can lead to massive speed gains. In Linux, see tmpfs, which allows you to set a size limit, unlike the memory-hogging ramfs, which can fill all available memory and potentially halt your server.
-
If you know the files beforehand, preload them into memory. This can also lead to massive speed gains. In Linux, see vmtouch.
-
If you have tasks that can be run concurrently, use Perl based GNU parallel for massive gains in performance. See also how to use semaphores (tutorial) to wait for all concurrent tasks to finish before continuing with the rest of the tasks in the pipeline. In some cases, even parallelizing work with GNU
xargs --max-procs=0
can help. -
Use GNU utilities. According to benchmarks, like StackOverflow, the GNU
grep
is considerably faster and more optimized than the operating system's default. For shells, the GNU utilities consist mainly of coreutils, grep and awk. If needed, arrangePATH
to prefer GNU utilities (for example, on macOS). -
Minimize extra processes as much as possible. In most cases, a single awk can handle all of
sed
,cut
,grep
etc. chains. Theawk
binary program is very fast and more efficient than Perl or Python scripts where startup time and higher memory consumption is a factor. Note: If you need to process large files, use a lot of regular expressions, manipulate or work on data extensively, there is probably nothing that can replace the speed of Perl unless you go even lower-level languages likeC
. But then again, we assume that you know how to choose your tools in those cases.
cmd | awk '{...}'
# ... could probably
# replace all of these
cmd | head ... | cut ...
cmd | grep ... | sed ...
cmd | grep ... | grep -v ... | cut ...
-
Note: if you have hordes of RAM, no shortage of cores, and large files, then utilize pipelines
<cmd> | ...
as much as possible because the Linux Kernel will optimize things in memory better. In more powerful systems, many latency and performance issues are not as relevant. -
Use Shell built-ins (see Bash) and not binaries:
echo # not /usr/bin/echo
printf # not /usr/bin/printf
[ ... ] # not /usr/bin/test
TODO
-
In Bash, It is at least 60 times faster to perform regular expression string matching using the binary operator
=~
rather than to calling external POSIX utilitiesexpr
orgrep
.NOTE: In POSIX
sh
, like dash , calling utilities is extremely fast. Compared to Bash's[[]]
, theexpr
indash
is only 5x slower, which is negligible because the time differences are measured in mere few milliseconds. See code
str="abcdef"
re="b.*e"
# Bash, Ksh
[[ $str =~ $re ]]
# In Bash, at least 60x slower
expr match "$str" ".*$re"
# In Bash, at least 100x slower
echo "$str" | grep -E "$re"
# --------------------------------
# Different shells compared.
# --------------------------------
./run.sh --shell dash,ksh93,bash t-string-match-regexp.sh
Run shell: dash
# t1 <skip>
# t2 real 0.010s expr
# t3 real 0.010s grep
Run shell: ksh93
# t1 real 0.001s [[ =~ ]]
# t2 real 0.139s expr
# t3 real 0.262s grep
Run shell: bash
# t1 real 0.003s [[ =~ ]]
# t2 real 0.200s expr
# t3 real 0.348s grep
- In Bash, it is about 50 times
faster to do string manipulation in
memory, than calling external
utilities. Seeing the measurements
just how expensive it is, reminds us
to utilize the possibilities of POSIX
#
,##
,%
and%%
parameter expansions. See more in Bash. See code.
str="/tmp/filename.txt.gz"
# (1) Almost instantaneous
# Delete up till first "."
ext=${str#*.}
# (2) In Bash, over 50x slower
#
# NOTE: identical in speed
# and execution to:
# cut -d "." -f 2,3 <<< "$str"
ext=$(echo "$str" | cut -d "." -f 2,3)
# (3) In Bash, over 70x slower
ext=$(echo "$str" | sed 's/^[^.]\+//')
# --------------------------------
# Different shells compared.
# --------------------------------
./run.sh --shell dash,ksh93,bash t-string-file-path-components.sh
Run shell: dash
# t3aExt real 0.009s (1)
# t3cExt real 0.008s (2)
# t3eExt real 0.009s (3)
Run shell: ksh93
# t3aExt real 0.001s
# t3cExt real 0.193s
# t3eExt real 0.288s
Run shell: bash
# t3aExt real 0.004s
# t3cExt real 0.358s
# t3eExt real 0.431s
- In Bash, it is about 10 times faster
to read a file into memory as a
string and use
pattern matching
or regular expressions binary
operator
=~
on string. In-memory handling is much more efficient than calling thegrep
command in Bash on a file, especially if multiple matches are needed. See code.
# Bash, Ksh
str=$(< file)
if [[ $str =~ $regexp1 ]]; then
...
elif [[ $str =~ $regexp2 ]]; then
...
fi
# --------------------------------
# Different shells compared.
# --------------------------------
(1) read once + case..end
(2) loop do.. grep file ..done
(3) loop do.. case..end ..done
./run.sh --shell dash,ksh93,bash t-file-grep-vs-match-in-memory.sh
Run shell: dash
# t1b real 0.023s (1) once
# t2 real 0.018s (2) grep
# t3 real 0.021s (3) case
Run shell: ksh93
# t1b real 0.333s (1) once
# t2 real 0.208s (2) grep
# t3 real 0.453s (3) case
Run shell: bash
# t1b real 0.048s (1) once
# t2 real 0.277s (2) grep
# t3 real 0.415s (3) case
- In Bash, it is about 8 times faster,
to use
nameref
to return a value. In Bash, the
ret=$(fn)
is inefficient to call functions. On the other hand, in POSIXsh
shells, like dash, there is practically no overhead in using$(fn)
. See code.
# An exmaple only. Not needed in
# POSIX sh shells as ret=$(fn)
# is already fast.
fnNamerefPosix()
{
# NOTE: uses non-POSIX
# 'local' but it is widely
# supported in POSIX-compliant
# shells: dash, posh, mksh,
# ksh93 etc.
local retref=$1
shift
local arg=$1
eval "$retref=\$arg"
}
fnNamerefBash()
{
local -n retref=$1
shift
local arg=$1
retref=$arg
}
# Return value returned to
# variable 'ret'
fnNamerefPosix ret "arg"
fnNamerefBash ret "arg"
# --------------------------------
# Different shells compared.
# --------------------------------
./run.sh --shell dash,ksh93,bash t-function-return-value-nameref.sh
Run shell: dash
# t1 <skip>
# t2 real 0.006s fnNamerefPosix
# t3 real 0.005s ret=$(fn)
Run shell: ksh93
# t1 <skip>
# t2 real 0.004s fnNamerefPosix
# t3 real 0.005s ret=$(fn)
Run shell: bash
# t1 real 0.006s fnNamerefBash
# t2 real 0.006s fnNamerefPosix
# t3 real 0.094s ret=$(fn)
- In Bash, it is about 2 times faster
for line-by-line handling to read
the file into an array and then loop
through the array. The built-in
readarray
is synonym formapfile
, See code.
# Bash
readarray < file
for line in "${MAPFILE[@]}"
do
...
done
# POSIX. In bash, slower
while read -r line
do
...
done < file
# --------------------------------
# Different shells compared.
# --------------------------------
./run.sh --shell dash,ksh93,bash t-file-read-content-loop.sh
Run shell: dash
# t1 <skip>
# t2 real 0.085 POSIX
Run shell: ksh93
# t1 <skip>
# t2 real 0.021 POSIX
Run shell: bash
# t1 real 0.045 readarray
# t2 real 0.108 POSIX
- In Bash, it is about 2 times faster to prefilter with grep to process only certain lines instead of reading the whole file into a loop and then selecting lines. The process substitution is more general because variables persist after the loop. The dash is very fast compared to Bash. See code.
# Bash
while read -r ...
do
...
done < <(grep "$re" file)
# POSIX
# Problem: while runs in
# a separate environment
grep "$re" file) |
while read -r ...
do
...
done
# POSIX
# NOTE: extra calls
# required for tmpfile
grep "$re" file) > tmpfile
while read -r ...
do
...
done < tmpfile
rm tmpfile
# Bash, Slowest,
# in-loop prefilter
while read -r line
do
[[ $line =~ $re ]] || continue
...
done < file
# --------------------------------
# Different shells compared.
# --------------------------------
./run.sh --shell dash,ksh93,bash t-file-read-match-lines-loop-vs-grep.sh
Run shell: dash
# t1a real 0.015s grep prefilter
# t2a real 0.012s loop: case...esac
Run shell: ksh93
# t1a real 2.940s
# t2a real 1.504s
Run shell: bash
# t1a real 4.567s
# t2a real 10.88s
- It is about 10 times faster to split
a string into an array using list
rather than using Bash here-string.
This is because
HERE STRING
<<<
uses a pipe or temporary file, whereas Bash list operates entirely in memory. The pipe buffer behavor was introduced in Bash 5.1 section c. Warning: Please note that using the(list)
statement will undergo pathname expansion so globbing characters like*
,?
, etc. in string would be a problem. The pathname expansion can be disabled. See code.
str="1:2:3"
# Bash, Ksh. Fastest.
IFS=":" eval 'array=($str)'
fn() # Bash
{
local str=$1
# Make 'set' local
local -
# Disable pathname
# expansion
set -o noglob
local -a array
IFS=":" eval 'array=($str)'
...
}
# Bash. Slower than 'eval'.
IFS=":" read -ra array <<< "$str"
# In Linux, see what Bash uses
# for HERE STRING: pipe or
# temporary file
bash -c 'ls -l --dereference /proc/self/fd/0 <<< hello'
- It is about 2 times faster to read
file into a string using Bash command
substitution
$(< file)
. NOTE: In POSIXsh
, likedash
, the$(cat file)
is extremely fast. See code.
# Bash
str=$(< file)
# In Bash: 1.8x slower
# Read max 100 KiB
read -r -N $((100 * 1024)) str < file
# In Bash: POSIX, 2.3x slower
str=$(cat file)
# --------------------------------
# Different shells compared.
# --------------------------------
./run.sh --shell dash,ksh93,bash t-file-read-into-string.sh
Run shell: dash
# t1 <skip>
# t2 <skip>
# t3 real 0.013s $(cat ...)
Run shell: ksh93
# t1 real 0.088s $(< ...)
# t2 real 0.095s read -N
# t3 real 0.267s $(cat ...)
Run shell: bash
# t1 real 0.139s $(< ...)
# t2 real 0.254s read -N
# t3 real 0.312s $(cat ...)
According to the results, none of these offer practical benefits.
- The Bash
brace expansion
{N..M}
might offer a neglible advantage. However it may be impractical becauseN..M
cannot be parameterized. Surprisingly, the simple and elegant$(seq N M)
is fast, even though command substitution uses a subshell. The last POSIXwhile
loop example was slightly slower in all subsequent tests. See code.
N=1
M=100
# Bash
for i in {1..100}
do
...
done
# POSIX, fast
for i in $(seq $N $M)
do
...
done
# Bash, slow
for ((i=$N; i <= $M; i++))
do
...
done
# POSIX, slowest
i=$N
while [ "$i" -le "$M" ]
do
i=$((i + 1))
done
- One might think that choosing
optimized
grep
options would make a difference. In practice, for typical file sizes (below few Megabytes), performance is nearly identical even with the ignore case option included. Nonetheless, there may be cases where selectingLANG=C
, using--fixed-strings
, and avoiding--ignore-case
might improve performance, at least according to StackOverflow discussions with large files. See code.
# The same performance. Regexp
# engine does not seem to be
# the bottleneck
LANG=C grep --fixed-strings ...
LANG=C grep --extended-regexp ...
LANG=C grep --perl-regexp ...
LANG=C grep --ignore-case ...
None of these offer any advantages to speed up shell scripts.
- The Bash-specific expression
[[]]
might offer a minuscule advantage but only in loops of 10,000 iterations. Unless the safeguards provided by Bash[[ ]]
are important, the POSIX tests will do fine. See code.
[ "$var" = "1" ] # POSIX
[[ $var = 1 ]] # Bash
[ ! "$var" ] # POSIX
[[ ! $var ]] # Bash
[ -z "$var" ] # archaic
- There are no practical differences
between these. The POSIX
arithmetic expansion
$(())
compound command will do fine. Note that the null command:
utilizes the command's side effect to "do nothing, but evaluate elements" and therefore may not be the most readable option. See code.
i=$((i + 1)) # POSIX, preferred
: $((i++)) # POSIX, Uhm
: $((i = i + 1)) # POSIX, Uhm
((i++)) # Bash, Ksh
let i++ # Bash, Ksh
- There is no performance
difference between a
Bash-specific expression
[[]]
for pattern matching compared to POSIXcase..esac
. Interestingly pattern matching is 4x slower under dash compared to Bash. However, that means nothing because the time differences are measured in minuscule milliseconds (0.002s). See code.
string="abcdef"
pattern="*cd*"
# Bash
if [[ $string == $pattern ]]; then
...
fi
# POSIX
case $string in
$pattern)
: # Same as true
;;
*)
false
;;
esac
# --------------------------------
# Different shells compared.
# --------------------------------
./run.sh --shell dash,ksh93,bash t-string-match-regexp.sh
Run shell: dash
# t1 <skip>
# t2 real 0.011 POSIX
Run shell: ksh93
# t1 real 0.004 [[ == ]]
# t2 real 0.002 POSIX
Run shell: bash
# t1 real 0.003 [[ == ]]
# t2 real 0.002 POSIX
- There is no performance difference
between a regular while loop and a
process substitution
loop. However, the latter is more
general, as any variable set during
the loop will persist after and
there is no need to clean up
temporary files like in POSIX (1)
solution. The POSIX (1) loop is
marginally faster but the speed gain
is lost by the extra
rm
command. See code.
# Bash, Ksh
while read -r ...
do
...
done < <(command)
# POSIX (1)
# Same, but with
# temporary file
command > file
while read -r ...
do
...
done < file
rm file
# POSIX (2)
# while is being run in
# separate environment
# due to pipe(|)
command |
while read -r ...
do
...
done
- With
grep
, the use of GNU parallel, aperl
program, makes things notably slower for typical file sizes. The idea of splitting a file into chunks of lines and running the search in parallel is intriguing, but the overhead of starting Perl interpreter withparallel
is orders of magnitude more expensive compared to running already optimizedgrep
only once. Usually the limiting factor when grepping a file is the disk's I/O speed. Otherwise, GNUparallel
is excellent for making full use of multiple cores. Based on StackOverflow discussions, if file sizes are in the several hundreds of megabytes or larger, GNUparallel
can help speed things up. See code.
# Possibly add: --block -1
parallel --pipepart --arg-file "$largefile" grep "$re"
In typical cases, the legacy sh
(Bourne Shell)
is not a relevant target for shell
scripting. The Linux and and modern
UNIX operating systems have long
provided an sh
that is
POSIX-compliant enough. Nowadays sh
is usually a symbolic link to
dash
(on Linux since 2006),
ksh
(on some BSDs), or it may point to
Bash
(on macOS).
Examples of pre-2000 shell scripting practises:
if [ x$a = y ] ...
# Variable lenght is non-zero
if [ -n "$a" ] ...
# Variable lenght is zero
if [ -z "$a" ] ...
# Deprecated in next POSIX
# version. Operands are
# not portable.
# -o (OR)
# -a (AND)
if [ "$a" = "y" -o "$b" = "y" ] ...
# POSIX allows leading
# opening "(" paren
case abc in
(a*) true
;;
(*) false
;;
esac
Modern equivalents:
# Equality
if [ "$a" = "y" ] ..
# Variable has something
if [ "$a" ] ...
# Variable is empty
if [ ! "$a" ] ...
if [ "$a" = "y" ] || [ "$b" = "y" ] ...
# Without leading "(" paren
case abc in
a*) : # "true"
;;
*) false
;;
esac
Writing shell scripts inherently involves considering several factors.
-
Personal scripts. When writing scripts for personal or administrative tasks, the choice of shell is unimportant. On Linux, the obvious choice is Bash. On BSD systems, it would be Ksh. On macOS, Zsh might be handy.
-
Portable scripts. If you intend to use the scripts across some operating systems — from Linux to Windows (Git Bash, Cygwin, MSYS2 [*][**]) — the obvious choice would be Bash. Between macOS and Linux, writing scripts in Bash is generally more portable than writing them in Zsh because Linux doesn't have Zsh installed by default. With macOS however, the choice of Bash is a bit more involved (see next).
-
POSIX-compliant scripts. If you intend to use the scripts across a variety of operating systems — from Linux, BSD, and macOS to various Windows Linux-like environments — the issues become quite complex. You are probably better off writing
sh
POSIX-compliant scripts and testing them with dash, since relying on Bash can lead to unexpected issues — different systems have different Bash versions, and there’s no guarantee that a script written on Linux will run without problems on older Bash versions, such as the outdated 3.2 version in/bin/bash
on macOS. Requiring users to install a newer version on macOS is not trivial because/bin/bash
is not replaceable.
[*] "Git Bash" is available with the
popular native Windows installation of
Git for Windows.
Under the hood, the installation is based on
MSYS2, which in turn is based on
Cygwin. The common denominator of
all native Windows Linux-like
environments is the
Cygwin
base which, in all
practical terms, provides the usual
command-line utilities,
including Bash. For curious readers,
Windows software
MobaXterm,
offers X server, terminals
and other connectivity features, but also
somes with Cygwin-based
Bash shell with its own
Debian-style
apt
package manager which allows installing
additional Linux utilities.
[**] In Windows, there is also
the Windows Subsystem for Linux
(WSL),
where you can install (See wsl --list --onlline
)
Linux distributions like
Debian,
Ubuntu,
OpenSUSE and
Oracle Linux.
Bash is the obvious choice for shell
scripts in this environment.
As this document is more focused on
Linux, macOS, and BSD compatibility,
and less on legacy UNIX operating
systems, for all practical purposes,
there is no need to attempt to write
pure POSIX shell scripts. Stricter
measures are required only if you
target legacy UNIX operating systems
whose sh
may not have changed in 30
years. your best guide probably is
the wealth of knowledge collected by
the GNU autoconf project; see
"11 Portable Shell Programming".
For more discussion see
4.6 MISCELLANEUS NOTES.
Let's first consider the typical sh
shells in order of their strictness to
POSIX:
-
posh. Minimal
sh
, Policy-compliant Ordinary SHell. Very close to POSIX. Stricter than dash. Supportslocal
keyword to define local variables in functions. The keyword is not defined in POSIX. -
dash. Minimal
sh
, Debian Almquish Shell. Close to POSIX. Supportslocal
keyword. The shell aims to meet the requirements of the Debian Linux distribution. -
Busybox ash is based on dash with some more features added. Supports
local
keyword. See ServerFault "What's the Busybox default shell?"
Let's also consider what the /bin/sh
might be in different Operating
Systems. For more about the history of
the sh
shell, see the well-rounded
discussion on StackExchange.
What does it mean to be "sh compatible"?
Picture
"Bourne Family Shells" by
tangentsoft.com
-
On Linux, most distributions already use, or are moving towards using,
sh
as a symlink to dash. Older Linux versions (Red Hat, Fedora, CentOS) used to havesh
to be a symlink tobash
. -
On the most conservative NetBSD, it is
ash
, the old Almquist shell. On FreeBSD,sh
is alsoash
. On OpenBSD, sh is ksh93 from the Ksh family. -
On many commercial, conservative UNIX systems
sh
is nowadays quite capable ksh93. -
On macOS,
sh
points tobash --posix
, where the Bash version is indefinitely stuck at version 3.2.x due to Apple avoiding the GPL-3 license in later Bash versions. If you write/bin/sh
scripts in macOS, it is good idea to check them for portability with:
# Check better /bin/sh
# compliance
dash -nx script.sh
posh -nx script.sh
In practical terms, if you plan to aim for POSIX-compliant shell scripts, the best shells for testing would be posh and dash. You can also extend testing with BSD Ksh shells and other shells. See FURTHER READING for external utilities to check and improve shell scripts even more.
# Save in shell startup file
# like ~/.bashrc
shelltest()
{
local script shell
for script # Implicit "$@"
do
for shell in \
posh \
dash \
"busybox ash" \
mksh \
ksh \
bash \
zsh
do
if command -v "$shell" > /dev/null; then
echo "-- shell: $shell"
$shell -nx "$script"
fi
done
done
}
# Use is like:
shelltest script.sh
# External utility
shellcheck script.sh
# External utility
checkbashisms script.sh
Note that POSIX does not define the shebang — the traditional first line that indicates which interpreter to use. See POSIX C language's section "exec family of functions" and RATIONALE
(...) Another way that some historical implementations handle shell scripts is by recognizing the first two bytes of the file as the character string "#!" and using the remainder of the first line of the file as the name of the command interpreter to execute.
The first bytes of a script typically
contain two special ASCII codes, a
special comment #!
if you wish, which
is read by the kernel. Note that this
is a de facto convention, universally
supported even though it is not defined
by POSIX.
#! <interpreter> [word]
#
# 1. whitespace is allowed in
# "#!" for readability.
#
# 2. The <interpreter> must be
# full path name. Not like:
#
# #! sh
#
# 3. ONE word can be added
# after the <interpreter>.
# Any more than that may not
# be portable accross Linux
# and some BSD Kernels.
#
# #! /bin/sh -eu
# #! /usr/bin/awk -f
# #! /usr/bin/env bash
# #! /usr/bin/env python3
Note that on macOS, /bin/bash
is
hard-coded to Bash version 3.2.57 where
in 2025 lastest Bash is
5.2.
You cannot uninstall it, even with root
access, without disabling System
Integrity Protection. If you install a
newer Bash version with brew install bash
, it will be located in
/usr/local/bin/bash
.
On macOS, to use the latest Bash, the
user must arrange /usr/local/bin
first in
PATH.
If the script starts with #! /bin/bash
, the user cannot arrange it
to run under different Bash version
without modifying the script itself, or
after modifying PATH
, run it
inconveniently with bash <script>
.
... portable
#! /usr/bin/env bash
... traditional
#! /bin/bash
There was a disruptive change from
Python 2.x to Python 3.x in 2008. The
older programs did not run without
changes with the new version. In Python
programs, the shebang should specify
the Python version explicitly, either
with python
(2.x) or python3
.
... The de facto interpreters
#! /usr/bin/python
#! /usr/bin/python3
.... not supported
#! /usr/bin/python2
#! /usr/bin/python3.13.2
But this is not all. Python is one of those languages which might require multiple virtual environments based on projects. It is typical to manage these environments with tools like uv or older virtualenv, pyenv etc. For even better portability, the following would allow user to use his active Python environment:
... portable
#! /usr/bin/env python3
The fine print here is that
env
is a standard POSIX utility, but its
path is not mandated by POSIX. However,
in 99.9% of cases, the de facto
portable location is /usr/bin/env
.
In the end, the actual implementation of the shell you use (dash, bash, ksh...) is less important than what utilities you use and how you use them.
It's not just about choosing to write
in POSIX sh
; the utilities
called from the script also has to be
considered. Those of echo
, cut
,
tail
make big part of of the scripts.
If you want to ensure portability,
check options defined in
POSIX.
See top left menu "Shell & Utilities"
followed by bottom left menu
"4. Utilities"
Notable observations:
- Use POSIX
command -v
to check if command exists. Note that POSIX also definestype
, as intype <command>
without any options. POSIX also defines utilityhash
, as inhash <command>
. Problem withtype
is that the semantics, return codes, support or output are not necessarily uniform. Problem withhash
are similar. Neithertype
norhash
is supported by posh; see table RESULTS-PORTABILITY. Note: Thewhich <command>
is neither in POSIX nor portable. For more information aboutwhich
, see shellcheck SC2230, BashFAQ 081, StackOverflow discussion "How can I check if a program exists from a Bash script?", and Debian project plan about deprecating the command in LWN article "Debian's which hunt".
REQUIRE="sqlite curl"
RequireFeatures ()
{
local cmd
for cmd # Implicit "$@"
do
if ! command -v "$cmd" > /dev/null; then
echo "ERROR: not in PATH: $cmd" >&2
return 1
fi
done
}
# Before program starts
RequireFeatures $REQUIRE || exit $?
...
- Use plain
echo
without any options. Useprintf
when more functionality is needed. Relying solely onprintf
may not be ideal. In POSIX-compliantsh
shells,printf
is not always a built-in command (e.g., in posh or mksh) which can lead to performance overhead due to the need to invoke an external process.
# POSIX
echo "line" # (1)
echo "line"
printf "no newline" # (2)
# Not POSIX
echo -e "line\nline" # (1)
echo -n "no newline" # (2)
-
Use
grep -E
. In 2001 POSIX removedegrep
. -
read
. POSIX requires a VARIABLE, so always supply one. In Bash, the command would default to variableREPLY
if omitted. You should also always use option-r
which is eplained in shellcheck SC2162, BashFAQ 001, POSIX IFS and BashWiki IFS. See in depth details how theread
command does not reads characters and not lines in StackExchange discussion Understanding "IFS= read -r line".
# POSIX
REPLY=$(cat file)
# Bash, Ksh
# Read max 100 KiB to $REPLY
read -rN $((100 * 1024)) < file
case $REPLY in
*pattern*)
# match
;;
esac
set -- 1
# POSIX
# shift all positional args
shift $#
# Any greater number terminates
# the whole program in:
# dash, posh, mksh, ksh93 etc.
shift 2
As a case study, the Linux GNU sed(1)
and its options differ or
are incompatible. The Linux GNU
sed
--in-place
option for replacing file
content cannot be used in macOS and
BSD. Additionally, in macOS and BSD,
you will find GNU programs under a
g
-prefix, such as gsed(1)
, etc. See
StackOverflow
"sed command with -i option failing on Mac, but works on Linux". For more
discussions about the topic, see
StackOverflow 1,
StackOverflow 2,
StackOverflow 3.
# Linux (works)
#
# GNU sed(1). Replace 'this'
# with 'that'
sed -i 's/this/that/g' file
# macOS (does not work)
#
# This does not work. The '-i'
# option has different syntax
# and semantics. There is no
# workaround to make the '-i'
# option work across all
# operating systems.
sed -i 's/this/that/g' file
# Maybe portable
#
# In many cases Perl might be
# available although it is not
# part of the POSIX utilities.
perl -i -pe 's/this/that/g' file
# Portable
#
# Avoid -i option.
tmp=$(mktemp)
sed 's/this/that/g' file > "$tmp" &&
mv "$tmp" file &&
rm -f "$tmp"
POSIX
awk
,
does not support the -v
option to
define variables. You can use
assignments after the program instead.
# POSIX
awk '{print var}' var=1 file
# GNU awk
awk -v var=1 '{print var}' file
However, don't forget that such
assignments are not evaluated until
they are encountered, that is, after
any BEGIN
action. To use awk for
operands without any files:
# POSIX
var=1 awk 'BEGIN {print ENVIRON["var"] + 1}' < /dev/null
# GNU awk
awk -v var=1 'BEGIN {print var + 1; exit}'
- The shell's null command
:
might be slightly preferrable than utlitytrue
according to GNU autoconf's manual "11.14 Limitations of Shell Builtins" which states that:
might not be always builtin and "(...) the portable shell community tends to prefer using :".
while :
do
break
done
# Create an empty file
: > file
-
Prefer POSIX
$(cmd)
command substitution instead of leagacy POSIX backtics as in `cmd`. For more information, see BashFaq 098 and shellcheck SC2006. For 20 years all the modern
sh
shells have supported$()
. Including UNIX like AIX, HP-UX and conservative Oracle Solaris 10 (2005) whose support ends in 2026 (see Solaris version history).
# Easily nested
lastdir=$(basename $(pwd))
# Readabilty problems
lastdir=`basename \`pwd\``
See the Bash manual how to use
time
reserved word with
TIMEFORMAT
variable to display results in
different formats. The use of time as a
reserved word permits the timing of
shell builtins, shell functions, and
pipelines.
TIMEFORMAT='real: %R' # '%R %U %S'
You could also drop kernel cache before testing:
echo 3 > /proc/sys/vm/drop_caches
- Bash Manual
- Greg's Bash Wiki and FAQ https://mywiki.wooledge.org/BashGuide
- List of which features were added to specific releases of Bash https://mywiki.wooledge.org/BashFAQ/061
- GNU autoconf's manual section "11 Portable Shell Programming" Note: This presents information intended to overcome operating system portability issues dating back to the 1970s. Consider some tips with a grain of salt, given the capabilities of more modern POSIX-compliant shells.
- For cross platform operating system detection, see useful files to check: http://linuxmafia.com/faq/Admin/release-files.html
shellcheck
(Haskell) can help to improve and write portable POSIX scripts. It can statically Lint scripts for potential mistakes. There is also a web interface where you can upload the script at https://www.shellcheck.net. In Debian, see package "shellcheck". The manual page is at https://manpages.debian.org/testing/shellcheck/shellcheck.1.en.htmlcheckbashisms
can help to improve and write portable POSIX scripts. In Debian, the command is available in package "devscripts". The manual page is at https://manpages.debian.org/testing/devscripts/checkbashisms.1.en.html
Relevant POSIX links from 2000 onward:
- https://en.wikipedia.org/wiki/POSIX
- POSIX.1-2024 IEEE Std 1003.1-2024 https://pubs.opengroup.org/onlinepubs/9799919799
- POSIX.1-2018 IEEE Std 1003.1-2018 https://pubs.opengroup.org/onlinepubs/9699919799 https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/
- POSIX.1-2008 IEEE Std 1003.1-2008 https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/
- POSIX.1-2004 (2001-2004) IEEE Std 1003.1-2004 https://pubs.opengroup.org/onlinepubs/009695399
Relevant UNIX standardization links. Single UNIX Specification (SUSv4) documents are derived from POSIX standards. For an operating system to become UNIX certified, it must meet all specified requirements—a process that is both costly and arduous. The only "Linux-based" system that has undergone full certification is Apple's macOS 10.5 Leopard in 2007. Read the story shared by Apple’s project lead, Terry Lambert, in a Quora's discussion forum "What goes into making an OS to be Unix compliant certified?"
- The Single UNIX Specification, Version 4 https://unix.org/version4/
- See discussion at StackExchnage about "Difference between POSIX, Single UNIX Specification, and Open Group Base Specifications?".
- A comprehensive history of
ash
. "Ash (Almquist Shell) Variants" by Sven Mascheck https://www.in-ulm.de/~mascheck/various/ash/ - Late Jörg Shillings's
schilitools
contains
pbosh
shell that can be used for POSIX-sh-like testing. See discussion of preserving the project and some history at Reddit. - Super simple
s
command interpreter to write shell-like scripts (security oriented): https://github.com/rain-1/s
Copyright (C) 2024-2025 Jari Aalto
These programs are free software; you can redistribute it and/or modify them under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
These programs are distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with these programs. If not, see http://www.gnu.org/licenses/.
License-tag: GPL-2.0-or-later
See https://spdx.org/licenses
Keywords: shell, sh, POSIX, bash, ksh, ksh93, programming, optimizing, performance, profiling, portability