Comments:"text processing - Why is printf better than echo? - Unix & Linux Stack Exchange"
URL:http://unix.stackexchange.com/q/65803/14305
Basically, it's a portability (and reliability) issue.
Initially, echo
didn't accept any option and didn't expand anything. All it was doing was outputting its arguments separated with a space character and terminated by a newline character.
Now, someone thought it would be nice if we could do things like echo "\n\t"
to output newline or tab characters, or have an option not to output the trailing newline character.
They then thought harder but instead of adding that functionality to the shell (like perl
where inside double quotes, \t
actually means a tab character), they added it to echo
.
David Korn realised the mistake and introduced a new form of shell quotes: $'...'
which was later copied by bash
and zsh
but it was far too late by that time.
Now when a standard Unix echo
receives an argument which contains the two characters \
and t
, instead of outputting them, it outputs a tab character. And as soon as it sees \c
in an argument, it stops outputting (so the trailing newline is not output either).
Other shells/Unix vendors chose to do it differently: they added a -e
option expand escape sequences, and a -n
option to not output the trailing newline. Some have a -E
to disable escape sequences, some have -n
but not -e
, the list of escape sequences supported by one echo
implementation is not necessarily the same as supported by another.
Sven Mascheck has a nice page that shows the extent of the problem.
On those echo
that support options, there's generally no support of a --
to mark the end of options (zsh and possibly others support -
for that though), so for instance, it's difficult to output "-n"
in many shells.
On some shells like bash
or ksh93
, the behaviour even depends on how the shell was compiled or the environment. So two bash
echos, even from the same version of bash
are not guaranteed to behave the same.
POSIX says: if the first argument is -n
or any argument contains backslashes, then the behaviour is unspecified. bash
echo in that regard is not POSIX in that for instance echo -e
is not outputting -e<newline>
as POSIX requires. The Unix specification is stricter, it prohibits -n
and requires expansion of some escape sequences including the \c
one to stop outputting.
Those specifications don't really come to the rescue here given that many implementations are not compliant.
All in all, you don't know what echo "$var"
will output unless you can make sure that $var
doesn't contain backslash characters and doesn't start with -
. The POSIX specification actually does tell us to use printf
instead in that case.
So what that means is that you can't use echo
to display uncontrolled data. In other words, if you're writing a script and it is taking external input (from the user as arguments, or file names from the file system...), you can't use echo
to display it.
This is OK:
echo >&2 Invalid file.
This is not:
echo >&2 "Invalid file: $file"
(though it will work OK with some (non Unix) echo
implementations like bash
's when the xpg_echo
has not been enabled in one way or another like at compilation time or via the environment).
printf
, on the other hand is more reliable, at least when it's limited to the basic usage of echo
.
printf '%s\n' "$var"
Will output the content of $var
followed by a newline character regardless of what character it may contain.
printf %s "$var"
will output it without the trailing newline character.
Now, there also are differences between printf
implementations. There's a core of features that is specified by POSIX, but then there's a lot of extensions. For instance, some support a %q
to quote the arguments but how it's done is shell specific, some support \uxxxx
for unicode characters. The behaviour varies for printf '%10s\n' "$var"
in multibyte locales, there are at least three different outcomes of printf %b '\123'
But in the end, if you stick to the POSIX feature set of printf
and don't try doing anything fancy with it, you're out of trouble.
But remember the first argument is the format, so shouldn't contain variable/uncontrolled data.
A more reliable echo
can be implemented using printf
, like:
echo() ( # forking for local scope for $IFS
IFS=" " # needed for "$*"
printf '%s\n' "$*"
)
echo_n() (
IFS=" "
printf %s "$*"
)
echo_e() (
IFS=" "
printf '%b\n' "$*"
)
The fork can be avoided using local IFS
on Linux (the LSB specification mandates local
for Linux sh
), or by writing it like:
echo() {
if [ "$#" -gt 0 ]; then
printf %s "$1"
shift
fi
if [ "$#" -gt 0 ]; then
printf ' %s' "$@"
fi
printf '\n'
}