Quantcast
Channel: Hacker News 50
Viewing all articles
Browse latest Browse all 9433

text processing - Why is printf better than echo? - Unix & Linux Stack Exchange

$
0
0

Comments:"text processing - Why is printf better than echo? - Unix & Linux Stack Exchange"

URL:http://unix.stackexchange.com/q/65803/14305


Basically, it's a portability (and reliability) issue.

Initially, echo didn't accept any option and didn't expand anything. All it was doing was outputting its arguments separated with a space character and terminated by a newline character.

Now, someone thought it would be nice if we could do things like echo "\n\t" to output newline or tab characters, or have an option not to output the trailing newline character.

They then thought harder but instead of adding that functionality to the shell (like perl where inside double quotes, \t actually means a tab character), they added it to echo.

David Korn realised the mistake and introduced a new form of shell quotes: $'...' which was later copied by bash and zsh but it was far too late by that time.

Now when a standard Unix echo receives an argument which contains the two characters \ and t, instead of outputting them, it outputs a tab character. And as soon as it sees \c in an argument, it stops outputting (so the trailing newline is not output either).

Other shells/Unix vendors chose to do it differently: they added a -e option expand escape sequences, and a -n option to not output the trailing newline. Some have a -E to disable escape sequences, some have -n but not -e, the list of escape sequences supported by one echo implementation is not necessarily the same as supported by another.

Sven Mascheck has a nice page that shows the extent of the problem.

On those echo that support options, there's generally no support of a -- to mark the end of options (zsh and possibly others support - for that though), so for instance, it's difficult to output "-n" in many shells.

On some shells like bash or ksh93, the behaviour even depends on how the shell was compiled or the environment. So two bash echos, even from the same version of bash are not guaranteed to behave the same.

POSIX says: if the first argument is -n or any argument contains backslashes, then the behaviour is unspecified. bash echo in that regard is not POSIX in that for instance echo -e is not outputting -e<newline> as POSIX requires. The Unix specification is stricter, it prohibits -n and requires expansion of some escape sequences including the \c one to stop outputting.

Those specifications don't really come to the rescue here given that many implementations are not compliant.

All in all, you don't know what echo "$var" will output unless you can make sure that $var doesn't contain backslash characters and doesn't start with -. The POSIX specification actually does tell us to use printf instead in that case.

So what that means is that you can't use echo to display uncontrolled data. In other words, if you're writing a script and it is taking external input (from the user as arguments, or file names from the file system...), you can't use echo to display it.

This is OK:

echo >&2 Invalid file.

This is not:

echo >&2 "Invalid file: $file"

(though it will work OK with some (non Unix) echo implementations like bash's when the xpg_echo has not been enabled in one way or another like at compilation time or via the environment).

printf, on the other hand is more reliable, at least when it's limited to the basic usage of echo.

 printf '%s\n' "$var"

Will output the content of $var followed by a newline character regardless of what character it may contain.

 printf %s "$var"

will output it without the trailing newline character.

Now, there also are differences between printf implementations. There's a core of features that is specified by POSIX, but then there's a lot of extensions. For instance, some support a %q to quote the arguments but how it's done is shell specific, some support \uxxxx for unicode characters. The behaviour varies for printf '%10s\n' "$var" in multibyte locales, there are at least three different outcomes of printf %b '\123'

But in the end, if you stick to the POSIX feature set of printf and don't try doing anything fancy with it, you're out of trouble.

But remember the first argument is the format, so shouldn't contain variable/uncontrolled data.

A more reliable echo can be implemented using printf, like:

echo() ( # forking for local scope for $IFS
 IFS=" " # needed for "$*"
 printf '%s\n' "$*"
)
echo_n() (
 IFS=" "
 printf %s "$*"
)
echo_e() (
 IFS=" "
 printf '%b\n' "$*"
)

The fork can be avoided using local IFS on Linux (the LSB specification mandates local for Linux sh), or by writing it like:

echo() {
 if [ "$#" -gt 0 ]; then
 printf %s "$1"
 shift
 fi
 if [ "$#" -gt 0 ]; then
 printf ' %s' "$@"
 fi
 printf '\n'
}

Viewing all articles
Browse latest Browse all 9433

Trending Articles