Why does printf report an error on all but three (ASCII-range) Unicode Codepoints, yet is fine with all others?
Posted
by
fred.bear
on Ask Ubuntu
See other posts from Ask Ubuntu
or by fred.bear
Published on 2011-01-09T21:19:24Z
Indexed on
2011/01/09
21:59 UTC
Read the original article
Hit count: 190
The 'printf' I refer to is the standard-issue "program" (not the built-in): /usr/bin/printf
I was testing printf out as a viable method of convert a Unicode Codepoint Hex-literal into its Unicoder character representation,
I was looking good, and seemed flawless..(btw. the built-in printf can't do this at all (I think)...
I then thought to test it at the lower extreme end of the code-spectrum, and it failed with an avalanche of errors.. All in the ASCII range (= 7 bits)
The strangest thing was that 3 value printed normally; they are:
- $ \u0024
- @ \u0040
- ` \u0060
I'd like to know what is going on here. The ASCII character-set is most definitely part of the Unicode Code-point sequence....
I am puzzled, and still without a good way to bash script this particular converion.. Suggestions are welcome.
To be entertained by that same avalanche of errors, paste the following code into a terminal...
# Here is one of the error messages
# /usr/bin/printf: invalid universal character name \u0041
# ...for them all, run the following script
(
for nib1 in {0..9} {A..F}; do
for nib0 in {0..9} {A..F}; do
[[ $nib1 < A ]] && nl="\n" || nl=" "
$(type -P printf) "\u00$nib1$nib0$nl"
done
done
echo
)
© Ask Ubuntu or respective owner