Can GNU sed (for Windows) handle Unicode? If so, is it a code-page/locale issue, or a switch?
Posted
by
Peter.O
on Super User
See other posts from Super User
or by Peter.O
Published on 2010-08-04T20:57:34Z
Indexed on
2012/06/15
21:19 UTC
Read the original article
Hit count: 217
I've been using GNU SED on and off for a couple of years now. It spins me out a bit sometimes, but it does a good job... for single-byte char sets!
I now and then notice references to GNU SED being Unicode-aware, but the closest I've seen of this is its "binary" mode.. and binary is not Unicode.
Can GSED process a Unicode text file at CodePoint resolution, including and especially \r\n (Windows)... and if it can, does it expect UTF-8, UTF-16, or what? and how does SED detect the encoding?
© Super User or respective owner