Can GNU sed (for Windows) handle Unicode? If so, is it a code-page/locale issue, or a switch?

Posted by Peter.O on Super User See other posts from Super User or by Peter.O
Published on 2010-08-04T20:57:34Z Indexed on 2012/06/15 21:19 UTC
Read the original article Hit count: 217

Filed under:
|
|
|
|

I've been using GNU SED on and off for a couple of years now. It spins me out a bit sometimes, but it does a good job... for single-byte char sets!
I now and then notice references to GNU SED being Unicode-aware, but the closest I've seen of this is its "binary" mode.. and binary is not Unicode.
Can GSED process a Unicode text file at CodePoint resolution, including and especially \r\n (Windows)... and if it can, does it expect UTF-8, UTF-16, or what? and how does SED detect the encoding?

© Super User or respective owner

Related posts about Windows

Related posts about encoding