Removing control / special characters from log file

Posted by digitalsky on Stack Overflow See other posts from Stack Overflow or by digitalsky
Published on 2011-11-25T20:16:08Z Indexed on 2011/11/26 1:51 UTC
Read the original article Hit count: 101

Filed under:
|
|

I have a log file captured by tclsh which captures all the backspace characters (ctrl-H, shows up as "^H") and color-setting sequences (eg. ^[[32m .... ^[[0m ). What is an efficient way to remove them?

^[...m

This one is easy since, I can just do "sed -i /^[.*m//g" to remove them

^H

Right now I have "sed -i s/.^H//", which "applies" a backspace, but I have to keep looping this until there are no more backspaces.

while [ logfile == `grep -l ^H logfile` ]; do sed -i s/.^H// logfile ; done;

"sed -i s/.^H//g" doesn't work because it would match consecutive backspaces. This process takes 11 mins for my log file with ~6k lines, which is too long.

Any better ways to remove the backspace?

© Stack Overflow or respective owner

Related posts about bash

Related posts about sed