How to remove control chars from UTF8 string
- by Mimefilt
Hi there,
i have a VB.NET program that handles the content of documents.
The programm handles high volumes of documents as "batch"(2Million documents;total 1TB volume)
Some of this documents may contain control chars or chars like f0e8(http://www.fileformat.info/info/unicode/char/f0e8/browsertest.htm).
Is there a easy and especially fast way to remove that chars?(except space,newline,tab,...)
If the answer is regex: Has anyone a complete regex for me?
Thanks!