How relevant is UTF-7 when it comes to parsing emails?
- by J. Pablo Fernández
I recently implemented incoming emails for an application and boy, did I open the gates of hell? Since then every other day an email arrives that makes the app fail in a different way.
One of those things is emails encoded as UTF-7. Most emails come as ASCII, some of the Latin encodings, or thankfully, UTF-8.
Hotmail error messages (like email address doesn't exist or quota exceeded) seem to come as UTF-7. Unfortunately, UTF-7 is not an encoding Ruby understands:
> "hello world".encode("utf-8", "utf-7")
Encoding::ConverterNotFoundError: code converter not found (UTF-7 to UTF-8)
> Encoding::UTF_7
=> #<Encoding:UTF-7 (dummy)>
My application doesn't crash, it actually handles the email quite well, but it does send me a notification about the potential error.
I spent some time googling and I can't find anyone that implemented the conversion, at least not as a Ruby 1.9.3 Encoding::Converter.
So, my question is, since I never got an email with actual content, from an actual person, in UTF-7, how relevant is that encoding? can I safely ignore it?