How to accurately parse smtp message status code (DSN)?
- by Geo
RFC1893 claims that status codes will come in the format below you can read more here.
But our bounce management system is having a hard time parsing error status code from bounce messages. We are able to get the raw message, but depending on the email server the code will come in different places. Is there any rule on how to parse this type of messages to obtain better results. We are not looking for the 100% solution but at least 80%.
This document defines a new set of status codes to report mail system
conditions. These status codes are intended to be used for media and
language independent status reporting. They are not intended for
system specific diagnostics.
The syntax of the new status codes
is defined as:
status-code = class "." subject "." detail
class = "2"/"4"/"5"
subject = 1*3digit
detail = 1*3digit
White-space characters and comments
are NOT allowed within a status-
code. Each numeric sub-code within
the status-code MUST be expressed
without leading zero digits.
The quote above from the RFC tells one thing but then the text below from a leading tool on bounce management says something different, where I can get a good source of standard status codes:
Return Code Description
0 UNDETERMINED - (ie. Recipient Reply)
10 HARD BOUNCE - (ie. User Unknown)
20 SOFT BOUNCE - General
21 SOFT BOUNCE - Dns Failure
22 SOFT BOUNCE - Mailbox Full
23 SOFT BOUNCE - Message Too Large
30 BOUNCE - NO EMAIL ADDRESS. VERY RARE!
40 GENERAL BOUNCE
50 MAIL BLOCK - General
51 MAIL BLOCK - Known Spammer
52 MAIL BLOCK - Spam Detected
53 MAIL BLOCK - Attachment Detected
54 MAIL BLOCK - Relay Denied
60 AUTO REPLY - (ie. Out Of Office)
70 TRANSIENT BOUNCE
80 SUBSCRIBE Request
90 UNSUBSCRIBE/REMOVE Request
100 CHALLENGE-RESPONSE