I am excited. I have had some Outlook/WordPress compatibilities issues for almost a year now, but I was unable to find any sign that other people have been suffering them.
I finally have an hypothesis to explain the problem. The hypothesis is untested and as yet unfixed, but this is already a bold step forward.
Symptoms
When read in Outlook, emails sent from WordPress are formatted poorly. All of the paragraphs are jammed together. Also, sometimes some of the email headers leak into the body – e.g. the top line of the email body will read:
Content-Type: text/plain; charset=”UTF-8″
When the same email is read via a webmail client, it looks fine.
When the same email is read via Python’s poplib, there are CR characters stuck at the end of some lines even after poplib has allegedly filtered out both CR-LF and LF-CR sequences.
When the same email is read via Python’s poplib with debugging turned on, it seems each line in the body – and each of the troublesome lines in the header – is terminated with CR-CR-LF.
Hypothesis
WordPress 2.0.2 uses PHP‘s mail function.
According to Ben Cooke, the function requires different input when it is running on different operating systems: On Windows systems, you need to use the CR-LF, as defined in the SMTP email standard. On Unix systems, you need to use the traditional Unix LF only – it will convert to use the CR-LF style that is required to send down the wire.
Here’s the kicker. The Unix system will replace all of the LF characters into CR-LF, even if there is already a CR preceding it. That means a CR-LF sequence will be broken (Ben Cooke describes it as “hypercorrected”) into a CR-CR-LF sequence.
Many email clients accept the CR-CR-LF sequence without blinking. I suspect all the clients that WordPress testers use fall into this category.
<supposition>
Outlook uses a single CR-LF sequence to indicate an end-of-line “hint”. Outlook reformats the text to fill the whole window. so it effectively discards that hint. Outlook uses a double CR-LF sequence to indicate an end-of-paragraph.
So when WordPress incorrectly sends an end-of-paragraph CR-CR-LF-CR-CR-LF sequence, Outlook interprets it in two phases.
First, it searches for end-of-paragraphs (CR-LF-CR-LF) and finds none.
Then, it strips out any remaining LF and CR characters, as they are merely end-of-line hints, and can be ignored.
</supposition>
My hypothesis matches the facts neatly.
Solution
The solution is three-fold.
- Search the WordPress bug database again, now that I have a better idea what to search for, and see if this is already known.
- If not, raise the issue in the WordPress database.
- Consider writing a simple plugin which replaces the easily-pluggable wp_mail function in WordPress, with an equivalent one that calls one of the third-party open-source replacements for PHP’s mail function. They allegedly handle all of the platform incompatibility issues more gracefully.
Comment by Aristotle Pagaltzis on March 27, 2006
I have to stab myself in the eye with a fork at work, so I prefer to stab myself in the eye with a fork at home.
🙂
Comment by Julian on March 27, 2006
ROTFL!
I seem to pride myself in that at least I am stabbing myself with a fork into the same eye at home!
Comment by Julian on March 28, 2006
Update:
A new ticket has been raised against WordPress.
I am now beta-testing a new WordPress plugin that I wrote to overcome the problem.
Comment by Julian on March 29, 2006
Beta-testing is complete. The new plugin, CRCRLF, is now available!
I know it sounds minor, but this is really making a big difference to me for the usability of WordPress.
Comment by Sunny Kalsi on March 31, 2006
I’m with you on the snarky comments this time, Julian. At the end of the day, Outlook is doing what it can with the input it’s given. It’s not buggy behaviour that wordpress has to work around. It’s wordpress which has illegal output.