RMail and RFC-2047

Oliver Cromm · May 27, 2004

I was playing around with the RMail package and I was missing RFC-2047
support. I found the "module Rfc2047" in
<20031204151316.GC849@jupp%gmx.de>
but noticed the following:

In the regex to discover encoded words:

| WORD = %r{=\?([!#$%&'*+-/0-9A-Z\\^\`a-z{|}~]+)\?([BbQq])\?([!->@-~]+)\?=} # :nodoc:

I had to change % to \% to run. Maybe it's just Cygwin.

The second thing is that the module doesn't correctly interpret the
"encoded-word - linear white space - encoded word" sequence, where
all the white space should be deleted.

So I added a regex to delete this whitespace before further processing:

module Rfc2047

WORD = %r{=\?([!#$\%&'*+-/0-9A-Z\\^\`a-z{|}~]+)\?([BbQq])\?([!->@-~]+)\?=} # :nodoc:
| WORDSEQ = %r{(=\?[!#$\%&'*+-/0-9A-Z\\^\`a-z{|}~]+\?[BbQq]\?[!->@-~]+\?=)\s*(=\?[!#$\%&'*+-/0-9A-Z\\^\`a-z{|}~]+\?[BbQq]\?[!->@-~]+\?=)}

[Comment skipped]

def Rfc2047.decode_to(target, from)
| from.gsub!(WORDSEQ, '\1\2')

out = from.gsub(WORD) do
|word|
charset, encoding, text = $1, $2, $3

It works so far, but I wonder whether '\s*' is the correct expression
and whether there is a more efficient way to do this.

I also observed that decoding of non-Western character sets (Win-1251
to
Big5) to UTF-8 didn't work. Does anybody already suspect why or do I
have
to track down the error further?

TCPSocket and RFC 821	9	Jul 30, 2006
Translater + module + tkinter	1	Feb 16, 2023
Decoding no of ways and printing each decode message	2	Jun 1, 2021
Part of RFC 822 ignored by email module	8	Jan 20, 2011
SVG not full width and space	0	Sep 15, 2023
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
Boomer trying to learn coding in C and C++	6	Dec 16, 2022
How to remove the undefined thing?	1	Oct 19, 2022

RMail and RFC-2047

Oliver Cromm

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads