A
Antoon Pardon
I want to do some postprocessing on messages from a particular mailbox.
So I use getmail which will fetch the messages and feed them to stdin
of my program.
As I don't know what encoding these messages will be in, I thought it
would be prudent to read stdin as binary data.
Using python 3.3 on a debian box I have the following code.
#!/usr/bin/python3
import sys
from email import message_from_file
sys.stdin = sys.stdin.detach()
msg = message_from_file(sys.stdin)
which gives me the following trace back
File "/home/apardon/.getmail/verdeler", line 7, in <module>
msg = message_from_file(sys.stdin)
File "/usr/lib/python3.3/email/__init__.py", line 56, in message_from_file
return Parser(*args, **kws).parse(fp)
File "/usr/lib/python3.3/email/parser.py", line 58, in parse
feedparser.feed(data)
File "/usr/lib/python3.3/email/feedparser.py", line 167, in feed
self._input.push(data)
File "/usr/lib/python3.3/email/feedparser.py", line 100, in push
data, self._partial = self._partial + data, ''
TypeError: Can't convert 'bytes' object to str implicitly))
which seems to be rather odd. The following header are in the msg:
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
So why doesn't the email parser lookup the charset and use that
for converting to string type?
What is the canonical way to parse an email message from stdin?
So I use getmail which will fetch the messages and feed them to stdin
of my program.
As I don't know what encoding these messages will be in, I thought it
would be prudent to read stdin as binary data.
Using python 3.3 on a debian box I have the following code.
#!/usr/bin/python3
import sys
from email import message_from_file
sys.stdin = sys.stdin.detach()
msg = message_from_file(sys.stdin)
which gives me the following trace back
File "/home/apardon/.getmail/verdeler", line 7, in <module>
msg = message_from_file(sys.stdin)
File "/usr/lib/python3.3/email/__init__.py", line 56, in message_from_file
return Parser(*args, **kws).parse(fp)
File "/usr/lib/python3.3/email/parser.py", line 58, in parse
feedparser.feed(data)
File "/usr/lib/python3.3/email/feedparser.py", line 167, in feed
self._input.push(data)
File "/usr/lib/python3.3/email/feedparser.py", line 100, in push
data, self._partial = self._partial + data, ''
TypeError: Can't convert 'bytes' object to str implicitly))
which seems to be rather odd. The following header are in the msg:
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
So why doesn't the email parser lookup the charset and use that
for converting to string type?
What is the canonical way to parse an email message from stdin?