M
Matěj Cepl
Hi,
I have a script (https://github.com/mcepl/gg_scraper) where I need to
read possibly malformed mbox messages. I use subprocess.Popen() and
/usr/bin/formail to clean up them to be correct mbox messages (with
correct leading From line etc.). Now I try to run tests for my script on
Travis-CI, where I don't have installed formail. Actually, I learned now
that I can run apt-get install procmail in .travis.yml. But still, I
started to think whether I couldn’t fix my script to be purely Pythonic.
I know that
msg = email.message_from_string(original_msg)
print(msg.as_string(unixfrom=True))
works as a poor-man’s replacement for `formail -d`. Now, I would like to
know how reliable replacement it is. Does anybody have (or know about) a
corpus of poorly formatted messages which can be fixed by formail to
test upon it?
Thanks a lot for any reply,
Matěj
--
http://www.ceplovi.cz/matej/, Jabber: (e-mail address removed)
GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC
Less is more or less more.
-- Y_Plentyn on #LinuxGER
(from fortunes -- I cannot resist
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iD8DBQFSzsjj4J/vJdlkhKwRAnoyAJ0aiXJSbC5GzzxQPgzTSKaki6OKagCfa9Fl
wlKBE7QPQTuWaYdmmPbXHCI=
=DDGj
-----END PGP SIGNATURE-----
I have a script (https://github.com/mcepl/gg_scraper) where I need to
read possibly malformed mbox messages. I use subprocess.Popen() and
/usr/bin/formail to clean up them to be correct mbox messages (with
correct leading From line etc.). Now I try to run tests for my script on
Travis-CI, where I don't have installed formail. Actually, I learned now
that I can run apt-get install procmail in .travis.yml. But still, I
started to think whether I couldn’t fix my script to be purely Pythonic.
I know that
msg = email.message_from_string(original_msg)
print(msg.as_string(unixfrom=True))
works as a poor-man’s replacement for `formail -d`. Now, I would like to
know how reliable replacement it is. Does anybody have (or know about) a
corpus of poorly formatted messages which can be fixed by formail to
test upon it?
Thanks a lot for any reply,
Matěj
--
http://www.ceplovi.cz/matej/, Jabber: (e-mail address removed)
GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC
Less is more or less more.
-- Y_Plentyn on #LinuxGER
(from fortunes -- I cannot resist
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iD8DBQFSzsjj4J/vJdlkhKwRAnoyAJ0aiXJSbC5GzzxQPgzTSKaki6OKagCfa9Fl
wlKBE7QPQTuWaYdmmPbXHCI=
=DDGj
-----END PGP SIGNATURE-----