[reposting since the original seems to have vanished]
It is him alright, and I maintain HarvestMan, (Capital 'H', Capital
'M').
But I dont think Nirmal is masochist, not in any way.
I dont understand why you called him "masochist". The reason why
we wrote our own cookie handling module using python's Cookie module
was that, the ClientCookie module was doing too many things, which
we wanted to avoid.
Anand, first, it wasn't said with the *slightest* malice (after all, I
meet my own criterion for masochism)!
Still, I certainly *do* think you're duplicating effort for no obvious
reason. You're quite at liberty to do that, of course, but so am I to
call you a masochist for it ;-)
ClientCookie is exactly what it says, it is a module that acts as
a webclient apart from managing cookies. The ClientCookie module
borrows the methods of urllib2 like urlopen() and manages cookies
under the covers, so to say. Good design, no doubt but not what
we wanted.
Well, no. It provides convenient replacements for the urllib2
callables (that's what _urllib2_support.py is for). It certainly
doesn't *require* you use that urllib2-wrapping stuff. All it
requires is that you give CookieJar.extract_cookies /
..add_cookie_header trivial request and response objects. In fact,
since I happen to know that HarvestMan uses urllib2, I know you
already *have* such objects.
No wrapping involved:
import urllib2, ClientCookie
cj = ClientCookie.CookieJar()
request = urllib2.Request("
http://www.example.com/")
cj.add_cookie_header(request)
response = urllib2.urlopen(request)
cj.extract_cookies(response, request)
....etc
It couldn't really be much simpler! In fact, I see your own
CookieManager.SetCookie and CookieManager.add_cookie_header methods
are directly analogous (if slightly less convenient, less
standards-compliant, more ignorant of the de-facto cookie standard,
and more buggy -- well OK, maybe not more buggy ;-).
Well, OK, it could be simpler if you didn't even have to call those
methods on cj. But of course, that's what the urllib2-replacement
stuff is for, to add entirely automatic cookie handling:
ClientCookie.urlopen(request) # no need for any tiresome method calls
You're entirely free to use or ignore that stuff.
Why does that optional support *wrap* urllib2 instead of *extending*
it with a CookieHandler? Good question. If you want to have the
urllib2 interface with automatic cookie handling in the sense above,
you're currently *forced* to replace (parts of) urllib2 rather than
extend it, due to the current design of urllib2. I hope to change
that with a patch I've submitted and Jeremy Hylton plans to look at in
his Copious Free Time.
ClientCookie is something like a wrapper over urllib2 plus cookie
handling. We wanted a module, which works *with* urllib2 and does
not wrap over it. With all due respect to CC, it cannot be modified
in a way to do that, withou writing klunky code, which I did not want.
That's just a misunderstanding -- it *has* been that way for a long
time. Just ignore, or throw away, _urllib2_support.py!
Hence nirmal read the RFCs for cookie handling and implemented
Well, as I tried to tell Nirmal (unfortunately it seems that email
address was dead), and as is explained in tiresome (though far from
exhaustive) detail in the ClientCookie docs, browsers don't actually
implement the RFCs. Well, lynx makes a good attempt at RFC 2109, and
Opera does for 2965, but as long as the big browsers don't implement
them -- and it's almost a certainty they never will -- nobody will
actually be *using* either standard! They're just there to trip you
up <0.7 wink>. As is the original cookie_spec.html, actually: it only
bears a passing resemblence to the de facto standard actually used on
the internet. Of course, as long as your userbase is sufficiently
small (please don't feel insulted: the userbase of my code is
doubtless pretty tiny), you may not notice (though I suspect you
will), but people like Ronald Tschalar, author of a Java library
called HTTPClient, have found themelves spending inordinate amounts of
time fixing problems that arise by trying to implement only the RFCs,
or by naievely trying to combine the RFCs with the Netscape protocol.
Of course, you can get around that to a large extent by not bothering
to implement the security rules properly (which may be quite a
reasonable thing to do in some cases).
[...]
Nirmal is talking about RFCs because we want to have a very correct
technical cookie implementation for our module, no matter what web
servers
does in their wheels and geats. I know that the SetCookie2 method uses
latest cookie RFC, which almost no webserver supports, but still the
question should be taken in a spirit of technical correctness.
Your module claims to implement 2109, officially obsolete. Fine
(though I suspect it's quite far from a full implementation). As for
2965 (the standard that officially obsoletes 2109, and which
ClientCookie does implement in addition to the Netscape protocol),
even David Kristol seems to have given up on it, as has the single guy
who was driving the effort to rescue it from compatibility problems
with the Netscape protocol (Daniel Kian McKiernan). RFC 2965 now
seems utterly dead as an internet protocol.
Yeah it has an academic quality to it, not of much pratical use maybe
[...]
As you say.
John