Cookie Module

N

N.K

Hi ,

Python's existing cookie module doesnt supports new cookie headers
SetCookie2 ,
How to submit a patch for that ? I tried emailing person who owns that
module.But no response.

Thanks,
Nirmal
 
P

Peter Hansen

N.K said:
Python's existing cookie module doesnt supports new cookie headers
SetCookie2 ,
How to submit a patch for that ? I tried emailing person who owns that
module.But no response.

Have you looked at this page? http://www.python.org/dev/

Note the reference to the "Patch Manager"...

-Peter
 
J

John J. Lee

Python's existing cookie module doesnt supports new cookie headers
SetCookie2 ,
[...]

What do you want this for? I'm curious, and I suspect you're unaware
that the RFCs on cookies are *not* the standards followed by most of
the web -- the real 'standards' aren't really written down anywhere.

Wait a minute (my memory works slowly), didn't I write you an email
about this a while back? Are you the masochist^H^H^H^H^H^H^H^H^Hguy
who wrote some client-side cookie-handling code for a web-crawler
(called harvestman, IIRC?)?


John
 
A

Anand Pillai

It is him alright, and I maintain HarvestMan, (Capital 'H', Capital
'M').
But I dont think Nirmal is masochist, not in any way.

I dont understand why you called him "masochist". The reason why
we wrote our own cookie handling module using python's Cookie module
was that, the ClientCookie module was doing too many things, which
we wanted to avoid.

ClientCookie is exactly what it says, it is a module that acts as
a webclient apart from managing cookies. The ClientCookie module
borrows the methods of urllib2 like urlopen() and manages cookies
under the covers, so to say. Good design, no doubt but not what
we wanted.

ClientCookie is something like a wrapper over urllib2 plus cookie
handling. We wanted a module, which works *with* urllib2 and does
not wrap over it. With all due respect to CC, it cannot be modified
in a way to do that, withou writing klunky code, which I did not want.

Hence nirmal read the RFCs for cookie handling and implemented
a module which works *along with* urllib2 rather than wrapping over
it.
The module need special calls to set the cookie which is part of
the harvestman code(in another module), so in a way it is much
inferior to client cookie, which does it transparently.

Nirmal is talking about RFCs because we want to have a very correct
technical cookie implementation for our module, no matter what web
servers
does in their wheels and geats. I know that the SetCookie2 method uses
latest cookie RFC, which almost no webserver supports, but still the
question should be taken in a spirit of technical correctness.

Yeah it has an academic quality to it, not of much pratical use maybe
right now, but I cannot understand how it makes the guy a masochist.

-Anand Pillai

Python's existing cookie module doesnt supports new cookie headers
SetCookie2 ,
[...]

What do you want this for? I'm curious, and I suspect you're unaware
that the RFCs on cookies are *not* the standards followed by most of
the web -- the real 'standards' aren't really written down anywhere.

Wait a minute (my memory works slowly), didn't I write you an email
about this a while back? Are you the masochist^H^H^H^H^H^H^H^H^Hguy
who wrote some client-side cookie-handling code for a web-crawler
(called harvestman, IIRC?)?


John
 
N

N.K

What do you want this for? I'm curious, and I suspect you're unaware
that the RFCs on cookies are *not* the standards followed by most of
the web -- the real 'standards' aren't really written down anywhere.


True,I was forced to use RFC 2965. And there are more reserved
keywords such as 'port' , 'Discard' etc. Anyway it is not a bad thing
to update a module.

_reserved = { "expires" : "expires",
"path" : "Path",
"comment" : "Comment",
"domain" : "Domain",
"max-age" : "Max-Age",
"secure" : "secure",
"version" : "Version",

}

Wait a minute (my memory works slowly), didn't I write you an email
about this a while back? Are you the masochist^H^H^H^H^H^H^H^H^Hguy
who wrote some client-side cookie-handling code for a web-crawler
(called harvestman, IIRC?)?


I discarded my yahoo email id because of spam - Sorry for not
replying. I am not that busy guy, normally i reply to almost all
emails :)
 
J

JanC

(e-mail address removed) (Anand Pillai) schreef:
Yeah it has an academic quality to it, not of much pratical use maybe
right now, but I cannot understand how it makes the guy a masochist.

JJL did implement it in ClientCookie, so I'm sure he knows by experience
what a masochist is... ;-)
 
J

John J. Lee

True,I was forced to use RFC 2965.

Who forced you?

And there are more reserved
keywords such as 'port' , 'Discard' etc. Anyway it is not a bad thing
to update a module.
[...]

What? I'm not certain I understand you, but I'll say again
(especially since my second response to Anand seems to have vanished
again): RFC 2965 is not only defunct as an internet protocol, it has
*never* been used by more than a vanishing fraction of the internet.


John
 
J

John J. Lee

[reposting since the original seems to have vanished]

It is him alright, and I maintain HarvestMan, (Capital 'H', Capital
'M').
But I dont think Nirmal is masochist, not in any way.

I dont understand why you called him "masochist". The reason why
we wrote our own cookie handling module using python's Cookie module
was that, the ClientCookie module was doing too many things, which
we wanted to avoid.

Anand, first, it wasn't said with the *slightest* malice (after all, I
meet my own criterion for masochism)!

Still, I certainly *do* think you're duplicating effort for no obvious
reason. You're quite at liberty to do that, of course, but so am I to
call you a masochist for it ;-)

ClientCookie is exactly what it says, it is a module that acts as
a webclient apart from managing cookies. The ClientCookie module
borrows the methods of urllib2 like urlopen() and manages cookies
under the covers, so to say. Good design, no doubt but not what
we wanted.

Well, no. It provides convenient replacements for the urllib2
callables (that's what _urllib2_support.py is for). It certainly
doesn't *require* you use that urllib2-wrapping stuff. All it
requires is that you give CookieJar.extract_cookies /
..add_cookie_header trivial request and response objects. In fact,
since I happen to know that HarvestMan uses urllib2, I know you
already *have* such objects.

No wrapping involved:

import urllib2, ClientCookie

cj = ClientCookie.CookieJar()
request = urllib2.Request("http://www.example.com/")
cj.add_cookie_header(request)
response = urllib2.urlopen(request)
cj.extract_cookies(response, request)
....etc


It couldn't really be much simpler! In fact, I see your own
CookieManager.SetCookie and CookieManager.add_cookie_header methods
are directly analogous (if slightly less convenient, less
standards-compliant, more ignorant of the de-facto cookie standard,
and more buggy -- well OK, maybe not more buggy ;-).

Well, OK, it could be simpler if you didn't even have to call those
methods on cj. But of course, that's what the urllib2-replacement
stuff is for, to add entirely automatic cookie handling:

ClientCookie.urlopen(request) # no need for any tiresome method calls


You're entirely free to use or ignore that stuff.

Why does that optional support *wrap* urllib2 instead of *extending*
it with a CookieHandler? Good question. If you want to have the
urllib2 interface with automatic cookie handling in the sense above,
you're currently *forced* to replace (parts of) urllib2 rather than
extend it, due to the current design of urllib2. I hope to change
that with a patch I've submitted and Jeremy Hylton plans to look at in
his Copious Free Time.

ClientCookie is something like a wrapper over urllib2 plus cookie
handling. We wanted a module, which works *with* urllib2 and does
not wrap over it. With all due respect to CC, it cannot be modified
in a way to do that, withou writing klunky code, which I did not want.

That's just a misunderstanding -- it *has* been that way for a long
time. Just ignore, or throw away, _urllib2_support.py!

Hence nirmal read the RFCs for cookie handling and implemented

Well, as I tried to tell Nirmal (unfortunately it seems that email
address was dead), and as is explained in tiresome (though far from
exhaustive) detail in the ClientCookie docs, browsers don't actually
implement the RFCs. Well, lynx makes a good attempt at RFC 2109, and
Opera does for 2965, but as long as the big browsers don't implement
them -- and it's almost a certainty they never will -- nobody will
actually be *using* either standard! They're just there to trip you
up <0.7 wink>. As is the original cookie_spec.html, actually: it only
bears a passing resemblence to the de facto standard actually used on
the internet. Of course, as long as your userbase is sufficiently
small (please don't feel insulted: the userbase of my code is
doubtless pretty tiny), you may not notice (though I suspect you
will), but people like Ronald Tschalar, author of a Java library
called HTTPClient, have found themelves spending inordinate amounts of
time fixing problems that arise by trying to implement only the RFCs,
or by naievely trying to combine the RFCs with the Netscape protocol.
Of course, you can get around that to a large extent by not bothering
to implement the security rules properly (which may be quite a
reasonable thing to do in some cases).

[...]
Nirmal is talking about RFCs because we want to have a very correct
technical cookie implementation for our module, no matter what web
servers
does in their wheels and geats. I know that the SetCookie2 method uses
latest cookie RFC, which almost no webserver supports, but still the
question should be taken in a spirit of technical correctness.

Your module claims to implement 2109, officially obsolete. Fine
(though I suspect it's quite far from a full implementation). As for
2965 (the standard that officially obsoletes 2109, and which
ClientCookie does implement in addition to the Netscape protocol),
even David Kristol seems to have given up on it, as has the single guy
who was driving the effort to rescue it from compatibility problems
with the Netscape protocol (Daniel Kian McKiernan). RFC 2965 now
seems utterly dead as an internet protocol.

Yeah it has an academic quality to it, not of much pratical use maybe
[...]

As you say.


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,169
Messages
2,570,919
Members
47,458
Latest member
Chris#

Latest Threads

Top