TypeError: can't pickle HASH objects?

E

est

import md5Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python25\lib\pickle.py", line 1366, in dumps
Pickler(file, protocol).dump(obj)
File "C:\Python25\lib\pickle.py", line 224, in dump
self.save(obj)
File "C:\Python25\lib\pickle.py", line 306, in save
rv = reduce(self.proto)
File "C:\Python25\lib\copy_reg.py", line 69, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle HASH objects

Why can't I pickle a md5 object? Is it because md5 algorithm needs to
read 512-bits at a time?

I need to md5() some stream, pause(python.exe quits), and resume
later. It seems that the md5 and hashlib in std module could not be
serialized?

Do I have to implement md5 algorithm again for this special occasion?

Or is there anyway to assige a digest when creating md5 objects?
 
M

mdsherry

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python25\lib\pickle.py", line 1366, in dumps
    Pickler(file, protocol).dump(obj)
  File "C:\Python25\lib\pickle.py", line 224, in dump
    self.save(obj)
  File "C:\Python25\lib\pickle.py", line 306, in save
    rv = reduce(self.proto)
  File "C:\Python25\lib\copy_reg.py", line 69, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle HASH objects

Why can't I pickle a md5 object? Is it because md5 algorithm needs to
read 512-bits at a time?

I need to md5() some stream, pause(python.exe quits), and resume
later.  It seems that the md5 and hashlib in  std module could not be
serialized?

Do I have to implement md5 algorithm again for this special occasion?

Or is there anyway to assige a digest when creating md5 objects?

I'm sure some of the regulars can correct me if I'm wrong, but looking
at the source code, it seems that this is the error that you'll see if
the object doesn't explicitly support pickling, or possibly isn't
composed of objects that do.

Examining the md5 and hashlib source files, it seems that they rely on
C implementations, and so have internal states opaque to Python. If
you feel confident, you could write your own MD5 class that would have
methods to dump and restore state, but I think you're out of luck when
it comes to the official module.

Mark Sherry
 
A

Aaron \Castironpi\ Brady

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python25\lib\pickle.py", line 1366, in dumps
    Pickler(file, protocol).dump(obj)
  File "C:\Python25\lib\pickle.py", line 224, in dump
    self.save(obj)
  File "C:\Python25\lib\pickle.py", line 306, in save
    rv = reduce(self.proto)
  File "C:\Python25\lib\copy_reg.py", line 69, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle HASH objects

Why can't I pickle a md5 object? Is it because md5 algorithm needs to
read 512-bits at a time?

I need to md5() some stream, pause(python.exe quits), and resume
later.  It seems that the md5 and hashlib in  std module could not be
serialized?

Do I have to implement md5 algorithm again for this special occasion?

Or is there anyway to assige a digest when creating md5 objects?

Can you just pickle the stream, the part of it you've read so far?
 
E

est

Can you just pickle the stream, the part of it you've read so far?- Hide quoted text -

- Show quoted text -

wow. It's giga-size file. I need stream reading it, md5 it. It may
break for a while.
 
A

Aaron \Castironpi\ Brady

no, I need to serialize half-finished digest, not file stream.

Anyone got solution?

I am looking at '_hashopenssl.c'. If you can find the implementation
of EVP_DigestUpdate, I'll give it a shot to help you write a ctypes
hack to store and write its state.
 
E

est

I am looking at '_hashopenssl.c'.  If you can find the implementation
of EVP_DigestUpdate, I'll give it a shot to help you write a ctypes
hack to store and write its state.- Hide quoted text -

- Show quoted text -


http://cvs.openssl.org/fileview?f=openssl/crypto/evp/digest.c

int EVP_DigestUpdate(EVP_MD_CTX *ctx, const void *data,
size_t count)
{
#ifdef OPENSSL_FIPS
FIPS_selftest_check();
#endif
return ctx->digest->update(ctx,data,count);
}


is this one?
 
G

Gabriel Genellina

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python25\lib\pickle.py", line 1366, in dumps
Pickler(file, protocol).dump(obj)
File "C:\Python25\lib\pickle.py", line 224, in dump
self.save(obj)
File "C:\Python25\lib\pickle.py", line 306, in save
rv = reduce(self.proto)
File "C:\Python25\lib\copy_reg.py", line 69, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle HASH objects

Why can't I pickle a md5 object? Is it because md5 algorithm needs to
read 512-bits at a time?

I need to md5() some stream, pause(python.exe quits), and resume
later. It seems that the md5 and hashlib in std module could not be
serialized?

Yep, they're implemented in C and have no provision for serializing.
If you can use the old _md5 module, it is far simpler to serialize; a
md5object just contains a small struct with 6 integers and 64 chars, no
pointers.

With some help from ctypes (and a lot of black magic!) one can extract the
desired state, and restore it afterwards:

--- begin code ---
import _md5
import ctypes

assert _md5.MD5Type.__basicsize__==96

def get_md5_state(m):
if type(m) is not _md5.MD5Type:
raise TypeError, 'not a _md5.MD5Type instance'
return ctypes.string_at(id(m)+8, 88)

def set_md5_state(m, state):
if type(m) is not _md5.MD5Type:
raise TypeError, 'not a _md5.MD5Type instance'
if not isinstance(state,str):
raise TypeError, 'state must be str'
if len(state)!=88:
raise ValueError, 'len(state) must be 88'
a88 = ctypes.c_char*88
pstate = a88(*list(state))
ctypes.memmove(id(m)+8, ctypes.byref(pstate), 88)

--- end code ---

py> m1 = _md5.new()
py> m1.update("this is a ")
py> s = get_md5_state(m1)
py> del m1
py>
py> m2 = _md5.new()
py> set_md5_state(m2, s)
py> m2.update("short test")
py> print m2.hexdigest()
95ad1986e9a9f19615cea00b7a44b912
py> print _md5.new("this is a short test").hexdigest()
95ad1986e9a9f19615cea00b7a44b912

The code above was only tested with Python 2.5.2 on Windows, not more than
you can see. It might or might not work with other versions or platforms.
It may even create a (small) black hole and eat your whole town. Use at
your own risk.
 
A

Aaron \Castironpi\ Brady

http://cvs.openssl.org/fileview?f=openssl/crypto/evp/digest.c

int EVP_DigestUpdate(EVP_MD_CTX *ctx, const void *data,
             size_t count)
        {
#ifdef OPENSSL_FIPS
        FIPS_selftest_check();
#endif
        return ctx->digest->update(ctx,data,count);
        }

is this one?

Oops, I needed 'EVP_MD_CTX'. I went Googling and found it.

http://www.google.com/codesearch?hl...p&cs_f=cyassl-0.8.5/include/openssl/evp.h#l51

But does Gabriel's work for you?
 
E

est

En Wed, 01 Oct 2008 16:50:05 -0300, est <[email protected]> escribió:








Yep, they're implemented in C and have no provision for serializing.
If you can use the old _md5 module, it is far simpler to serialize; a  
md5object just contains a small struct with 6 integers and 64 chars, no  
pointers.

With some help from ctypes (and a lot of black magic!) one can extract the  
desired state, and restore it afterwards:

--- begin code ---
import _md5
import ctypes

assert _md5.MD5Type.__basicsize__==96

def get_md5_state(m):
     if type(m) is not _md5.MD5Type:
         raise TypeError, 'not a _md5.MD5Type instance'
     return ctypes.string_at(id(m)+8, 88)

def set_md5_state(m, state):
     if type(m) is not _md5.MD5Type:
         raise TypeError, 'not a _md5.MD5Type instance'
     if not isinstance(state,str):
         raise TypeError, 'state must be str'
     if len(state)!=88:
         raise ValueError, 'len(state) must be 88'
     a88 = ctypes.c_char*88
     pstate = a88(*list(state))
     ctypes.memmove(id(m)+8, ctypes.byref(pstate), 88)

--- end code ---

py> m1 = _md5.new()
py> m1.update("this is a ")
py> s = get_md5_state(m1)
py> del m1
py>
py> m2 = _md5.new()
py> set_md5_state(m2, s)
py> m2.update("short test")
py> print m2.hexdigest()
95ad1986e9a9f19615cea00b7a44b912
py> print _md5.new("this is a short test").hexdigest()
95ad1986e9a9f19615cea00b7a44b912

The code above was only tested with Python 2.5.2 on Windows, not more than  
you can see. It might or might not work with other versions or platforms.  
It may even create a (small) black hole and eat your whole town. Use at  
your own risk.

WOW! I never expected python could be coded like that! Thanks a lot!


Oops, I needed 'EVP_MD_CTX'. I went Googling and found it.

http://www.google.com/codesearch?hl=en&q=struct+EVP_MD_CTX+show:mV3VB....

But does Gabriel's work for you?- Hide quoted text -

- Show quoted text -

Still need some hack with py2.5 on linux. Maybe I just need soft link /
usr/lib/python2.4/lib-dynload/md5.so to py2.5 :)
py2.4 is pre-installed on most of the servers, I think.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,961
Messages
2,570,131
Members
46,689
Latest member
liammiller

Latest Threads

Top