base-96

K

Kless

I think that would be very interesting thay Python would have a module
for working on base 96 too. [1]

It could be converted to base 96 the digests from hashlib module, and
random bytes used on crypto (to create the salt, the IV, or a key).

As you can see here [2], the printable ASCII characters are 94
(decimal code range of 33-126). So only left to add another 2
characters more; the space (code 32), and one not-printable char
(which doesn't create any problem) by last.


[1] http://svn.python.org/view/python/trunk/Modules/binascii.c
[2] http://en.wikipedia.org/wiki/ISO/IEC_8859-1
 
T

Tim Roberts

Kless said:
I think that would be very interesting thay Python would have a module
for working on base 96 too. [1]

Well, then, write one.

However, I'm not sure I see the point. Base 64 is convenient because 6
bits becomes 8 bits exactly, so 3 bytes translates exactly to 4 characters.
With base 96, you would end up doing division instead of just shifting and
masking; the conversion isn't as "neat".
As you can see here [2], the printable ASCII characters are 94
(decimal code range of 33-126). So only left to add another 2
characters more; the space (code 32), and one not-printable char
(which doesn't create any problem) by last.

This leaves some tricky issues. How will you denote the end of a base 96
sequence? If every printable character can be part of the ciphertext, what
can you use as an end marker or a padding character?
 
T

Tim Roberts

Kless said:
I think that would be very interesting thay Python would have a module
for working on base 96 too. [1]

It could be converted to base 96 the digests from hashlib module, and
random bytes used on crypto (to create the salt, the IV, or a key).

As you can see here [2], the printable ASCII characters are 94
(decimal code range of 33-126). So only left to add another 2
characters more; the space (code 32), and one not-printable char
(which doesn't create any problem) by last.

Whether it creates problems depends on how you intend to use it. The
biggest use for Base64, for instance, is in translating binary files to a
form where they can be send via email using only printable characters. If
you use a non-printable character, that's a problem for email.

With Base64, 3 bytes becomes 4. With Base96, 5 bytes becomes 6. So, you
would reduce the conversion penalty from 1.33 down to 1.17.

It's not hard to write modules to translate from binary to Base96 and back
again, and doing so would be a great exercise to explore the issues in this
kind of encoding.
 
K

Kless

Kless said:
I think that would be very interesting thay Python would have a module
for working on base 96 too. [1]

Well, then, write one.

However, I'm not sure I see the point.  Base 64 is convenient because 6
bits becomes 8 bits exactly, so 3 bytes translates exactly to 4 characters.
With base 96, you would end up doing division instead of just shifting and
masking; the conversion isn't as "neat".
As you can see here [2], the printable ASCII characters are 94
(decimal code range of 33-126). So only left to add another 2
characters more; the space (code 32), and one not-printable char
(which doesn't create any problem) by last.

This leaves some tricky issues.  How will you denote the end of a base 96
sequence?  If every printable character can be part of the ciphertext, what
can you use as an end marker or a padding character?
Well, it could be used an Unicode (UTF-8) character -wich isn't in
ASCII set-, if it isn't possible use a non-printable char.
 
K

Kless

Whether it creates problems depends on how you intend to use it.  The
biggest use for Base64, for instance, is in translating binary files to a
form where they can be send via email using only printable characters.  If
you use a non-printable character, that's a problem for email.
There would be that make tests, it's possible that there isn't problem
for any non-printable chars.
With Base64, 3 bytes becomes 4.  With Base96, 5 bytes becomes 6.  So, you
would reduce the conversion penalty from 1.33 down to 1.17.
It's not hard to write modules to translate from binary to Base96 and back
again, and doing so would be a great exercise to explore the issues in this
kind of encoding.
Yes, it's easy in python, but the ideal would be make the arithmetic
in C as it's implemented for base 64.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,701
Latest member
XavierQ83

Latest Threads

Top