Converting a number back to it's original string (that was hashed togenerate that number)

F

Ferrous Cranus

Now my website finally works as intended. Just visit the following links plz.
------------------------------------------------------------------------------
1. http://superhost.gr

2. http://superhost.gr/?show=log

3. http://i.imgur.com/89Eqmtf.png (this displays the database's column 'pin', a 5-digit number acting as a filepath indicator)

4. http://i.imgur.com/9js4Pz0.png (this is the detailed html page's information associated to 'pin' column indicator instead of something like '/home/nikos/public_html/index.html'

Isn't it a nice solution?

I beleive it is.

but what happens when: http://superhost.gr/?show=stats

I just see the number that correspons to a specific html page and hence i need to convert that number back to its original string.

# ==========================================================
# generating an 5-digit integer based on filepath, for to identify the current html page
# ==========================================================

pin = int( htmlpage.encode("hex"), 16 ) % 100000

Now i need the opposite procedure. Will hex.decode(number) convert back to the original string?

I think not because this isnt a hash procedure.
But how can it be done then?
 
L

Lele Gaifax

Ferrous Cranus said:
pin = int( htmlpage.encode("hex"), 16 ) % 100000

Now i need the opposite procedure.

As already said several times by different persons in this thread, there
is no way you can get the original string that originated a particular
“pinâ€: the function you are using is “lossyâ€, that is, information gets
lost in order to reduce a BIG string into a SMALL five-digits integer
number.
Will hex.decode(number) convert back to the original string?

NO. As people demonstrated you, you are going to meet collisions very
fast, if you insist going this way (even you thought a “smarter†way to
get a checksum out of your string by using a different weight for the
single characters, there is still high chances of collisions, not
counting the final “modulo†operation). Once you get such a collision,
there is not enough information in that single tiny number to get back a
single string that generated it.

Imagine that, instead of using an integer checksum of your full path,
you “shrink†it by replacing each name in the path with its starting
letter, that is:

/home/ferrous/public_html/index.html => /h/f/p/i

That is evidently way shorter of the original, but you LOST information,
and you cannot write code in any language that eventually reproduce the
original.

The only way out is either use the fullpath as the primary key of your
table, or using a mapping table with a bi-directional univoke mapping
between any single fullpath to the corresponding "short" integer value.

ciao, lele.
 
F

Ferrous Cranus

Please DON'T tell me to save both the pin <=> filepath and associate them (that can be done by SQL commands, i know)
I will not create any kind of primary/unique keys to the database.
I will not store the filepath into the database, just the number which indicates the filepath(html page).
Also no external table associating fielpaths and numbers.
i want this to be solved only by Python Code, not database oriented.


That is: I need to be able to map both ways, in a one to one relation, 5-digit-integer <=> string

int( hex ( string ) ) can encode a string to a number. Can this be decoded back? I gues that can also be decoded-converted back because its not losing any information. Its encoding, not compressing.

But it's the % modulo that breaks the forth/back association.

So, the question is:

HOW to map both ways, in a one to one relation, (5-digit-integer <=> string) without losing any information?
 
F

Ferrous Cranus

Please DON'T tell me to save both the pin <=> filepath and associate them (that can be done by SQL commands, i know)
I will not create any kind of primary/unique keys to the database.
I will not store the filepath into the database, just the number which indicates the filepath(html page).
Also no external table associating fielpaths and numbers.
i want this to be solved only by Python Code, not database oriented.


That is: I need to be able to map both ways, in a one to one relation, 5-digit-integer <=> string

int( hex ( string ) ) can encode a string to a number. Can this be decoded back? I gues that can also be decoded-converted back because its not losing any information. Its encoding, not compressing.

But it's the % modulo that breaks the forth/back association.

So, the question is:

HOW to map both ways, in a one to one relation, (5-digit-integer <=> string) without losing any information?
 
F

Ferrous Cranus

Τη ΤετάÏτη, 23 ΙανουαÏίου 2013 3:58:45 μ.μ. UTC+2, ο χÏήστης Dave Angel έγÏαψε:
Simple. Predefine the 100,000 legal strings, and don't let the user use

anything else. One way to do that would be to require a path string of

no more than 5 characters, and require them all to be of a restricted

alphabet of 10 characters. (eg. the alphabet could be 0-9, which is

obvious, or it could be ".aehilmpst" (no uppercase, no underscore, no

digits, no non-ascii, etc.)



In the realistic case of file paths or URLs, it CANNOT be done.

OK, its not doable. I'll stop asking for it.
CHANGE of plans.
i will use the database solution which is the most easy wau to do it:

============================================================

# insert new page record in table counters or update it if already exists
try:
cursor.execute( '''INSERT INTO counters(page, hits) VALUES(%s, %s)
ON DUPLICATE KEY UPDATE hits = hits + 1''', (htmlpage, 1) )
except MySQLdb.Error, e:
print ( "Query Error: ", sys.exc_info()[1].excepinfo()[2] )

# update existing visitor record if same pin and same host found
try:
cursor.execute( '''UPDATE visitors SET hits = hits + 1, useros = %s, browser = %s, date = %s WHERE pin = %s AND host = %s''', (useros, browser, date, page, host))
except MySQLdb.Error, e:
print ( "Error %d: %s" % (e.args[0], e.args[1]) )

# insert new visitor record if above update did not affect a row
if cursor.rowcount == 0:
cursor.execute( '''INSERT INTO visitors(hits, host, useros, browser, date) VALUES(%s, %s, %s, %s, %s)''', (1, host, useros, browser, date) )

============================================================

I can INSERT a row to the table "counter"
I cannot UPDATE or INSERT into the table "visitors" without knowing the "pin" primary key number the database created.

Can you help on this please?
 
F

Ferrous Cranus

Τη ΤετάÏτη, 23 ΙανουαÏίου 2013 3:58:45 μ.μ. UTC+2, ο χÏήστης Dave Angel έγÏαψε:
Simple. Predefine the 100,000 legal strings, and don't let the user use

anything else. One way to do that would be to require a path string of

no more than 5 characters, and require them all to be of a restricted

alphabet of 10 characters. (eg. the alphabet could be 0-9, which is

obvious, or it could be ".aehilmpst" (no uppercase, no underscore, no

digits, no non-ascii, etc.)



In the realistic case of file paths or URLs, it CANNOT be done.

OK, its not doable. I'll stop asking for it.
CHANGE of plans.
i will use the database solution which is the most easy wau to do it:

============================================================

# insert new page record in table counters or update it if already exists
try:
cursor.execute( '''INSERT INTO counters(page, hits) VALUES(%s, %s)
ON DUPLICATE KEY UPDATE hits = hits + 1''', (htmlpage, 1) )
except MySQLdb.Error, e:
print ( "Query Error: ", sys.exc_info()[1].excepinfo()[2] )

# update existing visitor record if same pin and same host found
try:
cursor.execute( '''UPDATE visitors SET hits = hits + 1, useros = %s, browser = %s, date = %s WHERE pin = %s AND host = %s''', (useros, browser, date, page, host))
except MySQLdb.Error, e:
print ( "Error %d: %s" % (e.args[0], e.args[1]) )

# insert new visitor record if above update did not affect a row
if cursor.rowcount == 0:
cursor.execute( '''INSERT INTO visitors(hits, host, useros, browser, date) VALUES(%s, %s, %s, %s, %s)''', (1, host, useros, browser, date) )

============================================================

I can INSERT a row to the table "counter"
I cannot UPDATE or INSERT into the table "visitors" without knowing the "pin" primary key number the database created.

Can you help on this please?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top