Using filepath method to identify an .html page

F

Ferrous Cranus

Τη ΤÏίτη, 22 ΙανουαÏίου 2013 9:16:34 μ.μ. UTC+2, οχÏήστης Peter Otten έγÏαψε:
Ferrous Cranus wrote:


Τη ΤÏίτη, 22 ΙανουαÏίου 2013 6:11:20 μ.μ. UTC+2, ο χÏήστης Chris Angelico
έγÏαψε:


all of it. You are asking something that is fundamentally
impossible[1]. There simply are not enough numbers to go around.


Fundamentally impossible?

OK: How about this in Perl:
$ cat testMD5.pl
use strict;

foreach my $url(qw@ /index.html /about/time.html @){



sub hashit {
my $url=shift;
my @ltrs=split(//,$url);
my $hash = 0;

foreach my $ltr(@ltrs){
$hash = ( $hash + ord($ltr)) %10000;

printf "%s: %0.4d\n",$url,$hash




which yields:
$ perl testMD5.pl
/index.html: 1066
/about/time.html: 1547



$ cat clashes.pl

use strict;



foreach my $url(qw@

/public/fails.html

/large/cannot.html

/number/being.html

/hope/already.html

/being/really.html

/index/breath.html

/can/although.html

@){

hashit($url);

}



sub hashit {

my $url=shift;

my @ltrs=split(//,$url);

my $hash = 0;



foreach my $ltr(@ltrs){

$hash = ( $hash + ord($ltr)) %10000;

}

printf "%s: %0.4d\n",$url,$hash



}

$ perl clashes.pl

/public/fails.html: 1743

/large/cannot.html: 1743

/number/being.html: 1743

/hope/already.html: 1743

/being/really.html: 1743

/index/breath.html: 1743

/can/although.html: 1743



Hm, I must be holding it wrong...

my @i = split(//,$url); # put each letter in it's own bin
my $j=0; # Initailize our
my $k=1; # hashing increment values
my @m=(); # workspace
foreach my $n(@i){
my $q=ord($n); # ASCII for character
$k += $j; # Increment our hash offset
$q += $k; # add our "old" value
$j = $k; # store that.
push @m,$q; # save the offsetted value
}

my $hashval=0; #initialize our hash value
# Generate that
map { $hashval = ($hashval + $_) % 10000} @m;


Using that method ABC.html and CBA.html now have different values because each letter position's value gets bumped up increasingly from left to right.
 
F

Ferrous Cranus

Τη ΤÏίτη, 22 ΙανουαÏίου 2013 9:16:34 μ.μ. UTC+2, οχÏήστης Peter Otten έγÏαψε:
Ferrous Cranus wrote:


Τη ΤÏίτη, 22 ΙανουαÏίου 2013 6:11:20 μ.μ. UTC+2, ο χÏήστης Chris Angelico
έγÏαψε:


all of it. You are asking something that is fundamentally
impossible[1]. There simply are not enough numbers to go around.


Fundamentally impossible?

OK: How about this in Perl:
$ cat testMD5.pl
use strict;

foreach my $url(qw@ /index.html /about/time.html @){



sub hashit {
my $url=shift;
my @ltrs=split(//,$url);
my $hash = 0;

foreach my $ltr(@ltrs){
$hash = ( $hash + ord($ltr)) %10000;

printf "%s: %0.4d\n",$url,$hash




which yields:
$ perl testMD5.pl
/index.html: 1066
/about/time.html: 1547



$ cat clashes.pl

use strict;



foreach my $url(qw@

/public/fails.html

/large/cannot.html

/number/being.html

/hope/already.html

/being/really.html

/index/breath.html

/can/although.html

@){

hashit($url);

}



sub hashit {

my $url=shift;

my @ltrs=split(//,$url);

my $hash = 0;



foreach my $ltr(@ltrs){

$hash = ( $hash + ord($ltr)) %10000;

}

printf "%s: %0.4d\n",$url,$hash



}

$ perl clashes.pl

/public/fails.html: 1743

/large/cannot.html: 1743

/number/being.html: 1743

/hope/already.html: 1743

/being/really.html: 1743

/index/breath.html: 1743

/can/although.html: 1743



Hm, I must be holding it wrong...

my @i = split(//,$url); # put each letter in it's own bin
my $j=0; # Initailize our
my $k=1; # hashing increment values
my @m=(); # workspace
foreach my $n(@i){
my $q=ord($n); # ASCII for character
$k += $j; # Increment our hash offset
$q += $k; # add our "old" value
$j = $k; # store that.
push @m,$q; # save the offsetted value
}

my $hashval=0; #initialize our hash value
# Generate that
map { $hashval = ($hashval + $_) % 10000} @m;


Using that method ABC.html and CBA.html now have different values because each letter position's value gets bumped up increasingly from left to right.
 
M

Michael Torrie

<some perl code>
Using that method ABC.html and CBA.html now have different values
because each letter position's value gets bumped up increasingly from
left to right.

You have run this little "hash" algorithm on a whole bunch of files, say
C:\windows\system32 right? And how many collisions did you get?

You've already rejected using the file path or url as a key because it
could change. Why are you wanting to do this hash based on the file's
path or url anyway?
 
F

Ferrous Cranus

Τη ΤετάÏτη, 23 ΙανουαÏίου 2013 5:25:36 μ.μ. UTC+2, ο χÏήστης Michael Torrie έγÏαψε:
You have run this little "hash" algorithm on a whole bunch of files, say

C:\windows\system32 right? And how many collisions did you get?



You've already rejected using the file path or url as a key because it

could change. Why are you wanting to do this hash based on the file's

path or url anyway?

No, its inevitable, something must remain the same.

Filepath *must* be used.

Can you transliterate this code to Python code please?
 
F

Ferrous Cranus

Τη ΤετάÏτη, 23 ΙανουαÏίου 2013 5:25:36 μ.μ. UTC+2, ο χÏήστης Michael Torrie έγÏαψε:
You have run this little "hash" algorithm on a whole bunch of files, say

C:\windows\system32 right? And how many collisions did you get?



You've already rejected using the file path or url as a key because it

could change. Why are you wanting to do this hash based on the file's

path or url anyway?

No, its inevitable, something must remain the same.

Filepath *must* be used.

Can you transliterate this code to Python code please?
 
L

Leonard, Arah

"his quote string is Cyrillic"?
If you're referring to the "Τη ΤÏίτη, 22 ΙανουαÏίου 2013 6:23:16 μ.μ.
UTC+2, ο χÏήστης Leonard, Arah έγÏαψε", that's Greek.

Cyrillic or not, it's all Greek to me. ;)
 
M

Mark Lawrence

3) This is a Python-specific resource and that's not even Python code. What next? Javascript? Ada? Fortran? COBOL? 8-bit x86 assembly with minimal comments written in Esperanto?

Please can we have CORAL 66 mentioned on the odd occasion.
4) The novelty of the entertainment resulting from this perversity has waned, even for me. The educational aspect to novice programmers has likewise run dry. I've now officially grown bored of your game and am joining everyone else who already has already gotten off of this kiddie ride. Congratulations on beating a dead horse into mince-meat and successfully milking the one-uddered cow until the pale is full. I hope that you enjoyed your meal.

Pail not pale :)
Or to borrow a phrase, "I say GOOD DAY, sir!"

Or madam?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,073
Messages
2,570,539
Members
47,197
Latest member
NDTShavonn

Latest Threads

Top