File::SortedSeek not working

W

worker

Hi, all,
I am using this File::SortedSeek module to search a big data file
and it is not working.

The data file has these entries:

01/01/1960,0.75
01/02/1960,0.00
etc

basically, a date followed by a number

Then, I have this other file contains only the list of valid date
as follows:

01/05/1960
01/07/1960

So, what I am doing is to get a line from the valid date file, then
File::SortedSeek out the matching date in the data file, but no matter
how I try, it is not matching at all,

Is that module real or just a quack?!
thx
bill
 
J

J. Gleixner

worker said:
Hi, all,
I am using this File::SortedSeek module to search a big data file
and it is not working.

The data file has these entries:

01/01/1960,0.75
01/02/1960,0.00
etc

basically, a date followed by a number

Then, I have this other file contains only the list of valid date
as follows:

01/05/1960
01/07/1960

So, what I am doing is to get a line from the valid date file, then
File::SortedSeek out the matching date in the data file, but no matter
how I try, it is not matching at all,

Is that module real or just a quack?!

Check line 42.
 
X

xhoster

worker said:
Hi, all,
I am using this File::SortedSeek module to search a big data file
and it is not working.

The data file has these entries:

01/01/1960,0.75
01/02/1960,0.00
etc

basically, a date followed by a number

Then, I have this other file contains only the list of valid date
as follows:

01/05/1960
01/07/1960

So, what I am doing is to get a line from the valid date file, then
File::SortedSeek out the matching date in the data file, but no matter
how I try, it is not matching at all,

Is that module real or just a quack?!

It is far more likely that you are a quack who is using the module
incorrectly. If you showed us the code, we might be able to verify
which if the two is the case.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
W

worker

Check line 42.

Hi, I am listing my little test code below, please help: (the problem
is that it can't find a match)
I guess the main thing I didn't get right is the matching pattern that
matches those dates, so I guess that's the part I am seeking help.
thx.

####### Start test code #####
use File::SortedSeek;
my $dfile = './dtest.file';
my $file = './test.file';

open DTEST,">$dfile" or die "0bad\n";
open TEST, ">$file" or die "bad\n";
print DTEST "01/01/1960\n01/02/1960\n"; close(DTEST);

print TEST
"01/10/1949,0.1\n01/02/1950,0.2\n1/1/1960,7.0\n1/2/1960,8.0\n3/3/1980,9.0\n";
close(TEST);

open DTEST,"<$dfile" or die "0bad\n";
open TEST, "<$file" or die "bad\n";

my $line;
my $zline;
$line = <DTEST>;
chomp ($line);
$tell = File::SortedSeek::alphabetic(*TEST,$line,\&munge_string);

$zline = <TEST>;
print "Found it?:: $zline\n";
close(TEST);close(DTEST);
exit(0);

sub munge_string {
my $line = shift || return undef;
# return ($line =~ m/\|(\w+)$/) ? $1 : undef;
# return ($line =~ m/^[0-9]+\/[0-9]+\/[0-9]+,/) ? $1 : undef;
return ($line =~ m/^\/,/) ? $1 : undef;
}

#### End test code ####
 
X

xhoster

The example code produced file like this:

01/10/1949,0.1
01/02/1950,0.2
1/1/1960,7.0
1/2/1960,8.0
3/3/1980,9.0

It is not canonical, as the sometimes it is zero-padded and sometimes
it is not. It does appear to be in a reasonable sorted order, but I
don't know if that is by design or by accident (If you designed something
to sort it properly, why doesn't it canonicalize it while it is at it?)

But isn't clear if the semantics are day/month/year or month/day/year,
as both are compatible with the given order. I'm assuming day/month/year

....
$line = <DTEST>;
chomp ($line);
$tell = File::SortedSeek::alphabetic(*TEST,$line,\&munge_string);

The query $line needs to be munged in way compatible with the querent
lines' munging. Otherwise it won't work. The easiest way to do this is to
pass in munge_string($line) rather than $line.
sub munge_string {
my $line = shift || return undef;
# return ($line =~ m/\|(\w+)$/) ? $1 : undef;
# return ($line =~ m/^[0-9]+\/[0-9]+\/[0-9]+,/) ? $1 : undef;
return ($line =~ m/^\/,/) ? $1 : undef;
}

The function has to munge the data in such a way that the lines of the
file being searched are in alphabetic sorted order after the munging. With
the sort order your file already has, it thus has to reorder the fields so
that the most significant (year) comes first.

return ($line =~ m/(^[0-9]+)\/([0-9]+)\/([0-9]+)(,|$)/) ?
sprintf "%04d%02d%02d", $3,$2,$1 : undef;


The (,|$) is so that it will work on the query, which is not followed by a
comma, as well as the querent.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
W

worker

The example code produced file like this:

01/10/1949,0.1
01/02/1950,0.2
1/1/1960,7.0
1/2/1960,8.0
3/3/1980,9.0

It is not canonical, as the sometimes it is zero-padded and sometimes
it is not. It does appear to be in a reasonable sorted order, but I
don't know if that is by design or by accident (If you designed something
to sort it properly, why doesn't it canonicalize it while it is at it?)

But isn't clear if the semantics are day/month/year or month/day/year,
as both are compatible with the given order. I'm assuming day/month/year

...
$line = <DTEST>;
chomp ($line);
$tell = File::SortedSeek::alphabetic(*TEST,$line,\&munge_string);

The query $line needs to be munged in way compatible with the querent
lines' munging. Otherwise it won't work. The easiest way to do this is to
pass in munge_string($line) rather than $line.


sub munge_string {
my $line = shift || return undef;
# return ($line =~ m/\|(\w+)$/) ? $1 : undef;
# return ($line =~ m/^[0-9]+\/[0-9]+\/[0-9]+,/) ? $1 : undef;
return ($line =~ m/^\/,/) ? $1 : undef;
}

The function has to munge the data in such a way that the lines of the
file being searched are in alphabetic sorted order after the munging. With
the sort order your file already has, it thus has to reorder the fields so
that the most significant (year) comes first.

return ($line =~ m/(^[0-9]+)\/([0-9]+)\/([0-9]+)(,|$)/) ?
sprintf "%04d%02d%02d", $3,$2,$1 : undef;

The (,|$) is so that it will work on the query, which is not followed by a
comma, as well as the querent.

Xho

--
--------------------http://NewsReader.Com/--------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

thnx for the explanation.
I made the following changes according to your suggestions:

$aSimDateMunged = &munge_string($aSimDate);
$tell = File::SortedSeek::alphabetic(*FRAW,$aSimDateMunged,
\&munge_string);


return ($line =~ m/(^[0-9]+)\/([0-9]+)\/([0-9]+)(,|$)/) ?
sprintf "%04d%02d%02d", $3,$1,$2 : undef;


my data is in month/day/year.


once I tried it, this is what I got as response:


Name "main::tell" used only once: possible typo at ztest.pl line 22.



Ark, File::SortedSeek got to EOF
Failed to find: 'SCALAR(0x354c0)'
The search mode for the file was 'Ascending order'
$line: undef
$next: undef
File size: 74 Bytes
$top: 55 Bytes
$bottom: 74 Bytes
Perhaps try reversing the search mode
Are you using the correct method - alhpabetic or numeric?

If you think it is a bug please send a bug report to:
(e-mail address removed)
A sample of the file, the call to this module and
this error message will help to fix the problem
Use of uninitialized value in concatenation (.) or string at ztest.pl
line 25, <
TEST> line 8.
Found it?::

Help?
 
W

worker

The example code produced file like this:

It is not canonical, as the sometimes it is zero-padded and sometimes
it is not. It does appear to be in a reasonable sorted order, but I
don't know if that is by design or by accident (If you designed something
to sort it properly, why doesn't it canonicalize it while it is at it?)
But isn't clear if the semantics are day/month/year or month/day/year,
as both are compatible with the given order. I'm assuming day/month/year
The query $line needs to be munged in way compatible with the querent
lines' munging. Otherwise it won't work. The easiest way to do this is to
pass in munge_string($line) rather than $line.
sub munge_string {
my $line = shift || return undef;
# return ($line =~ m/\|(\w+)$/) ? $1 : undef;
# return ($line =~ m/^[0-9]+\/[0-9]+\/[0-9]+,/) ? $1 : undef;
return ($line =~ m/^\/,/) ? $1 : undef;
}
The function has to munge the data in such a way that the lines of the
file being searched are in alphabetic sorted order after the munging. With
the sort order your file already has, it thus has to reorder the fields so
that the most significant (year) comes first.
return ($line =~ m/(^[0-9]+)\/([0-9]+)\/([0-9]+)(,|$)/) ?
sprintf "%04d%02d%02d", $3,$2,$1 : undef;
The (,|$) is so that it will work on the query, which is not followed by a
comma, as well as the querent.

--
--------------------http://NewsReader.Com/--------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

thnx for the explanation.
I made the following changes according to your suggestions:

$aSimDateMunged = &munge_string($aSimDate);
$tell = File::SortedSeek::alphabetic(*FRAW,$aSimDateMunged,
\&munge_string);

return ($line =~ m/(^[0-9]+)\/([0-9]+)\/([0-9]+)(,|$)/) ?
sprintf "%04d%02d%02d", $3,$1,$2 : undef;

my data is in month/day/year.

once I tried it, this is what I got as response:

Name "main::tell" used only once: possible typo at ztest.pl line 22.

Ark, File::SortedSeek got to EOF
Failed to find: 'SCALAR(0x354c0)'
The search mode for the file was 'Ascending order'
$line: undef
$next: undef
File size: 74 Bytes
$top: 55 Bytes
$bottom: 74 Bytes
Perhaps try reversing the search mode
Are you using the correct method - alhpabetic or numeric?

If you think it is a bug please send a bug report to:
(e-mail address removed)
A sample of the file, the call to this module and
this error message will help to fix the problem
Use of uninitialized value in concatenation (.) or string at ztest.pl
line 25, <
TEST> line 8.
Found it?::

Help?

Hello?
could anyone help with this? thx.
 
X

xhoster

Where did $aSimDate and *FRAW come from? The point of posting test code
is that we can use it for testing. If you keep changing the variable
names, that kind of defeats the purpose.
return ($line =~ m/(^[0-9]+)\/([0-9]+)\/([0-9]+)(,|$)/) ?
sprintf "%04d%02d%02d", $3,$1,$2 : undef;

my data is in month/day/year.

once I tried it, this is what I got as response:

Name "main::tell" used only once: possible typo at ztest.pl line 22.

Ark, File::SortedSeek got to EOF
Failed to find: 'SCALAR(0x354c0)'

Are you taking a reference to a scalar someplace that you aren't showing
us?
Hello?
could anyone help with this? thx.

When I made the indicated changes to the example you originally provided,
it worked just fine. I don't think you are reliably transferring
code/information back and forth between this forum and your own code, so
there is little more I can do for you.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,209
Messages
2,571,086
Members
47,684
Latest member
Rashi Yadav

Latest Threads

Top