reliability problem with Finance::QuoteHist::Yahoo

T

Ted

I have been able to get this package to work sometimes. However,
invariably it fails on me part way through my script, after
downloading data for from one to four ticker symbols. The error is:

Can't use an undefined value as an ARRAY reference at C:/Perl/site/lib/
Finance/QuoteHist/Generic.pm line 863.

I have no information about how an undefined value is being passed
deep into the code of a package I didn't write. I took a quick look
into Generic.pm and yahoo.pm, but found no enlightenment so far
(there's a lot of code there).

Sometimes it arises when quotes() is executed, and at other times, it
occurs when dividends() is executed.

The script I am using is appended below. There isn't anything special
about it. Much of it I copied from example code in the documentation
provided for the two finance packages I'm trying. About all I did was
merge two example scripts, and attempt to get the historical quote
data for one symbol at a time. So far, I have historical quote data
for four stocks. There are over 7000 that I need to retrieve:
obviously not something that is manageable manually downloading each
individually.

Neither Finance::QuoteHist::Generic nor Finance::QuoteHist::Yahoo
(which is derived from the former) say anything about error handling,
or provide a way to test whether or not data was successfully
downloaded.

Any assistance would be greatly appreciated.

Thanks

Ted

use Finance::TickerSymbols;
use Finance::QuoteHist::Yahoo;
use IO::File;

#my $fh1 = new IO::File "> values.csv";
#my $fh2 = new IO::File "> splits.csv";
#my $fh3 = new IO::File "> dividends.csv";
my $fname1;
my $fname2;
my $fname3;

for my $industry ( industries_list()) {
print "\n\n$industry\n";
mkdir "$industry";
for my $symbol ( industry_list( $industry ) ) {
print "$symbol\n\n";
mkdir "$industry\\$symbol";
$fname1 = "$industry\\$symbol\\values.csv";
$fname2 = "$industry\\$symbol\\splits.csv";
$fname3 = "$industry\\$symbol\\dividends.csv";
$fh1 = new IO::File "> $fname1";
$fh2 = new IO::File "> $fname2";
$fh3 = new IO::File "> $fname3";
$q = new Finance::QuoteHist::Yahoo
(
symbols => "$symbol",
start_date => '01/01/1998',
end_date => 'today'
);
# Values
foreach $row ($q->quotes()) {
if (defined $row ) {
($symbol, $date, $open, $high, $low, $close, $volume) = @$row;
print $fh1 "$symbol, $date, $open, $high, $low, $close, $volume
\n";
} else {
print "No data for $symbol\n";
next;
}
}
$fh1->close();
foreach $row ($q->splits()) {
if (defined $row ) {
($symbol, $date, $post, $pre) = @$row;
print $fh2 "$symbol, $date, $post, $pre\n";
} else {
print "No splits data for $symbol\n";
next;
}
}
$fh2->close();
foreach $row ($q->dividends()) {
if (defined $row ) {
($symbol, $date, $dividend) = @$row;
print $fh3 "$symbol, $date, $dividend\n";
} else {
print "No dividend data for $symbol\n";
next;
}
}
$fh3->close();
}
}
 
J

Jim Cochrane

Without even looking at your code my guess is that Yahoo! is not particularly
pleased about providing an ad-sponsored service to web-scrapers. Part of the
reason that free services change their page format is to block people from
doing this. Yahoo! changes their page format fairly frequently and I'm willing
to bet that their web programmers have access to CPAN. If you need 7000 quotes
then its pretty clearly not for your personal portfolio. Maybe time for your
company to invest in a quote service.

If you have good software, configured to do intensive filtering, data
for 7000 tradables can be easily managable for an individual trader.
On the other hand, if an individual trader needs a reliable and accurate
data feed for automated analysis, a free yahoo service is probably a
poor choice.

It sounds like the package in question is not as robust as it could be,
allowing the user to encounter a run-time undefined value error instead
of producing a good error message. However, yahoo changing the server's
behavior or data format seems a likely initial cause of the problem.


--
 
T

Ted Byers

If you have good software, configured to do intensive filtering, data
for 7000 tradables can be easily managable for an individual trader.
On the other hand, if an individual trader needs a reliable and accurate
data feed for automated analysis, a free yahoo service is probably a
poor choice.

It sounds like the package in question is not as robust as it could be,
allowing the user to encounter a run-time undefined value error instead
of producing a good error message.  However, yahoo changing the server's
behavior or data format seems a likely initial cause of the problem.

--- Hide quoted text -

- Show quoted text -

This project is at its beginning, proof of concept, stage. The plan
was to do an initial test using data from yahoo, and then proceed to
buying a datafeed from a commercial source.

We have had another project downloading a paltry 500 stocks daily for
about a year, without change or problem, but it doesn't get the data
for splits and dividends and it is written in PHP which I have never
used. I'd thought, given the original programmer's tedious task of
manually finding new stocks to add to the download, and the
documentaton for these packages, rewriting in perl would be the path
of least pain. But, it looks like I may have to write my own code to
do what this package does, so at least I can get a sensible error
message.

Cheers,

Ted
 
B

brian d foy

Ted Byers said:
This project is at its beginning, proof of concept, stage. The plan
was to do an initial test using data from yahoo, and then proceed to
buying a datafeed from a commercial source.

In that case you might make your own data source with whatever data you
like (and that data doesn't have to represent anythign real). You can
then test it and use it anyway that you like, and insert as many
special cases as you like.

These sorts of modules don't need to be anything fancy. It could be a
big data structure witha single subroutine to return the right thing
for each input. It doesn't need to think about it at all.

Good luck :)
 
T

Ted Byers

In that case you might make your own data source with whatever data you
like (and that data doesn't have to represent anythign real). You can
then test it and use it anyway that you like, and insert as many
special cases as you like.

These sorts of modules don't need to be anything fancy. It could be a
big data structure witha single subroutine to return the right thing
for each input. It doesn't need to think about it at all.

Good luck :)

I have seen the same idea in three different sources (one a
professional trader, one an investor, and the third is Mandlebrot who
sited several examples of its sucess), but the investor did it
manually and thus undoubtedly missed countless opportunities, the
trader's implementation is far too brittle and slow to respond to
avoid significant drawdowns, and Mandlebrot talked about it in terms
of empirical evidence for memory in price data rather than in terms of
what a trader or investor might do (but then everything he's written
that I have read has avoided such practicalities). I think I have
found a way to apply it in a way that makes a profit even in bear
markets, and still have the ability to withdraw to T-Bills should
things get really bad, avoiding the major drawdowns the markets may
experience. I have a first version that limits the drawdowns during a
bear market and often beats the market during bull markets, but it
doesn't make a profit during a bear market and so a portfolio may
experience a number of years without making a profit (e.g. from 2000
through 2003). Of course, right now, it has all my simulated
portfolios in cash, and so is not making a profit despite the fact
that when I go through the data myself I see lots of opportunities to
make a profit. This algorithm depends on data representing systems
with short to medium term memory in order to be effective.

I have often used data source modules to provide simulated data for my
models, but I don't really want to try to simulate such data at this
time (unless there is a library that facilitates creationg of
multifractal brownian motion, along with routines to estimate the
relevant parameters of such motion from empirical data - both to test
them on data with known properties and to use with real world data).
I am out to exploit properties that Mandlebrot has convincingly show
to be present in market data; properties that a simple RNG will not
have. There are a number of outstanding RNGs out there, and I use
them routinely, but I have not found an algorithm to use them to
produce a multifractal time series (perhaps because of the nature and
extent of my search so far). The challenge of creating such a module
that produces data with the properties I am out to exploit seems at
present greater than the challenge of getting real data.

Thanks

Ted
 
P

Peter Scott

I have been able to get this package to work sometimes. However,
invariably it fails on me part way through my script, after
downloading data for from one to four ticker symbols. The error is:

Can't use an undefined value as an ARRAY reference at C:/Perl/site/lib/
Finance/QuoteHist/Generic.pm line 863.

Line 863 of the current version of that module does not attempt an array
dereference. Consider upgrading.
I have no information about how an undefined value is being passed deep
into the code of a package I didn't write. I took a quick look into
Generic.pm and yahoo.pm, but found no enlightenment so far (there's a
lot of code there).

Sometimes it arises when quotes() is executed, and at other times, it
occurs when dividends() is executed.

I have been using Finance::QuoteHiat::Yahoo nearly daily for several years
without once having a problem. There again, I do not call splits() or
dividends(). I also do not use 'today' as a date and I always use a
starting date that I know data exists for. Construct a minimal program
requiring no user inputs that fails every time and post it here. If the
problem were changes to the Yahoo format it is extremely unlikely that the
behavior would be nondeterministic. I think it more likely that the
behavior is consistent for a particular ticker symbol.
 
T

Ted Byers

Hi Peter,

Line 863 of the current version of that module does not attempt an array
dereference.  Consider upgrading.
I am using version 1.11. I know that is about a year old, but I
haven't found a more recent release. Is there one?

Thanks

Ted
 
T

Ted Byers

Hi Peter,
A little more info...

Adding the following (to the appended script - slightly modified from
that shown above):

BEGIN { $SIG{__DIE__} = sub { require Carp; Carp::confess(@_) } }

Provides the following additional output:

Can't use an undefined value as an ARRAY reference at C:/Perl/site/lib/
Finance/QuoteHist/Generic.pm line 863.
at k:/MarketData/TickerSymbolsTest2.pl line 8
main::__ANON__('Can\'t use an undefined value as an ARRAY reference
at C:/Per...') called at C:/Perl/site/lib/Finance/QuoteHist/Generic.pm
line 863

Finance::QuoteHist::Generic::lineup('Finance::QuoteHist::Yahoo=HASH(0x1a309a8)')
called at C:/Perl/site/lib/Finance/QuoteHist/Generic.pm line 422
Finance::QuoteHist::Generic::__ANON__() called at C:/Perl/site/lib/
Finance/QuoteHist/Generic.pm line 142

Finance::QuoteHist::Generic::dividends('Finance::QuoteHist::Yahoo=HASH(0x1a309a8)')
called at k:/MarketData/TickerSymbolsTest2.pl line 62

Compilation exited abnormally with code 2 at Wed Jul 16 15:29:31


Does this make more sense?

The ticker it dies on is ACLO.OB.

NB: Computing yesterday's date from today's postponed the crash until
dividends() is called. Is there a way to query yahoo to find out the
inception date of a given ticker? If I had that, and provided it as
the start date, maybe that would clear things up to where this could
be useful.

Thanks

Ted
================new little
script==========================================
use Finance::TickerSymbols;
use Finance::QuoteHist::Yahoo;
use IO::File;
use Date::Manip;

$|=1;

BEGIN { $SIG{__DIE__} = sub { require Carp; Carp::confess(@_) } }

Date_Init("TZ=EST5EDT");
my $date = ParseDate('today');
$date = Date_PrevWorkDay($date,1);

my $fname1;
my $fname2;
my $fname3;

for my $industry ( industries_list()) {
print "\n\n$industry\n";
mkdir "$industry";
for my $symbol ( industry_list( $industry ) ) {
print "$symbol\n\n";
mkdir "$industry\\$symbol";
$fname1 = "$industry\\$symbol\\values.csv";
$fname2 = "$industry\\$symbol\\splits.csv";
$fname3 = "$industry\\$symbol\\dividends.csv";
$fh1 = new IO::File "> $fname1";
$fh2 = new IO::File "> $fname2";
$fh3 = new IO::File "> $fname3";
print "flag 1\n";
$q = new Finance::QuoteHist::Yahoo
(
symbols => "$symbol",
end_date => $date
);
print "flag 2\t";
# Values
foreach $row ($q->quotes()) {
print "flag 2a\t";
if (defined $row ) {
($symbol, $date, $open, $high, $low, $close, $volume) = @$row;
print $fh1 "$symbol, $date, $open, $high, $low, $close, $volume
\n";
print "flag 2b\n";
} else {
print "No data for $symbol\n";
next;
}
}
print "flag 3\t";
$fh1->close();
foreach $row ($q->splits()) {
if (defined $row ) {
($symbol, $date, $post, $pre) = @$row;
print $fh2 "$symbol, $date, $post, $pre\n";
} else {
print "No splits data for $symbol\n";
next;
}
}
print "flag 4\n";
$fh2->close();
foreach $row ($q->dividends()) {
if (defined $row ) {
($symbol, $date, $dividend) = @$row;
print $fh3 "$symbol, $date, $dividend\n";
} else {
print "No dividend data for $symbol\n";
next;
}
}
$fh3->close();
print "flag 5\n";
}
}
 
B

brian d foy

Ted Byers said:
I have often used data source modules to provide simulated data for my
models, but I don't really want to try to simulate such data at this
time (unless there is a library that facilitates creationg of
multifractal brownian motion,

I think you're over thinking it.

Grab a bunch of historical data however you like. Add any special cases
you like. Write a module to return records at the unit test level.

The simulation is just fetching records, not creating data.
 
P

Peter Scott

I am using version 1.11. I know that is about a year old, but I
haven't found a more recent release. Is there one?

Then the line number is being misreported (very unusual for a run-time
error):

859 sub target_mode {
860 my $self = shift;
861 if (@_) {
862 $self->{target_mode} = shift;
863 }
864 $self->{target_mode} || $self->default_target_mode;
865 }
866

Check your file.
Can't use an undefined value as an ARRAY reference at C:/Perl/site/lib/
Finance/QuoteHist/Generic.pm line 863.
at k:/MarketData/TickerSymbolsTest2.pl line 8
main::__ANON__('Can\'t use an undefined value as an ARRAY reference
at C:/Per...') called at C:/Perl/site/lib/Finance/QuoteHist/Generic.pm
line 863

Finance::QuoteHist::Generic::lineup('Finance::QuoteHist::Yahoo=HASH(0x1a309a8)')
called at C:/Perl/site/lib/Finance/QuoteHist/Generic.pm line 422

In my copy of this file, ->lineup() is called at line 437. However, in
version 1.10 (from backpan), not only is lineup() called from line 422,
but:

860 sub lineup {
861 my $self = shift;
862 $self->{lineup} = \@_ if @_;
863 @{$self->{lineup}};
864 }

And there's the array deref right where perl said. Therefore I am sure
you have an outdated version, whatever you think its number is. Upgrade
and rerun.
Finance::QuoteHist::Generic::__ANON__() called at C:/Perl/site/lib/
Finance/QuoteHist/Generic.pm line 142

Finance::QuoteHist::Generic::dividends('Finance::QuoteHist::Yahoo=HASH(0x1a309a8)')
called at k:/MarketData/TickerSymbolsTest2.pl line 62

Compilation exited abnormally with code 2 at Wed Jul 16 15:29:31


Does this make more sense?

The ticker it dies on is ACLO.OB.

NB: Computing yesterday's date from today's postponed the crash until
dividends() is called. Is there a way to query yahoo to find out the
inception date of a given ticker?

http://finance.yahoo.com/q/hp?s=ACLO.OB fills in the inception date as the
start date. I've not dealt with enough different symbols that I needed
to automate fetching it. I don't know if that's your problem though.
 
T

Ted Byers

Then the line number is being misreported (very unusual for a run-time
error):

   859  sub target_mode {
   860    my $self = shift;
   861    if (@_) {
   862      $self->{target_mode} = shift;
   863    }
   864    $self->{target_mode} || $self->default_target_mode;
   865  }
   866

Check your file.



In my copy of this file, ->lineup() is called at line 437.  However, in
version 1.10 (from backpan), not only is lineup() called from line 422,
but:

   860  sub lineup {
   861    my $self = shift;
   862    $self->{lineup} = \@_ if @_;
   863    @{$self->{lineup}};
   864  }

And there's the array deref right where perl said.  Therefore I am sure
you have an outdated version, whatever you think its number is.  Upgrade
and rerun.











http://finance.yahoo.com/q/hp?s=ACLO.OBfills in the inception date as the
start date.  I've not dealt with enough different symbols that I needed
to automate fetching it.  I don't know if that's your problem though.

You were right.

Activestate's PPM repository was out of date even though it said it
was providing version 1.11. Activestate's version was in fact 1.10.

I uninstalled it and installed from another repository (and verified
by looking in the files that I DID get version 1.11 this time), and
all problems went away.

I now have a couple other questions for you.

1) Do you compute adjusted closing prices yourself, or do you download
them from yahoo too? (If so, how?) Hmm, I just noticed the
documentation for label() mentions an 'adj' column. Does that mean
that if I change "($symbol, $date, $open, $high, $low, $close,
$volume) = @$row;" to "($symbol, $date, $open, $high, $low, $close,
$volume,$adj) = @$row;" I will get the adjusted prices too?

2) Finance::TickerSymbols seems to produce ticker symbols for which
Finance::QuoteHist::Yahoo doesn't seem to be able to get data. I
guess that is because it doesn't use all the sources or exchanges that
Finance::TickerSymbols uses. When I look at finance.yahoo, it seems
to list a large number of exchanges it gets data from. Is my guess
close to being correct? Is there a way to increase the percentage of
ticker symbols found by Finance::TickerSymbols that
Finance::QuoteHist::Yahoo can retrieve data for? And, out of
curiousity, is it possible to use it to find out whether or not a give
stock is traded on more than one exchange, and if so, get the data for
that stock from each exchange it is traded on (to test the efficient
markets theory - I want to see if the prices are the same on all the
exchanges a given stock is traded on)?

Thanks,

Ted
 
S

szr

Ted said:
On Fri, 11 Jul 2008 21:53:53 -0700, Ted wrote: [attributions fixed]
Can't use an undefined value as an ARRAY reference at
C:/Perl/site/lib/ Finance/QuoteHist/Generic.pm line 863.

Line 863 of the current version of that module does not
attempt an array dereference. Consider upgrading.

I am using version 1.11. I know that is about a year old,
but I haven't found a more recent release. Is there one?

Then the line number is being misreported (very unusual
for a run-time error):

859 sub target_mode {
860 my $self = shift;
861 if (@_) {
862 $self->{target_mode} = shift;
863 }
864 $self->{target_mode} || $self->default_target_mode;
865 }
866

Check your file.
[...]

You were right.

Activestate's PPM repository was out of date even though it
said it was providing version 1.11. Activestate's version
was in fact 1.10.

I uninstalled it and installed from another repository (and
verified by looking in the files that I DID get version 1.11
this time), and all problems went away.

If you have a c compiler (do you have Vidual Studio?) you can just get
the newest tarball from CPAN itself and build it directly. I personally
stopped trusting ActiveState's repository long ago and have never relaly
had any problem building modules myself, just as I can on a Linux
system. This also allows you to get your hands on newer versions, as
ActiveState is notorious for lagging behind.
 
T

Ted Byers

[attributions fixed]




Can't use an undefined value as an ARRAY reference at
C:/Perl/site/lib/ Finance/QuoteHist/Generic.pm line 863.
Line 863 of the current version of that module does not
attempt an array dereference. Consider upgrading.
I am using version 1.11. I know that is about a year old,
but I haven't found a more recent release. Is there one?
Then the line number is being misreported (very unusual
for a run-time error):
859 sub target_mode {
860 my $self = shift;
861 if (@_) {
862 $self->{target_mode} = shift;
863 }
864 $self->{target_mode} || $self->default_target_mode;
865 }
866
Check your file.
[...]

You were right.
Activestate's PPM repository was out of date even though it
said it was providing version 1.11.  Activestate's version
was in fact 1.10.
I uninstalled it and installed from another repository (and
verified by looking in the files that I DID get version 1.11
this time), and all problems went away.

If you have a c compiler (do you have Vidual Studio?) you can just get
the newest tarball from CPAN itself and build it directly. I personally
stopped trusting ActiveState's repository long ago and have never relaly
had any problem building modules myself, just as I can on a Linux
system. This also allows you to get your hands on newer versions, as
ActiveState is notorious for lagging behind.
Yes, I have Visual Studio 2005.

That is sorely tempting.

But first things first.

I am now hitting a problem with " $fh1 = new IO::File "> $fname1";
It always fails at the same place, after having been successfully
executed several thousand times. I am starting to wonder if there is
a limit on the number of times it can be called within one program. :-
(
 
J

J. Gleixner

Ted Byers wrote:
[...]
I am now hitting a problem with " $fh1 = new IO::File "> $fname1";
It always fails at the same place, after having been successfully
executed several thousand times. I am starting to wonder if there is
a limit on the number of times it can be called within one program. :-
(

This is a totally different issue, one that will require you to
post your code and a different subject, if you want any help.

Spend some time trying to debug it yourself. Make sure you're
closing the file and that the close was successful.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top