inputting the ephemerides

Larry Gates · Jan 20, 2009

Happy Bye George Day!

I've been chipping away at a long-term project: investigating the
ephemeris. I think it would make a great way to continue exploring perl's
pattern-matching capabilities.

So I'll have a program that looks like this:

my $filename = 'eph3.txt';
open(my $fh, '<', $filename) or die "cannot open $filename: $!";
while (<$fh>) {
print $_;
}
close($fh)

# perl faulk10.pl

or

open(my $fh, '<', 'eph3.txt');
while (my $line = <$fh>) {
print $line;
}
close($fh)

# perl faulk7.pl

I'll want to have an explicit variable for the line, so I'll use the better
parts of the above.

The first thing I'll want to do is capture the first seven characters in a
line. We can assume that these will always be letters or spaces padded out
to the right.

After that, I want to strip away all the characters, as does the following
fortran routine. In this treatment $line would be inrec .

subroutine WasteNonDigits(inrec)
character*80 inrec
character*1 c1,c2
character*13 ValidDigits
data ValidDigits/'0123456789.-+'/
n=13
do i=1,80
c1=' '
c2=inrec(i:i)
do j=1,n
if(c2.eq.ValidDigits(j:j)) c1=c2
end do
inrec(i:i)=c1
end do
return
end subroutine

Ultimately, I want to populate an object that I think would be pretty tame
by perl standards.

This is the data set:

C:\MinGW\source>type eph3.txt
! yesterday
# another comment

Sun 18h 41m 55s -23 5.4' 0.983 10.215 52.155 Up
Mercury 20h 2m 16s -22 12.5' 1.102 22.537 37.668 Up
Venus 21h 55m 33s -14 16.3' 0.795 39.872 11.703 Up
Moon 21h 17m 19s -15 2.4' 62.4 ER 36.796 22.871 Up
Mars 18h 11m 59s -24 6.1' 2.431 4.552 56.184 Up
Jupiter 20h 3m 35s -20 49.4' 6.034 23.867 38.203 Up
Saturn 11h 32m 59s +5 8.6' 9.018 -47.333 157.471 Set
Uranus 23h 21m 30s -4 57.9' 20.421 48.328 -18.527 Up
Neptune 21h 39m 30s -14 22.8' 30.748 38.963 16.599 Up
Pluto 18h 4m 34s -17 44.5' 32.543 7.443 62.142 Up

C:\MinGW\source>

Thanks for your comment.
--
larry gates

You know how people are sometimes rude on Usenet or on a mailing list.
Sometimes they'll write something that can only be taken as a deadly
insult,
and then they have the unmitigated gall to put a smiley face on it, as if
that makes it all right. -- Larry Wall, 8th State of the Onion

Jürgen Exner · Jan 20, 2009

Larry Gates said:
So I'll have a program that looks like this:

my $filename = 'eph3.txt';
open(my $fh, '<', $filename) or die "cannot open $filename: $!";
while (<$fh>) {
print $_;
}
close($fh)

Not much of a program, don't you think?

open(my $fh, '<', 'eph3.txt');

You are missing error handling

while (my $line = <$fh>) {
print $line;
}
close($fh)

You know, this is terribly similar to something I have seen a few weeks
ago from some George character. Are you sure you are not suffereing from
Dissociative Identity Disorder?

I'll want to have an explicit variable for the line, so I'll use the better
parts of the above.

The first thing I'll want to do is capture the first seven characters in a
line.

See "perldoc -f substr".

After that, I want to strip away all the characters, as does the following
fortran routine. In this treatment $line would be inrec .

And what exactly does that Fortran routine do? Not everyone speaks
Fortran, therefore an abstract specification or at least a detailed
description would be much better than dumping some code in some foreign
programming language.

Again, that George was dumping Fortran code into this NG, too.

Ultimately, I want to populate an object that I think would be pretty tame
by perl standards.

And what would that object be? An AoA? An A0H? A HoH?

jue

Tim Greer · Jan 20, 2009

Jürgen Exner said:
You know, this is terribly similar to something I have seen a few
weeks ago from some George character

He was George, said he was changing his posting name in celebration of
George being out of office. I also can't wrap my head around his
example of posting code that opens a file and prints and then another
snippet basically being the same, just without error checking. Weird.
I stopped reading and moved on.

RedGrittyBrick · Jan 20, 2009

Larry Gates wrote:
[...]

Thanks for your comment.

I didn't see any questions.

Good luck with your project.

George · Jan 20, 2009

He was George, said he was changing his posting name in celebration of
George being out of office. I also can't wrap my head around his
example of posting code that opens a file and prints and then another
snippet basically being the same, just without error checking. Weird.
I stopped reading and moved on.

Between the responses no source survived, so you can't wonder too much why
context is a problem.

Here's the resolution of your "weirdness:"

my $filename = 'eph3.txt';
open(my $fh, '<', $filename) or die "cannot open $filename: $!";

while (my $line = <$fh>){

print $line;
}
close($fh)

# perl faulk11.pl
--
George

You can fool some of the people all the time, and those are the ones you
want to concentrate on.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/

Larry Gates · Jan 20, 2009

Larry Gates wrote:
[...]

Thanks for your comment.

Click to expand...

I didn't see any questions.

Good luck with your project.

I guess I didn't.

I was fishing for a perl method to deliver only the numbers and whitespace.

Tad J McClellan · Jan 20, 2009

The first thing I'll want to do is capture the first seven characters in a
line.

my $first7 = substr $line, 0, 7;

We can assume that these will always be letters or spaces padded out
to the right.

If you want to validate the data against those criteria, then:

die "'$first7' is 'bad' data\n" unless $first7 =~ /^[a-z]+\s*$/i;

After that, I want to strip away all the characters,

Errr... OK:

$first7 = '';

If you instead meant to say "strip away all the trailing space characters":

$first7 =~ s/\s+$//;

treybianchini · Jan 20, 2009

Larry Gates said:
Larry Gates said:

The first thing I'll want to do is capture the first seven characters in a
line.

Click to expand...

my $first7 = substr $line, 0, 7;

We can assume that these will always be letters or spaces padded out
to the right.

Click to expand...

If you want to validate the data against those criteria, then:

die "'$first7' is 'bad' data\n" unless $first7 =~ /^[a-z]+\s*$/i;

After that, I want to strip away all the characters,

Click to expand...

Errr... OK:

$first7 = '';

If you instead meant to say "strip away all the trailing space characters":

$first7 =~ s/\s+$//;

Maybe you meant 'strip away all of the non numeric characters'.
$first7 =~ s/\D//g;
That will also take out any periods and negative signs which you might
want so you might have to tailor the expression a bit more than that.

Perhaps you could split the lines of the input file into fields and
then write some conditional logic which might do different
substitutions based on what expression the data matches before you
populate your objects.

$field =~ s/\s//g;
if ( $field =~ /\d+\./ ) {
...
} elsif { $field =~ /^*\d+\D/ ) {

etc...

}

A really "great way to continue exploring perl's pattern-matching
capabilities" might be to go to

http://www.perl.com/doc/manual/html/pod/perlre.html

or type

perldoc perlre <enter>

or read your textbook more closely maybe?

Good luck,
Trey

RedGrittyBrick · Jan 20, 2009

Larry said:
Larry Gates wrote:
[...]

Thanks for your comment.

Click to expand...

I didn't see any questions.

Good luck with your project.

Click to expand...

I guess I didn't.

I was fishing for a perl method to deliver only the numbers and whitespace.

perldoc -f substr
perldoc -f split

Assuming you don't mean
perl -p -e 's/[^\s\d]//g' ephemerides.txt

Tim Greer · Jan 20, 2009

George said:
Between the responses no source survived, so you can't wonder too much
why context is a problem.

Here's the resolution of your "weirdness:"

my $filename = 'eph3.txt';
open(my $fh, '<', $filename) or die "cannot open $filename: $!";

while (my $line = <$fh>){

print $line;
}
close($fh)

# perl faulk11.pl

I saw your original post (I don't have you blocked). You posted a
portion of code, such as above, that goes through each line and prints
it. It's not really doing anything related to the task you asked
about. What was "weird" was that you posted the same code again (for
the most part), but just failed to check the return value of open and
didn't add $_ to print (since it wouldn't be needed -- it wasn't in the
first one either). I fail to understand the purpose of you doing that?
You then posted about what you wanted to do, but you didn't post any
code relevant to it to show us what you have tried. I admit, I thought
that was weird.

Jim Gibson · Jan 20, 2009

This is the data set:

C:\MinGW\source>type eph3.txt
! yesterday
# another comment

Sun 18h 41m 55s -23 5.4' 0.983 10.215 52.155 Up
Mercury 20h 2m 16s -22 12.5' 1.102 22.537 37.668 Up
Venus 21h 55m 33s -14 16.3' 0.795 39.872 11.703 Up
Moon 21h 17m 19s -15 2.4' 62.4 ER 36.796 22.871 Up
Mars 18h 11m 59s -24 6.1' 2.431 4.552 56.184 Up
Jupiter 20h 3m 35s -20 49.4' 6.034 23.867 38.203 Up
Saturn 11h 32m 59s +5 8.6' 9.018 -47.333 157.471 Set
Uranus 23h 21m 30s -4 57.9' 20.421 48.328 -18.527 Up
Neptune 21h 39m 30s -14 22.8' 30.748 38.963 16.599 Up
Pluto 18h 4m 34s -17 44.5' 32.543 7.443 62.142 Up

C:\MinGW\source>

Thanks for your comment.

You can use the unpack function to unpack data lines with fixed-length
columns like your example. See 'perldoc -f unpack' for details, and
'peldoc -f pack' for template parameters. Note that the A parameter
will cause Perl to trim trailing blanks on unpacking.

Something like

my( $name, $hour, $min, $sec, ... ) =
unpack('A8 A4 A4 A4 ... ',$line);

should work.

Larry Gates · Jan 21, 2009

You can use the unpack function to unpack data lines with fixed-length
columns like your example. See 'perldoc -f unpack' for details, and
'peldoc -f pack' for template parameters. Note that the A parameter
will cause Perl to trim trailing blanks on unpacking.

Something like

my( $name, $hour, $min, $sec, ... ) =
unpack('A8 A4 A4 A4 ... ',$line);

should work.

Thanks, Jim. I've been working really hard to get my head around pattern
matching and tried something similar to what you write but was unable to
get output.

my $filename = 'eph6.txt';
open(my $fh, '<', $filename) or die "cannot open $filename: $!";

while (<$fh>) {

/(\w+)\W*(\d2).*(\d2).*(\d2)\W*([-|+]\d+).*(\d+\.\d+).*(\d+\.\d+).*(-*\d+\.\d+).*(-*\d+\.\d+)\W*(\w+)\W*/;

print "$1\n";
print $_;
}
close($fh)

# perl faulk11.pl

C:\MinGW\source>perl faulk11.pl

Sun 19h 43m 51s -21â–‘ 17.8' 0.984 -35.020 87.148 Set

Mercury 20h 36m 41s -16â–‘ 59.3' 0.747 -22.075 84.236 Set

Venus 22h 51m 18s -7â–‘ 46.9' 0.691 10.142 72.919 Up

Moon 10h 24m 21s +7â–‘ 29.5' 58.6 ER -4.992 -102.785 Set

Mars 18h 58m 51s -23â–‘ 33.8' 2.398 -45.280 90.860 Set

Jupiter 20h 17m 22s -20â–‘ 8.1' 6.082 -27.618 83.843 Set

Saturn 11h 32m 29s +5â–‘ 16.0' 8.806 -19.672 -111.729 Set

Uranus 23h 23m 12s -4â–‘ 46.5' 20.638 18.211 70.235 Up

Neptune 21h 41m 17s -14â–‘ 13.9' 30.892 -7.527 77.864 Set

Pluto 18h 6m 40s -17â–‘ 44.9' 32.485 -52.833 108.052 Set
C:\MinGW\source>

So no better output yet, but I think I'm close here.
--
larry gates

Or I suppose we could always recontextualize the meaning of "is"
instead. There is prior art...
-- Larry Wall in <[email protected]>

Larry Gates · Jan 21, 2009

I saw your original post (I don't have you blocked). You posted a
portion of code, such as above, that goes through each line and prints
it. It's not really doing anything related to the task you asked
about. What was "weird" was that you posted the same code again (for
the most part), but just failed to check the return value of open and
didn't add $_ to print (since it wouldn't be needed -- it wasn't in the
first one either). I fail to understand the purpose of you doing that?
You then posted about what you wanted to do, but you didn't post any
code relevant to it to show us what you have tried. I admit, I thought
that was weird.

It turns out I've gone different ways twice with this as I try to get some
output here. This is essentially the same script that I posted downthread
as resposnse to Jim Gibson:

my $filename = 'eph6.txt';
open(my $fh, '<', $filename) or die "cannot open $filename: $!";

while (<$fh>) {

/(\w+)\W*(\d2).*(\d2).*(\d2)\W*([-|+]\d+).*(\d+\.\d+).*(\d+\.\d+)
.*(-*\d+\.\d+).*(-*\d+\.\d+)\W*(\w+)\W*/;

print "$1\n";
print $_;
}
close($fh)

# perl faulk12.pl

I was under the impression that I created a bunch of variables like $1 $2
$3, but I don't have output yet.:-(
--
larry gates

One error message that would be of great benefit to novices is if we
could guess where the missing brace is based on indentation. (But not
*assuming* the missing brace, of course--this isn't Python...

-- Larry Wall in <[email protected]>

Tad J McClellan · Jan 21, 2009

Larry Gates said:
Thanks, Jim. I've been working really hard to get my head around pattern
matching

Pattern matching is not the Right Tool for fixed-width columnated data.

unpack() (or substr) is.

Tad J McClellan · Jan 21, 2009

Larry Gates said:
On Tue, 20 Jan 2009 09:09:44 -0800, Tim Greer wrote:

It turns out I've gone different ways twice with this as I try to get some
output here.

/(\w+)\W*(\d2).*(\d2).*(\d2)\W*([-|+]\d+).*(\d+\.\d+).*(\d+\.\d+)
.*(-*\d+\.\d+).*(-*\d+\.\d+)\W*(\w+)\W*/;

print "$1\n";

I was under the impression that I created a bunch of variables like $1 $2
$3,

The dollar-digit variables are only set when the pattern match *succeeds*.

Therefore, you should never use the dollar-digit variables unless
you have first ensured that the match in question succeeded:

if ( /(\w+)\W*(\d2).*(\d2).*(\d2)\W*([-|+]\d+).*(\d+\.\d+).*(\d+\.\d+)
.*(-*\d+\.\d+).*(-*\d+\.\d+)\W*(\w+)\W*/x ) {
print "$1\n";
}
else {
print "match failed!\n";
}

I doubt that \d2 does what you think it does.

It matches 2-digit strings where the 2nd digit is a "2".

You probably want \d{2} instead?

I doubt that [-|+] does what you think it does. It matches any of 3
characters: vertical bar, plus sign, minus sign.

You probably want [+-] instead.

You probably want \W+ rather than \W*

You probably want .*? rather than .*

but I don't have output yet.:-(

Don't try to do it all at once. Get it working a little at a time:

/(\w+)\W+/

/(\w+)\W+(\d{2})/

/(\w+)\W+(\d{2}).*?(\d{2})/

etc...

Note that all of this is moot, because pattern matching is
not the Right Tool for what you are trying to accomplish...

John W Kennedy · Jan 21, 2009

$line =~ /[^0123456789.-+]/ /g;

sln · Jan 21, 2009

You can use the unpack function to unpack data lines with fixed-length
columns like your example. See 'perldoc -f unpack' for details, and
'peldoc -f pack' for template parameters. Note that the A parameter
will cause Perl to trim trailing blanks on unpacking.

Something like

my( $name, $hour, $min, $sec, ... ) =
unpack('A8 A4 A4 A4 ... ',$line);

should work.

Click to expand...

Thanks, Jim. I've been working really hard to get my head around pattern
matching and tried something similar to what you write but was unable to
get output.

my $filename = 'eph6.txt';
open(my $fh, '<', $filename) or die "cannot open $filename: $!";

while (<$fh>) {

/(\w+)\W*(\d2).*(\d2).*(\d2)\W*([-|+]\d+).*(\d+\.\d+).*(\d+\.\d+).*(-*\d+\.\d+).*(-*\d+\.\d+)\W*(\w+)\W*/;

[snip]

Interresting. You can pay me for flawless regex processing of all your data.
Or, you can do bullshit with your right hand.

sln

Larry Gates · Jan 21, 2009

Pattern matching is not the Right Tool for fixed-width columnated data.

unpack() (or substr) is.

There is a lot of variation in the data though. The moon has ER after the
distance. If we replace the tabs with space, I think I can make this work,
but as it is, I don't know that there exists a sequence of integers that
works for the integer part of the format descriptors.

my $filename = 'eph6.txt';
open(my $fh, '<', $filename) or die "cannot open $filename: $!";

while (<$fh>) {

my( $name, $hour, $min, $sec, ) =
unpack('A8 A4 A4 A4 ', $_);

print $name, $hour, $min, $sec, "\n";
}
close($fh)

# perl faulk13.pl

C:\MinGW\source>perl faulk13.pl
Sun 19h43m51s-21
Mercury20h36m41s
Venus 22h 51m 18s -7
Moon 10h 24m 21s +7
Mars 18h 58m 51s -23
Jupiter20h17m22s
Saturn 11h 32m 29s +
Uranus 23h 23m 12s -
Neptune21h41m17s
Pluto 18h 6m 40s -17

C:\MinGW\source>

Some haven't started to read the declination while others have a sign and
two digits.

The data comes from
http://www.fourmilab.ch/cgi-bin/Yoursky
, where you can make a sky map from any city and any time.

Is it possible for perl to imitate the keystrokes and mouse clicks so as to
get these data directly?

Larry Gates · Jan 21, 2009

The dollar-digit variables are only set when the pattern match *succeeds*.

Therefore, you should never use the dollar-digit variables unless
you have first ensured that the match in question succeeded:

if ( /(\w+)\W*(\d2).*(\d2).*(\d2)\W*([-|+]\d+).*(\d+\.\d+).*(\d+\.\d+)
.*(-*\d+\.\d+).*(-*\d+\.\d+)\W*(\w+)\W*/x ) {
print "$1\n";
}
else {
print "match failed!\n";
}

I doubt that \d2 does what you think it does.

It matches 2-digit strings where the 2nd digit is a "2".

You probably want \d{2} instead?

I doubt that [-|+] does what you think it does. It matches any of 3
characters: vertical bar, plus sign, minus sign.

You probably want [+-] instead.

You probably want \W+ rather than \W*

You probably want .*? rather than .*

but I don't have output yet.:-(

Click to expand...

Don't try to do it all at once. Get it working a little at a time:

/(\w+)\W+/

/(\w+)\W+(\d{2})/

/(\w+)\W+(\d{2}).*?(\d{2})/

etc...

Note that all of this is moot, because pattern matching is
not the Right Tool for what you are trying to accomplish...

I'm inclined to think that this is the way. With your changes, I'm getting
real clean input until I get halfway through:

my $filename = 'eph6.txt';
open(my $fh, '<', $filename) or die "cannot open $filename: $!";

while (<$fh>) {

/(\w+)\W+/;
/(\w+)\W+(\d{2})/;

/(\w+)\W+(\d{2}).*?(\d{2})/;

/(\w+)\W+(\d{2}).*?(\d{2}).*?(\d{2})/;
/(\w+)\W+(\d{2}).*?(\d{2}).*?(\d{2}).*?([-+]\d{2})/;

print "string one is $1\n";
print "string two is $2\n";
print "string three is $3\n";
print "string four is $4\n";
print "string five is $5\n";
print $_;
}
close($fh)

# perl faulk14.pl

C:\MinGW\source>perl faulk14.pl
string one is Sun
string two is 19
string three is 43
string four is 51
string five is -21
Sun 19h 43m 51s -21 17.8' 0.984 -35.020 87.148 Set
....
string one is Pluto
string two is 18
string three is 40
string four is 17
string five is -52
Pluto 18h 6m 40s -17 44.9' 32.485 -52.833 108.052 Set
C:\MinGW\source>

The sun is right, but pluto went to the -52. I think I'm missing a
quantifer in *?([-+]\d{2})

John W. Krahn · Jan 21, 2009

John said:
$line =~ /[^0123456789.-+]/ /g;

$ perl -ce'$line =~ /[^0123456789.-+]/ /g;'
Invalid [] range ".-+" in regex; marked by <-- HERE in m/[^0123456789.-+
<-- HERE ]/ at -e line 1.

And if you fix that you are left with the match operator results divided
by the numerical value of the string 'g' in void context.

$ perl -cwe'$line =~ /[^0123456789.+-]/ /g;'
Unquoted string "g" may clash with future reserved word at -e line 1.
Useless use of division (/) in void context at -e line 1.
Name "main::line" used only once: possible typo at -e line 1.
-e syntax OK

John

processing text	12	Jan 15, 2009
FAQ 5.3 How do I count the number of lines in a file?	0	Jan 31, 2011
FAQ 7.15 How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}?	0	Jan 22, 2011
FAQ 6.6 How do I substitute case insensitively on the LHS while preserving case on the RHS?	0	Feb 8, 2011
Vector Space Search Engine	4	Oct 11, 2005
Need help understanding an Array push	5	Apr 10, 2007
Net::IRC::DCC	0	Oct 14, 2003
Accessing Hash of hash of arrays	9	Feb 14, 2005

inputting the ephemerides

Larry Gates

Jürgen Exner

Tim Greer

RedGrittyBrick

George

Larry Gates

Tad J McClellan

treybianchini

RedGrittyBrick

Tim Greer

Jim Gibson

Larry Gates

Larry Gates

Tad J McClellan

Tad J McClellan

John W Kennedy

sln

Larry Gates

Larry Gates

John W. Krahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads