Newbie file / string question

R

Ren

Suppose I have a file containing several lines like this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Perl may be more
suited for the task.

How is this accomplished using Perl?
 
G

Gunnar Hjalmarsson

Ren said:
Suppose I have a file containing several lines like this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be
28E7, 0530 needs to be 3005 and so on.


my @hex;
while (<DATA>) {
s/:\d{8}//;
my @line = map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g;
push @hex, [ @line ];
}
print "@{$hex[0]}\n" for 0..$#hex;
 
T

Tad McClellan

Ren said:
:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728,


Since you don't tell us _why_ that is the start of the interesting
part, we will have to leave it to you to strip out the leading
stuff. Maybe this would do it:

s/^:\d{8}//; # strip prefix

??
like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2


my @word16 = /(\w{2,4})/g;

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.


s/(\w\w)(\w\w)/$2$1/ for @word16; # swap "byte order"
 
G

Gunnar Hjalmarsson

Gunnar said:
my @hex;
while (<DATA>) {
s/:\d{8}//;
my @line = map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g;
push @hex, [ @line ];
}
print "@{$hex[0]}\n" for 0..$#hex;

Correction:

print "@{$hex[$_]}\n" for 0..$#hex;
------------------^^
 
A

Anno Siegel

Gunnar Hjalmarsson said:
my @hex;
while (<DATA>) {
s/:\d{8}//;
my @line = map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g;
push @hex, [ @line ];

"[ @line ]" makes a needless copy here. Since @line is declared inside the
loop, "\ @line" is sufficient.
}
print "@{$hex[0]}\n" for 0..$#hex;

Anno
 
A

Anno Siegel

Gunnar Hjalmarsson said:
Gunnar said:
my @hex;
while (<DATA>) {
s/:\d{8}//;
my @line = map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g;
push @hex, [ @line ];
}
print "@{$hex[0]}\n" for 0..$#hex;

Correction:

print "@{$hex[$_]}\n" for 0..$#hex;

Correction:

print "@$_\n" for @hex;

Anno
 
A

Anno Siegel

Gunnar Hjalmarsson said:
Anno said:
Gunnar said:
Correction:

print "@{$hex[$_]}\n" for 0..$#hex;

Correction:

print "@$_\n" for @hex;

Err.. Better, yes, but both are *correct*, right?

Well, yes. I'd still say it should be replaced, even if it works. When
I read it first I didn't notice what exactly it does. Instead, I assumed
that @hex contained stuff that takes some doing to print.

That's the problem with all code that does more than needs to be done.
Even if it doesn't hurt, it leads the reader to false conclusions.

Anno
 
G

gnari

Anno Siegel said:
Gunnar Hjalmarsson said:
my @hex;
while (<DATA>) {
s/:\d{8}//;
my @line = map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g;
push @hex, [ @line ];

"[ @line ]" makes a needless copy here. Since @line is declared inside the
loop, "\ @line" is sufficient.

or drop the variable
push @hex, [ map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g ];


gnari
 
G

Gunnar Hjalmarsson

gnari said:
Anno said:
Gunnar said:
my @hex;
while (<DATA>) {
s/:\d{8}//;
my @line = map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g;
push @hex, [ @line ];

"[ @line ]" makes a needless copy here. Since @line is declared
inside the loop, "\ @line" is sufficient.

or drop the variable
push @hex, [ map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g ];

I thought of that, and dropped the idea for readability reasons. But
now when I look at it, it's not *that* terrible. Or is it? ;-)
 
A

Anno Siegel

Gunnar Hjalmarsson said:
gnari said:
Anno said:
Gunnar Hjalmarsson wrote:

my @hex;
while (<DATA>) {
s/:\d{8}//;
my @line = map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g;
push @hex, [ @line ];

"[ @line ]" makes a needless copy here. Since @line is declared
inside the loop, "\ @line" is sufficient.

or drop the variable
push @hex, [ map { s/(..)(..)/$2$1/; $_ } /(\w{4})/g ];

I thought of that, and dropped the idea for readability reasons. But
now when I look at it, it's not *that* terrible. Or is it? ;-)

I think it's fine. The individual components are easy to understand.
It does take some Perl skills to see what the individual components
*are*, but then, it's Perl code.

One perceived problem with this kind of code is that the left-to-right
sequence doesn't represent the sequence of events when the code runs.
Obviously, "/(\w{4})/g" in list context must happen first, but it's almost
last.

The sequence of events is not necessarily the best way to understand the
code. At least in this case, reading left to right, with a few glances
(one, really) to the end of the statement, gives you what you need to know
in a reasonable sequence.

Another point about compact code is that, when it takes time to read, it
also contains more information per unit than longer code. It can't be
expected to read as fast.

Anno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
EmeliaBryc

Latest Threads

Top