ordered hashes

Julian · Dec 9, 2008

Hi,
Is there a best practice to implement an "ordered hash" in Perl?

For example I need to manipulate a csv file: each line is a record with
a fixed number of fields.

Here are some solutions but I don't find them very elegant:

Solution 1:
Two arrays: the first array holds the keys, the second array holds the
values.

my @arrkeys = ("first name", "age", "food"....)
my @arrvalues = ("joe", 39, "corn flakes",...)

Solution 2:
An array of hashes:
my @orderedhash = (
{ FieldName => "first name", Value => "Joe" },
{ FieldName => "age", Value => 39 },
...
);

Solution 3:
An array to keep the order of the field, and then a hash.
my @fieldnames = ( "first name", "age", "food",...);
my %h = { "first name" => "joe", "age" => 39,...)

TIA

david · Dec 9, 2008

Hi,
Is there a best practice to implement an "ordered hash" in Perl?

For example I need to manipulate a csv file: each line is a record with
a fixed number of fields.

Here are some solutions but I don't find them very elegant:

Solution 1:
Two arrays: the first array holds the keys, the second array holds the
values.

my @arrkeys = ("first name", "age", "food"....)
my @arrvalues = ("joe", 39, "corn flakes",...)

Solution 2:
An array of hashes:
my @orderedhash = (
{ FieldName => "first name", Value => "Joe" },
{ FieldName => "age", Value => 39 },
...
);

Solution 3:
An array to keep the order of the field, and then a hash.
my @fieldnames = ( "first name", "age", "food",...);
my %h = { "first name" => "joe", "age" => 39,...)

TIA

why do you need an ordered hash ?
What are you trying to achieve in the program

Martien Verbruggen · Dec 9, 2008

Hi,
Is there a best practice to implement an "ordered hash" in Perl?

Tie::IxHash is available from CPAN.

[snip CSV example]

I've done things like this in the past, but have always opted to use an
array with the order of the keys, rather than using an ordered hash.

YMMV

Martien

Julian · Dec 9, 2008

bugbear a écrit :

That's a 2D array.

yes, the file is a 2D array. But I'm processing one line at a time (the
files I'm processing can have hundreds of thousands records).
A line is a 1D array.
But for example the 24th field is the social security number, the 25th
field is the wage, and so on.
In my program I don't want to write $arr[23] = ..., $arr[24]=...
A hash of some kind is more appropriate:
$current_record{SSNumber} = ... , $current_record{Wage} = ...

Julian · Dec 9, 2008

david a écrit :

why do you need an ordered hash ?
What are you trying to achieve in the program

It's a generic question. I often manipulate csv files or tab-separated
files, where each field has some semantic. When I read or write a line,
I need an array. But when I work on the values of the fields I prefer to
work on a variable whose name has a meaning rather than on n'th element
of the array.

Jürgen Exner · Dec 9, 2008

Julian said:
bugbear a écrit :

yes, the file is a 2D array. But I'm processing one line at a time (the
files I'm processing can have hundreds of thousands records).
A line is a 1D array.

Then I misunderstand your intial question, too.

But for example the 24th field is the social security number, the 25th
field is the wage, and so on.

Why would you want to order SSN and wage? WIth extremely few exceptions
the SSN will always be larger than the wage.

In my program I don't want to write $arr[23] = ..., $arr[24]=...
A hash of some kind is more appropriate:
$current_record{SSNumber} = ... , $current_record{Wage} = ...

Ok, then, why don't you do it? Where is the problem?

jue

david · Dec 9, 2008

bugbear a écrit :

That's a 2D array.

Click to expand...

yes, the file is a 2D array. But I'm processing one line at a time (the
files I'm processing can have hundreds of thousands records).
A line is a 1D array.
But for example the 24th field is the social security number, the 25th
field is the wage, and so on.
In my program I don't want to write $arr[23] = ..., $arr[24]=...
A hash of some kind is more appropriate:
$current_record{SSNumber} = ... , $current_record{Wage} = ...

you try to make something like an enum

try maybe this

use constant {
...
...
SSNumber=>23,
Wage=>24
};
and then
$current_record[SSNumber] = ... , $current_record[Wage] = ...

Ron Bergin · Dec 9, 2008

bugbear a écrit :

yes, the file is a 2D array. But I'm processing one line at a time (the
files I'm processing can have hundreds of thousands records).

Sounds to me that a database would be far more appropriate and more
efficient than a flat csv file.

Charlton Wilbur · Dec 9, 2008

J> It's a generic question. I often manipulate csv files or
J> tab-separated files, where each field has some semantic. When I
J> read or write a line, I need an array. But when I work on the
J> values of the fields I prefer to work on a variable whose name
J> has a meaning rather than on n'th element of the array.

my @field_labels = qw/date time transaction_id memo amount quantity total/;

# ...

my %item;
@item{@field_labels} = @fields_from_csv;

# ...
# do your processing on the values of %item
# ...

my @fields_to_output = @item{@field_labels};

The trick is to realize that you probably don't need an ordered hash for
most of the processing you want to do -- you just need to read the line
from the CSV in a specified order, to match the keys correctly, and you
need to output the line from the CSV in a specified order, to match what
the file format expects.

Charlton

xhoster · Dec 9, 2008

Julian said:
Hi,
Is there a best practice to implement an "ordered hash" in Perl?

For example I need to manipulate a csv file: each line is a record with
a fixed number of fields.

Have you looked into a module, like Data::Table, the encapsulates the
entire table functionality?

Here are some solutions but I don't find them very elegant:

Solution 1:
Two arrays: the first array holds the keys, the second array holds the
values.

my @arrkeys = ("first name", "age", "food"....)
my @arrvalues = ("joe", 39, "corn flakes",...)

There is no efficient way to look up by field name.

Solution 2:
An array of hashes:
my @orderedhash = (
{ FieldName => "first name", Value => "Joe" },
{ FieldName => "age", Value => 39 },
...
);

Again, there is no efficient way to look up by field name.

Solution 3:
An array to keep the order of the field, and then a hash.
my @fieldnames = ( "first name", "age", "food",...);
my %h = { "first name" => "joe", "age" => 39,...)

This is how I would do it, if for some reason I wouldn't use
a CPAN modules in the first place. @fieldnames would probably only need
to be used during input and output, not the main processing body.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Jim Gibson · Dec 9, 2008

Julian said:
Hi,
Is there a best practice to implement an "ordered hash" in Perl?

For example I need to manipulate a csv file: each line is a record with
a fixed number of fields.

Here are some solutions but I don't find them very elegant:

Solution 1:
Two arrays: the first array holds the keys, the second array holds the
values.

my @arrkeys = ("first name", "age", "food"....)
my @arrvalues = ("joe", 39, "corn flakes",...)

Solution 2:
An array of hashes:
my @orderedhash = (
{ FieldName => "first name", Value => "Joe" },
{ FieldName => "age", Value => 39 },
...
);

Solution 3:
An array to keep the order of the field, and then a hash.
my @fieldnames = ( "first name", "age", "food",...);
my %h = { "first name" => "joe", "age" => 39,...)

See 'perldoc -q keep' "How can I always keep my hash sorted?"

Answer:

Use a database (e.g., DB_File) or the Tie::IxHash module.

Uri Guttman · Dec 9, 2008

MV> I've done things like this in the past, but have always opted to use an
MV> array with the order of the keys, rather than using an ordered hash.

i wrote this about a week ago.

my @col_names = split( /\t/, <> ) ;

my %name2col ;
@name2col{ @col_names } = 0 .. $#col_names ;

my @wanted_names = qw(
NAME SADDR1 SADDR2 SADDR3 SADDR4 SADDR5 PHONE1 CUSTFLD3 CUSTFLD4 NOTEPAD ) ;

while( <> ) {

my %fields ;

@fields{@wanted_names} =
(split( /\t/, $_ ))[ @name2col{ @wanted_names } ] ;

the last statement is a favorite of mine. triple slicing. but only later
did i notice that one of the slices was constant and could be factored
out of the loop.

show me a better version line in any other lang.

uri

Julian · Dec 11, 2008

Thanks folks.
I'll use Charlton's solution.

sln · Dec 11, 2008

J> It's a generic question. I often manipulate csv files or
J> tab-separated files, where each field has some semantic. When I
J> read or write a line, I need an array. But when I work on the
J> values of the fields I prefer to work on a variable whose name
J> has a meaning rather than on n'th element of the array.

my @field_labels = qw/date time transaction_id memo amount quantity total/;

# ...

my %item;
@item{@field_labels} = @fields_from_csv;

If a hash reference were used instead, would this be how it is written?

my $refitem = { #add stuff};
@{$refitem}{@field_labels} = @fields_from_csv;

sln

process multiple hashes	5	Jun 2, 2014
Ordered hashes	6	Mar 1, 2006
Some sort questions - especially hashes	4	Oct 10, 2013
A remark about 'field hashes'	2	Jun 18, 2013
Storing object references in hashes	11	Jan 30, 2006
skipping blank array items	14	Sep 6, 2013
Sorting hash of hashes	9	Feb 19, 2009
Question about arrays of hashes	3	Jan 11, 2008

ordered hashes

Julian

david

Martien Verbruggen

Julian

Julian

Jürgen Exner

david

Ron Bergin

Charlton Wilbur

xhoster

Jim Gibson

Uri Guttman

Julian

sln

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads