How to update entries in a file

J

John

Hi,

I have a file entry in the following format:
<Employee number="111222">
<Department>Sales</Department>
<Surname>Jones</Surname>
<Name>Tom</Name>
</Employee number>


Then, I'd like to update another file with the above entry. What I need
though is to first check whether this entry already exists in the file to be
updated.
If the entry does not exist then I just append the above record to the file.
But if the entry exists then I need to delete the old entry first and
replace it with the new one [by just appending it to the file's end - no
need to put it back in the same spot].

I recognise that I will probably need to read the old file into a temporary
one, do my editing there and save the results to a new file. The new file
will then be renamed to the old file. Is each line in the file entry meant
to become a hash element?

Can someone give me some pointers as to where I should start looking?

Thanks.
 
R

Roland Reichenberg

Hi John,
I have a file entry in the following format:
<Employee number="111222">
<Department>Sales</Department>
<Surname>Jones</Surname>
<Name>Tom</Name>
</Employee number>

This looks like XML-formated data
Can someone give me some pointers as to where I should start looking?

Have you tried the perl xml module?
Maybe this module includes the methods that you need.
Look at www.cpan.org for more information about perl and xml.

Regards,

Roland
 
J

John

Roland Reichenberg said:
Hi John,


This looks like XML-formated data


Have you tried the perl xml module?
Maybe this module includes the methods that you need.
Look at www.cpan.org for more information about perl and xml.

Regards,

Roland

Roland,
Yes it is meant to be XML formatted. Unfortunately, I forgot to mention that
I am unable to use any of the CPAN modules. :(
Sorry.
 
E

Eric J. Roode

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Yes it is meant to be XML formatted. Unfortunately, I forgot to
mention that I am unable to use any of the CPAN modules. :(
Sorry.

Why on earth not? Is this a homework assignment?

- --
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBP4ve0mPeouIeTNHoEQJ7wQCdGIr/fDPXGj6HnKEQYcXpXxcTruMAoIDl
TQIlp0am7KaXMGxfd0wK3+dA
=4Igk
-----END PGP SIGNATURE-----
 
T

Tad McClellan

I recognise that I will probably need to read the old file into a temporary
one, do my editing there and save the results to a new file. The new file
will then be renamed to the old file.


Perl can do all of that "administrative" work for you, see
the -i switch (perlrun.pod) and $^I (perlvar.pod).

Is each line in the file entry meant
to become a hash element?


Hashes are unordered, but the order of lines matters, so a hash
is not the Right Tool, it will break your data...
 
J

John

Tad McClellan said:

It is homework related so we need to do it the hard way.

BTW, I am pursuing the following option:
*******************************
....
sub fileupdate {
open DATA, ">>test.xml"; # append to file
foreach $line (@lines) {
print DATA $_; # print $_
}
print DATA @lines [0]; # print the first element in the array
again
close DATA;
}
....
....
while <$addr> { # this is the input
@lines = <$addr>; # I'm putting it into an array
fileupdate(); # function call
}
....

Problems:
print DATA $_; <<< this prints the first element all the time
print DATA @lines [0]; <<< this prints the second element instead of the
first one


What am I missing?
 
J

John

Tad McClellan said:
Perl can do all of that "administrative" work for you, see
the -i switch (perlrun.pod) and $^I (perlvar.pod).




Hashes are unordered, but the order of lines matters, so a hash
is not the Right Tool, it will break your data...


Thanks Tad. See my other reply if you can.
 
R

Randal L. Schwartz

John> It is homework related so we need to do it the hard way.

Does your instructor know you are cheating?

You should hope I'm never the hiring manager at any future job to
which you apply. I'll walk you out the door so fast you won't know
what hit you.
 
R

Roy Johnson

John said:
Hi,

I have a file entry in the following format:
<Employee number="111222">
<Department>Sales</Department>
<Surname>Jones</Surname>
<Name>Tom</Name>
</Employee number>


Then, I'd like to update another file with the above entry. What I need
though is to first check whether this entry already exists in the file to be
updated.

Does that mean an entry with the same employee number (regardless of
other information) exists? If so, your key will be (only) the employee
number. The other fields can be stored in an anonymous hash, or array,
if the tags are consistent.

So start with a hash:
my %entry;
Parsing is up to you.

When you want to know if an entry exists, it's just
if ($entry{$emp_no}) {...
Although if your only reason for checking is to see whether you need
to make a new entry, that's automatic. Just write all the new entries,
and they'll overwrite any older versions of themselves.

When you're ready to assign your tags, it might look like:
$entry{$emp_no} = [$dept, $surname, $name];
or
$entry{$emp_no} = { department => $dept,
surname => $surname,
name => $name
};
or even (if the tags are variable)
$entry{$emp_no} = { %record_hash };
where you've parsed a single record's tag names as keys and the
contents as values into %record_hash.
 
J

James Willmore

It is homework related so we need to do it the hard way.

Define "the hard way"? IMHO, using modules is the smart way to code
:)

<sniped code that is "homework">

And the ground rules your instructor laid out included getting help
off
a newsgroup? Asking for help is one thing. but having someone else
solve the issues you're having is something else. More importantly,
did you read the FAQ for this group? You may find an answer there
(like, how to debug your own code).

I'll give you some tips that will help you solve your own issues ...
1) put '-w' on your first line
2) 'use strict'
3) 'use diagnostics' - since you are new to Perl, this will further
aid you in solving your own issues
4) read the documentation - more specifically, learn how to use
'perldoc' type 'perldoc perldoc" for more info.
5) if your instructor(s) think Perl can _only_ be used through the
CGI, email me. I have run into _many_ persons who think that Perl can
_only_ be run through a web server. I have proof positive that it
_can_ run in other mediums as well (such as the command line - hint,
hint).

HTH

--
Jim

Copyright notice: all code written by the author in this post is
released under the GPL. http://www.gnu.org/licenses/gpl.txt
for more information.

a fortune quote ...
In Pocataligo, Georgia, it is a violation for a woman over 200
pounds and attired in shorts to pilot or ride in an airplane.
 
R

Roy Johnson

I think it's great that Perl is being used for homework. Please excuse
my cheeky comments below. Sometimes I think I'm funny. I will try to
include some helpful ones as well.

John said:
foreach $line (@lines) {
print DATA $_; # print $_

That comment may be a little obvious. You can print the whole array
with
print DATA @lines;
print DATA @lines [0]; # print the first element in the array

Technically, you're printing an array slice. To print an element, use
scalar notation. (Effectively, they're the same, here, but you should
get used to knowing which context you really want.)
while <$addr> { # this is the input
@lines = <$addr>; # I'm putting it into an array
fileupdate(); # function call

This is Not Good. Your while loop is reading line-by-line, but your
array read slurps the entire rest of the file. Figure out which way
you want to read it. Do you want to call fileupdate() for every line?
Or once for the entire file?

Good luck,
Roy
 
B

Bart Lateur

John said:
Yes it is meant to be XML formatted. Unfortunately, I forgot to mention that
I am unable to use any of the CPAN modules. :(

You may just as well forget about getting any help, then. We're not
going to reinvent such huge wheels just because you can't use modules.

My advice would have been to look at AnyData, or DBI+DBD::AnyData, plus
AnyData::Format::XML, to directly manipulate XML as if it was a
database. (I've not tried it myself, though.)

p.s. At least look at the XML::SAX::purePerl stuff, as it doesn't
require compilation/installation. At the very least you could steal some
ideas. It's the rough route, though.
 
T

Tassilo v. Parseval

Also sprach Randal L. Schwartz:
John> It is homework related so we need to do it the hard way.

Does your instructor know you are cheating?

Maybe he isn't cheating. Finding out where to get help might well be
within the specifications of the excercise. Actually and when it comes
to programming, it's the single most important thing you can learn:
knowing where to get the required information. This might be either
through documentation, tutorials, examples or even newsgroups.

At least this was how it was handled (and probably still is) in my
university.
You should hope I'm never the hiring manager at any future job to
which you apply. I'll walk you out the door so fast you won't know
what hit you.

Ohoo. And I was under the impression that those employees who get the
job done are the most valuable ones (from a shareholder-point-of-view).
Even when they cheat. Alas, I cheat when it solves a problem for me. :)

Tassilo
 
E

Eric Bohlman

It is homework related so we need to do it the hard way.

Somebody needs to tell your instructor that parsing XML is *not* a trivial
task and that giving students assignments in which they have to come up
with ways to sort-of-parse it will merely teach them very bad habits. Most
of what you and your fellow students are going to learn from this
assignment will be stuff that you'll eventually have to *unlearn*.
 
T

Tad McClellan

John said:
open DATA, ">>test.xml"; # append to file


You should always, yes *always*, check the return value from open():

open DATA, '>>test.xml' or die "could not open 'test.xml' $!";

print DATA @lines [0]; # print the first element in the array


You should enable warnings when developing Perl code!

while <$addr> { # this is the input


That is not Perl code. You are wasting the time of *thousands* of
people around the world because you cannot be troubled to
provide actual code!

That's it. You've used up all your coupons.

So long.
 
J

John

Roy Johnson said:
"John" <[email protected]> wrote in message
Hi,

I have a file entry in the following format:
<Employee number="111222">
<Department>Sales</Department>
<Surname>Jones</Surname>
<Name>Tom</Name>
</Employee number>


Then, I'd like to update another file with the above entry. What I need
though is to first check whether this entry already exists in the file to be
updated.

Does that mean an entry with the same employee number (regardless of
other information) exists? If so, your key will be (only) the employee
number. The other fields can be stored in an anonymous hash, or array,
if the tags are consistent.

So start with a hash:
my %entry;
Parsing is up to you.

When you want to know if an entry exists, it's just
if ($entry{$emp_no}) {...
Although if your only reason for checking is to see whether you need
to make a new entry, that's automatic. Just write all the new entries,
and they'll overwrite any older versions of themselves.

When you're ready to assign your tags, it might look like:
$entry{$emp_no} = [$dept, $surname, $name];
or
$entry{$emp_no} = { department => $dept,
surname => $surname,
name => $name
};
or even (if the tags are variable)
$entry{$emp_no} = { %record_hash };
where you've parsed a single record's tag names as keys and the
contents as values into %record_hash.

Just to confirm:
FILE_A [this is the file containing the update info] will only have one
record at a time.
Each record is composed of 5 lines. Each record has identical tags as listed
previously.
Before one can update FILE_B [this is the file containing all records], one
needs to check
whether a record with a given EmployeeNumber already exists.
If no, then we just append the record to the end of FILE_B.
If yes, then we delete the old record in FILE_B and then we append the
record to the end of FILE_B.

Thanks very much Roy. Will try. :)
 
J

John

Tad McClellan said:
John said:
open DATA, ">>test.xml"; # append to file


You should always, yes *always*, check the return value from open():

open DATA, '>>test.xml' or die "could not open 'test.xml' $!";

print DATA @lines [0]; # print the first element in the array


You should enable warnings when developing Perl code!

while <$addr> { # this is the input


That is not Perl code. You are wasting the time of *thousands* of
people around the world because you cannot be troubled to
provide actual code!

That's it. You've used up all your coupons.

So long.


My apologies for not being able to reply to everyone. This would just make
this thread into a spaghetti. This is not a direct reply to Tad's last post
but is intended to cover the last few posts from all of you. Whilst I may
not agree with everyone's opinions I respect your views.

Just to confirm, our instructor did not ban anyone from using newsgroups. We
were given a task and are supposed to complete it. How we go about doing it
is up to us. There was no list of resources that we should or should not
refer to.

For those of us who write books, not everything is covered in them and hence
there exists a need for forums, newsgroups, mailing lists etc. Furthermore,
such resources are best utilised when a specific problem or need arises.
Everyone learns in different ways. Some may only need to read a description
of what a car is to be able to understand its purpose. Others may need to
actually drive one. Some ppl read a book from start to finish, others skip
around to parts of interest or on a need to know basis.

I am very much against plagiarism and would not expect anyone to give me an
answer. Consequently, I had no problem saying that I'm working on a homework
related task. Mind you, this is but a component of it and I have tried to
word it in such a way as not to give myself an outright answer. I have in
the past read some of your columns Randal [not to mention the books]. I just
didn't take it too well with your accusations. Oh, and I doubt I'll be
applying for work where you are - it's not my area of interest. :)

So pls feel free to flame me when my search for clues is wasting your time
cause the code examples I provide are irrelevant but spare yourself from
judging the way in which I do my search. There will be those who support the
Monarchy and others who support the Republic. I'm probably somewhere in the
middle.

Anyway, that's just my opinion, I hope I have not offended anyone.

Back to the code for those who are still interested.
I am only including the below code examples as they seem to be the only ones
which are relevant [or maybe that's just my poor newbie judgement].

The below is done using sockets and I have the server and the client working
without a prob. The client passes the correct data to the server and the
server writes it to a file [or most of the data]. So I'm a bit stomped as to
why it's not a 100% correct.

Code extract - server script:
************
sub fileupdate {
open DATA, ">>test.xml" or die "Could not open test.xml $!\n"; #
open file file to append data to
print DATA @lines;
# add the data stored in the array
close DATA;
# close the file
}
....
....
while <$addr> { # reads input from client
print $_; # this prints to the screen the
first line of the file passed by the client
@lines <$addr>; # I want this array to capture the whole
file passed from the client
}
fileupdate (); # calling function
************

Q: Why isn't line 1 passed into the array?
print $_ echoes line 1 on the server side [so we know it's been received]
and 'cat FILE_B' only shows:
line 2
line 3



Everything was working fine when I only had this:
************
while <$addr> {
open DATA, ">>test.xml" or die "Could not open test.xml $!\n";
print DATA $_;
close DATA;
}
************

For example, 'cat FILE_B' would correctly show:
line 1
line 2
line 3

FILE_A passed by the client has
line 1
line 2
line 3
as its contents.


Does this make sense?
Thanks.
 
J

John

Bart Lateur said:
You may just as well forget about getting any help, then. We're not
going to reinvent such huge wheels just because you can't use modules.

My advice would have been to look at AnyData, or DBI+DBD::AnyData, plus
AnyData::Format::XML, to directly manipulate XML as if it was a
database. (I've not tried it myself, though.)

p.s. At least look at the XML::SAX::purePerl stuff, as it doesn't
require compilation/installation. At the very least you could steal some
ideas. It's the rough route, though.

Thanks Bart. I am actually looking at XML::parser right now. Will pursue the
ones you suggested as well.
 
R

Randal L. Schwartz

Tassilo> Maybe he isn't cheating. Finding out where to get help might well be
Tassilo> within the specifications of the excercise.

I'll believe that only if his instructor comes in here and
acknowledges it. Until then, I'm not holding my breath. :)

Tassilo> Actually and when it comes
Tassilo> to programming, it's the single most important thing you can learn:
Tassilo> knowing where to get the required information. This might be either
Tassilo> through documentation, tutorials, examples or even newsgroups.

But you should still understand the basics, and not run to get help at
the first sign of weakness.

Tassilo> Ohoo. And I was under the impression that those employees who
Tassilo> get the job done are the most valuable ones (from a
Tassilo> shareholder-point-of-view). Even when they cheat. Alas, I
Tassilo> cheat when it solves a problem for me. :)

I'm sorry, I hire people that can solve basic tasks on their own, and
use resources for the advanced tasks. This was not an advanced task.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,142
Messages
2,570,818
Members
47,362
Latest member
eitamoro

Latest Threads

Top