YAML & readlines & modify text files

D

Dan George

Hello,

Can anyone help me with the following issue:

I have a YAML file that looks like this:

---
gm: Name01
gs: N01
---
gm: Name02
gs: N02

In ruby I'm trying to read all *.txt files in the current folder and
all sub-folders. The text files look like this:

Name-line: Name01
Link-line: blabla.something.N01&=bla

Name-line: Name02
Link-line: blabla.something.N02&=bla

My ruby code looks like this:

require 'yaml'
lnk = 'blabla.something.'
op = '&=bla'

name = File.open('name.yaml')
yp = YAML::load_documents(name) do |name|
txt_files = Dir.glob('**/*.txt').each do |path|
file = File.open(path).readlines.each { |line|

if line.match(/Link-line/)
then line.gsub!(/Link-line.*/, 'Link-line: '+
lnk + name['gs'] + op)
end
}

File.open(path, 'w'){|f| f.write file}
end
end


My problem is that the code replaces the YAML value 'gs' with the last
value found in the *.txt values.

I want it to read the Name-line in each file and after that use the
appropriate 'gs' value from the YAML file in "line.gsub!(/Link-
line.*/, 'Link-line: '+ lnk + name['gs'] + op)" with the Link-line
field.

I've been trying to find a way for some time now but I just can't seem
to be able to do it and I'm starting to have headaches :) so if anyone
has any ideas or improvements or critiques please don't hesitate to
reply.

Cheers :)
 
S

Stefano Crocco

Alle mercoled=EC 19 settembre 2007, Dan George ha scritto:
Hello,

Can anyone help me with the following issue:

I have a YAML file that looks like this:

---
gm: Name01
gs: N01
---
gm: Name02
gs: N02

In ruby I'm trying to read all *.txt files in the current folder and
all sub-folders. The text files look like this:

Name-line: Name01
Link-line: blabla.something.N01&=3Dbla

Name-line: Name02
Link-line: blabla.something.N02&=3Dbla

My ruby code looks like this:

require 'yaml'
lnk =3D 'blabla.something.'
op =3D '&=3Dbla'

name =3D File.open('name.yaml')
yp =3D YAML::load_documents(name) do |name|
txt_files =3D Dir.glob('**/*.txt').each do |path|
file =3D File.open(path).readlines.each { |line|

if line.match(/Link-line/)
then line.gsub!(/Link-line.*/, 'Link-line: '+
lnk + name['gs'] + op)
end
}

File.open(path, 'w'){|f| f.write file}
end
end


My problem is that the code replaces the YAML value 'gs' with the last
value found in the *.txt values.

I want it to read the Name-line in each file and after that use the
appropriate 'gs' value from the YAML file in "line.gsub!(/Link-
line.*/, 'Link-line: '+ lnk + name['gs'] + op)" with the Link-line
field.

I've been trying to find a way for some time now but I just can't seem
to be able to do it and I'm starting to have headaches :) so if anyone
has any ideas or improvements or critiques please don't hesitate to
reply.

Cheers :)

I'm not at all sure I understand correctly what you want to do. I think you=
=20
want to replace the line under

Name-line: Name01

with some text containing a string taken from the Name01 entry in the yaml=
=20
file. Is it correct? If it isn't, then please explain better what you mean.=
=20
Otherwise, read on.

In my opinion, you're storing data in the YAML file in the wrong way, becau=
se,=20
at each iteration, name contains only one pair of name/replacement string,=
=20
which forces you to iterate over all the files for each document in the YAM=
L=20
file (and also makes the replacing code more complicated). I think your YAM=
L=20
file should contain a single hash, with the names as keys and the replaceme=
nt=20
strings as values:

=2D--
Name01: N01
Name02: N02

Then, you can do the following (untested)

require 'yaml'
lnk =3D 'blabla.something.'
op =3D '&=3Dbla'
hash =3D File.open('name.yaml'){|f| YAML.load f}
Dir.glob('**/.txt').each do |path|
lines =3D File.readlines(path)
lines.each_with_index do |line, i|
if line.match(/Link-line/)
match =3D lines[i-1].match(/Name-line:\s(.*)$/)[1]
line.gsub!(/Link-line.*/, 'Link-line: '+ lnk+hash[match[1]]+
op) if match and hash.has_key?(match[1])
end
end
File.open(path,'w'){|f| f.write lines}
end

When iterating on the lines, the block is passed not only the line, but als=
o=20
the line number. This way, when you meet a Link-line line, you can access t=
he=20
corresponding name-line using its index. It then matches the previous line=
=20
with a regexp to extract the name from it and stores the result in the matc=
h=20
variable. If match is not nil (i.e if the name line had the expected format=
)=20
and the name is included in hash, the replacement is performed (of course,=
=20
you can skip this test if you're confident enough in the format of the file=
s=20
and in the contents of the yaml file)

I hope this helps

Stefano
 
D

Dan George

Alle mercoledì 19 settembre 2007, Dan George ha scritto:
I'm not at all sure I understand correctly what you want to do. I think you
want to replace the line under

Name-line: Name01

with some text containing a string taken from the Name01 entry in the yaml
file. Is it correct? If it isn't, then please explain better what you mean.
Otherwise, read on.

In my opinion, you're storing data in the YAML file in the wrong way, because,
at each iteration, name contains only one pair of name/replacement string,
which forces you to iterate over all the files for each document in the YAML
file (and also makes the replacing code more complicated). I think your YAML
file should contain a single hash, with the names as keys and the replacement
strings as values:

---
Name01: N01
Name02: N02

Then, you can do the following (untested)

require 'yaml'
lnk = 'blabla.something.'
op = '&=bla'
hash = File.open('name.yaml'){|f| YAML.load f}
Dir.glob('**/.txt').each do |path|
lines = File.readlines(path)
lines.each_with_index do |line, i|
if line.match(/Link-line/)
match = lines[i-1].match(/Name-line:\s(.*)$/)[1]
line.gsub!(/Link-line.*/, 'Link-line: '+ lnk+hash[match[1]]+
op) if match and hash.has_key?(match[1])
end
end
File.open(path,'w'){|f| f.write lines}
end

When iterating on the lines, the block is passed not only the line, but also
the line number. This way, when you meet a Link-line line, you can accessthe
corresponding name-line using its index. It then matches the previous line
with a regexp to extract the name from it and stores the result in the match
variable. If match is not nil (i.e if the name line had the expected format)
and the name is included in hash, the replacement is performed (of course,
you can skip this test if you're confident enough in the format of the files
and in the contents of the yaml file)

I hope this helps

Stefano

You are correct, that is what I want!

I tryied what you suggested but I get some weird errors:

new.rb:45: undefined method `[]' for nil:NilClass (NoMethodError)
from D:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in
`each_with_index'
from new.rb:43:in `each'
from new.rb:43:in `each_with_index'
from new.rb:43
from new.rb:41:in `each'
from new.rb:41

Line 45 is: match = lines[i-1].match(/Name-line:\s(.*)$/)[1]
Line 43 is: lines.each_with_index do |line, i|
Line 41 is: Dir.glob('**/*.jad').each do |path|

I don't get it, as far as I can tell the code is ok...
 
S

Stefano Crocco

Alle mercoled=EC 19 settembre 2007, Dan George ha scritto:
Alle mercoled=EC 19 settembre 2007, Dan George ha scritto:

I'm not at all sure I understand correctly what you want to do. I think
you want to replace the line under

Name-line: Name01

with some text containing a string taken from the Name01 entry in the
yaml file. Is it correct? If it isn't, then please explain better what
you mean. Otherwise, read on.

In my opinion, you're storing data in the YAML file in the wrong way,
because, at each iteration, name contains only one pair of
name/replacement string, which forces you to iterate over all the files
for each document in the YAML file (and also makes the replacing code
more complicated). I think your YAML file should contain a single hash,
with the names as keys and the replacement strings as values:

---
Name01: N01
Name02: N02

Then, you can do the following (untested)

require 'yaml'
lnk =3D 'blabla.something.'
op =3D '&=3Dbla'
hash =3D File.open('name.yaml'){|f| YAML.load f}
Dir.glob('**/.txt').each do |path|
lines =3D File.readlines(path)
lines.each_with_index do |line, i|
if line.match(/Link-line/)
match =3D lines[i-1].match(/Name-line:\s(.*)$/)[1]
line.gsub!(/Link-line.*/, 'Link-line: '+ lnk+hash[match[1]]+
op) if match and hash.has_key?(match[1])
end
end
File.open(path,'w'){|f| f.write lines}
end

When iterating on the lines, the block is passed not only the line, but
also the line number. This way, when you meet a Link-line line, you can
access the corresponding name-line using its index. It then matches the
previous line with a regexp to extract the name from it and stores the
result in the match variable. If match is not nil (i.e if the name line
had the expected format) and the name is included in hash, the
replacement is performed (of course, you can skip this test if you're
confident enough in the format of the files and in the contents of the
yaml file)

I hope this helps

Stefano

You are correct, that is what I want!

I tryied what you suggested but I get some weird errors:

new.rb:45: undefined method `[]' for nil:NilClass (NoMethodError)
from D:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in
`each_with_index'
from new.rb:43:in `each'
from new.rb:43:in `each_with_index'
from new.rb:43
from new.rb:41:in `each'
from new.rb:41

Line 45 is: match =3D lines[i-1].match(/Name-line:\s(.*)$/)[1]
Line 43 is: lines.each_with_index do |line, i|
Line 41 is: Dir.glob('**/*.jad').each do |path|

I don't get it, as far as I can tell the code is ok...

I think it's because of a mistake in my code: the [1] part of line 45=20
shouldn't be there (it's a leftover from a previous version of the code).
If I'm right, the string doesn't match the regexp, so lines[i-1].match(...)=
=20
returns nil, which doesn't have a [] method, leading to the error you get.=
=20
Avod to call nil.[] is the reason for the conditional at the end of the=20
following line (note that the conditional checks that match is not nil befo=
re=20
trying to extract an element from it), but this is useless if I call it on=
=20
the line before. Removing that [1] from line 45 should solve your porblem.

However, if you found this mistake, it may mean that there's something amis=
s=20
in either your .txt files or your yaml file (see the end of my previous=20
post).

Stefano
 
D

Dan George

Alle mercoledì 19 settembre 2007, Dan George ha scritto:


Alle mercoledì 19 settembre 2007, Dan George ha scritto:
I'm not at all sure I understand correctly what you want to do. I think
you want to replace the line under
Name-line: Name01
with some text containing a string taken from the Name01 entry in the
yaml file. Is it correct? If it isn't, then please explain better what
you mean. Otherwise, read on.
In my opinion, you're storing data in the YAML file in the wrong way,
because, at each iteration, name contains only one pair of
name/replacement string, which forces you to iterate over all the files
for each document in the YAML file (and also makes the replacing code
more complicated). I think your YAML file should contain a single hash,
with the names as keys and the replacement strings as values:
---
Name01: N01
Name02: N02
Then, you can do the following (untested)
require 'yaml'
lnk = 'blabla.something.'
op = '&=bla'
hash = File.open('name.yaml'){|f| YAML.load f}
Dir.glob('**/.txt').each do |path|
lines = File.readlines(path)
lines.each_with_index do |line, i|
if line.match(/Link-line/)
match = lines[i-1].match(/Name-line:\s(.*)$/)[1]
line.gsub!(/Link-line.*/, 'Link-line: '+ lnk+hash[match[1]]+
op) if match and hash.has_key?(match[1])
end
end
File.open(path,'w'){|f| f.write lines}
end
When iterating on the lines, the block is passed not only the line, but
also the line number. This way, when you meet a Link-line line, you can
access the corresponding name-line using its index. It then matches the
previous line with a regexp to extract the name from it and stores the
result in the match variable. If match is not nil (i.e if the name line
had the expected format) and the name is included in hash, the
replacement is performed (of course, you can skip this test if you're
confident enough in the format of the files and in the contents of the
yaml file)
I hope this helps
Stefano
You are correct, that is what I want!
I tryied what you suggested but I get some weird errors:
new.rb:45: undefined method `[]' for nil:NilClass (NoMethodError)
from D:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in
`each_with_index'
from new.rb:43:in `each'
from new.rb:43:in `each_with_index'
from new.rb:43
from new.rb:41:in `each'
from new.rb:41
Line 45 is: match = lines[i-1].match(/Name-line:\s(.*)$/)[1]
Line 43 is: lines.each_with_index do |line, i|
Line 41 is: Dir.glob('**/*.jad').each do |path|
I don't get it, as far as I can tell the code is ok...

I think it's because of a mistake in my code: the [1] part of line 45
shouldn't be there (it's a leftover from a previous version of the code).
If I'm right, the string doesn't match the regexp, so lines[i-1].match(....)
returns nil, which doesn't have a [] method, leading to the error you get.
Avod to call nil.[] is the reason for the conditional at the end of the
following line (note that the conditional checks that match is not nil before
trying to extract an element from it), but this is useless if I call it on
the line before. Removing that [1] from line 45 should solve your porblem.

However, if you found this mistake, it may mean that there's something amiss
in either your .txt files or your yaml file (see the end of my previous
post).

Stefano

Stefano thanks for your replies so far.

That did. The code executes ok but "match = lines[i-1].match(/Name-
line:\s(.*)$/)" always returs nil (I did "puts match" after it) and I
don't understand why. I checked my txt files and my yaml file and they
are ok.

txt file has:

Name-line: New York
Link-line: etc

and YAML file has:

---
New York: NY

So the value from the Name is equal to the one in the YAML file. Am I
still missing something obvious?

Sorry if some things seem so obvious that I should understand them but
I'm just a beginner, started a couple of weeks back and the only way
for me to learn is from examples that I try myself and it annoys to
see that something simple gives me so much trouble but I don't want to
quit either :).
 
S

Stefano Crocco

Alle mercoled=EC 19 settembre 2007, Dan George ha scritto:
Alle mercoled=EC 19 settembre 2007, Dan George ha scritto:
Alle mercoled=EC 19 settembre 2007, Dan George ha scritto:

I'm not at all sure I understand correctly what you want to do. I
think you want to replace the line under

Name-line: Name01

with some text containing a string taken from the Name01 entry in t= he
yaml file. Is it correct? If it isn't, then please explain better
what you mean. Otherwise, read on.

In my opinion, you're storing data in the YAML file in the wrong wa= y,
because, at each iteration, name contains only one pair of
name/replacement string, which forces you to iterate over all the
files for each document in the YAML file (and also makes the
replacing code more complicated). I think your YAML file should
contain a single hash, with the names as keys and the replacement
strings as values:

---
Name01: N01
Name02: N02

Then, you can do the following (untested)

require 'yaml'
lnk =3D 'blabla.something.'
op =3D '&=3Dbla'
hash =3D File.open('name.yaml'){|f| YAML.load f}
Dir.glob('**/.txt').each do |path|
lines =3D File.readlines(path)
lines.each_with_index do |line, i|
if line.match(/Link-line/)
match =3D lines[i-1].match(/Name-line:\s(.*)$/)[1]
line.gsub!(/Link-line.*/, 'Link-line: '+ lnk+hash[match[1]]+
op) if match and hash.has_key?(match[1])
end
end
File.open(path,'w'){|f| f.write lines}
end

When iterating on the lines, the block is passed not only the line,
but also the line number. This way, when you meet a Link-line line,
you can access the corresponding name-line using its index. It then
matches the previous line with a regexp to extract the name from it
and stores the result in the match variable. If match is not nil (i= =2Ee
if the name line had the expected format) and the name is included = in
hash, the replacement is performed (of course, you can skip this te= st
if you're confident enough in the format of the files and in the
contents of the yaml file)

I hope this helps

Stefano

You are correct, that is what I want!

I tryied what you suggested but I get some weird errors:

new.rb:45: undefined method `[]' for nil:NilClass (NoMethodError)
from D:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:= in
`each_with_index'
from new.rb:43:in `each'
from new.rb:43:in `each_with_index'
from new.rb:43
from new.rb:41:in `each'
from new.rb:41

Line 45 is: match =3D lines[i-1].match(/Name-line:\s(.*)$/)[1]
Line 43 is: lines.each_with_index do |line, i|
Line 41 is: Dir.glob('**/*.jad').each do |path|

I don't get it, as far as I can tell the code is ok...

I think it's because of a mistake in my code: the [1] part of line 45
shouldn't be there (it's a leftover from a previous version of the code= ).
If I'm right, the string doesn't match the regexp, so
lines[i-1].match(...) returns nil, which doesn't have a [] method,
leading to the error you get. Avod to call nil.[] is the reason for the
conditional at the end of the following line (note that the conditional
checks that match is not nil before trying to extract an element from
it), but this is useless if I call it on the line before. Removing that
[1] from line 45 should solve your porblem.

However, if you found this mistake, it may mean that there's something
amiss in either your .txt files or your yaml file (see the end of my
previous post).

Stefano

Stefano thanks for your replies so far.

That did. The code executes ok but "match =3D lines[i-1].match(/Name-
line:\s(.*)$/)" always returs nil (I did "puts match" after it) and I
don't understand why. I checked my txt files and my yaml file and they
are ok.

txt file has:

Name-line: New York
Link-line: etc

and YAML file has:

---
New York: NY

So the value from the Name is equal to the one in the YAML file. Am I
still missing something obvious?

Sorry if some things seem so obvious that I should understand them but
I'm just a beginner, started a couple of weeks back and the only way
for me to learn is from examples that I try myself and it annoys to
see that something simple gives me so much trouble but I don't want to
quit either :).

There's nothing obvious in the problem you're having (at least, not obvious=
=20
for me). If, as you say, match is nil, it means that the trouble is outside=
=20
the yaml file (which is only used in the following line). So, either the da=
ta =20
doesn't have the expected format or the regexp isn't doing what I think it=
=20
should. Yet, trying in irb, the regexp matched the line you posted. I'm at =
a=20
loss, here. The best suggestion I can give you is to try putting a
p lines[i-1]
before the match line and see if this gives some insight on what it's=20
happening.

Stefano
 
D

Dan George

Alle mercoledì 19 settembre 2007, Dan George ha scritto:


Alle mercoledì 19 settembre 2007, Dan George ha scritto:
Alle mercoledì 19 settembre 2007, Dan George ha scritto:
I'm not at all sure I understand correctly what you want to do. I
think you want to replace the line under
Name-line: Name01
with some text containing a string taken from the Name01 entry inthe
yaml file. Is it correct? If it isn't, then please explain better
what you mean. Otherwise, read on.
In my opinion, you're storing data in the YAML file in the wrong way,
because, at each iteration, name contains only one pair of
name/replacement string, which forces you to iterate over all the
files for each document in the YAML file (and also makes the
replacing code more complicated). I think your YAML file should
contain a single hash, with the names as keys and the replacement
strings as values:
---
Name01: N01
Name02: N02
Then, you can do the following (untested)
require 'yaml'
lnk = 'blabla.something.'
op = '&=bla'
hash = File.open('name.yaml'){|f| YAML.load f}
Dir.glob('**/.txt').each do |path|
lines = File.readlines(path)
lines.each_with_index do |line, i|
if line.match(/Link-line/)
match = lines[i-1].match(/Name-line:\s(.*)$/)[1]
line.gsub!(/Link-line.*/, 'Link-line: '+ lnk+hash[match[1]]+
op) if match and hash.has_key?(match[1])
end
end
File.open(path,'w'){|f| f.write lines}
end
When iterating on the lines, the block is passed not only the line,
but also the line number. This way, when you meet a Link-line line,
you can access the corresponding name-line using its index. It then
matches the previous line with a regexp to extract the name from it
and stores the result in the match variable. If match is not nil (i.e
if the name line had the expected format) and the name is included in
hash, the replacement is performed (of course, you can skip this test
if you're confident enough in the format of the files and in the
contents of the yaml file)
I hope this helps
Stefano
You are correct, that is what I want!
I tryied what you suggested but I get some weird errors:
new.rb:45: undefined method `[]' for nil:NilClass (NoMethodError)
from D:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in
`each_with_index'
from new.rb:43:in `each'
from new.rb:43:in `each_with_index'
from new.rb:43
from new.rb:41:in `each'
from new.rb:41
Line 45 is: match = lines[i-1].match(/Name-line:\s(.*)$/)[1]
Line 43 is: lines.each_with_index do |line, i|
Line 41 is: Dir.glob('**/*.jad').each do |path|
I don't get it, as far as I can tell the code is ok...
I think it's because of a mistake in my code: the [1] part of line 45
shouldn't be there (it's a leftover from a previous version of the code).
If I'm right, the string doesn't match the regexp, so
lines[i-1].match(...) returns nil, which doesn't have a [] method,
leading to the error you get. Avod to call nil.[] is the reason for the
conditional at the end of the following line (note that the conditional
checks that match is not nil before trying to extract an element from
it), but this is useless if I call it on the line before. Removing that
[1] from line 45 should solve your porblem.
However, if you found this mistake, it may mean that there's something
amiss in either your .txt files or your yaml file (see the end of my
previous post).
Stefano
Stefano thanks for your replies so far.
That did. The code executes ok but "match = lines[i-1].match(/Name-
line:\s(.*)$/)" always returs nil (I did "puts match" after it) and I
don't understand why. I checked my txt files and my yaml file and they
are ok.
txt file has:
Name-line: New York
Link-line: etc
and YAML file has:
So the value from the Name is equal to the one in the YAML file. Am I
still missing something obvious?
Sorry if some things seem so obvious that I should understand them but
I'm just a beginner, started a couple of weeks back and the only way
for me to learn is from examples that I try myself and it annoys to
see that something simple gives me so much trouble but I don't want to
quit either :).

There's nothing obvious in the problem you're having (at least, not obvious
for me). If, as you say, match is nil, it means that the trouble is outside
the yaml file (which is only used in the following line). So, either the data
doesn't have the expected format or the regexp isn't doing what I think it
should. Yet, trying in irb, the regexp matched the line you posted. I'm at a
loss, here. The best suggestion I can give you is to try putting a
p lines[i-1]
before the match line and see if this gives some insight on what it's
happening.

Stefano

First of all thanks for all your help so far.

I just don't get it.. the output looks like this:

"CATEG: CITY=0;STATE=0;\n"
nil
"CATEG: CITY=0;STATE=0;\n"
nil

and so on for all files.

CATEG being another line inside the text files... and the problem
might be because the Name-line isn't always above the Link-line. I
could have n lines between or below, what I'm trying to say is that I
don't know where the Name-line is inside the texts files.
 
S

Stefano Crocco

Alle mercoled=EC 19 settembre 2007, Dan George ha scritto:
CATEG being another line inside the text files... and the problem
might be because the Name-line isn't always above the Link-line. I
could have n lines between or below, what I'm trying to say is that I
don't know where the Name-line is inside the texts files.

This changes everything. I assumed (according to the example lines you post=
ed)=20
that each Link-line had the corresponding Name-line above it. But, if there=
=20
isn't a relation between the position of the two kind of lines, how can you=
=20
know what to put in the link line? I mean, what is the relation which=20
connects a Link-line and the corresponding Name-line? Since (from what you=
=20
say now) the position of the two kind of lines are random (as far as this=20
problem is concerned, at any rate) are you able, given a single Link-line, =
to=20
understand which is the corresponding Name-line? If yes, how? Whithout=20
knowing this, I can't help you.

Stefano
 
D

Dan George

Alle mercoledì 19 settembre 2007, Dan George ha scritto:


This changes everything. I assumed (according to the example lines you posted)
that each Link-line had the corresponding Name-line above it. But, if there
isn't a relation between the position of the two kind of lines, how can you
know what to put in the link line? I mean, what is the relation which
connects a Link-line and the corresponding Name-line? Since (from what you
say now) the position of the two kind of lines are random (as far as this
problem is concerned, at any rate) are you able, given a single Link-line, to
understand which is the corresponding Name-line? If yes, how? Whithout
knowing this, I can't help you.

Stefano

All I can say is that in each text file there will only be one Name-
line and one Link-line. The only connection between this 2 lines is
that the Link uses the shorter version of what is written in the Name-
line (i.e. if the Name-line: New York the Link line will use NY)

Isn't it possible to read the Name-line, take the correspond value
from the YAML file, store it in a variable (string?) inside the code
and then use it in the Link-line, when going to the next text file
read again the Name-line and take the value from YAML and so on...
 
S

Stefano Crocco

Alle gioved=EC 20 settembre 2007, Dan George ha scritto:
All I can say is that in each text file there will only be one Name-
line and one Link-line. The only connection between this 2 lines is
that the Link uses the shorter version of what is written in the Name-
line (i.e. if the Name-line: New York the Link line will use NY)

Isn't it possible to read the Name-line, take the correspond value
from the YAML file, store it in a variable (string?) inside the code
and then use it in the Link-line, when going to the next text file
read again the Name-line and take the value from YAML and so on...

If each file contains only one Name-line and one instance of the correspond=
ing=20
Link-line, this should work:

require 'yaml'

lnk=3D'blabla.something.'
op =3D '&=3Dbla'
hash =3D File.open('name.yaml'){|f| YAML.load f}

Dir.glob('**/*.txt').each do |f|
lines =3D File.readlines f
name =3D nil
link_idx =3D nil
lines.each_with_index do |l, i|
if l.match(/Name-line:\s+(.*)$/) then name $1
elsif l.match(/Link-line/) then link =3D i
end
break if name and link_idx
end
if name
rep =3D hash[name]
if rep=20
lines[link_idx]=3D"Link-line: #{lnk}#{rep}#{op}"
File.open(f, 'w'){|of| of.write lines}
else puts "name.yaml doesn't contain an entry for #{name} (file #{f})"
end
else puts "Couldn't find a Name line in file #{f}"
end
end

=46or each file, it loops each line looking for a Name-line or a Link-line.=
When=20
it finds the former, it stores the name in the name variable; when it finds=
=20
the latter, it stores its index in the link_idx variable. When both are=20
found, the loop stops (no point in examining the remaining lines). If a nam=
e=20
has been found and it corresponds to an entry in the hash, the line with=20
index link_idx is replaced with a new one (I removed the call to gsub!, sin=
ce=20
we're rebuilding the entire line, but you can put it back, if you need it)=
,=20
then the array is written to the file. If the Name-line wasn't found, or if=
=20
the hash doesn't contain an entry for it, an error message is printed on=20
screen and the next file is processed.

I hope this helps

Stefano
 
D

Dan George

Alle giovedì 20 settembre 2007, Dan George ha scritto:


All I can say is that in each text file there will only be one Name-
line and one Link-line. The only connection between this 2 lines is
that the Link uses the shorter version of what is written in the Name-
line (i.e. if the Name-line: New York the Link line will use NY)
Isn't it possible to read the Name-line, take the correspond value
from the YAML file, store it in a variable (string?) inside the code
and then use it in the Link-line, when going to the next text file
read again the Name-line and take the value from YAML and so on...

If each file contains only one Name-line and one instance of the corresponding
Link-line, this should work:

require 'yaml'

lnk='blabla.something.'
op = '&=bla'
hash = File.open('name.yaml'){|f| YAML.load f}

Dir.glob('**/*.txt').each do |f|
lines = File.readlines f
name = nil
link_idx = nil
lines.each_with_index do |l, i|
if l.match(/Name-line:\s+(.*)$/) then name $1
elsif l.match(/Link-line/) then link = i
end
break if name and link_idx
end
if name
rep = hash[name]
if rep
lines[link_idx]="Link-line: #{lnk}#{rep}#{op}"
File.open(f, 'w'){|of| of.write lines}
else puts "name.yaml doesn't contain an entry for #{name} (file #{f})"
end
else puts "Couldn't find a Name line in file #{f}"
end
end

For each file, it loops each line looking for a Name-line or a Link-line.When
it finds the former, it stores the name in the name variable; when it finds
the latter, it stores its index in the link_idx variable. When both are
found, the loop stops (no point in examining the remaining lines). If a name
has been found and it corresponds to an entry in the hash, the line with
index link_idx is replaced with a new one (I removed the call to gsub!, since
we're rebuilding the entire line, but you can put it back, if you need it),
then the array is written to the file. If the Name-line wasn't found, or if
the hash doesn't contain an entry for it, an error message is printed on
screen and the next file is processed.

I hope this helps

Stefano

Thanks for your reply Stefano!

I had to do:
link_idx = nil.to_i
otherwise I would get this error: `[]': no implicit conversion from
nil to integer (TypeError)

And it seems to be working but if I use this:
lines[link_idx]="Link-line: #{lnk}#{rep}#{op}"
the Name-line is removed and is replaced by the "Link-line:
#{lnk}#{rep}#{op}" but the old Link-line = 0 is kept too.

If I use
lines[link_idx].gsub!(/Link-line.*/, 'Link-line: '+lnk+rep
+op)
nothing happends, no errors, no modified files, no nothing.

I don't see anything wrong with the gsub! so what might be the problem?
 
S

Stefano Crocco

Alle gioved=EC 20 settembre 2007, Dan George ha scritto:
Alle gioved=EC 20 settembre 2007, Dan George ha scritto:
Alle mercoled=EC 19 settembre 2007, Dan George ha scritto:
CATEG being another line inside the text files... and the problem
might be because the Name-line isn't always above the Link-line. I
could have n lines between or below, what I'm trying to say is th= at
I don't know where the Name-line is inside the texts files.

This changes everything. I assumed (according to the example lines
you posted) that each Link-line had the corresponding Name-line abo= ve
it. But, if there isn't a relation between the position of the two
kind of lines, how can you know what to put in the link line? I mea= n,
what is the relation which connects a Link-line and the correspondi= ng
Name-line? Since (from what you say now) the position of the two ki= nd
of lines are random (as far as this problem is concerned, at any
rate) are you able, given a single Link-line, to understand which is
the corresponding Name-line? If yes, how? Whithout knowing this, I
can't help you.

Stefano

All I can say is that in each text file there will only be one Name-
line and one Link-line. The only connection between this 2 lines is
that the Link uses the shorter version of what is written in the Name-
line (i.e. if the Name-line: New York the Link line will use NY)

Isn't it possible to read the Name-line, take the correspond value
from the YAML file, store it in a variable (string?) inside the code
and then use it in the Link-line, when going to the next text file
read again the Name-line and take the value from YAML and so on...

If each file contains only one Name-line and one instance of the
corresponding Link-line, this should work:

require 'yaml'

lnk=3D'blabla.something.'
op =3D '&=3Dbla'
hash =3D File.open('name.yaml'){|f| YAML.load f}

Dir.glob('**/*.txt').each do |f|
lines =3D File.readlines f
name =3D nil
link_idx =3D nil
lines.each_with_index do |l, i|
if l.match(/Name-line:\s+(.*)$/) then name $1
elsif l.match(/Link-line/) then link =3D i
end
break if name and link_idx
end
if name
rep =3D hash[name]
if rep
lines[link_idx]=3D"Link-line: #{lnk}#{rep}#{op}"
File.open(f, 'w'){|of| of.write lines}
else puts "name.yaml doesn't contain an entry for #{name} (file
#{f})" end
else puts "Couldn't find a Name line in file #{f}"
end
end

For each file, it loops each line looking for a Name-line or a Link-lin= e.
When it finds the former, it stores the name in the name variable; when
it finds the latter, it stores its index in the link_idx variable. When
both are found, the loop stops (no point in examining the remaining
lines). If a name has been found and it corresponds to an entry in the
hash, the line with index link_idx is replaced with a new one (I removed
the call to gsub!, since we're rebuilding the entire line, but you can
put it back, if you need it), then the array is written to the file. If
the Name-line wasn't found, or if the hash doesn't contain an entry for
it, an error message is printed on screen and the next file is processe= d.

I hope this helps

Stefano

Thanks for your reply Stefano!

I had to do:
link_idx =3D nil.to_i
otherwise I would get this error: `[]': no implicit conversion from
nil to integer (TypeError)

And it seems to be working but if I use this:
lines[link_idx]=3D"Link-line: #{lnk}#{rep}#{op}"
the Name-line is removed and is replaced by the "Link-line:
#{lnk}#{rep}#{op}" but the old Link-line =3D 0 is kept too.

If I use
lines[link_idx].gsub!(/Link-line.*/, 'Link-line: '+lnk+rep
+op)
nothing happends, no errors, no modified files, no nothing.

I don't see anything wrong with the gsub! so what might be the problem?

Another couple of mistakes on my part, I'm afraid. This should work

require 'yaml'

lnk=3D'blabla.something.'
op =3D '&=3Dbla'
hash =3D File.open('name.yaml'){|f| YAML.load f}

Dir.glob('**/*.txt').each do |f|
lines =3D File.readlines f
name =3D nil
link_idx =3D nil
lines.each_with_index do |l, i|
#added missing =3D between name and $1
if l.match(/Name-line:\s+(.*)$/) then name =3D $1
#changed link =3D i to link_idx =3D i
elsif l.match(/Link-line/) then link_idx =3D i =20
end
break if name and link_idx
end
#checking that also link_idx is not nil
if name and link_idx
rep =3D hash[name]
if rep=20
lines[link_idx]=3D"Link-line: #{lnk}#{rep}#{op}"
File.open(f, 'w'){|of| of.write lines}
else puts "name.yaml doesn't contain an entry for the name #{name}"
end
#changed the error message
else puts "Name or Link line are missing in file #{f}"
end
end

Stefano
 
D

Dan George

Alle giovedì 20 settembre 2007, Dan George ha scritto:


Alle giovedì 20 settembre 2007, Dan George ha scritto:
Alle mercoledì 19 settembre 2007, Dan George ha scritto:
CATEG being another line inside the text files... and the problem
might be because the Name-line isn't always above the Link-line.. I
could have n lines between or below, what I'm trying to say is that
I don't know where the Name-line is inside the texts files.
This changes everything. I assumed (according to the example lines
you posted) that each Link-line had the corresponding Name-line above
it. But, if there isn't a relation between the position of the two
kind of lines, how can you know what to put in the link line? I mean,
what is the relation which connects a Link-line and the corresponding
Name-line? Since (from what you say now) the position of the two kind
of lines are random (as far as this problem is concerned, at any
rate) are you able, given a single Link-line, to understand whichis
the corresponding Name-line? If yes, how? Whithout knowing this, I
can't help you.
Stefano
All I can say is that in each text file there will only be one Name-
line and one Link-line. The only connection between this 2 lines is
that the Link uses the shorter version of what is written in the Name-
line (i.e. if the Name-line: New York the Link line will use NY)
Isn't it possible to read the Name-line, take the correspond value
from the YAML file, store it in a variable (string?) inside the code
and then use it in the Link-line, when going to the next text file
read again the Name-line and take the value from YAML and so on...
If each file contains only one Name-line and one instance of the
corresponding Link-line, this should work:
require 'yaml'
lnk='blabla.something.'
op = '&=bla'
hash = File.open('name.yaml'){|f| YAML.load f}
Dir.glob('**/*.txt').each do |f|
lines = File.readlines f
name = nil
link_idx = nil
lines.each_with_index do |l, i|
if l.match(/Name-line:\s+(.*)$/) then name $1
elsif l.match(/Link-line/) then link = i
end
break if name and link_idx
end
if name
rep = hash[name]
if rep
lines[link_idx]="Link-line: #{lnk}#{rep}#{op}"
File.open(f, 'w'){|of| of.write lines}
else puts "name.yaml doesn't contain an entry for #{name} (file
#{f})" end
else puts "Couldn't find a Name line in file #{f}"
end
end
For each file, it loops each line looking for a Name-line or a Link-line.
When it finds the former, it stores the name in the name variable; when
it finds the latter, it stores its index in the link_idx variable. When
both are found, the loop stops (no point in examining the remaining
lines). If a name has been found and it corresponds to an entry in the
hash, the line with index link_idx is replaced with a new one (I removed
the call to gsub!, since we're rebuilding the entire line, but you can
put it back, if you need it), then the array is written to the file. If
the Name-line wasn't found, or if the hash doesn't contain an entry for
it, an error message is printed on screen and the next file is processed.
I hope this helps
Stefano
Thanks for your reply Stefano!
I had to do:
link_idx = nil.to_i
otherwise I would get this error: `[]': no implicit conversion from
nil to integer (TypeError)
And it seems to be working but if I use this:
lines[link_idx]="Link-line: #{lnk}#{rep}#{op}"
the Name-line is removed and is replaced by the "Link-line:
#{lnk}#{rep}#{op}" but the old Link-line = 0 is kept too.
If I use
lines[link_idx].gsub!(/Link-line.*/, 'Link-line: '+lnk+rep
+op)
nothing happends, no errors, no modified files, no nothing.
I don't see anything wrong with the gsub! so what might be the problem?

Another couple of mistakes on my part, I'm afraid. This should work

require 'yaml'

lnk='blabla.something.'
op = '&=bla'
hash = File.open('name.yaml'){|f| YAML.load f}

Dir.glob('**/*.txt').each do |f|
lines = File.readlines f
name = nil
link_idx = nil
lines.each_with_index do |l, i|
#added missing = between name and $1
if l.match(/Name-line:\s+(.*)$/) then name = $1
#changed link = i to link_idx = i
elsif l.match(/Link-line/) then link_idx = i
end
break if name and link_idx
end
#checking that also link_idx is not nil
if name and link_idx
rep = hash[name]
if rep
lines[link_idx]="Link-line: #{lnk}#{rep}#{op}"
File.open(f, 'w'){|of| of.write lines}
else puts "name.yaml doesn't contain an entry for the name #{name}"
end
#changed the error message
else puts "Name or Link line are missing in file #{f}"
end
end

Stefano

Thank you Stefano! Works like a charm now :)

This part (#added missing = between name and $1 ) I figured it too but
I should have been more careful about this part "#checking that also
link_idx is not nil"

I liked the #comments part, thank you!

Now I'm gonna try and search for text files inside a zip archive and
modify them and then zip them back together.

Stefano, how long have you been using Ruby because you seem to know a
lot of stuff and I was wondering how long it will take me to know half
of what you know?
 
D

Dan George

And I have another question since I couldn't find anything about this
subject.

If in my YAML file I have a value that is like "New York: City: NYC"
it will obviously generate an error. What I want to know if it's
possible to change the YAML separator ":" with something else like ";"
so I can have a value in the YAML file like this "New York: City; NYC"?
 
D

Dan George

And I have another question since I couldn't find anything about this
subject.

If in my YAML file I have a value that is like "New York: City: NYC"
it will obviously generate an error. What I want to know if it's
possible to change the YAML separator ":" with something else like ";"
so I can have a value in the YAML file like this "New York: City; NYC"?

Never mind this.. it was pretty obvious that I had to use "New York:
City": NYC in the YAML file :)
 
S

Stefano Crocco

Alle gioved=EC 20 settembre 2007, Dan George ha scritto:
Alle gioved=EC 20 settembre 2007, Dan George ha scritto:
Alle gioved=EC 20 settembre 2007, Dan George ha scritto:
Alle mercoled=EC 19 settembre 2007, Dan George ha scritto:
CATEG being another line inside the text files... and the
problem might be because the Name-line isn't always above the
Link-line. I could have n lines between or below, what I'm
trying to say is that I don't know where the Name-line is
inside the texts files.

This changes everything. I assumed (according to the example
lines you posted) that each Link-line had the corresponding
Name-line above it. But, if there isn't a relation between the
position of the two kind of lines, how can you know what to put
in the link line? I mean, what is the relation which connects a
Link-line and the corresponding Name-line? Since (from what you
say now) the position of the two kind of lines are random (as f= ar
as this problem is concerned, at any rate) are you able, given a
single Link-line, to understand which is the corresponding
Name-line? If yes, how? Whithout knowing this, I can't help you.

Stefano

All I can say is that in each text file there will only be one
Name- line and one Link-line. The only connection between this 2
lines is that the Link uses the shorter version of what is written
in the Name- line (i.e. if the Name-line: New York the Link line
will use NY)

Isn't it possible to read the Name-line, take the correspond value
from the YAML file, store it in a variable (string?) inside the
code and then use it in the Link-line, when going to the next text
file read again the Name-line and take the value from YAML and so
on...

If each file contains only one Name-line and one instance of the
corresponding Link-line, this should work:

require 'yaml'

lnk=3D'blabla.something.'
op =3D '&=3Dbla'
hash =3D File.open('name.yaml'){|f| YAML.load f}

Dir.glob('**/*.txt').each do |f|
lines =3D File.readlines f
name =3D nil
link_idx =3D nil
lines.each_with_index do |l, i|
if l.match(/Name-line:\s+(.*)$/) then name $1
elsif l.match(/Link-line/) then link =3D i
end
break if name and link_idx
end
if name
rep =3D hash[name]
if rep
lines[link_idx]=3D"Link-line: #{lnk}#{rep}#{op}"
File.open(f, 'w'){|of| of.write lines}
else puts "name.yaml doesn't contain an entry for #{name} (file
#{f})" end
else puts "Couldn't find a Name line in file #{f}"
end
end

For each file, it loops each line looking for a Name-line or a
Link-line. When it finds the former, it stores the name in the name
variable; when it finds the latter, it stores its index in the
link_idx variable. When both are found, the loop stops (no point in
examining the remaining lines). If a name has been found and it
corresponds to an entry in the hash, the line with index link_idx is
replaced with a new one (I removed the call to gsub!, since we're
rebuilding the entire line, but you can put it back, if you need
it), then the array is written to the file. If the Name-line wasn't
found, or if the hash doesn't contain an entry for it, an error
message is printed on screen and the next file is processed.

I hope this helps

Stefano

Thanks for your reply Stefano!

I had to do:
link_idx =3D nil.to_i
otherwise I would get this error: `[]': no implicit conversion from
nil to integer (TypeError)

And it seems to be working but if I use this:
lines[link_idx]=3D"Link-line: #{lnk}#{rep}#{op}"
the Name-line is removed and is replaced by the "Link-line:
#{lnk}#{rep}#{op}" but the old Link-line =3D 0 is kept too.

If I use
lines[link_idx].gsub!(/Link-line.*/, 'Link-line: '+lnk+rep
+op)
nothing happends, no errors, no modified files, no nothing.

I don't see anything wrong with the gsub! so what might be the proble=
m?

Another couple of mistakes on my part, I'm afraid. This should work

require 'yaml'

lnk=3D'blabla.something.'
op =3D '&=3Dbla'
hash =3D File.open('name.yaml'){|f| YAML.load f}

Dir.glob('**/*.txt').each do |f|
lines =3D File.readlines f
name =3D nil
link_idx =3D nil
lines.each_with_index do |l, i|
#added missing =3D between name and $1
if l.match(/Name-line:\s+(.*)$/) then name =3D $1
#changed link =3D i to link_idx =3D i
elsif l.match(/Link-line/) then link_idx =3D i
end
break if name and link_idx
end
#checking that also link_idx is not nil
if name and link_idx
rep =3D hash[name]
if rep
lines[link_idx]=3D"Link-line: #{lnk}#{rep}#{op}"
File.open(f, 'w'){|of| of.write lines}
else puts "name.yaml doesn't contain an entry for the name #{name}"
end
#changed the error message
else puts "Name or Link line are missing in file #{f}"
end
end

Stefano

Thank you Stefano! Works like a charm now :)

This part (#added missing =3D between name and $1 ) I figured it too but
I should have been more careful about this part "#checking that also
link_idx is not nil"

I liked the #comments part, thank you!

Now I'm gonna try and search for text files inside a zip archive and
modify them and then zip them back together.

Stefano, how long have you been using Ruby because you seem to know a
lot of stuff and I was wondering how long it will take me to know half
of what you know?

I've been using ruby for about two years. And don't worry: you only need a=
=20
little time to get to know the language, then progresses become much quicke=
r.

Stefano
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,742
Latest member
AshliMayer

Latest Threads

Top