A
Arun Kumar
Hello everyone,
I'm 20 days new to Ruby, please forgive if I make any mistakes. I'm
on a project where I'm indexing certain words in a text document. So
I'm also storing the file position where the word occurs. But the
Problem is:
The IO#pos points to the end of the file all the while... Below is
the code I'm working on:
File.open(file_name) do |f|
f.readlines("\r\n\r\n").each do |para|
para.scan(/\b\w+\b/).each do |word|
word =3D word.downcase.stem
if (!stoplist.include? word) && (!word.empty?) #excludes empty
and frequent words
unless freq.has_key?(word)
freq[word] =3D [1,f.pos,file_name] # freq is a hash, that
stores an array containing index, position of word (THE PROBLEM)..
else
freq[word].to_a[0] +=3D 1
freq[word].to_a<< f.pos << file_name
end
unless wfreq.has_key?(word)
wfreq[word] =3D [1,f.pos,file_name]
else
wfreq[word].to_a[0] +=3D 1
wfreq[word].to_a<< f.pos << file_name
end
end
end
end
File.open(file_name+".yaml","w"){|f| YAML.dump(freq,f)}
Also it would be great if someone told me the replacement for the
deprecated 'to_a' method used above
Any help is greatly appreciated
---------------
--=20
|| =E0=A4=B6=E0=A5=8D=E0=A4=B0=E0=A5=80 =E0=A4=9C=E0=A4=BE=E0=A4=A8=E0=A4=
=95=E0=A5=80=E0=A4=B0=E0=A4=98=E0=A5=81=E0=A4=A8=E0=A4=BE=E0=A4=A5=E0=A5=8B=
=E0=A4=B5=E0=A4=BF=E0=A4=9C=E0=A4=AF=E0=A4=A4=E0=A5=87 ||
I'm 20 days new to Ruby, please forgive if I make any mistakes. I'm
on a project where I'm indexing certain words in a text document. So
I'm also storing the file position where the word occurs. But the
Problem is:
The IO#pos points to the end of the file all the while... Below is
the code I'm working on:
File.open(file_name) do |f|
f.readlines("\r\n\r\n").each do |para|
para.scan(/\b\w+\b/).each do |word|
word =3D word.downcase.stem
if (!stoplist.include? word) && (!word.empty?) #excludes empty
and frequent words
unless freq.has_key?(word)
freq[word] =3D [1,f.pos,file_name] # freq is a hash, that
stores an array containing index, position of word (THE PROBLEM)..
else
freq[word].to_a[0] +=3D 1
freq[word].to_a<< f.pos << file_name
end
unless wfreq.has_key?(word)
wfreq[word] =3D [1,f.pos,file_name]
else
wfreq[word].to_a[0] +=3D 1
wfreq[word].to_a<< f.pos << file_name
end
end
end
end
File.open(file_name+".yaml","w"){|f| YAML.dump(freq,f)}
Also it would be great if someone told me the replacement for the
deprecated 'to_a' method used above
Any help is greatly appreciated
---------------
--=20
|| =E0=A4=B6=E0=A5=8D=E0=A4=B0=E0=A5=80 =E0=A4=9C=E0=A4=BE=E0=A4=A8=E0=A4=
=95=E0=A5=80=E0=A4=B0=E0=A4=98=E0=A5=81=E0=A4=A8=E0=A4=BE=E0=A4=A5=E0=A5=8B=
=E0=A4=B5=E0=A4=BF=E0=A4=9C=E0=A4=AF=E0=A4=A4=E0=A5=87 ||