reading a column formatted data

S

suresh

Hi

I am quite fresh to ruby. I want to read a column formatted data like
the one below:

Full name Q01 Q02 Q03
Anonymous1 Fair Fair Fair
Anonymous2 Excellent Fair Poor
Anonymous3 Fair Excellent Fair
Anonymous1 Poor Poor Fair
Anonymous2 Poor Fair Fair
Anonymous1 Fair Fair Fair
Anonymous3 Excellent Excellent Excellent


Depending on the column number, these strings (Fair, Excellent, etc)
must be given a different weight, multiplied by the number of these
strings occurrences in a column and an index is to be calculated.

I was using excel and the builtin function "countif" of excel to this.
But I want to automate the whole process by using a program

what is the best approach in ruby?

Thank you
suresh
 
E

Eustáquio 'TaQ' Rangel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi.
Depending on the column number, these strings (Fair, Excellent, etc)
must be given a different weight, multiplied by the number of these
strings occurrences in a column and an index is to be calculated.
what is the best approach in ruby?

Don't know if it's the best but:

weight = {0=>1,1=>2,2=>3}
scale = {"Poor"=>1,"Fair"=>2,"Excellent"=>3}

File.open("suresh.txt").each do |line|
value, (name, *values) = 0, line.split
values.each_with_index {|item,index| value += weight[index]*scale[item]}
puts "[#{name}] [#{values.join(',')}] [#{value}]"
end

Wow! Dr. Suresh from Heroes??? :)

- --
Eustáquio "TaQ" Rangel
http://eustaquiorangel.com

"Premature optimisation is the root of all evil."
Donald Knuth
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFHOc7Xb6UiZnhJiLsRAmDeAJ9gwMYrgsLzf98j3TIQOl09TpsBTgCgq7OI
RsFHTwjXRUK2oeTiUVUwbEk=
=/Vt2
-----END PGP SIGNATURE-----
 
7

7stud --

suresh said:
multiplied by the number of these
strings occurrences in a column

The previous solution doesn't account for that requirement, and the
solution won't work on a file with column headings. Here is a way to
tabulate your data, so that you can calculate your index:

require 'rubygems'
require 'fastercsv'

totals = {'Poor' =>[0,0,0], 'Fair'=>[0,0,0], 'Excellent'=>[0,0,0]}
cols = totals.length

FasterCSV.foreach('data.txt', :headers =>true) do |row|
cols.times do |i|
totals[row[i+1]] += 1
end
end

p totals

--output:--
{"Excellent"=>[2, 2, 1], "Poor"=>[2, 1, 1], "Fair"=>[3, 4, 5]}
 
7

7stud --

7stud said:
FasterCSV.foreach('data.txt', :headers =>true) do |row|

I forgot to add that if your file doesn't use commas as the separator
between columns, you have to specify the separator, which in your case
looks like it's a tab(\t):

FasterCSV.foreach('data.txt', :headers =>true, :col_sep =>"\t")
 
7

7stud --

7stud said:
require 'rubygems'
require 'fastercsv'

totals = {'Poor' =>[0,0,0], 'Fair'=>[0,0,0], 'Excellent'=>[0,0,0]}
cols = totals.length

FasterCSV.foreach('data.txt', :headers =>true) do |row|
cols.times do |i|
totals[row[i+1]] += 1
end
end

p totals

--output:--
{"Excellent"=>[2, 2, 1], "Poor"=>[2, 1, 1], "Fair"=>[3, 4, 5]}


This is more readable:

require 'rubygems'
require 'fastercsv'

totals = {'Poor' =>[0,0,0], 'Fair'=>[0,0,0], 'Excellent'=>[0,0,0]}
target_cols = ['Q01', 'Q02', 'Q03']

FasterCSV.foreach('data.txt', :headers =>true, :col_sep =>" ") do |row|
target_cols.each_with_index do |col_name, i|
rating = row[col_name]
totals[rating] += 1
end
end

p totals

--output:--
{"Excellent"=>[2, 2, 1], "Poor"=>[2, 1, 1], "Fair"=>[3, 4, 5]}
 
S

suresh

hi,

thanks for the wonderful ideas,

suresh
India

7stud said:
require 'rubygems'
require 'fastercsv'
totals = {'Poor' =>[0,0,0], 'Fair'=>[0,0,0], 'Excellent'=>[0,0,0]}
cols = totals.length
FasterCSV.foreach('data.txt', :headers =>true) do |row|
cols.times do |i|
totals[row[i+1]] += 1
end
end

--output:--
{"Excellent"=>[2, 2, 1], "Poor"=>[2, 1, 1], "Fair"=>[3, 4, 5]}

This is more readable:

require 'rubygems'
require 'fastercsv'

totals = {'Poor' =>[0,0,0], 'Fair'=>[0,0,0], 'Excellent'=>[0,0,0]}
target_cols = ['Q01', 'Q02', 'Q03']

FasterCSV.foreach('data.txt', :headers =>true, :col_sep =>" ") do |row|
target_cols.each_with_index do |col_name, i|
rating = row[col_name]
totals[rating] += 1
end
end

p totals

--output:--
{"Excellent"=>[2, 2, 1], "Poor"=>[2, 1, 1], "Fair"=>[3, 4, 5]}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,270
Messages
2,571,352
Members
48,034
Latest member
BettinaArn

Latest Threads

Top