if column header contain regexp, delete column

P

Paul Shapiro

I need to see if a csv column header matches a pattern (see columns 2/3
below...S31 and S32) then delete the entire column. I've tried using:
/(S[0-9]+\s\b)/
but i don't think the regexp is the problem. My script is using bother
the rio and fastercsv gems. The fastercsv documentation has been very
difficult for me to figure out (I'm new to ruby). Thanks.

CSV:
Device ID,S31 Which best describes how you answered the online reading
comprehension quiz?,S32 Which best describes how you answered the online
timed retrieval quiz?,If you want your product to be easy to find in the
supermarket then you should make its container,"So that he can shift
attention between the radio and his incessantly talking girl friend when
she is in the car, Joe adjusts his radio",Early selection is most likely
to occur for,Early selection for a red target is most likely to occur
when there is,"In a lexical decision task, when the target is a bird
name, e.g. robin, it is usually preceded by the prime BODY but is
sometimes preceded by the prime BIRD.","In a lexical decision task, when
the target is a dog name, e.g. collie, it is usually preceded by the
prime CAR but is sometimes preceded by the prime DOG.","Suppose that
that you see a brief display with 12 colored letters: 4 red, 4 white,
and 4 blue. At the offset of the display you hear tone. A tone
instructs you to report only the letters of a particular color: high for
red, medium for white, and low for blue. About how many letters do you
report?",Sperling (1960) found that partial report produced the highest
estimate of the number of available letters when the tone occurred
,"According to the logic of Sperling’s (1960) partial report method, an
observer who reports three letters from a row in a 4 x 4 display that
was cued at the display’s offset must have seen at least",Sperling
(1960) found that the greatest difference between full and partial
report in the number available of letters was when the tone occurred
____ milliseconds after the offset of the visual display
96A39,6,4,4 c,4 c,5 c,5 c,5 c,4 i,3 c,1 c,1 i,5 i
1E90A4,5,3,4 c,4 c,5 c,5 c,2 i,5 c,3 c,1 c,4 i,4 i
1F7EE1,5,3,4 c,4 c,5 c,5 c,5 c,5 c,3 c,1 c,4 i,1 c
B8D35,4,3,4 c,5 i,5 c,5 c,5 c,4 i,3 c,1 c,3 c,1 c
9867B,6,4,4 c,1 i,5 c,-,-,3 i,4 i,1 c,3 c,4 i
1F7EDF,5,3,4 c,4 c,5 c,5 c,5 c,4 i,4 i,1 c,1 i,3 i
1F7DEC,5,3,1 i,4 c,5 c,5 c,4 i,5 c,3 c,3 i,3 c,-
1F7EF7,5,3,4 c,4 c,5 c,5 c,5 c,5 c,3 c,1 c,3 c,1 c
95738,6,-,4 c,2 i,5 c,5 c,2 i,5 c,4 i,1 c,4 i,1 c
9C46C,5,3,4 c,4 c,5 c,5 c,5 c,4 i,3 c,1 c,3 c,1 c
BEFC2,5,-,4 c,4 c,5 c,5 c,5 c,4 i,3 c,1 c,2 i,1 c
1F082A,-,1,4 c,4 c,5 c,5 c,5 c,4 i,3 c,1 c,1 i,1 c
68CE4,-,3,4 c,4 c,5 c,5 c,2 i,3 i,3 c,1 c,3 c,2 i
1F7E05,-,3,4 c,2 i,5 c,5 c,5 c,-,3 c,1 c,5 i,3 i
2020B2,-,3,4 c,4 c,5 c,5 c,5 c,4 i,3 c,1 c,1 i,1 c
1F7EED,-,1,4 c,4 c,5 c,5 c,5 c,-,3 c,1 c,1 i,1 c
1F7F11,-,3,4 c,5 i,5 c,5 c,2 i,5 c,4 i,4 i,3 c,5 i
BB147,-,-,4 c,4 c,5 c,5 c,1 i,4 i,3 c,1 c,3 c,1 c

Current Script:
#!/usr/bin/env ruby

require 'rubygems'
require 'roo'
require 'csv'
require 'fileutils'
require 'rio'
require 'fastercsv'

FileUtils.mkdir_p "/Users/pshapiro/Desktop/Excel/xls"
FileUtils.mkdir_p "/Users/pshapiro/Desktop/Excel/tmp"
FileUtils.mkdir_p "/Users/pshapiro/Desktop/Excel/csv"

@filesxls = Dir["/Users/pshapiro/Desktop/Excel/*.xls"]
for file in @filesxls
FileUtils.move(file,"/Users/pshapiro/Desktop/Excel/xls")
end

@filesxls = Dir["/Users/pshapiro/Desktop/Excel/xls/*.xls"]
@filetmp = Dir["/Users/pshapiro/Desktop/Excel/xls/*.xls_tmp"]

for file in @filesxls
convert = Excel.new(file)
convert.default_sheet = convert.sheets[0]
convert.to_csv(file+"_tmp")
end

@filestmp = Dir["/Users/pshapiro/Desktop/Excel/xls/*.xls_tmp"]

for file in @filestmp
FileUtils.move(file,"/Users/pshapiro/Desktop/Excel/tmp")
end

dir = "/Users/pshapiro/Desktop/Excel/tmp/"
files = Dir.entries(dir)
files.each do |f|
next if f == "." or f == ".."
oldFile = dir + "/" + f
newFile = dir + "/" + File.basename(f, '.*')
File.rename(oldFile, newFile)
end

files = Dir.entries(dir)
files.each do |f|
next if f == "." or f == ".."
oldFile = dir + "/" + f
newFile = dir + "/" + f + ".csv"
File.rename(oldFile, newFile)
end

@filescsv = Dir["/Users/pshapiro/Desktop/Excel/tmp/*.csv"]

for file in @filescsv
FileUtils.move(file,"/Users/pshapiro/Desktop/Excel/csv")
end

FileUtils.rm_rf("/Users/pshapiro/Desktop/Excel/tmp")

@filescsv = Dir["/Users/pshapiro/Desktop/Excel/csv/*.csv"]

for file in @filescsv
5.times {
text=""
File.open(file,"r"){|f|f.gets;text=f.read}
File.open(file,"w+"){|f| f.write(text)}
}
end

dir = "/Users/pshapiro/Desktop/Excel/csv/"
files = Dir.entries(dir)
files.each do |f|
next if f == "." or f == ".."
oldFile = dir + "/" + f
newFile = dir + "/" + File.basename(f, '.*') + ".tmp"
File.rename(oldFile, newFile)
end

@filescsv = Dir["/Users/pshapiro/Desktop/Excel/csv/*.tmp"]

for file in @filescsv
csv = FasterCSV.read(file, :headers => true)
lastc = csv.headers.length-1
# puts lastc
rio(file).csv.skipcolumns(1..2,lastc) > rio(file+".csv").csv(',')
end

@filescsv = Dir["/Users/pshapiro/Desktop/Excel/csv/*.tmp"]

for file in @filescsv
FileUtils.remove(file)
end

dir = "/Users/pshapiro/Desktop/Excel/csv"
files = Dir.entries(dir)
files.each do |f|
next if f == "." or f == ".."
oldFile = dir + "/" + f
newFile = dir + "/" + File.basename(f, '.*')
File.rename(oldFile, newFile)
end

2.times {
files = Dir.entries(dir)
files.each do |f|
next if f == "." or f == ".."
oldFile = dir + "/" + f
newFile = dir + "/" + File.basename(f, '.*')
File.rename(oldFile, newFile)
end
}

files = Dir.entries(dir)
files.each do |f|
next if f == "." or f == ".."
oldFile = dir + "/" + f
newFile = dir + "/" + f + ".csv"
File.rename(oldFile, newFile)
end

#####################################

@filescsv = Dir["/Users/pshapiro/Desktop/Excel/csv/*.csv"]

for file in @filescsv
csv = FasterCSV.read(file, :headers => true)
csv = csv.to_s
fields = FCSV.parse_line(csv)

fields.each do |f|
f.sub!(/[\d]+\)+[\s]/,'')
end

# puts fields

wline = FCSV.generate_line(fields)
astring = rio(file).contents
rio(file).csv.print(astring).close

text=""
File.open(file,"r"){|f|f.gets;text=f.read}
File.open(file,"w+"){|f| f.write(text)}

astring = rio(file).contents
rio(file).csv.print(wline+astring).close
end

puts "Successfully fixed Microsoft Excel documents!"
#puts teststr.gsub(/!(Device ID)|([BC\B0-9]+\.)\s/,'')

#puts "(Device ID)|[BC\B][0-9]+\."
#puts "(Device ID)|([BC\B0-9]+\.)"
#puts "!(Device ID)|([BC]\B[0-9]+\.\s)"

#####################################

@filescsv = Dir["/Users/pshapiro/Desktop/Excel/csv/*.csv"]

for file in @filescsv
csv = FasterCSV.read(file, :headers => true)
csv = csv.to_s
fields = FCSV.parse_line(csv)

fields.each do |f|
f.sub!(/!(Device ID)|([BC\B0-9]+\.)[\s]*/,'')
end

# puts fields

wline = FCSV.generate_line(fields)
astring = rio(file).contents
rio(file).csv.print(astring).close

text=""
File.open(file,"r"){|f|f.gets;text=f.read}
File.open(file,"w+"){|f| f.write(text)}

astring = rio(file).contents
rio(file).csv.print(wline+astring).close
end
 
J

James Gray

I need to see if a csv column header matches a pattern (see columns
2/3
below...S31 and S32) then delete the entire column. I've tried using:
/(S[0-9]+\s\b)/
but i don't think the regexp is the problem. My script is using bother
the rio and fastercsv gems. The fastercsv documentation has been very
difficult for me to figure out (I'm new to ruby). Thanks.

Paul, we need to improve your question asking skills a bit. :)

It's hard for us to help when we have to read through a lot of prose
and even more code to understand the issue you are facing. You'll get
a lot better responses, and quicker, if you try to simplify your
questions down for us. "Here are the five lines I'm trying to use to
delete a column of CSV. Can you tell me why it's failing?" for example.

Basically though, the steps are almost identical to what I gave you in
this message:

http://groups.google.com/group/comp.lang.ruby/msg/c9f4c7b71465b6af

Just copy the data, leaving out the column you don't want.

I hope that helps.

James Edward Gray II
 
P

Paul Shapiro

James said:
I need to see if a csv column header matches a pattern (see columns
2/3
below...S31 and S32) then delete the entire column. I've tried using:
/(S[0-9]+\s\b)/
but i don't think the regexp is the problem. My script is using bother
the rio and fastercsv gems. The fastercsv documentation has been very
difficult for me to figure out (I'm new to ruby). Thanks.

Paul, we need to improve your question asking skills a bit. :)

It's hard for us to help when we have to read through a lot of prose
and even more code to understand the issue you are facing. You'll get
a lot better responses, and quicker, if you try to simplify your
questions down for us. "Here are the five lines I'm trying to use to
delete a column of CSV. Can you tell me why it's failing?" for example.

Basically though, the steps are almost identical to what I gave you in
this message:

http://groups.google.com/group/comp.lang.ruby/msg/c9f4c7b71465b6af

Just copy the data, leaving out the column you don't want.

I hope that helps.

James Edward Gray II

Ok. I would like to use fastercsv with maybe delete_if(&block) or
delete(index_or_header), where the index is selected by a regular
expression match. I have been unable to figure out the proper syntax for
these functions. For demoing the task, I've been working with this code:

table =
FasterCSV::table('/Users/pshapiro/Desktop/Excel/csv/Attention.csv',
:headers => true, :return_headers => true)
junk = table.by_col.to_s

Hopefully, that was a little more clear?

James, I see you are the fastercsv developer, and I want to thank you
for the help you have given me thus far!
 
J

James Gray

James said:
I need to see if a csv column header matches a pattern (see columns
2/3
below...S31 and S32) then delete the entire column. I've tried
using:
/(S[0-9]+\s\b)/
but i don't think the regexp is the problem. My script is using
bother
the rio and fastercsv gems. The fastercsv documentation has been
very
difficult for me to figure out (I'm new to ruby). Thanks.

Paul, we need to improve your question asking skills a bit. :)

It's hard for us to help when we have to read through a lot of prose
and even more code to understand the issue you are facing. You'll
get
a lot better responses, and quicker, if you try to simplify your
questions down for us. "Here are the five lines I'm trying to use to
delete a column of CSV. Can you tell me why it's failing?" for
example.

Basically though, the steps are almost identical to what I gave you
in
this message:

http://groups.google.com/group/comp.lang.ruby/msg/c9f4c7b71465b6af

Just copy the data, leaving out the column you don't want.

I hope that helps.

James Edward Gray II

Ok. I would like to use fastercsv with maybe delete_if(&block) or
delete(index_or_header), where the index is selected by a regular
expression match.

Awesome. That helped. Thanks.

I would probably try something like this:

$ cat data.csv
A,B,C,D
1,2,3,4
5,6,7,8
$ cat del_col.rb
#!/usr/bin/env ruby -wKU

require "rubygems"
require "faster_csv"

table = FCSV.table("data.csv")
table.headers.each do |h|
next unless h.to_s =~ /\A[bd]\z/i
table.delete(h)
end
puts table

__END__
$ ruby del_col.rb
a,c
1,3
5,7

Hope that helps.

James Edward Gray II
 
P

Paul Shapiro

James said:
Ok. I would like to use fastercsv with maybe delete_if(&block) or
delete(index_or_header), where the index is selected by a regular
expression match.

Awesome. That helped. Thanks.

I would probably try something like this:

$ cat data.csv
A,B,C,D
1,2,3,4
5,6,7,8
$ cat del_col.rb
#!/usr/bin/env ruby -wKU

require "rubygems"
require "faster_csv"

table = FCSV.table("data.csv")
table.headers.each do |h|
next unless h.to_s =~ /\A[bd]\z/i
table.delete(h)
end
puts table

__END__
$ ruby del_col.rb
a,c
1,3
5,7

Hope that helps.

James Edward Gray II

Alright. I'm not getting this to work. I'm doing this:

table = FCSV.table("/Users/pshapiro/Desktop/Excel/csv/Attention.csv")
table.headers.each do |h|
next unless h.to_s =~ /(S[0-9]+\s\b)/
table.delete(h)
end
puts table

The regexp im using is "(S[0-9]+\s\b)" to match "S31 Which best
describes how you answered the online reading comprehension quiz?"

I'm not sure if its a problem with my ruby syntax or an incorrect
regexp?
 
J

James Gray

James said:
the rio and fastercsv gems. The fastercsv documentation has been
delete a column of CSV. Can you tell me why it's failing?" for
I hope that helps.

James Edward Gray II

Ok. I would like to use fastercsv with maybe delete_if(&block) or
delete(index_or_header), where the index is selected by a regular
expression match.

Awesome. That helped. Thanks.

I would probably try something like this:

$ cat data.csv
A,B,C,D
1,2,3,4
5,6,7,8
$ cat del_col.rb
#!/usr/bin/env ruby -wKU

require "rubygems"
require "faster_csv"

table = FCSV.table("data.csv")
table.headers.each do |h|
next unless h.to_s =~ /\A[bd]\z/i
table.delete(h)
end
puts table

__END__
$ ruby del_col.rb
a,c
1,3
5,7

Hope that helps.

James Edward Gray II

Alright. I'm not getting this to work. I'm doing this:

table = FCSV.table("/Users/pshapiro/Desktop/Excel/csv/Attention.csv")
table.headers.each do |h|
next unless h.to_s =~ /(S[0-9]+\s\b)/
table.delete(h)
end
puts table

The regexp im using is "(S[0-9]+\s\b)" to match "S31 Which best
describes how you answered the online reading comprehension quiz?"

I'm not sure if its a problem with my ruby syntax or an incorrect
regexp?

It's FasterCSV kind of cheating you a little bit here. The table()
method uses some options to make it as easy as possible to
programmatically work with the resulting data structure. One of those
options turns the headers into Symbols, so your header really ends up
looking like:

:s31_which_best_describes_how_you_answered_the_online_reading_comprehension_quiz

Thus the Regexp needs to be adapted accordingly. Here's one you might
try:

/\AS\d+/i

Hope that helps.

James Edward Gray II
 
P

Paul Shapiro

:s31_which_best_describes_how_you_answered_the_online_reading_comprehension_quiz
Thus the Regexp needs to be adapted accordingly. Here's one you might
try:

/\AS\d+/i

Hope that helps.

James Edward Gray II

I'm guessing it was incorrect to assume that the delete statement would
write out to the file? How would I do that. It prints correctly.
 
J

James Gray

I'm guessing it was incorrect to assume that the delete statement
would
write out to the file? How would I do that. It prints correctly.

Yes, the delete() call deletes a column out of the in-memory table
data structure. If you want to write the result to a file, you do it
the same way you wrote it to the screen:

open("new_data.csv", "w") do |csv| # create a file
csv.puts table # write to that
end

Hope that helps.

James Edward Gray II
 
P

Paul Shapiro

James said:
Yes, the delete() call deletes a column out of the in-memory table
data structure. If you want to write the result to a file, you do it
the same way you wrote it to the screen:

open("new_data.csv", "w") do |csv| # create a file
csv.puts table # write to that
end

Hope that helps.

James Edward Gray II

that works great, but is there a way so that it doesn't output with all
of those underscores and the case change.
 
J

James Gray

that works great, but is there a way so that it doesn't output with
all
of those underscores and the case change.

Sure, change the table() call to something like:

table = FCSV.table("data.csv", :return_headers => true)

and the output writing code to something like:

FCSV.open("new_data.csv", "w") do |csv|
table.each do |row|
csv << row
end
end

Hope that helps.

James Edward Gray II
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,732
Latest member
ArronPalin

Latest Threads

Top