Manipulating CSV files over SSH

D

Drew Olson

All -

I have a nice little script that takes a large CSV document and splits
it into 65,000 line chunks which are readable in Excel. However, I
currently have a .csv file which I'd like to split that is sitting on a
unix box. I have ssh access to the box, however the file is quite big
and I'd like to avoid downloading the whole file. Is there a simple way
to modify my script such that I use the ssh library to access the
document and perform the splitting over the network?

Thanks in advance,
Drew
 
L

Logan Capaldo

All -

I have a nice little script that takes a large CSV document and splits
it into 65,000 line chunks which are readable in Excel. However, I
currently have a .csv file which I'd like to split that is sitting on a
unix box. I have ssh access to the box, however the file is quite big
and I'd like to avoid downloading the whole file. Is there a simple way
to modify my script such that I use the ssh library to access the
document and perform the splitting over the network?

Thanks in advance,
Drew
ssh user@host -x 'cat large.csv' | your_script
should work pretty well for this.
 
J

Jan Svitok

All -

I have a nice little script that takes a large CSV document and splits
it into 65,000 line chunks which are readable in Excel. However, I
currently have a .csv file which I'd like to split that is sitting on a
unix box. I have ssh access to the box, however the file is quite big
and I'd like to avoid downloading the whole file. Is there a simple way
to modify my script such that I use the ssh library to access the
document and perform the splitting over the network?

Thanks in advance,
Drew

The natural way would be to upload the script there and run it on he
remote host, provided that ruby is installed on the host.

If you don't need CSV escaping, (embedded newlines etc.) then using
split(1) should be enough, i.e.

split -l 65000 filename prefix

for alphanum counter and

split -l 65000 -d filename prefix

for numerical one.
 
C

Cameron McBride

ssh user@host -x 'cat large.csv' | your_script
should work pretty well for this.

That is essentially downloading the entire file, which he doesn't want to do.

Wow, hot topic. I was going to suggest what Jan Svitok did: The
easiest thing to do is to upload and run the ruby script on the unix
box.

If the script takes a while to run, you might want to check out
"screen" (but that is a bit off topic - ping me personally if you need
more info on this than you can google).

Cameron
 
T

Thomas Hafner

Drew Olson said:
I have a nice little script that takes a large CSV document and splits
it into 65,000 line chunks which are readable in Excel. However, I
currently have a .csv file which I'd like to split that is sitting on a
unix box.

Where shall the result of that splitting reside? Also on the same unix
box where the original CSV document resides?

Regards
Thomas
 
D

Drew Olson

Where shall the result of that splitting reside? Also on the same unix
box where the original CSV document resides?

Regards
Thomas

Yes, I'd like the split files to be on the unix box as well. Obviously,
the easy solution is to install rails but I can't (client paranoid,
etc). I do have ksh or perl to work with, but I have a working ruby
script which i'd like to use. Any other ideas?

-Drew
 
D

Drew Olson

Drew said:
Yes, I'd like the split files to be on the unix box as well. Obviously,
the easy solution is to install rails but I can't (client paranoid,
etc). I do have ksh or perl to work with, but I have a working ruby
script which i'd like to use. Any other ideas?

-Drew

Any ideas? If I can't do this with ruby does anyone here have any ksh or
perl - foo to share?

-Drew
 
A

ara.t.howard

Yes, I'd like the split files to be on the unix box as well. Obviously,
the easy solution is to install rails but I can't (client paranoid,
etc). I do have ksh or perl to work with, but I have a working ruby
script which i'd like to use. Any other ideas?

installing rails would be a twenty ton sledgehammer approach.

ruby uses stdin if no script is provided. you need to send the script to ruby
on stdin via ssh and let it process the local file, creating local output.

here's the code

harp:~ > cat a.rb
input = ARGV.shift
output = "#{ input }.out"

open(output, 'w') do |fd_out|
open(input) do |fd_in|
fd_in.each do |line|
fd_out.puts line.split(',').inspect
end
end
end

here is the remote file

harp:~ > ssh fortytwo.merseine.nu cat foo.csv
1,2,3
a,b,c

we spawn ruby on the remote host reading from stdin, giving 'foo.csv' as an argument and the file'a.rb' as the script to run

harp:~ > ssh fortytwo.merseine.nu ruby - foo.csv < a.rb

this works as expected: the output is created on the remote host

harp:~ > ssh fortytwo.merseine.nu cat foo.csv.out
["1", "2", "3\n"]
["a", "b", "c\n"]


hth.


-a
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,232
Messages
2,571,168
Members
47,803
Latest member
ShaunaSode

Latest Threads

Top