large data file manipulation

Roland Hall · Dec 15, 2004

I'm looking for information on working with large data files using FSO, XML.

I have a program which creates a large CSV file, over 7mb. It's a rate
table of freight shipping costs.
There are certain fields I do not need, some are blank. A typical line
would be:

Raw data:

" ", "30142", "GA", "01001"," ", "MA","
","100",018609,000000,000000,000000,014435,013181,010622,009022,007125,006569,006569,006569,006569,000000,000000,000000,000000

structure:

blank,fromzip,fromstate, tozip,blank, tostate,blank,class, mc, blank, blank,
blank, l5c, m5c, m1m, m2m, m5m, mxm, mxxm, mxxxm, mxlm, blank, blank,
blank,blank

I don't need the double quotes or spaces or any field determined to be blank
in the structure. It is my understanding I can read this file in 3 ways:

read(b)
readLine
readAll

I chose readLine because I didn't want the 7mb all at once nor reading bytes
because the line is not fixed. I'm using readLine. I manipulate my data
and append my data to a new file after 1000 lines, finishing up with however
many lines are left upon reaching the end.

My result file is a little over 3mb [41380 lines of raw data]. It takes
seconds to process and will only be used if shipping rates change. The 3mb
file is still too large to work with and I have decided to split it up in
one of two ways, either by state or zip code ranges. "By state" gives me 50
and zip range gives me 10. Not sure what the difference in size will be or
if it will be a noticeable difference. The rate table, or part of it, will
only in memory long enough to get the rate and then released.

I have printing to the screen turned on during the debug process. You can
see it here:
http://kiddanger.com/dev/freight.asp

My questions are:

Since I have to use data files would using XML over CSV be drastically
different to use as a lookup for my new file?
How much more efficient is XML to retrieve information over CSV being read
in? To make a true comparison, the result will eventually be multiple
files, read in with readALL [if used as CSV] and then I would search an
array for the rate I needed.

If I used XML, would it be necessary to split the file up, as I would with
the CSV [by ship to state] or could I use the single file?

Yes, I know SQL is better but I have to also have a version that does not
use a database.

TIA...

--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp

Mandatory Elements To Conduct JavaScript Form Manipulation	7	Aug 22, 2023
Read xml column inside csv file with Python	0	Jul 23, 2022
Is this right way to convert data attributes values to number in javascipt? Need to get valid numeric value or 0	2	May 30, 2023
PHP cURL for large content and single HTTP request	1	Feb 23, 2023
CSS File does not really work?	1	Jul 7, 2023
Building a program that can enter data into a field	3	Aug 18, 2022
Using a DTSX file with GoDaddy	0	Apr 21, 2024
How to create PDF file in Batch	5	May 11, 2022

large data file manipulation

Roland Hall

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads