Data Structures Question...

S

Saikrishna

Hi friends,
I am trying to choose the best possible data structure for the probelm
I am going to describe now.

I have lets say tens of thousands of numbers in file1 and tens of
thousands of numbers in another file2. file1 & file2 contents (only
numbers) can be entirely different.

Now the program should be able to read the files and give me a
difference file. Ofcourse this is a very easy implementation if I go
with primitive programming using "ifs" and "whiles".

This is what I need to do. First I need to rearrange the data in file1
in vary compact format so that my program uses as little RAM as
possible. SETS are one way of doing that.

For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
file1 may contain several thousand numbers) then I can rearrange them
like these sets

{1-4} // Numbers 1 to 4
{9-12} // Numbers 9 to 12
{6-7} // Numbers 6 to 7
{22} // Single number 22.

And similraly I need to rearrange contents of file2 in this format.

Now my question is the best possible method for storage of this
datastructure.
Linked Lists, Tree or others or simply use STLs? Also I will be happy
if somebody can give me an idea about the best way of doing
comparision process
between sets of file1 and file2 and produce the difference set in a
number format in a result file "file3".

Thanks,
Sai
 
T

Thomas Matthews

Saikrishna said:
Hi friends,
I am trying to choose the best possible data structure for the probelm
I am going to describe now.

I have lets say tens of thousands of numbers in file1 and tens of
thousands of numbers in another file2. file1 & file2 contents (only
numbers) can be entirely different.

Now the program should be able to read the files and give me a
difference file. Ofcourse this is a very easy implementation if I go
with primitive programming using "ifs" and "whiles".

This is what I need to do. First I need to rearrange the data in file1
in vary compact format so that my program uses as little RAM as
possible. SETS are one way of doing that.

For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
file1 may contain several thousand numbers) then I can rearrange them
like these sets

{1-4} // Numbers 1 to 4
{9-12} // Numbers 9 to 12
{6-7} // Numbers 6 to 7
{22} // Single number 22.

And similraly I need to rearrange contents of file2 in this format.

Now my question is the best possible method for storage of this
datastructure.
Linked Lists, Tree or others or simply use STLs? Also I will be happy
if somebody can give me an idea about the best way of doing
comparision process
between sets of file1 and file2 and produce the difference set in a
number format in a result file "file3".

Thanks,
Sai

Your "sets" are also termed "runs" in a typical Merge Sort
algorithm.

This is possibly an algorithm issue, not a language issue.
My suggestion is to sort both files into a third, then
remove duplicates. Perhaps the folks in can offer better advice. Follow-ups set.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book
 
A

Alexander Malkis

There are three DIFFerent methods to attack the problem.

The first method is to look at the implementation of the diff utility
for Linux/Unix.
The second is to look at the implemenation of the sequnce-alignment from
biological-computer-science. There they determine the best aligments of
the gene sequences and it may have smth to do with your problem.
The laziest method is to ask in some programming / algorithmics group,
since algorithms/data structures are

O F F - T O P I C

here. That means that people here are not always qualified enough to
reply. (No offence meant.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,744
Latest member
CortneyMcK

Latest Threads

Top