pairwise comparison and merging of vector elements

A

Alan

I am wondering if anyone has any better idea of how to approach
this problem than I do. . . .

I have a vector of items (data). I have to do a pairwise
comparison of each item to each other item and apply logic to see if
one should be deleted. In the past I have done this using for
statements, like:

for (int i = 0; i < input_vector.size(); i++)
for (int j = i; i < input_vector.size(); j++)
{
. . . process data . . .
}

The problem is that I have do delete items in the middle, which makes
index-watching tricky. So, in the past, I have copied the vector
first and manipulated the copy vice the original when I process the
data. However, then I have to keep track of which items in the
original vector have already been merged away. For example, I compare
items 1 and 2, and I decide to delete 2. So, then after I finish all
the comparisons with item 1, I need to do that with item 3 (not 2,
which I`ve deleted in the copied vector.

In addition to being complicated in the past, this has been slow.

Is there a better approach? I come across the same scenario
frequently.

Thanks, Alan
 
V

Victor Bazarov

Alan said:
I am wondering if anyone has any better idea of how to approach
this problem than I do. . . .

I have a vector of items (data). I have to do a pairwise
comparison of each item to each other item and apply logic to see if
one should be deleted. In the past I have done this using for
statements, like:

for (int i = 0; i < input_vector.size(); i++)
for (int j = i; i < input_vector.size(); j++)
{
. . . process data . . .
}

The problem is that I have do delete items in the middle, which makes
index-watching tricky.

Tricky? Really?... Most of vector cleanup code I've seen used to
decrement the index ('j', for example) right after deletion of the
element which it indexes (so the next ++ would essentially keep it
"correct").
So, in the past, I have copied the vector
first and manipulated the copy vice the original when I process the
data. However, then I have to keep track of which items in the
original vector have already been merged away. For example, I compare
items 1 and 2, and I decide to delete 2. So, then after I finish all
the comparisons with item 1, I need to do that with item 3 (not 2,
which I`ve deleted in the copied vector.

In addition to being complicated in the past, this has been slow.

Is there a better approach? I come across the same scenario
frequently.

Keep the second vector<char> and "mark" the elements you've decided
to throw away. When going over your 'i' and 'j', first check with the
"marked" vector and if the element is set, skip to the next. Fast
and simple to understand. At the end walk from start and copy into
another vector only the elements that don't have corresponding "marked"
elements set.

Or write a proper functor and use 'remove_if'. You can rely on the
requirement that elements of a vector exist in an array...

And if you post a bit more of the code that illustrates your problem
we might point out inefficiencies further (if you want).

V
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,996
Messages
2,570,237
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top