N
Nobody
I've been looking for a job for a while now, and have run into this
interview question twice now... and have stupidly kind of blown it twice...
(although I've gotten better)... time to finally figure this thing out...
basically the interview question is: given an unsorted listed such as:
3,1,3,7,1,2,4,4,3
find the FIRST UNIQUE number, in this case 7... and of course the list can
be millions long...
the first time I got this question, my solution was something like:
1) declare a second array A2 with n elements
2) loop through original array A1 and insert the element in its sorted
position in A2, if I find a match eliminate, mark all copies of that element
as "duplicate"
3) loop through original array A1 again and find the first number that is
not listed as a duplicate in A2
I think at the time of that interview, I pointed out that this wasnt a very
good algorithm as it would be something like O(2*(n^2)), but it will get the
job done (I did get offered that job actually, but stupidly turned it
down)...
I then offered up another solution at that interview which I thought was
better:
1) build a copy of A1 that has the original position of each element
2) sort A1
3) loop through sorted A1 deleting any duplicate items
4) now A1 will only contain unique items... so loop through it again looking
for the smallest position that was marked in step 1
this was kind of the algorithm I started out with in interview #2 (which I
really wanted, but didnt get)...as the interviewer asked me to give him the
run time of that algorithm which I said was:
step #1: n
step #2: n log n
step #3: n
step #4: n
resulting in 3n + n log n...
this was a lot better then the original algorithm, but still not too great
because of the 3n term...
I got my rejection from the 2nd place yesterday, so I kind of was up all
night trying to figure out how to get it faster...
I came up with something like this:
1) loop through A1 building a customized AVL tree... an AVL tree insertion
is O(log n), so building this tree should take O(n log n).. the customized
part is... each node inserted will bring along its original position in the
array (0..n)... if I run across a duplicate node, I will mark the position
of the node thats already in the tree as "-1" and dump the node I'm trying
to insert
so basically I've spent O(n log n) to combine steps 1, 2 & 3 of my last
solution... basically taking 2n + n log n down to O(n log n)...
2) but now I need to visit every node in the tree to find out which one has
the lowest position which is O(n) obviously ignoring any nodes I marked
as -1.
so this final "optimized" solution I came up with last night would still
require O(n + n log n).
Is there any way to optimize this further? or a better algorithm all
together? I'm thinking perhaps to keep track of the first unique node as I'm
creating the tree? but I couldn't nail down an idea that would work...
I'm sort of invisioning the tree having a member called m_nPos initialized
to 0... and as I'm building the tree, if I find a duplicate element at
m_nPos, to increment it. So basically I insert element 0, then later on, I
find a duplicate for it, so the first unique item can't possibly be at zero,
it has to be at least at position 1. But then as I did a few test cases, I
poke holes in this idea very easily.
Any ideas?
interview question twice now... and have stupidly kind of blown it twice...
(although I've gotten better)... time to finally figure this thing out...
basically the interview question is: given an unsorted listed such as:
3,1,3,7,1,2,4,4,3
find the FIRST UNIQUE number, in this case 7... and of course the list can
be millions long...
the first time I got this question, my solution was something like:
1) declare a second array A2 with n elements
2) loop through original array A1 and insert the element in its sorted
position in A2, if I find a match eliminate, mark all copies of that element
as "duplicate"
3) loop through original array A1 again and find the first number that is
not listed as a duplicate in A2
I think at the time of that interview, I pointed out that this wasnt a very
good algorithm as it would be something like O(2*(n^2)), but it will get the
job done (I did get offered that job actually, but stupidly turned it
down)...
I then offered up another solution at that interview which I thought was
better:
1) build a copy of A1 that has the original position of each element
2) sort A1
3) loop through sorted A1 deleting any duplicate items
4) now A1 will only contain unique items... so loop through it again looking
for the smallest position that was marked in step 1
this was kind of the algorithm I started out with in interview #2 (which I
really wanted, but didnt get)...as the interviewer asked me to give him the
run time of that algorithm which I said was:
step #1: n
step #2: n log n
step #3: n
step #4: n
resulting in 3n + n log n...
this was a lot better then the original algorithm, but still not too great
because of the 3n term...
I got my rejection from the 2nd place yesterday, so I kind of was up all
night trying to figure out how to get it faster...
I came up with something like this:
1) loop through A1 building a customized AVL tree... an AVL tree insertion
is O(log n), so building this tree should take O(n log n).. the customized
part is... each node inserted will bring along its original position in the
array (0..n)... if I run across a duplicate node, I will mark the position
of the node thats already in the tree as "-1" and dump the node I'm trying
to insert
so basically I've spent O(n log n) to combine steps 1, 2 & 3 of my last
solution... basically taking 2n + n log n down to O(n log n)...
2) but now I need to visit every node in the tree to find out which one has
the lowest position which is O(n) obviously ignoring any nodes I marked
as -1.
so this final "optimized" solution I came up with last night would still
require O(n + n log n).
Is there any way to optimize this further? or a better algorithm all
together? I'm thinking perhaps to keep track of the first unique node as I'm
creating the tree? but I couldn't nail down an idea that would work...
I'm sort of invisioning the tree having a member called m_nPos initialized
to 0... and as I'm building the tree, if I find a duplicate element at
m_nPos, to increment it. So basically I insert element 0, then later on, I
find a duplicate for it, so the first unique item can't possibly be at zero,
it has to be at least at position 1. But then as I did a few test cases, I
poke holes in this idea very easily.
Any ideas?