Math/CompSci Interview Question - Thoughts?

P

Pascal J. Bourguignon

mike3 said:
Andrew Tomazos said:
I was posed the following question in a technical interview for a
Software Engineering position by a major multinational NASDAQ company:
[Paraphrasing]  "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array.  Specify an
algorithm to determine x."
The assumption being extra points for efficiency.
I initially came up with a linear space, linear time solution.  With
some prompting and hints from the interviewer we worked our way to a
smaller constant space and linear time algorithm.  That being
basically:
int findmode(int* p, int n)
{
    int count[32];
    for(int i = 0; i < 32; i++)
        count = 0;

    for (int i = 0; i < n; i++)
         for (int j = 0; j < 32; j++)
               if (p & (1 << j)) // if bit j is on
                    count[j]++;
               else
                    count[j]--;

    int x = 0;
    for (int i = 0; i < 32; i++)
        if (count > 0)
            x = x | (1 << i);

    return x;
}
The idea here is to count the frequency of each of the 32 bits in the
array separately, knowing that these bit counts will be dominated by
the mode.
The interviewer already knew the answer, so I can only assume the test
was based on how much help he had to give me to arrive at it.
My question about this is as follows.  I, perhaps boldly, claim to
employers to have a "masters-equivalant" experience/knowledge of
computer science and math.  Should I have been able to come up with
this solution without prompting or help?

If what you're asking is whether anybody having a master in CS and
maths would have been able to come up with this solution in the
interview time, I guess we can answer that definitely no, otherwise
there would be no point in trying to select candidates with this test.

Obviously, it's because some people (with or without a master diploma,
this really isn't relevant) get or don't get it, that this test is
useful for the recruiter.

Now if you want this kind of jobs, yes you should better be able to
come up with smart solutions to little puzzles like this in
interviews.


You mean he should be able to come up with them without prompting,
right?


Put yourself in the place of the employer! Between a guy who will
find the solution with two clues, and one who will find it alone, in
the same time, who will you hire?

What I'm saying here is tautologic: if you want the job that is
selected by this filter, then you better have to pass this filter!

(It's up to the employer to ensure that the filter matches the
requirements of the job).


You say not a "single" one, so does that mean no course would do it,
or
you'd need multiple ones? After all, courses and books (multiple) can
be used to increase the first 2 areas (knowledge of algorithms and
mathematics).

The later of course.

And how does one develop that? More specifically, the "speed" of
coming up with the solution, not just being able to come up with
one.

Learning, training, repeatition. Some say the brain is like a muscle,
you have to train it.

Basically, it is a kind of "intelligence", which you want to improve.


What would that be?

We cannot know in advance. The point here is that between somebody
who spends 100% of his time learning CS, and somebody who spends 90%
of his time learning CS, and 10% of his time 'learning' something
else, the later will probably be able to invent algorithms that the
first one won't.

For example, in the domain of artificial intelligence, it is
worthwhile to have some idea of neurobiology...
 
P

Pascal J. Bourguignon

mike3 said:
<snip>

Oh yes, and I forgot to add: how does one train the "symbolic"
thinking?


I don't have a statistically valid study, but I observed a few people
who didn't play with lego (or at least mecano) as a child had more
difficulty in building programs.
 
B

bartc

Andrew Tomazos said:
I was posed the following question in a technical interview for a
Software Engineering position by a major multinational NASDAQ company:

[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array. Specify an
algorithm to determine x."

How was it determined that one value occurs 500001 or more times?

Perhaps more attention should have been paid to x at the time the 500001
figure was calculated...

I quite liked Malcolm's idea of taking a small but random sample, choosing a
handful of likely candidates, then counting occurrences of those. If no
counts reach 500001, then repeat a few more times or until you know you've
been given dud information.
 
P

Phil Carmody

Andrew Tomazos said:
I was posed the following question in a technical interview for a
Software Engineering position by a major multinational NASDAQ company:

[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array. Specify an
algorithm to determine x."

The assumption being extra points for efficiency.

I initially came up with a linear space, linear time solution. With
some prompting and hints from the interviewer we worked our way to a
smaller constant space and linear time algorithm. That being
basically:

int findmode(int* p, int n)
{
int count[32];
for(int i = 0; i < 32; i++)
count = 0;

for (int i = 0; i < n; i++)
for (int j = 0; j < 32; j++)
if (p & (1 << j)) // if bit j is on
count[j]++;
else
count[j]--;

int x = 0;
for (int i = 0; i < 32; i++)
if (count > 0)
x = x | (1 << i);

return x;
}

The idea here is to count the frequency of each of the 32 bits in the
array separately, knowing that these bit counts will be dominated by
the mode.

The interviewer already knew the answer, so I can only assume the test
was based on how much help he had to give me to arrive at it.

My question about this is as follows. I, perhaps boldly, claim to
employers to have a "masters-equivilant" experience/knowledge of
computer science and math. Should I have been able to come up with
this solution without prompting or help?


What solution? The above exhibits undefined behaviour.

Phil
 
A

Andrew Tomazos

int findmode(int* p, int n)
{
    int count[32];
    for(int i = 0; i < 32; i++)
        count = 0;

    for (int i = 0; i < n; i++)
         for (int j = 0; j < 32; j++)
               if (p & (1 << j)) // if bit j is on
                    count[j]++;
               else
                    count[j]--;

    int x = 0;
    for (int i = 0; i < 32; i++)
        if (count > 0)
            x = x | (1 << i);

    return x;
}
The idea here is to count the frequency of each of the 32 bits in the
array separately, knowing that these bit counts will be dominated by
the mode.

What solution? The above exhibits undefined behaviour.


If you spot a bug you might bother saying what it is. I wrote this
quickly and haven't tested it. Are you referring to the assumption
that int is 32 bit, or is there a typo or some other problem?
-Andrew.
 
J

jbriggs444

On Nov 28, 4:07 am, Phil Carmody <[email protected]>
wrote:
int findmode(int* p, int n)
{
int count[32];
for(int i = 0; i < 32; i++)
count = 0;
for (int i = 0; i < n; i++)
for (int j = 0; j < 32; j++)
if (p & (1 << j)) // if bit j is on
count[j]++;
else
count[j]--;
int x = 0;
for (int i = 0; i < 32; i++)
if (count > 0)
x = x | (1 << i);
return x;
}
The idea here is to count the frequency of each of the 32 bits in the
array separately, knowing that these bit counts will be dominated by
the mode.
What solution? The above exhibits undefined behaviour.

If you spot a bug you might bother saying what it is. I wrote this
quickly and haven't tested it. Are you referring to the assumption
that int is 32 bit, or is there a typo or some other problem?
-Andrew.

I tried this with:
* int p[9] = {4, 4, 4, 5, 6, 7, 5, 5, 4};
* int mode = findmode(p, 9);
and it returns 5.
The bit-counting method correctly returns 4.

It looks to me like you missed Phase 1 from the Fischer&Salzberg
algorithm, namely
arrange the elements so that no two adjacent elements are the same.http://www.cs.yale.edu/publications/techreports/tr252.pdf- Hide quoted text -

- Show quoted text -


I think you missed the definition of the problem in this thread. It
is one of the givens of the problem that the array has a strict
"majority" value that appears in over half ot its entries.

You gave a 9 member array whose mode appears in less than half of the
elements. That violates the givens of the problem. The algorithm is
required to deliver correct results for input data that is within
spec. It is permitted to engage in undefined behavior for input data
that is out of spec.

However, unless you're going to snipe about implementation defined
precision for "int", I don't see any undefined behavior even for the
spec-violating input that you provided.

You go on to claim that the code quoted above gives an answer of 5 but
that the "bit-counting method" returns 4. What you fail to realize is
that the code above _is_ the bit-counting method
and that it returns 4, not 5.

I actually ran the code. Did you?

$ ./a.out
mode is 4

In summary, in spite of your accusation of ill-defined behavior, in
spite of your providing the algorithm with non-compliant input and in
spite of your claim of incorrect output, the fact of the matter is
that the algorithm got what you consider to be the right answer.

What were you complaining about again?
 
D

Danio

I think you missed the definition of the problem in this thread. It
is one of the givens of the problem that the array has a strict
"majority" value that appears in over half ot its entries.

Sorry my mistake - strict majority truned into modal value in my head
in between the original description and the latter discussions in this
thread.
You go on to claim that the code quoted above gives an answer of 5 but
that the "bit-counting method" returns 4. What you fail to realize is
that the code above _is_ the bit-counting method
and that it returns 4, not 5.

Again a mis-reading of Phil Carmody's quote - assumed that he would be
quoting
the most recent code posting of Andrew's not the initial one.
I actually ran the code. Did you? Yes

In summary, in spite of your accusation of ill-defined behavior, in
spite of your providing the algorithm with non-compliant input and in
spite of your claim of incorrect output, the fact of the matter is
that the algorithm got what you consider to be the right answer.

Well it wasn't my accusation of ill-defined behaviour: that was Phil
Carmody.
I was merely trying to find out what he didn't like about the
algorithm,
unfortunately with a horribly flawed test case and the wrong
algorithm...
 
O

Ostap S. B. M. Bender Jr.

I was posed the following question in a technical interview for a
Software Engineering position by a major multinational NASDAQ company:

[Paraphrasing]  "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array.  Specify an
algorithm to determine x."

The assumption being extra points for efficiency.

I initially came up with a linear space, linear time solution.  With
some prompting and hints from the interviewer we worked our way to a
smaller constant space and linear time algorithm.  That being
basically:

int findmode(int* p, int n)
{
    int count[32];
    for(int i = 0; i < 32; i++)
        count = 0;

    for (int i = 0; i < n; i++)
         for (int j = 0; j < 32; j++)
               if (p & (1 << j)) // if bit j is on
                    count[j]++;
               else
                    count[j]--;

    int x = 0;
    for (int i = 0; i < 32; i++)
        if (count > 0)
            x = x | (1 << i);

    return x;

}

The idea here is to count the frequency of each of the 32 bits in the
array separately, knowing that these bit counts will be dominated by
the mode.

The interviewer already knew the answer, so I can only assume the test
was based on how much help he had to give me to arrive at it.

My question about this is as follows.  I, perhaps boldly, claim to
employers to have a "masters-equivilant" experience/knowledge of
computer science and math.  Should I have been able to come up with
this solution without prompting or help?

Would you expect someone with a CompSci Masters or PhD from some major
ivy league university to be able to come up with this solution without
help?

If so in what course or from which book would you expect to learn the
required knowledge to come up with the above solution?

Is the skill to be able to come up with such an algorithm something
that is learned by studying lots of algorithms and being able to mix
and match the "tricks"?  If so, what is a similar commonly known
algorithm(s) on which the above solution could have been based?

Or, is the ability to invent such a solution simply a matter of IQ?
Some people have the talent to solve the puzzle, see the pattern and
come up with the solution - and some just don't?


How badly would the following simple algorithm work:

Look at the first column. Determine which digit occurs more - 0 or 1 -
among the 1,000,000 integers, and mark those integers, that don't have
this digit, as "R" (for "rejects")

Then go to the second column. Determine which digit occurs more - 0 or
1 - among the integers not marked "R", and mark those integers, that
don't have this digit, as "R".

..
..
..

Then go to the i'th column. Determine which digit occurs more - 0 or 1
- among the integers not marked "R", and mark those integers, that
don't have this digit, as "R".

At the end, return any integer not marked by "R".
 
A

Andrew Tomazos

I was posed the following question in a technical interview for a
Software Engineering position by a major multinational NASDAQ company:
[Paraphrasing]  "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array.  Specify an
algorithm to determine x."
The assumption being extra points for efficiency.
I initially came up with a linear space, linear time solution.  With
some prompting and hints from the interviewer we worked our way to a
smaller constant space and linear time algorithm.  That being
basically:
int findmode(int* p, int n)
{
    int count[32];
    for(int i = 0; i < 32; i++)
        count = 0;

    for (int i = 0; i < n; i++)
         for (int j = 0; j < 32; j++)
               if (p & (1 << j)) // if bit j is on
                    count[j]++;
               else
                    count[j]--;

    int x = 0;
    for (int i = 0; i < 32; i++)
        if (count > 0)
            x = x | (1 << i);

    return x;

The idea here is to count the frequency of each of the 32 bits in the
array separately, knowing that these bit counts will be dominated by
the mode.
The interviewer already knew the answer, so I can only assume the test
was based on how much help he had to give me to arrive at it.
My question about this is as follows.  I, perhaps boldly, claim to
employers to have a "masters-equivilant" experience/knowledge of
computer science and math.  Should I have been able to come up with
this solution without prompting or help?
Would you expect someone with a CompSci Masters or PhD from some major
ivy league university to be able to come up with this solution without
help?
If so in what course or from which book would you expect to learn the
required knowledge to come up with the above solution?
Is the skill to be able to come up with such an algorithm something
that is learned by studying lots of algorithms and being able to mix
and match the "tricks"?  If so, what is a similar commonly known
algorithm(s) on which the above solution could have been based?
Or, is the ability to invent such a solution simply a matter of IQ?
Some people have the talent to solve the puzzle, see the pattern and
come up with the solution - and some just don't?

How badly would the following simple algorithm work:

Look at the first column. Determine which digit occurs more - 0 or 1 -
among the 1,000,000 integers, and mark those integers, that don't have
this digit, as "R" (for "rejects")


This requires linear O(n) space to store these reject marks. The two
previously discussed algorithms only require constant O(1) space.
-Andrew.
 
O

Ostap S. B. M. Bender Jr.

I was posed the following question in a technical interview for a
Software Engineering position by a major multinational NASDAQ company:
[Paraphrasing]  "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array.  Specify an
algorithm to determine x."
The assumption being extra points for efficiency.
I initially came up with a linear space, linear time solution.  With
some prompting and hints from the interviewer we worked our way to a
smaller constant space and linear time algorithm.  That being
basically:
int findmode(int* p, int n)
{
    int count[32];
    for(int i = 0; i < 32; i++)
        count = 0;
    for (int i = 0; i < n; i++)
         for (int j = 0; j < 32; j++)
               if (p & (1 << j)) // if bit j is on
                    count[j]++;
               else
                    count[j]--;
    int x = 0;
    for (int i = 0; i < 32; i++)
        if (count > 0)
            x = x | (1 << i);
    return x;
}
The idea here is to count the frequency of each of the 32 bits in the
array separately, knowing that these bit counts will be dominated by
the mode.
The interviewer already knew the answer, so I can only assume the test
was based on how much help he had to give me to arrive at it.
My question about this is as follows.  I, perhaps boldly, claim to
employers to have a "masters-equivilant" experience/knowledge of
computer science and math.  Should I have been able to come up with
this solution without prompting or help?
Would you expect someone with a CompSci Masters or PhD from some major
ivy league university to be able to come up with this solution without
help?
If so in what course or from which book would you expect to learn the
required knowledge to come up with the above solution?
Is the skill to be able to come up with such an algorithm something
that is learned by studying lots of algorithms and being able to mix
and match the "tricks"?  If so, what is a similar commonly known
algorithm(s) on which the above solution could have been based?
Or, is the ability to invent such a solution simply a matter of IQ?
Some people have the talent to solve the puzzle, see the pattern and
come up with the solution - and some just don't?

How badly would the following simple algorithm work:
Look at the first column. Determine which digit occurs more - 0 or 1 -
among the 1,000,000 integers, and mark those integers, that don't have
this digit, as "R" (for "rejects")

This requires linear O(n) space to store these reject marks.  The two
previously discussed algorithms only require constant O(1) space.
 


Do you mean this:

int findmode(int* p, int n)
{
int count[32];
for(int i = 0; i < 32; i++)
count = 0;

for (int i = 0; i < n; i++)
for (int j = 0; j < 32; j++)
if (p & (1 << j)) // if bit j is on
count[j]++;
else
count[j]--;

int x = 0;
for (int i = 0; i < 32; i++)
if (count > 0)
x = x | (1 << i);

return x;

}


Doesn't count[j] require O(log n) space?


But of course O(log n) is still better than O(n)
 
A

Andrew Tomazos

I was posed the following question in a technical interview for a
Software Engineering position by a major multinational NASDAQ company:
[Paraphrasing]  "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array.  Specify an
algorithm to determine x."
The assumption being extra points for efficiency.
I initially came up with a linear space, linear time solution.  With
some prompting and hints from the interviewer we worked our way to a
smaller constant space and linear time algorithm.  That being
basically:
int findmode(int* p, int n)
{
    int count[32];
    for(int i = 0; i < 32; i++)
        count = 0;
    for (int i = 0; i < n; i++)
         for (int j = 0; j < 32; j++)
               if (p & (1 << j)) // if bit j is on
                    count[j]++;
               else
                    count[j]--;
    int x = 0;
    for (int i = 0; i < 32; i++)
        if (count > 0)
            x = x | (1 << i);
    return x;
}
The idea here is to count the frequency of each of the 32 bits in the
array separately, knowing that these bit counts will be dominated by
the mode.
The interviewer already knew the answer, so I can only assume the test
was based on how much help he had to give me to arrive at it.
My question about this is as follows.  I, perhaps boldly, claim to
employers to have a "masters-equivilant" experience/knowledge of
computer science and math.  Should I have been able to come up with
this solution without prompting or help?
Would you expect someone with a CompSci Masters or PhD from some major
ivy league university to be able to come up with this solution without
help?
If so in what course or from which book would you expect to learn the
required knowledge to come up with the above solution?
Is the skill to be able to come up with such an algorithm something
that is learned by studying lots of algorithms and being able to mix
and match the "tricks"?  If so, what is a similar commonly known
algorithm(s) on which the above solution could have been based?
Or, is the ability to invent such a solution simply a matter of IQ?
Some people have the talent to solve the puzzle, see the pattern and
come up with the solution - and some just don't?
How badly would the following simple algorithm work:
Look at the first column. Determine which digit occurs more - 0 or 1 -
among the 1,000,000 integers, and mark those integers, that don't have
this digit, as "R" (for "rejects")

This requires linear O(n) space to store these reject marks.  The two
previously discussed algorithms only require constant O(1) space.
 

Do you mean this:

int findmode(int* p, int n)
{
    int count[32];
    for(int i = 0; i < 32; i++)
        count = 0;

    for (int i = 0; i < n; i++)
         for (int j = 0; j < 32; j++)
               if (p & (1 << j)) // if bit j is on
                    count[j]++;
               else
                    count[j]--;

    int x = 0;
    for (int i = 0; i < 32; i++)
        if (count > 0)
            x = x | (1 << i);

    return x;

}

Doesn't count[j] require O(log n) space?

But of course O(log n) is still better than O(n)


Ummm, I'm not sure whether you are reffering to the number of ints
that make up the count array (32) or the size of the individual ints
themselves. The number of counts required is always 32 independant of
n, therefore that part is constant.

In the counting algorithm the size of each of the 32 ints has to be
large enough to hold a value of n, and so requires log2(n) bits. This
is also true of the Fisher algorithm which must maintain a similiar
count of maximum value n, and therefore also needs log2(n) bits.
These were generally referred to as constant space solutions, but I am
confused right now whether that is true anymore, and whether these two
solutions are actually technically O(log(n)) space. I remember
hearing something about there being an implied log operation on space
complexity in some context, but I'm not sure if I'm conflating this
with something else.
-Andrew.
 
W

websnarf

Andrew Tomazos said:
I was posed the following question in a technical interview for a
Software Engineering position by a major multinational NASDAQ company:

[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array. Specify an
algorithm to determine x."

I missed your original post, but dug up the following that might be of
interest:

http://www.cs.utexas.edu/users/moore/best-ideas/mjrty/index.html
 
R

Rob Johnson

Probably I miss something, but this algorithm gives D as the majority
element in the sequence AAAABCBCD. ???

The majority element has to account for more than 50% of the sample.
In the example above, there is no majority element (A is 4/9 of the
sample), so the algorithm does not guarantee anything.

Rob Johnson <[email protected]>
take out the trash before replying
to view any ASCII art, display article in a monospaced font
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,148
Messages
2,570,838
Members
47,385
Latest member
Joneswilliam01

Latest Threads

Top