List of integers & L.I.S.

B

Bryan Olson

n00m said:
> Oh!
> Seems you misunderstand me!
> See how the last block in your code should look:
>
> for tc in range(10):
> _ = stdin.readline()
> sequence = [int(ch) for ch in stdin.readline().split()]
> supers = supernumbers(sequence)
> print len(supers)
> for i in supers:
> print i,

Ah, I misunderstood the input. I thought they'd run it once for
each of their test cases. Handling exactly ten inputs sucks, so
how about:

while True:
if not stdin.readline().strip():
break
sequence = [int(ch) for ch in stdin.readline().split()]
supers = supernumbers(sequence)
print len(supers)
for i in supers:
print i,
print
 
B

Bryan Olson

n00m said:
> It also timed out:(

Could be. Yet you did write:
> It's incredibly fast!


I never intended to submit this program for competition. The
contest ranks in speed order, and there is no way Python can
compete with truly-compiled languages on such low-level code.
I'd bet money that the algorithm I used (coded in C) can run
with the winners. I also think I'd wager that the Python version
outright trumps them on code size.


My first version bombed for the zero-length sequence. That was a
mistake, sorry, but it may not be one of their test-cases. I
wonder how many of the accepted entries would perform properly.
 
N

n00m

Bryan said:
Could be. Yet you did write:
I just was obliged to exclaim "It's incredibly fast!"
because I THOUGHT your first version handled ALL TEN
testcases from the input. But the code read from the
*20-lines* input *ONLY 2* its first lines.

Usually they place heavy data testcase(s) at the end
of the (whole) input. Like this:

3
2 3 1
7
4 5 6 1 2 7 3
....
....
....
100000
456 2 6789 ... ... ... ... ... 55444 1 ... 234

Surely producing an answer for list [2, 3, 1] will
be "incredibly fast" for ANY language and for ANY
algorithm.
My first version bombed for the zero-length sequence. That was a
mistake, sorry, but it may not be one of their test-cases.
In my turn I can bet there's not an empty sequence testcase in the
input.
I
wonder how many of the accepted entries would perform properly.
Info of such kind they keep in secret (along with what the input
data are).

One more thing.
They (the e-judge's admins) are not gods and they don't warrant
that if they put 9 sec timelimit for a problem then this problem
can be "solved" in all accepted languages (e.g. in Python).
I never intended to submit this program for competition.
"Competition" is not quite relevant word here. It just LOOKS as
if it is a "regular" competetion. There nobody blames anybody.
Moreover, judging on my own experience, there nobody is even
interested in anybody. It's just a fun (but very useful fun).
 
N

n00m

Bryan;
My own version also timed out.
And now I can tell: it's incredibly SLOW.
Nevertheless it would be interesting to compare
speed of my code against yours. I can't do it myself
because my Python is of 2.3.4 version.

Just uncomment "your" part.


import bisect

def oops(w,a,b):
for m in w:
j=bisect.bisect_left(a,m)
a.insert(j,m)
b.insert(j,max(b[:j]+[0])+1)

def n00m(n,w):
a,b=[],[]
oops(w,a,b)
v=map(lambda x: -x, w[::-1])
c,d=[],[]
oops(v,c,d)
e=map(sum, zip(b, d[::-1]))
mx=max(e)
f=[]
for i in xrange(n):
if e==mx:
f.append(i+1)
print len(f)


def one_way(seq):
n = len(seq)
dominators = [n + 1] * (n * 1)
score = [None] * n
end = 0
for (i, x) in enumerate(seq):
low, high = 0, end
while high - low > 10:
mid = (low + high) >> 1
if dominators[mid] < x:
low = mid + 1
else:
high = mid + 1
while dominators[low] < x:
low += 1
dominators[low] = x
score = low
end = max(end, low + 1)
return score

def supernumbers(seq):
forscore = one_way(seq)
opposite = [len(seq) - x for x in reversed(seq)]
backscore = reversed(one_way(opposite))
score = map(sum, zip(forscore, backscore))
winner = max(score + [0])
return sorted([seq for i in range(len(seq)) if score ==
winner])

def b_olson(sequence):
supers = supernumbers(sequence)
print len(supers)



import random, time
n=50000
w=range(1,n+1)
random.shuffle(w)

t=time.time()
n00m(n,w)
print 'n00m:',time.time()-t

"""
t=time.time()
b_olson(w)
print 'b_olson:',time.time()-t
"""
 
T

Tim Peters

[Bryan Olson, on the problem at
http://spoj.sphere.pl/problems/SUPPER/
]
I never intended to submit this program for competition. The
contest ranks in speed order, and there is no way Python can
compete with truly-compiled languages on such low-level code.
I'd bet money that the algorithm I used (coded in C) can run
with the winners. I also think I'd wager that the Python version
outright trumps them on code size.

Oh, it's not that bad <wink>. I took a stab at a Python program for
this, and it passed (3.44 seconds). It just barely made it onto the
list of "best" solutions, which I also guess is ranked by elapsed
runtime. The Java program right above it took 2.91 seconds, but ate
more than 27x as much RAM ;-)

I didn't make any effort to speed this, beyond picking a reasonable
algorithm, so maybe someone else can slash the runtime (while I
usually enjoy such silliness, I can't make time for it at present).
I'll include the code below.

Alas, without access to the input data they use, it's hard to guess
what might be important in their data. On my home box, chewing over
random 100,000-element permutations took less than a second each
(including the time to generate them); I'm pretty sure they're using
slower HW than mine (3.4 GHz P5).
My first version bombed for the zero-length sequence. That was a
mistake, sorry, but it may not be one of their test-cases. I
wonder how many of the accepted entries would perform properly.

No idea here, and didn't even think about it.

Notes: the `all` returned by crack() is a list such that all is
list of all (index, value) pairs such that the longest increasing
subsequence ending with `value` is of length i+1; `value` is at index
`index` in the input permutation. The maximal LISs thus end with the
values in all[-1]. findall() iterates backwards over `all`, to
accumulate all the values that appear in _some_ maximal LIS. There's
clearly wasted work in findall() (if someone is looking for an
algorithmic point to attack). Curiously, no use is made of that
values are integers, outside of input and output; any values with a
total ordering would work fine in crack() and findall().

"""
# http://spoj.sphere.pl/problems/SUPPER/

def crack(xs):
from bisect import bisect_right as find
smallest = []
all = []
n = 0
for index, x in enumerate(xs):
i = find(smallest, x)
if i == n:
smallest.append(x)
all.append([(index, x)])
n += 1
else:
all.append((index, x))
if x < smallest:
smallest = x
return all

def findall(all):
constraints = all[-1]
allints = [pair[1] for pair in constraints]
for i in xrange(len(all) - 2, -1, -1):
survivors = []
for pair in all:
index, value = pair
for index_limit, value_limit in constraints:
if index < index_limit and value < value_limit:
survivors.append(pair)
allints.append(value)
break
constraints = survivors
return sorted(allints)

def main():
import sys
while 1:
n = sys.stdin.readline()
if not n:
break
n = int(n)
perm = map(int, sys.stdin.readline().split())
assert n == len(perm)
supers = findall(crack(perm))
perm = None # just to free memory
print len(supers)
print " ".join(map(str, supers))

if __name__ == "__main__":
main()
"""
 
N

n00m

Tim Peters;
INCREDIBLE~
241433 2005-09-11 04:23:40 Tim Peters accepted 3.44 7096 PYTH
BRAVO!
I just wonder have I grey cells enough for to understand how your
algo works... and hopefully it's not your last solved problem on
the contester.
I'm pretty sure they're using
slower HW than mine (3.4 GHz P5).
As I mentioned before their 4 identical machines are PIII Xeon 700MHz.

PS:
each accepted solution automatically gets into "Best Solutions" list.
 
T

Tim Peters

[Tim Peters, on the problem at
http://spoj.sphere.pl/problems/SUPPER/
]
[[email protected]]
INCREDIBLE~
241433 2005-09-11 04:23:40 Tim Peters accepted 3.44 7096 PYTH
BRAVO!

It's different now ;-) I added the two lines

import psyco
psyco.full()

and time dropped to 2.29, while memory consumption zoomed:

2005-09-11 18:44:57 Tim Peters accepted 2.29 42512 PYTH

That surprised me: my own test cases on Windows using Python 2.4.1
enjoyed the same order of speedup, but barely increased RAM usage.
I just wonder have I grey cells enough for to understand how your
algo works...

You do! Work out some small examples by hand, and it will become
clear; be sure to read the comments before the code, because they
explain it.
 
B

Bryan Olson

Tim said:
> [Bryan Olson, on the problem at
> http://spoj.sphere.pl/problems/SUPPER/
> ]
>
>>I never intended to submit this program for competition. The
>>contest ranks in speed order, and there is no way Python can
>>compete with truly-compiled languages on such low-level code.
>>I'd bet money that the algorithm I used (coded in C) can run
>>with the winners. I also think I'd wager that the Python version
>>outright trumps them on code size.
>
> Oh, it's not that bad <wink>. I took a stab at a Python program for
> this, and it passed (3.44 seconds). [...]
> I didn't make any effort to speed this, beyond picking a reasonable
> algorithm, so maybe someone else can slash the runtime

Hmmm ... I used the Theta(n lg n) algorithm ... how the heck...
Aha! The 'bisect' module changed since last I looked. It still
has the Python implementation, but then the last few lines say:

# Overwrite above definitions with a fast C implementation
try:
from _bisect import bisect_right, bisect_left, insort_left,
insort_right, insort, bisect
except ImportError:
pass

Binary-search is the inner-loop in this algorithm. I wrote my
own binary-search, so I was executing Theta(n lg n) Python
statements. Tim's use of bisect means that his inner-loop is
in C, so he does Theta(n) Python statements and Theta(n lg n) C
statement.

The key to fast Python: use a good algorithm, and keep the inner
loop in C.
 
T

Tim Peters

[Tim Peters, on the problem at
http://spoj.sphere.pl/problems/SUPPER/
]
[Bryan Olson]
Hmmm ... I used the Theta(n lg n) algorithm ... how the heck...
Aha! The 'bisect' module changed since last I looked. It still
has the Python implementation, but then the last few lines say:

# Overwrite above definitions with a fast C implementation
try:
from _bisect import bisect_right, bisect_left, insort_left, insort_right, insort, bisect
except ImportError:
pass

Binary-search is the inner-loop in this algorithm. I wrote my
own binary-search, so I was executing Theta(n lg n) Python
statements. Tim's use of bisect means that his inner-loop is
in C, so he does Theta(n) Python statements and Theta(n lg n) C
statement.

That's part of it, but doesn't account for enough. If I disable the C
implementation of bisect (replace the "from ..." import above with a
"pass"), the program takes about 50% longer, which would still leave
it well under the 9-second SPOJ time limit. That's without psyco.
With psyco, it takes about the same time with the Python
implementation of bisect as with the C implementation.

Proably more important was my findall() routine. It does redundant
work, and its runtime seems hard to analyze, but it's conceptually
simple and actual timing showed it takes about a third of the time
consumed by my crack() on random 100,000-element permutations. In
fact, findall() takes very close to the amount of time needed just to
read a giant line of input and convert it to a list of ints (without
psyco; with psyco converting the input takes longer than findall()).

Possible surprise: there's a simple trick that allows rewriting
findall() to produce the result list in sorted order directly, instead
of building it in "random" order and sorting it at the end. That made
no measurable difference.
The key to fast Python: use a good algorithm,
Absolutely!

and keep the inner loop in C.

Usually ;-)

Add another: especially for long-term maintenance, readability,
stability and performance, use a library function instead of rolling
your own. The chance that Raymond Hettinger is going to recode _your_
functions in C is approximately 0 ;-)
 
N

n00m

Got it! He is a kind of pythonic monsters.

Btw, why it's impossible to reply to old threads?
Namely, there're no more "Reply" link in them.
Only "Reply to author" etc.
 
T

Terry Reedy

n00m said:
Who is Raymond Hettinger?

The Python developer who, in the last few years, has perhaps been the most
active in coding or recoding library modules, such as itertools and sets,
in C. He also partipates in development discussions, helps with
SourceForge tracker items (RFEs, bugs, and patches) and occasionally posts
here, especially about his modules, for which he is the expert.

Terry J. Reedy
 
R

Reinhold Birkenfeld

n00m said:
Got it! He is a kind of pythonic monsters.

Btw, why it's impossible to reply to old threads?
Namely, there're no more "Reply" link in them.
Only "Reply to author" etc.

Perhaps because you are not using a real Usenet client?

Reinhold
 
A

antonmuhin

I hope nobody have posted similar solution (it's tested, but I didn't
submit it to contest):

from bisect import bisect_right as find

def supernumbers(ls):
indices = [0]*len(ls)
for i, e in enumerate(ls):
indices[e - 1] = i

result = [None]*len(ls)

borders = []

for i in indices:
ind = find(borders, i)
result = ind + 1
if ind == len(borders):
borders.append(i)
else:
borders[ind] = i

return result

At least, it looks shorter than other solutins.
Of course, it allows number of optimizations like
prefetching length and caching borders.append.

If I'm correct, it takes O(n*log (max sequence length)) time that can
be a win for huge datasets.

With the best regards,
anton.
 
A

antonmuhin

Hellom n00m!

I beg your pardon for not having replied for such long time---I was
really busy.

Yes, the posted code doesn't solve the supernumbers problem, but solves
the problem you posted first as I understood it :) --- for each number
find a position of this number in the longest sequence it belongs to.

Ok, anyway, I hope that I can suggest a solution to the supernumbers
problem (see below).

Actually, as I find out today testing it against Tim Peters's solution,
they are of the same kind, but I use one additional optimimization.

And the last note: I didn't persue the maximal performace, rather
clarity of the algorithm.

With the best regards,
anton.

from bisect import bisect as find

def supernumbers(l):
indices = [0]*len(l)
for i, e in enumerate(l):
indices[e - 1] = i

# Now we can find out index of each element in the longest ascending
sequence it belongs to
# I call this index 'generation'

# ess is an array indexed by generations and holding
# ascending list of generation's elements
# Note: suffix s stands for a list, therefore ess---list of lists
of elements
# Note: ess holds not the elements itself, but elements decreased
by 1
ess = []

# Borders i.e. positions that separate generations
borders = []

for e, i in enumerate(indices):
ind = find(borders, i)
if ind < len(borders):
borders[ind] = i
ess[ind].append(e)
else:
borders.append(i)
ess.append([e])

# Now we should filter out those elements that don't
# belong to the longest sequences
gens = reversed(ess) # walk generations in reverse order
fes = gens.next() # fes stands for elements' filter
r = fes # r is a list of supernumbers; obvioulsy all the elements of
the latest generation belong to r
for es in gens:
new_fes = []

s = 0 # As elements are sorted ascendingly we can check lesser and
lesser lists
for e in es:
s = find(fes, e, s)
if s == len(fes):
# There is no more elements in filter that are greater
# than elements in the current generation
break
if indices[fes] > indices[e]: # Nice---element e belongs to
the longest sequence
new_fes.append(e)

# Note: the following invariant holds: len(new_fes) > 0

fes = new_fes
r += new_fes

return [e + 1 for e in r] # As we used numbers from 0 internally
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,264
Messages
2,571,323
Members
48,007
Latest member
Elvis60357

Latest Threads

Top