Programming challenge: wildcard exclusion in cartesian products

Dinko Tenev · Mar 20, 2006

It would seem that your program is just filtering the full cartesian
product, right? The solution I'm looking for generates the elements
one-by-one so that it could be used in a loop.

OK, having read some of the comments so far, I have the feeling that I
may be missing the point in more than one way, so let's set this
straight:

If I understand correctly, for an alphabet S, and a subset W of S*
specified by the wildcards, you expect the enumeration of sequences of
length n to run in Theta( n*|S^n - W| ) instead of Theta( n*|S^n| ).

First, this doesn't seem to hold for your Python program. Try, for
example, S = { a, b, c }, W = { *a*b*, *b*c*, *c*a*, *b*a*, *c*b*,
*a*c* }, with some large values of n. Theta( n*|S^n - W| ) predicts
that the enumeration time should grow linearly with n, as |S^n - W| =
3, but if you take some measurements, you'd notice that it grows faster
than that.

Second, my current bet is that such an improvement in asymptotic
complexity is not possible, if we consider *both* pre-processing of the
wildcard set and subsequent enumeration. Speculation: the time for
building-up a smart structure to speed-up enumeration, together with
the time for enumerating the set using that structure, should sum up to
roughly Theta( n*|S^n| ), even with a really smart algorithm.

Even if you're willing to pay up-front for tighter loop execution
later, and you build a suitable structure for this purpose, you would
have to consider the structure's size, so here's another speculation:
such structure would likely take up Theta( |S^n| ) space in memory, in
the worst case.

I would really appreciate it if you could pour some light into what
you're trying to do exactly, and possibly point out anything that I
might have missed so far.

Cheers,

Dinko

Mark Tarver · Mar 20, 2006

Hi,

You wrote into the Qilang News group with your problem.
This is a solution in 17 lines of Qi for any n-product >= 2.
It falls short of your complete requirement since it uses
generate and then test, rather than interleaving the
two.

(define challenge
Patterns N X -> (filter (/. Y (member Y Patterns)) (n-product N X)))

(define n-product
2 X -> (cartesian-product l X X)
N X -> (cartesian-product c X (n-product (- N 1) X)))

(define cartesian-product
_ [ ] _ -> [ ]
c [X | Y] Z -> (append (map (/. W [X | W]) Z) (cartesian-product c Y
Z))
l [X | Y] Z -> (append (map (/. W [X W]) Z) (cartesian-product l Y
Z)))

(define filter
_ [] -> []
F [X | Y] -> (filter F Y) where (F X)
F [X | Y] -> [X | (filter F Y)])

(define member
_ [] -> false
X [Pattern | _] -> true where (query-prolog [[= Pattern X]])
X [_ | Patterns] -> (member X Patterns))

Notes:

Pattern filtering is done by a unification test within the member
function. You
can do this most easily by calling Qi Prolog using query-prolog.
Here's a test.

(42 -) (n-product 3 [a b c])
[[a a a] [a a b] [a a c] [a b a] [a b b] [a b c] [a c a] [a c b] [a c
c]
[b a a] [b a b] [b a c] [b b a] [b b b] [b b c] [b c a] [b c b] [b c
c]
[c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c
c]]

OK, remove all lists beginning [a a ....].

(43-) (challenge [[a a | X]] 3 [a b c])
[[a b a] [a b b] [a b c] [a c a] [a c b] [a c c] [b a a] [b a b] [b a
c]
[b b a] [b b b] [b b c] [b c a] [b c b] [b c c] [c a a] [c a b] [c a
c]
[c b a] [c b b] [c b c] [c c a] [c c b] [c c c]]

Remove all lists beginning with a or b.

(51-) (challenge [[a | X] [b | X]] 3 [a b c])
[[c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c
c]]

Mark

Mark Carter · Mar 20, 2006

I'd like to propose a coding challenge of my own. The challenge is to
reproduce the TEA (Tiny Encryption Algorith):
http://www.simonshepherd.supanet.com/tea.htm
in your language of choice.

Here's the code, just two simple functions:

void encipher(unsigned long *const v,unsigned long *const w,
const unsigned long *const k)
{
register unsigned long y=v[0],z=v[1],sum=0,delta=0x9E3779B9,
a=k[0],b=k[1],c=k[2],d=k[3],n=32;

while(n-->0)
{
sum += delta;
y += (z << 4)+a ^ z+sum ^ (z >> 5)+b;
z += (y << 4)+c ^ y+sum ^ (y >> 5)+d;
}

w[0]=y; w[1]=z;
}

void decipher(unsigned long *const v,unsigned long *const w,
const unsigned long *const k)
{
register unsigned long y=v[0],z=v[1],sum=0xC6EF3720,
delta=0x9E3779B9,a=k[0],b=k[1],
c=k[2],d=k[3],n=32;

/* sum = delta<<5, in general sum = delta * n */

while(n-->0)
{
z -= (y << 4)+c ^ y+sum ^ (y >> 5)+d;
y -= (z << 4)+a ^ z+sum ^ (z >> 5)+b;
sum -= delta;
}

w[0]=y; w[1]=z;
}

I had a crack at it in Lisp. My version doesn't work - but of greater
concern to me is that it doesn't appear nearly as compact as the C
version. Anyway, here's my Lisp code (no prizes for guessing that I'm a
noob to Lisp):

(defconstant delta 2654435769 ) ; delta= 0x9E3779B9

(defun floorn (n) (nth-value 0 (floor n)))

(defun >> (val num-bytes)
"Right-shift positive integer val by num-bytes"
(let* (t1 t2)
(setf t1 (expt 2 num-bytes))
(setf t2 (/ val t1))
(floor t2)))

(defun << (val num-bytes)
"Left-shift positive integer v by num-bytes"
(* val (expt 2 num-bytes)))

(defun <<4 (i) (<< i 4))

(defun byte-n (v n)
"Return the nth byte of a value v"
(let* ((bits-to-shift (* 8 (1- n)))
(shifted-value (>> v bits-to-shift)))
(logand shifted-value 256)))

(defun transform (v1 v2 v3 v4)
(let (t1 t2 t3)
(setf t1 (<<4 v1))
(setf t2 (expt v2 v1))
(setf t3 (expt v3 (>> v2 5)))
(+ t1 t2 t3 v4)))

(defun pack64 (b1 b2) (+ (<< b1 32) b2))

(defun encipher (v k)
(let ((sum 0)
(a (byte-n k 3)) ; a=k[0]
(b (byte-n k 2)) ; b=k[1]
(c (byte-n k 1)) ; c=k[2]
(d (byte-n k 0)) ; d=k[3]
(y (byte-n v 1)) ; y=v[4]
(z (byte-n v 0))) ; z=v[1]

(loop for n from 0 to 31 do ;n=32, while(n-->0)
(incf sum delta) ;sum += delta;
(incf y (transform z a sum b)) ; y += (z << 4)+a ^ z+sum ^ (z >> 5)+b
(incf z (transform y c sum d)) ;z += (y << 4)+c ^ y+sum ^ (y >> 5)+d;
)

(pack64 y z) ; w[0]=y; w[1]=z;
))

(defun decipher (v k)
(let ((sum 3337565984) ; 0xC6EF3720
(a (byte-n k 3)) ; a=k[0]
(b (byte-n k 2)) ; b=k[1]
(c (byte-n k 1)) ; c=k[2]
(d (byte-n k 0)) ; d=k[3]
(y (byte-n v 1)) ; y=v[4]
(z (byte-n v 0))) ; z=v[1]

(loop for n from 0 to 31 do ;n=32, while(n-->0)
(decf z (transform y c sum d)) ;z -= (y << 4)+c ^ y+sum ^ (y >> 5)+d;
(decf y (transform z a sum b)) ;y -= (z << 4)+a ^ z+sum ^ (z >> 5)+b;
(decf sum delta) ;sum -= delta;
)

(pack64 y z) ; w[0]=y; w[1]=z;
))

Mark Tarver · Mar 20, 2006

Interesting. But you probably need to post this as a new
message, since it is a distinctly different
problem from the one driving this thread.

Mark

Mark Carter · Mar 20, 2006

Mark said:
Interesting.

At the risk of being labelled a troll, one thought that is occuring to
me is that in Lisp it seems that sometimes it is difficult to achieve a
simple thing in a simple way. To clarify ... recently, I had been
trying to obtain md5 hashes of the files we had on our server (a
different exercise than the one I mentioned in my OP, just in case you
thought that I didn't understand the difference between encryption and
hashing). There is an md5 package for Lisp available on the web, which I
used with CLISP. I had a file that contained a non-standard character,
causing CLISP to throw an error when it tried to print it.

Well, I suppose I could have tried to figure out a way to cajole CLISP
into printing something it didn't want to print, but I was keen to give
Corman Lisp 2.5 a try-out anyway, so I tried the package on it. EXCEPT,
for some reason when you try to read a file with an :element-type of
(unsigned-byte 8) (or something similar), Corman didn't like it.

In the end, I hacked together an md5 DLL from some sources I found on
the internet. You can get the package here, together with Corman Lisp
bindings:
http://www.markcarter.me.uk/computing/freeware/md5mc/md5mc.htm

In the past, I had also employed a similar technique in order to get
access to some console functions that I was interested in.

My worry is that it seems to be a recurring theme with me ... get
stumped in Lisp, realise that it is probably just plain easier in C, and
then link the whole thing together in Lisp. Which is kinda less than
expected.

Geoffrey Summerhayes · Mar 20, 2006

Dinko Tenev said:
Oops...missed that part.

It took me a while to study the exchange on this topic more thoroughly,
and I now fully appreciate the fact that the problem calls for a much
more sophisticated approach.

Sorry for the hasty shot, I'll give it another shortly.

I wouldn't worry about it, Prolog generated the elements one-by-one.
The loop was the print,nl,fail line. Just beefing it up a bit, I
didn't take the time to clean it up though.

gen(_,0,[]).
gen(S,N,[H|T]):- N > 0, N1 is N - 1, member(H,S), gen(S,N1,T).

filter([],[]).
filter([X|T],[X|T1]):- filter(T,T1).
filter([*|T],L):- filter(T,L).
filter([*|T],[_|T1]):- filter([*|T],T1).

filter_list(L,[[and|T]|_]):- filter_and(L,T), !.
filter_list(L,[[or|T]|_]):- filter_list(L,T), !.
filter_list(L,[H|_]):- H \= [and|_], H \= [or|_], filter(H,L),!.
filter_list(L,[H|T]):- H \= [and|_], H \= [or|_], filter_list(L,T).

filter_and(_,[]) :- !.
filter_and(L,[H|T]):- filter_list(L,[H]), filter_and(L,T).

generate_member(X,S,N,[]):-gen(S,N,X).
generate_member(X,S,N,[H|T]):-gen(S,N,X),\+ filter_list(X,[H|T]).

1 ?- generate_member(X,[a,b],3,[[a,*,b],[b,*,a]]).

X = [a, a, a] ;

X = [a, b, a] ;

X = [b, a, b] ;

X = [b, b, b] ;

No
2 ?- generate_member(X,[1,2],3,[[and, [*,2], [or, [2,1,*], [1,2,*]]]]).

X = [1, 1, 1] ;

X = [1, 1, 2] ;

X = [1, 2, 1] ;

X = [2, 1, 1] ;

X = [2, 2, 1] ;

X = [2, 2, 2] ;

No

Pascal Bourguignon · Mar 20, 2006

Mark Carter said:
I'd like to propose a coding challenge of my own. The challenge is to
reproduce the TEA (Tiny Encryption Algorith):
http://www.simonshepherd.supanet.com/tea.htm
in your language of choice.

Here's the code, just two simple functions:

void encipher(unsigned long *const v,unsigned long *const w,
const unsigned long *const k)
{
register unsigned long y=v[0],z=v[1],sum=0,delta=0x9E3779B9,
a=k[0],b=k[1],c=k[2],d=k[3],n=32;

while(n-->0)
{
sum += delta;
y += (z << 4)+a ^ z+sum ^ (z >> 5)+b;
z += (y << 4)+c ^ y+sum ^ (y >> 5)+d;
}

w[0]=y; w[1]=z;
}

void decipher(unsigned long *const v,unsigned long *const w,
const unsigned long *const k)
{
register unsigned long y=v[0],z=v[1],sum=0xC6EF3720,
delta=0x9E3779B9,a=k[0],b=k[1],
c=k[2],d=k[3],n=32;

/* sum = delta<<5, in general sum = delta * n */

while(n-->0)
{
z -= (y << 4)+c ^ y+sum ^ (y >> 5)+d;
y -= (z << 4)+a ^ z+sum ^ (z >> 5)+b;
sum -= delta;
}

w[0]=y; w[1]=z;
}

I get it shorter than in C:

(defun op (x a b sum) (logxor (+ (ash x 4) a) (+ x sum) (+ (ash x -5) b)))
(declaim (inline op))
(defmacro ciploop ((v w k y z a b c d (sum init-sum) delta) &body body)
`(let ((,y (aref ,v 0)) (,z (aref ,v 1)) (,sum ,init-sum) (,delta #x9E3779B9)
(,a (aref ,k 0)) (,b (aref ,k 1)) (,c (aref ,k 2)) (,d (aref ,k 3)))
(loop repeat 32 do ,@body finally (setf (aref ,w 0) ,y (aref ,w 1) ,z))))
(defmacro c-incf (var expr) `(setf ,var (mod (+ ,var ,expr) #x100000000)))
(defmacro c-decf (var expr) `(setf ,var (mod (- ,var ,expr) #x100000000)))
(defun encipher (v w k)
(ciploop (v w k y z a b c d (sum 0) delta)
(c-incf sum delta) (c-incf y (op z a b sum)) (c-incf z (op y c d sum))))
(defun decipher (v w k)
(ciploop (v w k y z a b c d (sum #xC6EF3720) delta)
(c-decf z (op y c d sum)) (c-decf y (op z a b sum)) (c-decf sum delta)))

You can also easily modify it to implement the improved version of TEA...
Note that this Lisp programs will work equally well on a 16-bit,
32-bit or 64-bit Common Lisp implementation. The same cannot be said
of the C program above.

;; Let's add a testbed:

(defun word (a b c d)
(dpb a (byte 8 24) (dpb b (byte 8 16) (dpb c (byte 8 8) d))))

(defun read-words (bits what)
(loop
for bytes = (progn (format *query-io* "Please enter ~D bits of ~A: "
bits what)
(ext:convert-string-to-bytes
(read-line *query-io* nil nil) ext:*TERMINAL-ENCODING*))
while (< (* 8 (length bytes)) bits)
finally (return
(loop for i from 0 by 4 below (truncate (+ 7 bits) 8)
collect (word (aref bytes (+ i 0))
(aref bytes (+ i 1))
(aref bytes (+ i 2))
(aref bytes (+ i 3))) into words
finally (return (coerce words 'vector))))))

(defun test ()
(loop
with code = (vector 0 0)
with decr = (vector 0 0)
for clear = (read-words 64 "clear text")
for key = (read-words 128 "key")
do (progn (encipher clear code key)
(format t "(encipher ~S ~S)~% --> ~S~%" clear key code)
(decipher code decr key)
(format t "(decipher ~S ~S)~% --> ~S~%" code key decr)
(unless (equalp clear decr) (format t "!!! ERROR !!!~%")))))

[11]> (test)
Please enter 64 bits of clear text: Hello World!
Please enter 128 bits of key: John McCarthy invented LISP.
(encipher #(1214606444 1864390511) #(1248815214 541942595 1634890856 2032167278))
--> #(913593965 183139965)
(decipher #(913593965 183139965) #(1248815214 541942595 1634890856 2032167278))
--> #(1214606444 1864390511)
Please enter 64 bits of clear text: Big Secret: LISP!
Please enter 128 bits of key: A very very secure key.
(encipher #(1114203936 1399153522) #(1092646501 1920540790 1702000928 1936024437))
--> #(3198111104 1851109064)
(decipher #(3198111104 1851109064) #(1092646501 1920540790 1702000928 1936024437))
--> #(1114203936 1399153522)
Please enter 64 bits of clear text: ^C

--
__Pascal Bourguignon__ http://www.informatimago.com/

ATTENTION: Despite any other listing of product contents found
herein, the consumer is advised that, in actuality, this product
consists of 99.9999999999% empty space.

joswig · Mar 20, 2006

I had a crack at it in Lisp. My version doesn't work - but of greater

concern to me is that it doesn't appear nearly as compact as the C
version. Anyway, here's my Lisp code (no prizes for guessing that I'm a
noob to Lisp):

Lot's of things you can write more compact.
But compact is not always the best way to
write source. For me the most important
criteria is that I can return to some source
after, say, a year absence and everything
is clear and readable again.

(defconstant delta 2654435769 ) ; delta= 0x9E3779B9

(defconstant +delta+ #x9E3779B9)

(defun floorn (n) (nth-value 0 (floor n)))

is above used?

(defun >> (val num-bytes)
"Right-shift positive integer val by num-bytes"
(let* (t1 t2)
(setf t1 (expt 2 num-bytes))
(setf t2 (/ val t1))
(floor t2)))

(defun >> (val num-bytes)
"Right-shift positive integer val by num-bytes"
(floor (/ val (expt 2 num-bytes))))

(defun transform (v1 v2 v3 v4)
(let (t1 t2 t3)
(setf t1 (<<4 v1))
(setf t2 (expt v2 v1))
(setf t3 (expt v3 (>> v2 5)))
(+ t1 t2 t3 v4)))

(defun transform (v1 v2 v3 v4)
(+ (<<4 v1)
(expt v2 v1)
(expt v3 (>> v2 5))
v4))

and so on...

Alexander Schmolck · Mar 20, 2006

(defun >> (val num-bytes)
"Right-shift positive integer val by num-bytes"
(floor (/ val (expt 2 num-bytes))))

or just (floor val (expt 2 num-bytes))

'as

Christophe Rhodes · Mar 20, 2006

[ note followups ]

Mark Carter said:
I'd like to propose a coding challenge of my own. The challenge is to
reproduce the TEA (Tiny Encryption Algorith):
http://www.simonshepherd.supanet.com/tea.htm
in your language of choice.

Here's mine, in Common Lisp.

(defmacro define-tea-pair ((encrypt decrypt) (delta n) (i1 j1 i2 j2))
`(macrolet ((32bitize (form) `(logand ,form #xffffffff))
(tea-kernel (x i j)
`(logxor (+ (ash ,x 4) (aref key ,i)) (+ ,x sum)
(+ (ash ,x -5) (aref key ,j))))
(tea-loop ((text n sum) &body body)
`(let ((result (make-array 2 :element-type '(unsigned-byte 32)))
(y (aref ,text 0))
(z (aref ,text 1)))
(do ((n ,n (- n 1)) (sum ,sum))
((<= n 0)
(prog1 result
(setf (aref result 0) y (aref result 1) z)))
,@body))))
(defun ,encrypt (plaintext key &aux (delta ,delta))
(declare (type (simple-array (unsigned-byte 32) (2)) plaintext)
(type (simple-array (unsigned-byte 32) (4)) key))
(tea-loop (plaintext ,n 0)
(setq sum (32bitize (+ sum delta))
y (32bitize (+ y (tea-kernel z ,i1 ,j1)))
z (32bitize (+ z (tea-kernel y ,i2 ,j2))))))
(defun ,decrypt (ciphertext key &aux (delta ,delta))
(declare (type (simple-array (unsigned-byte 32) (2)) ciphertext)
(type (simple-array (unsigned-byte 32) (4)) key))
(tea-loop (ciphertext ,n (32bitize (* ,n delta)))
(setq z (32bitize (- z (tea-kernel y ,i2 ,j2)))
y (32bitize (- y (tea-kernel z ,i1 ,j1)))
sum (32bitize (- sum delta)))))))

(define-tea-pair (encipher decipher) (#x9e3779b9 32) (0 1 2 3))

So far, so ordinary; only marginally shorter than the C version; I'm
certainly nowhere near Pascal's 14 lines, although my version has the
advantage over his that each constant is mentioned only once, and all
quantities derived from it are computed at compile-time; I can define
a different pair with

(define-tea-pair (eprime dprime) (#xabcdef01) (3 1 2 0))

and the new functions are inverses of each other as before.

The other thing that might be of interest is the inner loop. There
are no declarations other than the argument declarations, and all the
code that I have written is portable Common Lisp, and should work in
any conforming implemenation. In SBCL (on the PowerPC; other
platforms are similar), the inner loop for ENCIPHER is

addis $nl0,$nl3,-25033
addi $nl3,$nl0,31161
rlwinm $nl5,$nl2,4,0,27
lwz $nl6,1($fdefn)
add $nl6,$nl5,$nl6
add $nl0,$nl2,$nl3
xor $cfunc,$nl6,$nl0
rlwinm $nl0,$nl2,27,5,31
mr $nl5,$nl0
lwz $nl6,5($fdefn)
add $nl6,$nl5,$nl6
xor $nl6,$cfunc,$nl6
add $nl1,$nl1,$nl6
rlwinm $nl5,$nl1,4,0,27
lwz $nl6,9($fdefn)
add $nl6,$nl5,$nl6
add $nl0,$nl1,$nl3
xor $cfunc,$nl6,$nl0
rlwinm $nl0,$nl1,27,5,31
mr $nl5,$nl0
lwz $nl6,13($fdefn)
add $nl6,$nl5,$nl6
xor $nl6,$cfunc,$nl6
add $nl2,$nl2,$nl6
addi $nl4,$nl4,-4
cmpwi cr0,$nl4,0
bt gt,l0

and while this may be opaque to some readers, the point is that it is
pretty much comparable to the C code in performance (the differences
between this disassembly and gcc -O2 lie in the fact that SBCL's
instruction scheduler is pretty much nonexistent on the PowerPC
architecture).

Christophe

wkehowski · Mar 21, 2006

After the basic fact of generating the exclusion - a considerable
achievement - the program should be interactive. What if the target set
has thousands or millions of elements? There should be a loop-like way
('do' in Haskell, for example) to peel off the elements one-by-one and
then terminate.

Dinko Tenev · Mar 21, 2006

After the basic fact of generating the exclusion - a considerable
achievement - the program should be interactive. What if the target set
has thousands or millions of elements? There should be a loop-like way
('do' in Haskell, for example) to peel off the elements one-by-one and
then terminate.

Um..."interactivity" is a bit tricky in Prolog

As Geoffrey pointed out in his posting, the solutions are generated
effectively one at a time. Following is a typical example of using the
generator:

generate_member( X, ... ), do_something_with( X ), fail.

The underlying semantics is, roughly, 1) bind X, 2) do_something_with(
X ), 3) fail, meaning reject this binding of X and backtrack. In this
particular case, backtracking is tantamount to going back to 1).

You can regard ( generate_member( X, ... ) ... fail ) as the equivalent
of a loop construct, and do_something_with( X ) as the loop body. At
any given time that the goal is being evaluated, there is only one
binding for X in effect.

On a side note, Haskell's "do" notation isn't really about loops. If
you're referring to Tomasz's code, it's rather mapM_ that can sort of
be thought of as looping through the list of values returned by
generateNotMatching

Cheers,

Dinko

Dinko Tenev · Mar 21, 2006

Dinko said:
Speculation: the time for
building-up a smart structure to speed-up enumeration, together with
the time for enumerating the set using that structure, should sum up to
roughly Theta( n*|S^n| ), even with a really smart algorithm.

OK, maybe not.

This might be the worst case, looking at S^n - W only, but it's not
quite clear what "worst case" means in the context of concrete
implementations. Surely, one can clog the program with zillions of
wildcards to test, so we can produce an arbitrarily "bad" case

--
but such a case is obviously of little or no practical importance. It
appears that, to make any sensible statements about performance in the
relevant cases, we have to take into account the size of the pattern
set used to specify W as well.

[...] here's another speculation:
such structure would likely take up Theta( |S^n| ) space in memory, in
the worst case.

....and similarly for "worst case" here.

Cheers,

Dinko

Geoffrey Summerhayes · Mar 21, 2006

After the basic fact of generating the exclusion - a considerable
achievement - the program should be interactive. What if the target set
has thousands or millions of elements? There should be a loop-like way
('do' in Haskell, for example) to peel off the elements one-by-one and
then terminate.

There is...(Q&D)

1 ?- generate_member(X,[1,2],3,[[and, [*,2], [or, [2,1,*], [1,2,*]]]]),
write(X),nl,write('Is this the term you were looking for? (y/n):'),
get(Y), ((Y is 121) -> true; fail). % 121 = 'y'

[1, 1, 1]
Is this the term you were looking for? (y/n):n
[1, 1, 2]
Is this the term you were looking for? (y/n):|: n
[1, 2, 1]
Is this the term you were looking for? (y/n):|: y

Yes
2 ?-

funkyj · Mar 21, 2006

Dinko said:
Doug Quale wrote:

Hmmm...storage is not an issue in the Prolog version. It generates a
candidate solution, then checks membership in the wildcard set, then
backtracks (backtracking is caused by "fail" in the test goal.) On
backtracking, it effectively "forgets" the last solution, so the memory
is freed up (or at least free to be reclaimed through GC.)

How about the other iterator characteristics?

when there is a huge solution space can I ask the prolog version to
give me the first 1000 solutions? The next 1000 solutions (i.e. 1000 -
1999)?

If you can do that then it would appear that generators have no
advantage over your prolog solution.

Dinko Tenev · Mar 22, 2006

funkyj said:
How about the other iterator characteristics?

when there is a huge solution space can I ask the prolog version to
give me the first 1000 solutions?

Geoffrey's post above offers one way to do this from within a REPL.

Within a program, as soon as you accept a solution, you're potentially
out of the "loop" (it is possible to use cut (!) to make this certain.)

The next 1000 solutions (i.e. 1000 - 1999)?

I am not quite sure how exactly you propose to do this in e.g. Python,
but if you mean to enumerate and skip over the first 1000 and then
continue with the rest, yes, this is also possible.

If you can do that then it would appear that generators have no
advantage over your prolog solution.

To be fair, the Python implementation has one significant advantage --
it offers reduced runtime complexity for quite a number of practical
cases (of course, this improvement is achievable in Prolog too,) though
I'm still inclined to think that it can be made to run in exponential
time.

Also, my Prolog version is non-compliant in that it treats a wildcard
as matching a single position, rather than matching any sequence.
Geoffrey's implementation does the right thing though.

Cheers,

Dinko

Dinko Tenev · Mar 22, 2006

OK, here's a case that will make your program run in exponential time:
S = { a, b }, W = { *a*b, *b*a } -- on my machine, it starts getting
ugly as soon as n is 15 or so. Note that S^n - W = { a^n, b^n }.

In general, whenever all the patterns in the set match against the last
position, your current implementation is guaranteed to have to sift
through all of S^n. I'd say the very idea of checking against a
blacklist is fundamentally flawed, as far as performance is concerned.

Dirk Thierbach · Mar 22, 2006

[Had to drop alt.comp.lang.haskell, otherwise my newsserver doesn't accept it]

Dinko Tenev said:
OK, here's a case that will make your program run in exponential time:
S = { a, b }, W = { *a*b, *b*a } -- on my machine, it starts getting
ugly as soon as n is 15 or so. Note that S^n - W = { a^n, b^n }.

In general, whenever all the patterns in the set match against the last
position, your current implementation is guaranteed to have to sift
through all of S^n. I'd say the very idea of checking against a
blacklist is fundamentally flawed, as far as performance is concerned.

If more time during preprocessing is allowed, another idea is to
treat the wildcard expressions as regular expressions, convert
each into a finite state machine, construct the "intersection" of
all these state machines, minimize it and then swap final and non-final
states. Then you can use the resulting automaton to efficiently
enumerate S^n - W. In the above case, the resulting FSM would have just
three states.

And it doesn't really matter what language you use to implement this
algorithm, it's the idea that counts. Notation aside, all
implementations will be quite similar.

- Dirk

Mark Carter · Mar 22, 2006

Mark said:
At the risk of being labelled a troll

One thing I just discovered, and by which I mean *really* discovered ...
is that Lisp is an interactive environment. I am working on trying to
verify the contents of disks. I noticed that the input formats are
slightly wrong, and needed correction. In fact, there's a whole host of
jiggery pokery that I need to do in order to massage and build up
everything the way it needs to be.

A programmers mindset is usually geared towards "writing applications".
What I'm currently doing in Lisp is building up functions as I need
them. Using emacs, I can just C-x C-e to make my functions "live", and
when it's time to stop for the day, save my working image so that I can
use it the next day.

It seems to me that only Forth or Scheme really matches this capability.
Ruby and Python come kinda close - they do have a REPL, but it's kinda
clunky to try to create functions on the fly, plus of course they don't
support the idea of an image.

Alexander Schmolck · Mar 22, 2006

Mark Carter said:
A programmers mindset is usually geared towards "writing applications". What
I'm currently doing in Lisp is building up functions as I need them. Using
emacs, I can just C-x C-e to make my functions "live", and when it's time to
stop for the day, save my working image so that I can use it the next day.

It seems to me that only Forth or Scheme really matches this capability.

Not really. I'd say matlab's and squeak's (and presumably most smalltalks')
interactive environment is quite superior to cl+slime. Emacs lisp is also
better and I'd also suspect erlang to be better in some ways, but I haven't
used it yet. APL and descendants (J and K) also tend to have quite reasonable
interactive facilities and so do CAS systems.

Ruby and Python come kinda close - they do have a REPL, but it's kinda
clunky to try to create functions on the fly, plus of course they don't
support the idea of an image.

I don't think interactively creating functions is much harder in python but
changing module and class definitions is.

Actually although cl+slime is rather nice it is inferior to the much less
sophisticated and capable ipython+emacs combo in some regards (runtime error
reporting in CL tends to be pretty crap; (i)python docstrings, ease of
interactive testing and shell integration are superior). It's also much
easier to serialize stuff in python than it is in common lisp (never mind
pickle; you can't even readably print hashtables -- and unless I'm missing
something this is not user-fixable which really sucks).

Of course it's *much* easier to supply a nice interactive experience for a
quite limited language such as matlab than it is for CL (mostly because CL is
more powerful and geared towards efficient code generation).

So CL might well have the best interactiveness/(expressiveness*speed) ratio.

'as

Programming math challenge gives wrong answer	2	Aug 6, 2023
wildcard exclusion in cartesian products	3	Mar 26, 2006
Complex Python challenge 1	3	Jan 30, 2023
Cash register challenge	2	Mar 8, 2022
Some thougts on cartesian products	44	Jan 22, 2006
Dynamic programming	3	Jan 9, 2023
BITCOIN PROGRAMMING - CODE INCLUDED - needs slight modification in linux terminal - NSA please do not block	0	Nov 2, 2024
Gray Scott model in FEniCS	0	Nov 20, 2024

Programming challenge: wildcard exclusion in cartesian products

Dinko Tenev

Mark Tarver

Mark Carter

Mark Tarver

Mark Carter

Geoffrey Summerhayes

Pascal Bourguignon

joswig

Alexander Schmolck

Christophe Rhodes

wkehowski

Dinko Tenev

Dinko Tenev

Geoffrey Summerhayes

funkyj

Dinko Tenev

Dinko Tenev

Dirk Thierbach

Mark Carter

Alexander Schmolck

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads