Programming challenge: wildcard exclusion in cartesian products

T

Tomasz Zielonka

-- The states are lists of regular expressions
-- where [a,b,..] means match a or b or...

I haven't run or studied your program yet myself but what I had in mind
was that the list of wc's are *all* to be excluded, so the list
[wc1..wcn] is to correspond generating all tuples matching not(wc1 and
.. and wcn). Maybe you're already doing that. The wc's themselves
could be logical statements among the 'primitive' wc's. That's why I
named it the 'wildcard exclusion problem'. It's a lot easier to specify
a list of simpler wc's than create a long logical expression.

I missed "any logical combination" :-(

It would be quite easy to fix my first program, but I don't have the
time to do it right now.

Best regards
Tomasz
 
W

wkehowski

It would seem that your program is just filtering the full cartesian
product, right? The solution I'm looking for generates the elements
one-by-one so that it could be used in a loop.
 
D

Dan Piponi

(e-mail address removed) said:
I haven't run or studied your program yet myself but what I had in mind
was that the list of wc's are *all* to be excluded

I think it already does what you want. You might want to change

run n alphabet pat =
generate terminal null transition [pat] n alphabet to

to

run n alphabet pat =
generate terminal null transition pat n alphabet

to allow lists of patterns where the patterns in a list are ORed
together.
 
D

Dinko Tenev

It would seem that your program is just filtering the full cartesian
product, right? The solution I'm looking for generates the elements
one-by-one so that it could be used in a loop.

Oops...missed that part.

It took me a while to study the exchange on this topic more thoroughly,
and I now fully appreciate the fact that the problem calls for a much
more sophisticated approach.

Sorry for the hasty shot, I'll give it another shortly.

Cheers,

Dinko
 
A

Azolex

sa said:
in k:

cp:{[c;n;p]+(n#c)_vs(!_ c^n)_dvl,/{2_sv+(,/,/:\:)/(),/:mad:[x;&x=-1;:[;!c]]}'p}

That one goes a long way as a proof of eg evolution theory, you know,
monkeys reproducing shakespeare with a typewriter k-board and all that :)
examples:

cp[2;3;,0 -1 1]
(0 0 0
0 1 0
1 0 0
1 0 1
1 1 0
1 1 1)

cp[2;3;(0 -1 1;1 -1 0)]
(0 0 0
0 1 0
1 0 1
1 1 1)

cp[2;3;(0 -1 1;1 -1 1)]
(0 0 0
0 1 0
1 0 0
1 1 0)

arguments of cp:

c = cardinality of the input set
n = power
p = list of patterns (-1 = wildcard)

the algorithm directly computes the target set. in other words,
it does not generate the set, then filter the matches from the
target.

modifying cp to accept s instead of the cardinality of s,
patterns expressed in terms of elements of s, &c. adds nothing
of interest to the problem.
 
G

Geoffrey Summerhayes

Wade said:
Another attempt. I have made no special attempt to create an
exclusion language, just used an anonymous lambda predicate.

FWIW, here's my Q-and-D pattern matcher (only partially tested).

(defun match(list pattern &optional (test #'eql))
"Match a list of atoms against a pattern list
using :all as a 0-to-many wildcard, :single as a
1-to-1 wildcard, a list of elements or a single
element to match a specific place. Optional
argument test for comparing elements (default eql).

Returns: T if match is made, NIL otherwise.

Examples: (match '(0 1 2 3 4 5) ':)all (2 3) 3 :single 5 :all)) => T
(match '(0 1 2 3 4 5) ':)all (2 3) 3 :single 5 :single)) =>
NIL"
(let ((current (first pattern))
(next-pattern (rest pattern))
(candidate (first list)))
(cond ((and (null pattern) (null list))
t)
((and (eq :single current) candidate)
(match (rest list) next-pattern test))
((eq :all current)
(loop for new-list on list
when (match new-list next-pattern test)
do (return-from match t))
(null next-pattern)) ; last case null remainder
((if(atom current)
(funcall test candidate current)
(member candidate current :test test))
(match (rest list) next-pattern test)))))
 
A

Alan Crowe

What I have in mind is the efficient, <enumerated> generation of the
complement S^n/WC(S^n). A good program should initialize, generate, and
terminate.

T=cartprodex(S,n,WC); //initialize
for all i in T do
what you want with i
test to see if any more
terminate if not

and it should do this without explicitly generating WC and then
complementing. For example, if the cardinality of S is m, and the WC is
just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality
(m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016
while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general
the program should directly generate EX from arbitrary WC. Of course,
in practice the WC should themselves occur in a logically consistent
manner, but let's just assume they're a given.

The following code doesn't build a data structure. It
recurses n deep and those n stack frames contain state that
tracks progress through the product. But it still generates
and tests, taking time proportional to S^n. Is this what you
had in mind?

;; A macro to let you loop over the S^n possibilities
;; without building a big data structure
;; The macro is structured as syntactic sugar on
;; a higher order function
;;
(defmacro cartesian-power ((variable set exponent) &body code)
(let ((body-function (gensym)))
`(flet ((,body-function(,variable),@code))
(cartesian-power-hof ,set ,exponent '() (function ,body-function)))))

;; The standard idea of using recursion to implement a nest
;; of loops of indefinite depth
;;
(defun cartesian-power-hof (set exponent prefix f)
(if (zerop exponent)
(funcall f prefix)
(dolist (item set)
(cartesian-power-hof set
(- exponent 1)
(cons item prefix)
f))))

;; A simple recursive pattern match
;; I haven't thought through the implications
;; I guess that it is exponentially slow on some long
;; patterns
;;
(defun wild-match (pattern data)
(cond ((endp pattern) (endp data))
((endp data) (or (endp pattern)
(equal pattern ':)wild))))
((eql (car pattern) :wild)
(or (null (cdr pattern))
(wild-match (cdr pattern)
data)
(wild-match pattern
(cdr data))))
('literal-pattern
(and (eql (car pattern)
(car data))
(wild-match (cdr pattern)
(cdr data))))))

;; close over a data item to get a function
;; suitable for checking several patterns
(defun match-data (data)
(lambda(pattern)
(wild-match pattern data)))

;; Use the macro and the utilities to count how many are not excluded
(defun count-remainder (set exponent &rest exclusions)
(let ((count 0))
(cartesian-power (item set exponent)
(when (notany (match-data item) exclusions)
(incf count)))
count))

CL-USER> (loop for i from 3 to 10
do (format t "~&~4D~10D" i
(count-remainder '(a b c d e) i ':)wild a :wild b :wild))))
3 112
4 512
5 2304
6 10240
7 45056
8 196608
9 851968
10 3670016

I can see how a pattern such as (a b :wild) would knock out
an element from each of the first two sets so reducing the
task from m^n to (m-1)^2 * m^(n-2).

Also :)wild a :wild) knocks it down from m^n to (m-1)^n

However I can only see the exploitation of :)wild a :wild b
:wild) as a hairy special case. Pick one of n places for the
first a. Pick earlier elements from the set excluding a,
pick later elements from the set excluding b. Add in all the
items with a missing altogether (so b can be used freely).
I cannot see what algorithm exploits the constraints more
generally. Is there a standard technique, page nn of Knuth
or the like? Is that what you are actually wanting to see
coded?

Alan Crowe
Edinburgh
Scotland
 
K

Kaz Kylheku

The wildcard exclusion problem is interesting enough to have many
distinct, elegant solutions in as many languages.

In that case, you should have crossposted to comp.lang.python also.

Your program looks like a dog's breakfast.
 
F

funkyj

It would seem that your program is just filtering the full cartesian
product, right? The solution I'm looking for generates the elements
one-by-one so that it could be used in a loop.

One advantage of a generator over filtering the full product is that I,
as the user of the generator, am not obligated to iterate over the
entire solution space.

Are there other _practical_ advantages of generators over mapping &
filtering complete sets?
 
D

Doug Quale

funkyj said:
One advantage of a generator over filtering the full product is that I,
as the user of the generator, am not obligated to iterate over the
entire solution space.

Are there other _practical_ advantages of generators over mapping &
filtering complete sets?

Storage. You can iterate over problem spaces much too large to fit in
memory. Also generate + iterate can be faster because of reduced
memory pressure.
 
W

wkehowski

Yes, the program is quite a jumble: but it works. And I didn't post to
python newsgroup since I was limited to just 5 newsgroups and didn't
feel like doing multiple postings to multiple newsgroups.
 
W

wkehowski

Here is an nice intro to K:

http://www.kuro5hin.org/?op=displaystory;sid=2002/11/14/22741/791

"This is where K starts to set itself from apart from most of the
common programming languages in use today. You rarely write loops in K
(KDB is 100% loop-free), instead you use adverbs. An adverb modifies a
function, returning another function, changing the ways it operates
over its arguments and what it does with it's return values."

How about an interactive loop-like version? Generating the target set
is good for baby test cases but not if the cardinality of the target is
large. Does that make the problem more intersesting?
 
W

wkehowski

When I run this I get through ghc I get

C:\Documents and Settings\User\My Documents\wildcard>ghc
"./wc-zielonka.hs"
compilation IS NOT required
C:/Languages/ghc/ghc-6.4.1/libHSrts.a(Main.o)(.text+0x1d):Main.c:
undefined refe
rence to `__stginit_ZCMain'
C:/Languages/ghc/ghc-6.4.1/libHSrts.a(Main.o)(.text+0x43):Main.c:
undefined refe
rence to `ZCMain_main_closure'
collect2: ld returned 1 exit status

Unless there's a command line option I'm missing?
 
W

wkehowski

The cardinality of excluding '*a*b*' from S^n should be
(m-1)^(n-1)*(m+n-1), where m=|S|. For m=5 this becomes 4^(n-1)*(n+4),
and your table fits this formula. As far as generating and testing, an
'ideal' solution would be to 'start from the ground up', as in
excluding length 2 wc, and then length 3, etc, until all wc's have been
excluded. The 'ideal' solution would intrinsically exclude wc's and not
test against a background generation of all of S^n. Does that make
sense?
 
W

wkehowski

Nice! How to put it in a loop? I'm totally a newbie to Lisp myself,
just gettng into Graham and Touretzky. Let's create a problem. Suppose
after excluding I want to know if the digits sum to 12, say, like maybe
they're part of a partition. S={0,..6}, S^5, excluding "*1*5*" and
"1*2*3*", say. How would I do that?
 
J

Joachim Durchholz

"This is where K starts to set itself from apart from most of the
common programming languages in use today. You rarely write loops in K
(KDB is 100% loop-free), instead you use adverbs. An adverb modifies a
function, returning another function, changing the ways it operates
over its arguments and what it does with it's return values."

Doesn't sound too different from what closures do. Or lazy parameter
passing.
<rant> I'm not sure whether the K designer actually fits that
description, but there are too many language designers around
reinventing the wheel, arguing whether it should have seven, eight or
thirteen sides... </rant>

Regards,
Jo
 
W

wkehowski

OK, a bad case of RTFM. I saved your file as WildCartesian.hs and then

1) command line: ghci WildCartesian.hs
2) Get some loading messages
3) command line: test

and it works! But how do I compile it to get a program with command
line arguments? I'm looking through Daume's tutorial right now.
 
D

Dinko Tenev

Doug said:
Storage. You can iterate over problem spaces much too large to fit in
memory. Also generate + iterate can be faster because of reduced
memory pressure.

Hmmm...storage is not an issue in the Prolog version. It generates a
candidate solution, then checks membership in the wildcard set, then
backtracks (backtracking is caused by "fail" in the test goal.) On
backtracking, it effectively "forgets" the last solution, so the memory
is freed up (or at least free to be reclaimed through GC.)

Cheers,

Dinko
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,291
Messages
2,571,453
Members
48,137
Latest member
IndiraMcCo

Latest Threads

Top