Matching words and letters

Sandman · Dec 8, 2004

I just can't seem to wrap my mind around how to attack this problem in the best
way, so I thought I'd just post here and see if anyone here has done something
similar and/or has a logical solution to it.

I have a series of letters, say "kljetilamnop", then I have a big textfile with
legit words, but let's use a short one here:

mall
bill
kill
lot
pile
antilope
hello

Now, I want to cycle through this word list and match it to the letters, and
return true if the word can be formed by the letters. So, the above would
return:

mall
kill
lot
antilope

The result is to be used in a Scrabble-like fashion, which leads me to the
other part of my problem. When matching these words, I need to be able to
specify that some of the characters are "locked" in their position. So, lets
say that the letter "k" in the letter string is a fixed letter, then the only
result would be "kill" of course. I imagine that this would be easiest if I fed
the function the string and also the locked position in an array (1,6,8 for
example).

Anyway, this could easily become really complex, and if I just get the basic
scrambled-letters to match to words, I think I can manage my way from there.
So, any help appreciated.

Jim Keenan · Dec 8, 2004

Sandman wrote:

Anyway, this could easily become really complex, and if I just get the basic
scrambled-letters to match to words, I think I can manage my way from there.
So, any help appreciated.

See if the is_LsubsetR() method of my CPAN module List::Compare
(http://search.cpan.org/~jkeenan/List-Compare-0.31/Compare.pm) is useful
to you.

jimk

A. Sinan Unur · Dec 9, 2004

I have a series of letters, say "kljetilamnop", then I have a big
textfile with legit words, but let's use a short one here:

mall
bill
kill
lot
pile
antilope
hello

Now, I want to cycle through this word list and match it to the
letters, and return true if the word can be formed by the letters. So,
the above would return:

mall
kill
lot
antilope

pile should be in that list as well.

Now, undoubtedly, there is a better way to do this, but here is something
that works for the data you posted:

#! /usr/bin/perl

use strict;
use warnings;

use Data:

umper;

my $source = normalize('kljetilamnop');

LINE: while(my $word = <DATA>) {
chomp $word;
my $target = normalize($word);
for my $c (keys %{ $target }) {
next LINE unless(
defined($source->{$c}) && ($source->{$c} >= $target->{$c})
);
}
print "$word\n";
}

sub normalize {
my ($string) = @_;
my @letters = split //, $string;
my %count;
$count{$_}++ for (@letters);
return \%count;
}

__DATA__
mall
bill
kill
lot
pile
antilope
hello

D:\Home\Dload>perl rs.pl
mall
kill
lot
pile
antilope

which leads me to the other part of my problem.

I am going to ignore that part.

So, any help appreciated.

Hope this helps.

Sinan.

Anno Siegel · Dec 9, 2004

Sandman said:
I just can't seem to wrap my mind around how to attack this problem in the best
way, so I thought I'd just post here and see if anyone here has done something
similar and/or has a logical solution to it.

I have a series of letters, say "kljetilamnop", then I have a big textfile with
legit words, but let's use a short one here:

mall
bill
kill
lot
pile
antilope
hello

Now, I want to cycle through this word list and match it to the letters, and
return true if the word can be formed by the letters. So, the above would
return:

mall
kill
lot
antilope

[snip other requirements]

One way of looking at this is to consider a string (word, or list of
available letters) as a multiset of its letters. A multiset is like
a set, but each element can appear with a multiplicity. The bag of
available letters, for instance, would contain each letter of
"kjetiamnop" once and the letter "l" twice.

A word can be built from the given letters when its bag is contained
(in the obvious sense of multisets) in the bag of given letters.

Fortunately, there is a CPAN module that implements bags. Unfortunately,
the implementation doesn't have a "contained" method, so we'll have to
extend the class. Fortunately, the class is well-written, so inheritance
makes this easy:

#!/usr/local/bin/perl
use strict; use warnings; $| = 1;

my $letters = CBag->new->insert( map { $_ => 1 } split //, 'kljetilamnop');

while ( <DATA> ) {
chomp;
my $word = CBag->new->insert( map { $_ => 1 } split //);
print "$_\n" if $word <= $letters;
}

# Extend Set::Bag to support comparison
{{
package CBag;
use base 'Set::Bag';

sub contained {
my ( $bag, $other) = @_;
for ( $bag->elements ) {
return 0 if $bag->grab( $_) > ( $other->grab( $_) || 0);
}
return 1;
}

use overload(
'<=' => 'contained',
);

}}

__DATA__
mall
bill
kill
lot
pile
antilope
hello

Josef Moellers · Dec 9, 2004

Sandman wrote:

[ elaborate problem description deleted ]

A very simple/simplistic solution:

my @words = qw(mall bill kill lot pile antilope hello);
my $fixed = 'k';
my $set = 'ljetilamnop' . $fixed;

foreach (@words) {
print "$_\n" if m/[$fixed]/ && m/^[$set]*$/;
}

A. Sinan Unur · Dec 10, 2004

Sandman wrote:

[ elaborate problem description deleted ]

A very simple/simplistic solution:

my @words = qw(mall bill kill lot pile antilope hello);
my $fixed = 'k';
my $set = 'ljetilamnop' . $fixed;

foreach (@words) {
print "$_\n" if m/[$fixed]/ && m/^[$set]*$/;
}

How is this a solution?

D:\Home>perl tttt.pl
kill

Sinan

Sandman · Dec 10, 2004

A. Sinan Unur said:
pile should be in that list as well.

Yes, sorry.

Now, undoubtedly, there is a better way to do this, but here is something
that works for the data you posted:

Yes, and I've adapted it for my uses - it works great!

##########################

#!/usr/bin/perl
use strict;
use warnings;
use Data:

umper;
exit unless $ARGV[0];
$|=1; # Dont buffer output

my %words;

open(FILE, "</home/sandman/conf/rim.db"); # file with legal words
while (<FILE>)
{
chomp;
my @list=split (/\t/);
foreach (@list){
# no need to include too long words.
next if length($_) > (length($ARGV[0])+1);
# And we don't care for too short words
next if length($_) < 3;
$words{lc($_)}++;
}
}

my $source = normalize($ARGV[0]);
my $prnr;
my $word;

LINE: foreach $word (keys %words){
chomp $word;
my $target = normalize($word);
for my $c (keys %{ $target }) {
next LINE unless(
defined($source->{$c}) && ($source->{$c} >= $target->{$c})
);
}
my @input = split //, $ARGV[0];
my @match = split //, $word;
if ($ARGV[1]){
foreach (split /,/, $ARGV[1]){
# for every fixed position
next LINE unless $input[($_-1)] eq $match[$_-1];
}
}
printf "%-12s ", $word;
$prnr++;
print "\n" if ($prnr % 6) == 0;
}
print "\n";

sub normalize {
my ($string) = @_;
my @letters = split //, $string;
my %count;
$count{$_}++ for (@letters);
return \%count;
}

##########################

sandman~> alfapet kamrslsi 1,2
kali kams kari karl karm kass
kal kam kar kas

##########################

These are swedish words, so it's quite correct.

Thank you!

Anno Siegel · Dec 10, 2004

Sandman said:
I just can't seem to wrap my mind around how to attack this problem in the best
way, so I thought I'd just post here and see if anyone here has done something
similar and/or has a logical solution to it.

I have a series of letters, say "kljetilamnop", then I have a big textfile with
legit words, but let's use a short one here:

mall
bill
kill
lot
pile
antilope
hello

Now, I want to cycle through this word list and match it to the letters, and
return true if the word can be formed by the letters. So, the above would
return:

mall
kill
lot
antilope

Yet another way:

while ( <DATA> ) {
chomp;
( my $x = $_) =~ tr/kljetilamnop//d;
print "$_\n" unless length $x;
}

Anno

Single put routine overlapping words during iteration	4	Jan 2, 2023
Regex: deleting non-matching words	3	Aug 22, 2010
I have to finish this code for my assignment but I cant figure out how to solve it	1	Jun 27, 2023
regular expressions and matching delimeters	17	May 21, 2014
converting letters to numbers	15	Oct 8, 2013
find words that contains some specific letters	60	Jun 1, 2009
Generating Letters	1	Jul 24, 2008
Initial letters of the words in a string	9	Sep 8, 2011

Matching words and letters

Sandman

Jim Keenan

A. Sinan Unur

Anno Siegel

Josef Moellers

A. Sinan Unur

Sandman

Anno Siegel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads