[QUIZ] Numbers Can Be Words (#133)

R

Ruby Quiz

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Morton Goldberg

When working with hexadecimal numbers it is likely that you've noticed some hex
numbers are also words. For example, 'bad' and 'face' are both English words and
valid hex numbers (2989 and 64206, respectively, in decimal). I got to thinking
that it would be interesting to find out how many and which hex numbers were
also valid English words. Of course, almost immediately I started to think of
generalizations. What about other bases? What about languages other than
English?

Your mission is to pick a word list in some language (it will have be one that
uses roman letters) and write Ruby code to filter the list to extract all the
words which are valid numbers in a given base. For many bases this isn't an
interesting task--for bases 2-10, the filter comes up empty; for bases 11-13,
the filter output is uninteresting (IMO); for bases approaching 36, the filter
passes almost everything (also uninteresting IMO). However, for bases in the
range from 14 to about 22, the results can be interesting and even surprising,
especially if one constrains the filter to accept only words of some length.

I used `/usr/share/dict/words` for my word list. Participants who don't have
that list on their system or want a different one can go to Kevin's Word List
Page (http://wordlist.sourceforge.net/) as a source of other word lists.

Some points you might want to consider: Do you want to omit short words like 'a'
and 'ad'? (I made word length a parameter). Do you want to allow capitalized
words (I prohibited them)? Do you want to restrict the bases allowed (I didn't)?
 
K

Karl von Laudermann

When working with hexadecimal numbers it is likely that you've noticed some hex
numbers are also words. For example, 'bad' and 'face' are both English words and
valid hex numbers (2989 and 64206, respectively, in decimal). I got to thinking
that it would be interesting to find out how many and which hex numbers were
also valid English words. Of course, almost immediately I started to think of
generalizations. What about other bases? What about languages other than
English?

I'm not sure why this quiz is being phrased as "numbers that are
words". Aren't you just asking for a program that finds words that use
only the first n letters of the alphabet? Or am I missing something
obvious (tends to happen :) )?

Actually, one interesting variant that would tie it to numbers would
be if you could include digits that look like letters, i.e.:
0 -> O
1 -> I
2 -> Z
5 -> S
6 -> G
8 -> B

In this case, even numbers in base 10 could be words.
 
J

James Edward Gray II

I'm not sure why this quiz is being phrased as "numbers that are
words". Aren't you just asking for a program that finds words that use
only the first n letters of the alphabet? Or am I missing something
obvious (tends to happen :) )?

That's pretty much the quiz, yes. It's not too hard to solve, but
the results are pretty interesting.

James Edward Gray II
 
M

Morton Goldberg

Here are some solutions to this quiz. The first solution deliberately
avoids using regular expressions. Note the use of next to skip over
words that are too short or capitalized and break to stop the
iteration when it gets into territory beyond where numbers of the
given base exist.

<code>
WORD_LIST = "/usr/share/dict/words"
WORDS = File.read(WORD_LIST).split

def number_words(base=16, min_letters=3)
result = []
WORDS.each do |w|
next if w.size < min_letters || (?A..?Z).include?(w[0])
break if w[0] > ?a + (base - 11)
result << w if w.to_i(base).to_s(base) == w
end
result
end
</code>

<example>
number_words(18, 5) # => ["abaca", "abaff", "accede", "achage",
"adage", "added", "adead", "aface", "ahead", "bacaba", "bacach",
"bacca", "baccae", "bache", "badge", "baggage", "bagged", "beach",
"beached", "beachhead", "beaded", "bebed", "bedad", "bedded",
"bedead", "bedeaf", "beech", "beedged", "beefhead", "beefheaded",
"beehead", "beeheaded", "begad", "behead", "behedge", "cabbage",
"cabbagehead", "cabda", "cache", "cadge", "caeca", "caffa", "caged",
"chafe", "chaff", "chebec", "cheecha", "dabba", "dagaba", "dagga",
"dahabeah", "deadhead", "debadge", "decad", "decade", "deedeed",
"deface", "degged", "dhabb", "echea", "edged", "efface", "egghead",
"facade", "faced", "faded", "fadge", "feedhead", "gabgab", "gadbee",
"gadded", "gadge", "gaffe", "gagee", "geggee", "hache", "haggada",
"hagged", "headache", "headed", "hedge"]
</example>

The second solution uses #inject rather than #each, but doesn't seem
to be much if any of an improvement. I found it interesting because
it's one of few times I've ever needed to pass an argument to break
and next.

<code>
WORD_LIST = "/usr/share/dict/words"
WORDS = File.read(WORD_LIST).split

def number_words(base=16, min_letters=3)
WORDS.inject([]) do |result, w|
next result if w.size < min_letters || (?A..?Z).include?(w[0])
break result if w[0] > ?a + (base - 11)
result << w if w.to_i(base).to_s(base) == w
result
end
end
</code>

<example>
number_words(20, 7) # => ["accidia", "accidie", "acidific",
"babiche", "bacchiac", "bacchic", "bacchii", "badiaga", "baggage",
"beached", "beachhead", "beedged", "beefhead", "beefheaded",
"beehead", "beeheaded", "behedge", "bighead", "cabbage",
"cabbagehead", "caddice", "caddiced", "caffeic", "cheecha",
"cicadid", "dahabeah", "deadhead", "debadge", "debeige", "decadic",
"decafid", "decided", "deedeed", "deicide", "diffide", "edifice",
"egghead", "feedhead", "giffgaff", "haggada", "haggadic", "headache",
"jibhead"]
</example>

In my third and last solution, I take the obvious route and use
regular expressions. Maybe regular expressions are better after all.

<code>
WORD_LIST = "/usr/share/dict/words"
WORDS = File.read(WORD_LIST).split

def number_words(base=16, min_letters=3)
biggest_digit = (?a + (base - 11))
regex = /\A[a-#{biggest_digit.chr}]+\z/
result = []
WORDS.each do |w|
next if w.size < min_letters || w =~ /^[A-Z]/
break if w[0] > biggest_digit
result << w if w =~ regex
end
result
end
</code>

The following are all the hex numbers in word list which have at
least three letters.

<example>
number_words # => ["aba", "abac", "abaca", "abaff", "abb", "abed",
"acca", "accede", "ace", "adad", "add", "adda", "added", "ade",
"adead", "aface", "affa", "baa", "baba", "babe", "bac", "bacaba",
"bacca", "baccae", "bad", "bade", "bae", "baff", "bead", "beaded",
"bebed", "bed", "bedad", "bedded", "bedead", "bedeaf", "bee", "beef",
"cab", "caba", "cabda", "cad", "cade", "caeca", "caffa", "cede",
"cee", "dab", "dabb", "dabba", "dace", "dad", "dada", "dade", "dae",
"daff", "dead", "deaf", "deb", "decad", "decade", "dee", "deed",
"deedeed", "deface", "ebb", "ecad", "edea", "efface", "facade",
"face", "faced", "fad", "fade", "faded", "fae", "faff", "fed", "fee",
"feed"]
</example>

Regards, Morton
 
E

Eugene Kalenkovich

Ruby Quiz said:
by Morton Goldberg

Your mission is to pick a word list in some language (it will have be one
that
uses roman letters) and write Ruby code to filter the list to extract all
the
words which are valid numbers in a given base. For many bases this isn't
an
interesting task--for bases 2-10, the filter comes up empty; for bases
11-13,
the filter output is uninteresting (IMO); for bases approaching 36, the
filter
passes almost everything (also uninteresting IMO). However, for bases in
the
range from 14 to about 22, the results can be interesting and even
surprising,
especially if one constrains the filter to accept only words of some
length.
Here are my 4 solutions :) (all use ?, so they will not work in 1.9)

# solution #1 - Simple one-liner

p File.read(ARGV[0]).split("\n").reject{|w| w !~
%r"^[a-#{(?a-11+ARGV[1].to_i).chr}]+$"}.sort_by{|w| [w.length,w]} if
(?a...?z)===?a-11+ARGV[1].to_i

# solution #2 - Non-hackery substs, like Olaf

p File.read(ARGV[0]).split("\n").reject{|w| w !~
%r"^[a-#{(?a-11+ARGV[1].to_i).chr}|lO]+$"}.sort_by{|w| [w.length,w]} if
(?a...?k)===?a-11+ARGV[1].to_i

# solution #3 - c001 hackerz

p File.read(ARGV[0]).split("\n").reject{|w| w !~
%r"^[a-#{(?a-11+ARGV[1].to_i).chr}|lo]+$"i}.map{|w|
w.downcase.gsub('o','0').gsub('l','1')}.sort_by{|w| [w.length,w]} if
(?a...?k)===?a-11+ARGV[1].to_i

# solution #4 - B16 5H0UT1N6 HACKER2

base=ARGV[1].to_i
base_=base+?a-11

raise "Bad base: [#{base}]" if base<1 || base_>?z

sub0=base_ < ?o
sub1=base>1 && base_ < ?l
sub2=base>2 && base_ < ?z
sub5=base>5 && base_ < ?s
sub6=base>6 && base_ < ?g
sub8=base>8 && base_ < ?b

reg="^["
reg<<'O' if sub0
reg<<'I' if sub1
reg<<'Z' if sub2
reg<<'S' if sub5
reg<<'G' if sub6
reg<<'B' if sub8
reg<<"|a-#{base_.chr}" if base>10
reg<<']+$'

result=File.read(ARGV[0]).split("\n").reject{|w| w !~ %r"#{reg}"i}.map{|w|
w.upcase}.sort_by{|w| [w.length,w]}
result.map!{|w| w.gsub('O','0')} if sub0
result.map!{|w| w.gsub('I','1')} if sub1
result.map!{|w| w.gsub('Z','2')} if sub2
result.map!{|w| w.gsub('S','5')} if sub5
result.map!{|w| w.gsub('G','6')} if sub6
result.map!{|w| w.gsub('B','8')} if sub8
result.reject!{|w| w !~ /[A-Z]/} # NUM8ER5-0NLY LIKE 61885 ARE N0T READA8LE
p result
 
C

Carl Porth

Just a simple regex, the rest is just option parsing.

#!/usr/bin/env ruby -wKU

require "optparse"

options = {
:base => 16,
:min_length => 1,
:word_file => "/usr/share/dict/words",
:case_insensitive => false
}

ARGV.options do |opts|
opts.banner = "Usage: #{File.basename($PROGRAM_NAME)} [OPTIONS]"

opts.separator ""
opts.separator "Specific Options:"

opts.on( "-b", "--base BASE", Integer,
"Specify base (default #{options[:base]})" ) do |base|
options[:base] = base
end

opts.on( "-l", "--min-word-length LENGTH", Integer,
"Specify minimum length" ) do |length|
options[:min_length] = length
end

opts.on( "-w", "--word-file FILE",
"Specify word file",
"(default #{options[:word_file]})" ) do |word_file|
options[:word_file] = word_file
end

opts.on( "-i", "--ignore-case",
"Ignore case distinctions in word file." ) do |i|
options[:ignore_case] = true
end

opts.separator "Common Options:"

opts.on( "-h", "--help",
"Show this message." ) do
puts opts
exit
end

begin
opts.parse!
rescue
puts opts
exit
end
end

last_letter = (options[:base] - 1).to_s(options[:base])
letters = ("a"..last_letter).to_a.join
exit if letters.size.zero?

criteria = Regexp.new("^[#{letters}]{#{options[:min_length]},}$",
options[:ignore_case])

open(options[:word_file]).each do |word|
puts word if word =~ criteria
end


The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- =-=-=

by Morton Goldberg

When working with hexadecimal numbers it is likely that you've noticed some hex
numbers are also words. For example, 'bad' and 'face' are both English words and
valid hex numbers (2989 and 64206, respectively, in decimal). I got to thinking
that it would be interesting to find out how many and which hex numbers were
also valid English words. Of course, almost immediately I started to think of
generalizations. What about other bases? What about languages other than
English?

Your mission is to pick a word list in some language (it will have be one that
uses roman letters) and write Ruby code to filter the list to extract all the
words which are valid numbers in a given base. For many bases this isn't an
interesting task--for bases 2-10, the filter comes up empty; for bases 11-13,
the filter output is uninteresting (IMO); for bases approaching 36, the filter
passes almost everything (also uninteresting IMO). However, for bases in the
range from 14 to about 22, the results can be interesting and even surprising,
especially if one constrains the filter to accept only words of some length.

I used `/usr/share/dict/words` for my word list. Participants who don't have
that list on their system or want a different one can go to Kevin's Word List
Page (http://wordlist.sourceforge.net/) as a source of other word lists.

Some points you might want to consider: Do you want to omit short words like 'a'
and 'ad'? (I made word length a parameter). Do you want to allow capitalized
words (I prohibited them)? Do you want to restrict the bases allowed (I didn't)?
 
K

Ken Bloom

There is no end of numerological[0,1] variations that could be used by
anyone who feels the need for an additional challenge this week.

Well then, along those lines I have a Hebrew gematria counter. Give it
words on the commandline, and it will tell you what the gematria is of
those words, and what the total gematria.

I use this to check when converting Hebrew citations of Jewish books into
English for the benefit of those reading English newsgroups.

#!/usr/bin/env ruby
$KCODE = "u"
require "jcode"
require 'generator'
class String
def u_reverse; split(//).reverse.join; end
end

‭LETTERVALUES=Hash.new(0).merge \
‭ Hash['×' => 1, 'ב' => 2, '×’' => 3, 'ד' => 4, '×”' => 5,
‭ 'ו' => 6, 'ז' => 7, 'ח' => 8, 'ט' => 9, 'י' => 10, 'כ' => 20
‭ 'ל' => 30, 'מ' => 40, 'נ' => 50, 'ס' => 60, 'ע' => 70, 'פ' => 80,
‭ 'צ' => 90, 'ק' => 100, 'ר' => 200, 'ש' => 300, 'ת' => 400,
‭ '×' => 40, 'ך' => 20 , 'ן' => 50, '×£' => 80, '×¥' => 90]
gematrias=ARGV.collect do |word|
word.split(//).inject(0) do |t,l|
t+LETTERVALUES[l]
end
end

SyncEnumerator.new(ARGV, gematrias).each do |word,value|
#reverse the word to print it RTL if all of the characters in it
#are hebrew letters

#note that this doesn't find nikudot, but then we don't care
#anyway because the terminal mangles nikudot -- the result will be
#so mangled anyway that we don't care whether it's reversed
word=word.u_reverse if word.split(//)-LETTERVALUES.keys==[]
printf "%s %d\n", word, value
end

printf "Total %d\n", gematrias.inject {|t,l| t+l}
 
J

James Edward Gray II

I have come up with this one-liner:

----------8<----------
puts File.readlines
('/usr/share/dict/words').grep(/\A[a-#{((b=ARGV[0].to_i)-1).to_s
(b)}]+\Z/)
---------->8----------

I used a one-liner too:

ruby -sne 'print if $_.downcase =~ /\A[\d\s#{("a".."z").to_a.join[0...
($size.to_i - 10)]}]+\Z/' -- -size=12 /usr/share/dict/words

James Edward Gray II
 
R

Robert Klemme

------=_Part_109090_20058903.1186381418861
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

My solution.

robert

------=_Part_109090_20058903.1186381418861
Content-Type: application/x-ruby; name=word-filter.rb
Content-Transfer-Encoding: base64
X-Attachment-Id: f_f4wphimf
Content-Disposition: attachment; filename="word-filter.rb"

IyFydWJ5CgppZiBBUkdWLmVtcHR5PwogIHB1dHMgInVzZTogIyQwIGJhc2UgbWluIG1heCA8d29y
ZCBmaWxlcz4iCiAgZXhpdCAwCmVuZAoKYmFzZSA9IEFSR1Yuc2hpZnQudG9faQptaW4gID0gQVJH
Vi5zaGlmdC50b19pCm1heCAgPSBBUkdWLnNoaWZ0LnRvX2kKCnJhaXNlICJMb3cgYmFzZSIgdW5s
ZXNzIGJhc2UgPiAxMApyYWlzZSAibWluIG1heCBlcnJvciIgdW5sZXNzIG1heCA+PSBtaW4gJiYg
bWluID4gMAoKZmlsdGVyID0gUmVnZXhwLm5ldyAiXlthLSN7KD9hICsgYmFzZSAtIDExKS5jaHJ9
XXsje21pbn0sI3ttYXh9fSQiLAogIFJlZ2V4cDo6SUdOT1JFQ0FTRQoKQVJHRi5lYWNoIGRvIHxs
aW5lfAogIHB1dHMgbGluZSBpZiBmaWx0ZXIgPT09IGxpbmUKZW5kCg==
------=_Part_109090_20058903.1186381418861--
 
D

Douglas F Shearer

Crude but effective.

Written in about 20minutes.

###################################################

@words = File.new('/usr/share/dict/words').read.downcase.scan(/[a-z]
+/).uniq
@chars = '0123456789abcdefghijklmnopqrstuvwxyz'

def print_matches(base,minsize=0)

print "Base: " + base.to_s + "\n"

alphabet = @chars[0,base]

print "Alphabet: " + alphabet + "\n\nMatching Words:\n\n"

@words.each do |w|

if w.length >= minsize
hexword = true
w.each_byte { |c|
if !alphabet.include?(c.chr)
hexword = false
break
end
}
p w if hexword
end
end

end

print_matches 18,4

#################################################

Output:

Base: 18
Alphabet: 0123456789abcdefgh

Matching Words:

"ababdeh"
"abac"
"abaca"
"abaff"
"abba"
"abed"
"acca"
"accede"
"achage"
"ache"
"adad"
"adage"
"adda"
"added"
"adead"
"aface"
"affa"
"agade"
"agag"
"aged"
"agee"
"agha"
...

Douglas F Shearer
http://douglasfshearer.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top