[QUIZ] Secret Agent 00111 (#94)

R

Ruby Quiz

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Mauricio Fernandez

"Secret Agent 00111 is back at the Casino again, playing a game of chance, while
the fate of mankind hangs in the balance. Each game consists of a series of
favorable events (probability p), terminated by the first occurrence of an
unfavorable event (probability q = 1 - p). More specifically, the game is
roulette, and the unfavorable event is the occurrence of 0, which has a
probability of q = 1 / 37. No one seriously doubts that 00111 will come through
again, but the Secret Service is quite concerned about communicating the
blow-by-blow description to Whitehall.

The bartender, who is a free-lance agent, has a binary channel available, but he
charges a stiff fee for each bit sent. The problem perplexing the Service is how
to encode the vicissitudes of the wheel[1] so as to place the least strain on
the Royal Exchequer."

00111 will be needing the communication device in 48H. Q has decided to give him
a programmable gizmo, with a built-in Ruby interpreter, in the hope that 00111
will code/play with it on his own and do his best to return it in the original
condition, very unlike all the other gadgets he's been given before.

Therefore, Q can enjoy the luxury of coding in Ruby, and he's confident he will
be able to complete the task in time. Q's first thought of using Zlib::Deflate
to compress the data as shown in the following example:

# EVENTS will be a large sequence of boolean events:
# true favorable event (00111 wins)
# false unfavorable event (00111 loses)
# we generate a sample one as follows:
EVENTS = (0..1000).map{ rand() > 1.0 / 37 }

# compress
require 'zlib'
string = EVENTS.map{|x| x ? "0" : "1"}.join("")
string.size # => 1001
compressed = Zlib::Deflate.deflate(string)
compressed.size # => 69

# now 'compressed' is transmitted by the bartender

However, Q has been wary of Zlib since he ran into some sort of BufferError when
he was installing Mongrel using RubyGems, and he still mourns the sad demise of
00110, which was the direct result of a binary incompatibility between a Ruby
build and a third-party extension.

Moreover, the MI6 has been spying on the management of the Casino, which faced a
similar problem and solved it long ago, and intercepted the following message at
the time the Casino's system was being developed:

TEST RUN 43
-----------
Original size: 10000
Compressed to: 242 bytes
Result using deflate: 462 bytes

Q keeps a very detailed log of his activities, so you know he has been reading
up on Information Theory, and has found a paper that addresses the very problem
he needs to solve:

Run-length encodings--S. W. Golomb (1966); IEEE Trans Info Theory 12(3):399[1]

He also found some information in the Wikipedia[2], and wrote the following unit
tests before feeling indisposed while reading RailsBlob[3]:

http://rubyquiz.com/test_00111_gizmo.rb

Q has also wrote some code to exert the SecretAgent00111CommunicationGizmo:

http://rubyquiz.com/00111.rb

Finally, Q left a note with the expected (approximate) output one should get
when running 00111.rb:

Dear myself,

please make sure that
ruby 00111.rb
writes something like this to stdout before you hand your beautiful creation
to the brutish 00111:

Probability : 0.972972972972973
Approx. with: 0.957603280698574
Exponent: 4 (p**4 = 1/2; m = 2 ** e = 16)
Decoded correctly? yes
Original size: 1000
Compressed to: 23 bytes (2.3%)
Compare to 57 bytes using deflate...
=========================================
Testing buffered encoder/decoder
Decoded correctly? (b) yes
Decoded correctly? (c) yes

Sincerely,

Q

Unfortunately, it seems Q is not doing any better, but management is sure you
will be able to complete Q's code (that is, write 00111_gizmo.rb such that the
above unit tests pass and the sample program works) in time, given all the
material he left.

00111's mission will require _at least_ the following tests to pass:

* test_rle
* test_unrle
* test_basic_encoding
* test_basic_decoding
* test_round_trip_probabilistic

The other tests are optional (you can choose to comment them, but the sample
program won't execute correctly if they fail, though), but you should probably
do your best to make them pass: this is an excellent opportunity to prove your
skills, and it seems Q's retirement is not too far away...

Good luck.

PS: you owe me one, man. Such opportunities are rare.
 
J

James Edward Gray II

It's my fault this quiz was released late and I think it is a fun
problem. I don't want to cheat anyone out of a chance to solve it,
so I'm extending the quiz one week. I will summarize a week from
tomorrow.

James Edward Gray II
 
K

Kero

Q keeps a very detailed log of his activities, so you know he has been reading
up on Information Theory, and has found a paper that addresses the very problem
he needs to solve:

Run-length encodings--S. W. Golomb (1966); IEEE Trans Info Theory 12(3):399[1]

Gave me the chance to actually read Golomb's paper, which I never had :)

[snip]
Unfortunately, it seems Q is not doing any better, but management is sure you
will be able to complete Q's code (that is, write 00111_gizmo.rb such that the
above unit tests pass and the sample program works) in time, given all the
material he left.

00111's mission will require _at least_ the following tests to pass:

* test_rle
* test_unrle
* test_basic_encoding
* test_basic_decoding
* test_round_trip_probabilistic

The other tests are optional (you can choose to comment them, but the sample
program won't execute correctly if they fail, though), but you should probably
do your best to make them pass: this is an excellent opportunity to prove your
skills, and it seems Q's retirement is not too far away...

I'm all for TDD, but a little bit of explanation would have been welcome
1) Golomb's paper mentions m, you use an exponent, such that 2**exponent is m
(so we're missing all the fun with M; aren't these the Rice codes?)
2) placing the exponent in a BYTE before the bits is... practical...
3) why does there seem to be an extra 'false' trailing, in (some!) of the
bitstreams of the testcases?

After I figured out all of that (I happily started with arbitrary m... and
with hindsight, (3) kept me confused longer than it should have; tests
working on infinite bitstreams before turning to the finite bit-strings (and
then byte-strings) would have helped (unfortunately test_decoder_partial is
derived from those finite TEST_VECTORS and therefore carries its problems
back into the infinite bitstream domain :( ) ) I managed:

Probability : 0.972972972972973
Approx. with: 0.957603280698574
Exponent: 4 (p**4 = 1/2; m = 2 ** e = 16)
Decoded correctly? yes
Original size: 1000
Compressed to: 27 bytes (2.7%)
Compare to 72 bytes using deflate...
================================================================================
Testing buffered encoder/decoder
Decoded correctly? (b) yes
Decoded correctly? (c) yes

Now I'll have to figure out something very simple: when is the new deadline
of this quiz :p

Regards,
Kero.
 
J

James Edward Gray II

Now I'll have to figure out something very simple: when is the new
deadline
of this quiz :p

<laughs> I will release the summary Thursday morning, which means
you need to have it in before I start writing sometime Wednesday if
you want me to look it over.

James Edward Gray II
 
K

Kero

Now I'll have to figure out something very simple: when is the new
<laughs> I will release the summary Thursday morning, which means
you need to have it in before I start writing sometime Wednesday if
you want me to look it over.

OK, here we go :)

Somehow, I dug into the tests and missed test_rle and test_unrle above the
TEST_VECTORS. A pity, because they *do* get your mind set onto those
trailing false's. Plus, especially test_unrle is trivial to implement.
Run like `ruby test_00111_gizmo.rb -n test_unrle` :)

Then on with test_basic_en/decoding. Encoder#<< shows directly why the extra
false is required for a finite bitstream: the code doesn't do a bloody thing
when appending 'true's. In order to get finite bitstrings (36 out of 37
which end with 'true') across the wire, you have to append a 'false'. For
consistency *always* append a 'false' when you close the string. and that is
precisely what Encoder#finish does.

This way, Encoder#<< works for infinite bitstreams, too.

And then you need some padding (with ones! so the decoder won't be able to
decode them!) for those "practical" bytes.

The decoder has a statemachine to see whether its reading bits for the
multiplier :)div) or the remainder :)rem, which is a binary encoded number
as we all know for these powers-of-two codes; this is not the case for the
general Golomb codes, you'd need an extra offset and sometimes one bit
less). I had to add a state for the exponent :)exp) in the beginning and
move some code around, instead of just reading the exponent in initialize,
because at that time the (string)io can still be empty.

Note how Decoder#read returns as much as can be processed from the
bitstream, up until what is received (dealing with byte boundaries;
additional (fewer than eight) bits could be in the bitstream, but not yet
available for the decoder; see how "practical" bytes are?)

The Gizmo::encode and Gizmo::decode look amazingly like the test_encoder and
test_decoder tests. Sorry :)

By this time I was still messing with the trailing false, which I had to
append to some tests. Apart from that, test_decode_partial and
test_round_trip_probabilistic came almost for free. When I finally saw the
light, I could remove the trailing 'false' in Gizmo::decode, put the tests
back to what they were supposed to be, and immediately after that, Q's
00111.rb ran as it was supposed to.

Thanks for the Quiz, taught me Golomb codes and Ruby's StringIO.
Oh, and I realize now that test_decoder tests the infinite streams.

And then, the code!
-----

# Author: Kero van Gelder
# Copyright: Kero van Gelder, can be distributed under LGPL

require 'stringio'

module SecretAgent00111CommunicationGizmo

class UndefinedRLE < RuntimeError
end

def self.unrle(ary)
ary.inject([]) {|un, val|
un.concat([true] * val)
un << false
}
end

def self.rle(ary)
result = []
while not ary.empty?
nr = 0
while not ary.empty? and ary[0]
nr += 1
ary.shift
end
raise UndefinedRLE if ary.empty?
ary.shift
result << nr
end
raise UndefinedRLE if result.empty?
result
end

Log2 = Math.log(2)

def self.encode(ary, exponent)
io = StringIO.new
encoder = Encoder.new(exponent, io)
ary.each{|result| encoder << result}
encoder.finish
io.string
end

def self.decode(bitstring)
ary = Decoder.new(StringIO.new(bitstring)).read
ary.pop
ary
end

class Encoder
attr_reader :exponent, :io

def initialize(exponent, io)
@exponent, @io = exponent, io
io.print [exponent].pack("c")
@ones = 0
@buf = ""
# @finished = false
end

def << bool
finished? and raise "Channel closed"
if bool
@ones += 1
else
m = 2**exponent
div, rem = @ones.divmod m
@buf << ("1" * div)
@buf << (("%0#{exponent+1}s" % (rem).to_s(2)).gsub(/ /, "0"))
if (blen = @buf.length) >= 8
#p [exponent, div, rem, @buf, blen]
bytes = blen / 8
io.print([@buf.slice!(0, bytes * 8)].pack("B*"))
end
#p io.string
@ones = 0
end
end

def finish
self << false
@finished = true
@buf << ("1" * (8 - @buf.length % 8)) if @buf.length % 8 > 0
io.print([@buf].pack("B*"))
end

def finished?
@finished
end
end

class Decoder
attr_reader :io
def initialize(io)
@io = io
@state = :exp
@buf = ""
@div = 0
@rem = ""
end

def exponent
if @state == :exp # and io.ready?
@exponent = io.read(1).unpack("c")[0]
@state = :div
end
@exponent
end

def read
if @state == :exp # and io.ready?
@exponent = io.read(1).unpack("c")[0]
@state = :div
end
result = []
@buf << io.read.unpack("B*")[0]
while not @buf.empty?
#p [@buf, @state, @exponent, @div, @rem]

if @state == :div
while not @buf.empty? and @buf[0] == ?1
@buf.slice!(0, 1)
@div += 1
end
@state = :rem unless @buf.empty?
end
#p [@buf, @state, @div, @rem]

if @state == :rem
req = exponent + 1 - @rem.length
if @buf.length < req
@rem << @buf.slice!(0..-1)
else
@rem << @buf.slice!(0, req)
result.concat([true] * (@div * 2**exponent + @rem.to_i(2)))
result << false
@div = 0
@rem = ""
@state = :div
end
end
end
result
end

end
end
 
B

Boris Prinz

Hi,

here's my solution. It's based on strings of "1"s and "0"s, so the
encoding
can be done mainly with a format string ("%0#{@exponent+1}B") and
the decoding with regular expressions.

Regards,
Boris


### 00111_gizmo.rb

require 'stringio'

module SecretAgent00111CommunicationGizmo
class ExponentUndefined < Exception
end

class UndefinedRLE < Exception
end

def self.decode data
events = Decoder.new(StringIO.new(data)).read
events.pop # remove last false
events
end

def self.encode events, exponent
io = StringIO.new
e = Encoder.new(exponent, io)
events.each do |result|
e << result
end
e.finish
io.string
end

def self.rle events
raise UndefinedRLE.new if events.size == 0
rl = []
truevals = 0
events.each do |result|
if result
truevals += 1
else
rl << truevals
truevals = 0
end
end
raise UndefinedRLE.new if truevals != 0
rl
end

def self.unrle rl
events = []
rl.each {|n| events += [true]*n + [false]}
events
end

class Encoder
def initialize exponent, io
@exponent = exponent
@io = io
@events = []
@bits = ''
@io << exponent.chr
end

def << result
@events << result
if result == false
encode_events
end
end

def encode_events
n = @events.size - 1
@events = []
@bits << "1" * (n / 2**@exponent)
@bits << "%0#{@exponent+1}B" % (n % 2**@exponent)
flush_bytes
end

def flush_bytes
# write every 8 bits to the stream:
@bits.gsub!(/[01]{8}/) do |byte|
@io << byte.to_i(2).chr
''
end
end

def finish
@events << false
encode_events
# fill up remaining byte with "1"s:
rest = @bits.size % 8
if rest != 0
@bits << "1" * (8 - rest)
end
flush_bytes
end
end

class Decoder
def bits(string)
string.unpack("B*")[0]
end

def initialize io
@io = io
@exponent = nil
@bits = ''
@t = @n = nil
end

def exponent
return @exponent if @exponent
e = @io.getc
if e
@exponent = e
else
raise ExponentUndefined.new
end
@exponent
end

def add_events
n = @n
n += @t * 2**exponent if @t
@t = @n = nil
@events += SecretAgent00111CommunicationGizmo.unrle([n])
end

def decode_bits
loop do
found = false
if @bits =~ /^(1+)0/
@t = $1.size
@bits.sub!(/^(1+)/, '')
found = true
end
re = Regexp.new("^0([01]{#{exponent}})")
if @bits =~ re
@n = $1.to_i(2)
@bits.sub!(re, '')
add_events
found = true
end
return unless found
end
end

def read
@events = []
begin
exponent
data = @io.read
@bits += bits(data)
decode_bits
rescue ExponentUndefined
# exponent not available yet in stream, try again later
end
@events
end
end
end
 
E

Eric Torreborre

Hi,

Here's my "solution" to the quiz. It passes the mandatory tests but not the
StringIO tests (encode and decode_partial).

I found it interesting to know about the Golomb coding, but I still can get
ideas right about using IO with bits and bytes. This certainly points me to
another area of study,...

The units tests were very helpful because they covered a lot of different
cases (showing a new bug almost each time!) but the probabilistic test case
was even more interesting because it detected a bug outside of the test
suite. I have heard about QuickCheck which is an automatic testing tool for
Haskell (http://www.math.chalmers.se/~rjmh/QuickCheck/). This tool also
makes use of random generated test cases and I am wondering if theses ideas
could be imported into the Ruby world.

Eric.

====================================
class SecretAgent00111CommunicationGizmo
class UndefinedRLE < Exception
end

def self.rle(tries)
# raise an error if there are no tries or if the array doesn't end with
a false try
if (tries.empty? || tries[-1])
raise UndefinedRLE.new("tries empty: #{tries.empty?} - tries ends
with: #{tries[-1]}")
end

# array[0..-2] returns an array with the last element removed
tries.map{|x| x ? "1" : "0"}.join("").split("0", -1)[0..-2].inject([])
do |result, tries_row|
result << tries_row.size
end
end

def self.unrle(tries)
tries.inject([]) do |result, tries_row|
tries_row.times {result << true}
result << false
end
end

def self.encode(array, exponent)
tries, result = rle(array << false), ""
tries.inject(result) do |result, tries_number|
# divide the tries number with a power of two
a, b = tries_number.divmod(2 ** exponent)
# the quotient is unary-encoded ('1' repeated a times)
# the remainder is binary encoded
result << '1' * a + sprintf("0%0*b", exponent, b)
end

# pad the result with '1's in order to have an appropriate number of
bytes
result << '1' * (-result.length % 8)
# add the exponent used for calculation and pack
[exponent].pack("c") + [result].pack("B*")
end

def self.decode(bitstring)
exponent, result = decode_exponent_and_values(bitstring)
result
end

def self.decode_exponent_and_values(bitstring)
bitstring = bitstring.unpack("B*")[0]
# the exponent is binary encoded on 8 bits
exponent = bitstring.slice!(0..7).to_i(2)

result = []
# stop analyzing the bit if empty or if anything that's left is padding
while(not bitstring.empty? and not bitstring =~ /\A1+$/)

# the quotient is unary encoded at the beginning of the bitstring if
!= 0
coded_quotient = ""
bitstring.scan(/(\A1+)0+\w*/){|x| coded_quotient = x[0]}
bitstring.slice!(0, coded_quotient.size)
quotient = coded_quotient.size

# the remainder is binary encoded on exponent + 1 bits
remainder = bitstring.slice!(0..exponent).to_i(2)
result << quotient * (2 ** exponent) + remainder
end
return exponent, unrle(result)
end

class Encoder
def initialize(exponent, io)
@values = []
@io = io
@exponent = exponent
end

def << value
@values << value
end

def finish
@io << SecretAgent00111CommunicationGizmo.encode(@values, @exponent)
end
end

class Decoder
attr :exponent
def initialize(io)
@exponent, @array =
SecretAgent00111CommunicationGizmo.decode_exponent_and_values(io.string)
end

def read
@array + [false]
end
end
end
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top