## Unit Conversion (#183)
The right way, generally, to do a task such as unit conversion is to
see if someone has already done all the hard work for you. As was
pointed out, there are several options in this respect:
* The [Stick] library for Ruby; a [brief summary] was provided.
Stick provides a value class (i.e. quantity with units), conversions,
syntactic sugar and more.
* Google's search engine can act as a calculator, including unit
conversions. Using Google's API is one option; another is screen-
scraping, as was done by _Peter Szinek_. (Of course, as noted, you
must have an activate Internet connection to use this solution.)
* As was pointed out by _Ryan Davis_, there is a BSD/Un*x command
and library called `units` which does this same task. Transform the
arguments, pass them to a shell, and capture the output.
Many thanks to _Martin Boese_, whose solution had to be empirically
confirmed. Repeatedly.
But I'm going to look at the solution from _Robert Dober_. While it is
limited, as posted, his data driven approach could be expanded to
include more conversions.
To understand how the expression `1.0.in.to.mm` will generate the
string "25.4mm", I'll trace it a step at a time, looking at the
relevant bits of code.
First, we have the float value `1.0`, but where does the method `in`
come from? Clearly, class `Float` gets something by way of extension:
class Float
include Conversion
end
Module `Conversion` only defines one method that will extend `Float`
(with the rest of `Conversion` being helper classes and code executed
when `Conversion` is first evaluated). That method is `method_missing`:
def method_missing unit_name
pc = ProxyClasses[ unit_name.to_s ] || super( unit_name )
pc::new self
end
So we will look for `ProxyClasses["in"]` and, if not found, we just
call to the parent class and hope it knows what to do with method call
`in`. But in this case, we're expecting to find something in
`ProxyClasses`... a Class, in fact, which we attempt to instantiate
immediately using `new`. But where does we fill `ProxyClasses`?
Ah, that would be the code right below `method_missing` in his
solution: the code that makes use of `LineParser`.
conversions = LineParser::new
File:
pen "units.txt" do | f |
f.each do | line |
conversions.parse_line line
end
end
Robert provided a minimal `units.txt` data file to show how the code
works. (Note that the line beginning "use SI" is part of the data file
and not a mistake; see `parse_line` for how that is handled.)
1 in = 0.0254 m
1 l = 0.001 m3
use SI prefixes for m g l m3
It could be expanded greatly to support many more units. As each line
is read, the `LineParser` object parses them, keeping track of the
conversion rules -- I'll come back to that later. What I want to look
at first is what gets done with those rules:
conversions.traverse do | src_unit, tgt_unit, conversion |
( ProxyClasses[ src_unit ] ||= Class::new ProxyClass ).module_eval do
define_method tgt_unit do (@value * conversion).to_s + tgt_unit end
end
end
`traverse` is going to enumerate over a number of valid conversions --
source units, target units, and the conversion factor. And here we see
from where the `ProxyClasses` originate... New `ProxyClass` objects
are created through the code `Class::new ProxyClass` (but only if one
didn't exist already for the particular source unit... note the use of
the `||=` operator which only evaluates the right side and assigns
left if the left was initially nil).
After ensuring that the `ProxyClass` corresponding to the source units
exists, we call `module_eval` in order to add methods to the anonymous
class just created. The method name will be the target units, and the
method multiplies in the conversion factor, converts to a string, and
appends the targets units.
So, getting back to our example `1.0.in.to.mm`, we've now found the
`ProxyClass` corresponding to `1.0.in`. And we know that `ProxyClass`
also has methods named by target units, which includes one that
corresponds to the last part of the example: `.mm`.
If you're wondering about `to`, every `ProxyClass` defines that method
to return self: essentially a useless function (in the sense that it
does nothing more than `1.0.in.mm`). It's existence mimics other
libraries, and the point is readability. (An alternative would be a
more traditional call, such as 1.0.convert
in, :mm) or similar.)
So once these proxy classes exist, there's very little effort going on
to evaluate calls such as our example. And creating the proxy classes
isn't much more difficult, assuming you have a proper conversion
table. Now we come back to `LineParser` and what happens beyond its
`parse_line` method. (I'll skip `parse_line` itself, since it is a
few, simple regular expressions.)
Most of `units.txt` that defines our conversions is going to be
handled by `add_conversion`, which just receives as arguments each
split line of the data file. The conversion table (stored in `@c`) is
two-layered hash -- a hash of hashes -- and is setup with this code:
def add_conversion lhs_value, lhs_unit, equal_dummy, rhs_value,
rhs_unit
@c[ lhs_unit ][ rhs_unit ] = Float( rhs_value ) / Float( lhs_value )
@c[ rhs_unit ][ lhs_unit ] = Float( lhs_value ) / Float( rhs_value )
end
The conversion ratio (and the inverse conversion ratio) are stored in
two places based on the indexing order. By storing both ratios/orders,
we can convert in "both directions". That is, for our example, not
only can we convert inches to millimeters, but millimeters to inches.
The last bit of file parsing is adding appropriate metric prefixes (SI
units). One line in the file indicates which units are worthy of
metric prefixes. In the data file provided, we see that meters can
accept metric prefixes (such as "kilo" and "milli"), but inches will
not. These prefixes are handed by `add_si_unit_for`:
def add_si_unit_for unit
SIUnits.each do | prefix, conversion |
@c[ prefix + unit ][ unit ] = conversion
@c[ unit ][ prefix + unit ] = 1 / conversion
end
end
Here, `unit` is the particular unit we want to support metric
prefixes. `SIUnits` is the hash containing the metric prefixes as
characters and the corresponding orders of magnitude. For every unit
and metric prefix, two more conversions are added, each the inverse of
the other: conversion between the naked unit and the adorned unit
(e.g. between meters and millimeters, and vice-versa).
Finally, `traverse` is an enumerator that will yield (via `blk.call`)
every valid combination of units and the appropriate conversion
factor. It manages this without storing every conversion (e.g. we
store the inches to meters conversion, and the meters to millimeters
conversion, but don't explicitly store inches to millimeters).
Enumerating every possible, valid conversion is done in the private
method `_traverse`:
def _traverse src_unit, unit_conversions, traversed_units, f=1.0, &blk
unit_conversions.each do | new_unit, conversion |
next if traversed_units.include? new_unit
blk.call src_unit, new_unit, f * conversion
_traverse src_unit, @c[ new_unit ], traversed_units + [ new_unit ],
f * conversion, &blk
end
end
The final, recursive step here is what allows us to build a transitive
closure of all units. `src_unit` is, of course, the source unit (e.g.
inches). `unit_conversion` contains all possible immediate conversions
from the source and is the hash of units and conversion factors. And,
you can see, we enumerate those into `new_unit` and `conversion`.
We skip a target unit if it's already been visited (i.e. in
`traversed_units`). Otherwise, we yield to the caller (`blk.call`) and
recurse, now converting the source unit to everything the target unit
can also be converted, making sure to update `traversed_units` so as
to terminate eventually.
[1]:
http://stick.rubyforge.org/
[2]:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/320583