Ruby in XML.

J

John Carter

I have just stuck this on..
http://www.rubygarden.org/ruby?RubyInXML

I like XML.

There is a firm standard, there is a rich toolset to work on it.

I like HTML. It is simply the fastest way to deliver good looking
documents to the widest audience.

No surprise. I like XHTML it _is_ HTML in XML. I can validate my XHTML
documents and know that they conform exactly to the standard, and hence
will render properly on a wide set of browsers.

I love ruby. It is quite the easiest way to program. It has a lovely XML
API called REXML.

I sometimes need to do spreadsheet sort of things. Basically a document
that describes my reasoning and findings, supported by numbers.

Long time ago, when I still did Perl, I found by actual trials that I was
about as fast in Perl as the average guy is using a Spreadsheet. Sometimes
faster, sometimes slower. But for the next hundred data sets, my perl
scripts where a thousand times faster.

So I don't do spreadsheets these days, I write ruby scripts.

So I have taken to combining Ruby & HTML. Sometimes via cgi. It works for
me.

But sometimes I have documents that are more HTML than ruby. So it makes
sense to write them in HTML, with a bit of Ruby embedded. That's where erb
and eruby live.

But I don't like erb and eruby's tags. I can't validate my XHTML.

So add REXML and I present a very small script I call rubyexml. Ruby
Embedded in XML.

#!/usr/bin/ruby -w

require 'rexml/document'
require 'rexml/streamlistener'
require 'pp'

# All eval's are evaluated in the context of an instance of this class.
# Extend this, or add this method to a class of your own.
class Context

def eval_value( value)
value.gsub( %r{ \#\{ ( [^\}]+ ) \} }x) do | match|
instance_eval( $1).to_s
end
end
end


# This does the work.
class Listener
include REXML::StreamListener

def initialize( context)
@context = context
end

def comment( text)
print @context.instance_eval( text)
rescue SyntaxError => details
pp @context
pp text
raise "Failed to compile '#{text}' in context : #{details}"
end

def tag_start(name,attrs)
print "<",name
attrs.each_pair do |key, value|
print " #{key}=\"#{@context.eval_value( value)}\""
end
print ">"
end

def tag_end( name)
print "</", name, ">"
end

def text( text)
print @context.eval_value(text)
end

def cdata( ctext)
text( ctext)
end
end

# This comes for free from REXML. Stream parse an XML document.
REXML::Document::parse_stream( REXML::SourceFactory::create_from( STDIN),
Listener::new( Context.new))


So take a chunk of XHTML...

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"xhtml11.dtd" >
<html xmlns="HTTP://www.w3.org/TR/xhtml"
xmlns:xlink="HTTP://www.w3.org/XML/XLink/0.9"
xml:lang="en" >
<head>
<title>

</title>
</head>

<body>
<h1>
The answer to life, the universe and everything is <!-- 44 - 2
-->
</h1>

<p>
The following image is <!-- @file_name = "pretty_picture.jpg"
-->

<img src="#{@file_name}" alt = "#{@file_name.sub(/\.jpg/,'')}"/>
</p>
</body>
</html>

It validates as correct xml against the XHTML DTD.

Feed it through rubyexml and get...

<html xmlns:xlink="HTTP://www.w3.org/XML/XLink/0.9" xml:lang="en"
xmlns="HTTP://www.w3.org/TR/xhtml">
<head>
<title>

</title>
</head>

<body>
<h1>
The answer to life, the universe and everything is 42
</h1>

<p>
The following image is pretty_picture.jpg

<img src="pretty_picture.jpg" alt="pretty_picture"></img>
</p>
</body>
</html>

Just so blooming simple.

And if you have a big hairy object that knows all the deeper secrets of
life, just change rubyexml to...

REXML::Document::parse_stream( REXML::SourceFactory::create_from( STDIN),
Listener::new( BigHairyObjectThatKnowsTheDeeperSecretsOfLife.new))

And you can refer to all it's instance variables and methods.

It all so blooming simple!



John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

"At first I hoped that such a technically unsound project would
collapse but I soon realized it was doomed to success. Almost
anything in software can be implemented, sold, and even used given
enough determination. There is nothing a mere scientist can say that
will stand against the flood of a hundred million dollars. But there
is one quality that cannot be purchased in this way---and that is
reliability. The price of reliability is the pursuit of the utmost
simplicity. It is a price which the very rich find most hard to
pay." -- C.A.R. Hoare in The Emperor's Old Clothes,
Turing Award Lecture (27 October 1980)
 
D

David Mitchell

How does this work for loops? For example this won't work:

<!-- for @i in 0...5 -->
<img src="#{@i}.jpg"/>
<!-- end -->

How would you suggest I achieve this? Perhaps:

<!-- for @i in 0...5
print "<img src='#{@i}.jpg'/>"
end -->

I can quickly see that becoming an escaping nightmare.

I really like the idea but I would want it to be this flexible.

David

John said:
I have just stuck this on..
http://www.rubygarden.org/ruby?RubyInXML

I like XML.

There is a firm standard, there is a rich toolset to work on it.

I like HTML. It is simply the fastest way to deliver good looking
documents to the widest audience.

No surprise. I like XHTML it _is_ HTML in XML. I can validate my XHTML
documents and know that they conform exactly to the standard, and hence
will render properly on a wide set of browsers.

I love ruby. It is quite the easiest way to program. It has a lovely XML
API called REXML.

I sometimes need to do spreadsheet sort of things. Basically a document
that describes my reasoning and findings, supported by numbers.

Long time ago, when I still did Perl, I found by actual trials that I
was about as fast in Perl as the average guy is using a Spreadsheet.
Sometimes faster, sometimes slower. But for the next hundred data sets,
my perl scripts where a thousand times faster.

So I don't do spreadsheets these days, I write ruby scripts.

So I have taken to combining Ruby & HTML. Sometimes via cgi. It works
for me.

But sometimes I have documents that are more HTML than ruby. So it makes
sense to write them in HTML, with a bit of Ruby embedded. That's where
erb and eruby live.

But I don't like erb and eruby's tags. I can't validate my XHTML.

So add REXML and I present a very small script I call rubyexml. Ruby
Embedded in XML.

#!/usr/bin/ruby -w

require 'rexml/document'
require 'rexml/streamlistener'
require 'pp'

# All eval's are evaluated in the context of an instance of this class.
# Extend this, or add this method to a class of your own.
class Context

def eval_value( value)
value.gsub( %r{ \#\{ ( [^\}]+ ) \} }x) do | match|
instance_eval( $1).to_s
end
end
end


# This does the work.
class Listener
include REXML::StreamListener

def initialize( context)
@context = context
end

def comment( text)
print @context.instance_eval( text)
rescue SyntaxError => details
pp @context
pp text
raise "Failed to compile '#{text}' in context : #{details}"
end

def tag_start(name,attrs)
print "<",name
attrs.each_pair do |key, value|
print " #{key}=\"#{@context.eval_value( value)}\""
end
print ">"
end

def tag_end( name)
print "</", name, ">"
end

def text( text)
print @context.eval_value(text)
end

def cdata( ctext)
text( ctext)
end
end

# This comes for free from REXML. Stream parse an XML document.
REXML::Document::parse_stream( REXML::SourceFactory::create_from(
STDIN), Listener::new( Context.new))


So take a chunk of XHTML...

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"xhtml11.dtd" >
<html xmlns="HTTP://www.w3.org/TR/xhtml"
xmlns:xlink="HTTP://www.w3.org/XML/XLink/0.9"
xml:lang="en" >
<head>
<title>

</title>
</head>

<body>
<h1>
The answer to life, the universe and everything is <!-- 44 - 2 -->
</h1>

<p>
The following image is <!-- @file_name =
"pretty_picture.jpg" -->

<img src="#{@file_name}" alt = "#{@file_name.sub(/\.jpg/,'')}"/>
</p>
</body>
</html>

It validates as correct xml against the XHTML DTD.

Feed it through rubyexml and get...

<html xmlns:xlink="HTTP://www.w3.org/XML/XLink/0.9" xml:lang="en"
xmlns="HTTP://www.w3.org/TR/xhtml">
<head>
<title>

</title>
</head>

<body>
<h1>
The answer to life, the universe and everything is 42
</h1>

<p>
The following image is pretty_picture.jpg

<img src="pretty_picture.jpg" alt="pretty_picture"></img>
</p>
</body>
</html>

Just so blooming simple.

And if you have a big hairy object that knows all the deeper secrets of
life, just change rubyexml to...

REXML::Document::parse_stream( REXML::SourceFactory::create_from(
STDIN), Listener::new( BigHairyObjectThatKnowsTheDeeperSecretsOfLife.new))

And you can refer to all it's instance variables and methods.

It all so blooming simple!



John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

"At first I hoped that such a technically unsound project would
collapse but I soon realized it was doomed to success. Almost
anything in software can be implemented, sold, and even used given
enough determination. There is nothing a mere scientist can say that
will stand against the flood of a hundred million dollars. But there
is one quality that cannot be purchased in this way---and that is
reliability. The price of reliability is the pursuit of the utmost
simplicity. It is a price which the very rich find most hard to
pay." -- C.A.R. Hoare in The Emperor's Old Clothes,
Turing Award Lecture (27 October 1980)
 
J

John Carter

How does this work for loops? For example this won't work:

Yup. Thought about it. Didn't come up with any bright thunks..

<!--
(0...5).inject('') {|m,i|
m+= "<img src=\"#{i}.jpg\"/>"
}
-->

Not bright, but will work.
I really like the idea but I would want it to be this flexible.

Given flexible or simple, I chose simple.

Possibly this is merely a lack of imagination on my part.

Perhaps flexible and simple is possible. I wanted it to be able to
validate as vanilla XHTML.

But that is what Wiki's are for. If I missed something, click on "edit
this page".



John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

Carter's Compass...

I know I'm on the right track when by deleting code I'm adding
functionality.
 
J

James Britt

John Carter wrote:
...
So add REXML and I present a very small script I call rubyexml. Ruby ...


So take a chunk of XHTML...

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"xhtml11.dtd" > ...


<body>
<h1>
The answer to life, the universe and everything is <!-- 44 - 2 -->
</h1>

Question: If your document has instructions for processing, why not use
processing instructions? Why munge the semantics of the comments syntax?

Maybe take a look at how Nitro does this.


--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys
 
D

David Mitchell

Hey,

John said:
Given flexible or simple, I chose simple.
..
Perhaps flexible and simple is possible. I wanted it to be able to
validate as vanilla XHTML.

Ok, I don't think the example I posted falls outside the bounds of
simple. Maybe it makes your script a little more complicated but not the
syntax that the end user must use. I suspect if you hold off the actual
evaluation of the comments until the page is output then you might find
this task simpler. That is, build a single ruby script in memory that
contains the logic for outputting the page, then just eval it at the end.
 
J

John Carter

Ok, I don't think the example I posted falls outside the bounds of simple.
Maybe it makes your script a little more complicated but not the syntax that
the end user must use. I suspect if you hold off the actual evaluation of the
comments until the page is output then you might find this task simpler.

I guess where I started was I wanted to loop on the rows of a table.

<table>
<!-- (0..5).each { |i| -->
<tr>
<td>
#{i} - <!-- @name -->
</td>
<td>
<img src="#{i}.jpg" alt="#{i}"/>
</td>
</tr>
<!-- } -->
</table>


But then I'm holding all kind of state and what happens if I want to
iterate over the columns of the table as well? (Nested loops.)


<table>
<!-- (0..5).each { |i| -->
<tr>
<td>
#{i} - <!-- @name -->
</td>
<!-- (0..6).each do |j| -->
<td>
#{i*10 + j}
</td>
<!-- end -->
</tr>
<!-- } -->
</table>


I'm then operating a fairly hairy state machine in my Listener class or
I'm no longer using the one (longish) line StreamParser.

However, if anyone gets the urge to embellish what I have done, as I say,
that's what Wiki's are for.





John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.
 
J

John Carter

Question: If your document has instructions for processing, why not use
processing instructions? Why munge the semantics of the comments syntax?

The short answer is I had read the XML standard so long ago I forgot
about them....

I knew they existed, but I feared they had some deep meaning I didn't
want to clash with.

I will change to using them.

Thank you,
Maybe take a look at how Nitro does this.

Will do.


John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.
 
D

David Mitchell

I hate to break this to you John, but the link you posted is to a blank
wiki page. Perhaps you got the wrong link? The page
http://www.rubygarden.org/ruby?RubyInXML doesn't exist.

Yes, you end up with a hairy state machine, but it doesn't mean you lose
the one-line StreamParser.

Cheers

David

John said:
Ok, I don't think the example I posted falls outside the bounds of
simple. Maybe it makes your script a little more complicated but not
the syntax that the end user must use. I suspect if you hold off the
actual evaluation of the comments until the page is output then you
might find this task simpler.


I guess where I started was I wanted to loop on the rows of a table.

<table>
<!-- (0..5).each { |i| -->
<tr>
<td>
#{i} - <!-- @name -->
</td>
<td>
<img src="#{i}.jpg" alt="#{i}"/>
</td>
</tr>
<!-- } -->
</table>


But then I'm holding all kind of state and what happens if I want to
iterate over the columns of the table as well? (Nested loops.)


<table>
<!-- (0..5).each { |i| -->
<tr>
<td>
#{i} - <!-- @name -->
</td>
<!-- (0..6).each do |j| -->
<td>
#{i*10 + j}
</td>
<!-- end -->
</tr>
<!-- } -->
</table>


I'm then operating a fairly hairy state machine in my Listener class or
I'm no longer using the one (longish) line StreamParser.

However, if anyone gets the urge to embellish what I have done, as I
say, that's what Wiki's are for.





John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong
later."
From this principle, all of life and physics may be deduced.
 
J

John Carter

I hate to break this to you John, but the link you posted is to a blank wiki
page. Perhaps you got the wrong link? The page
http://www.rubygarden.org/ruby?RubyInXML doesn't exist.

Nope, seems to be all there when I look. From two separate machines.

Yes, you end up with a hairy state machine, but it doesn't mean you lose the
one-line StreamParser.

True. I meant the choice was a hairy state machine xor use the "eat whole
doc" REXML parser instead of stream parser.



John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.
 
D

Daniel Brockman

John Carter said:
I can validate my XHTML documents and know that they conform exactly to
the standard, and hence will render properly on a wide set of browsers.

If only it were that simple...
 
G

George Moschovitis

Hello,

Nitro allready implements this. Have a look at www.nitrohq.com.

The normal way to do this is:

<ul>
<?r for item in items ?>
<li>#{item.title}</li>
<?r end ?>
</ul>

If you include the morphing shader you can also write it as:

<ul>
<li each="item in items">#{item.title}</li>
</ul>

And since Nitro, always gives you one more option (sic), you can do:

<ul>
<% for item in items %>
<li>#{item.title}</li>
<% end %>
</ul>

Suit yourself ;-)

Of course Nitro can do so much more. You can use XSLT on top of you
xhtml page, or the new cool Elements system (similar to JSP tag
libraries).


regards,
George.
 
J

John Carter

You have probably been caught by the RubyGarden tarpit:

http://www.ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-talk/137405

Hmm. Unfortunately I live behind a large corporate firewall. So reverse
DNS is never going to work right.

I tried putting in other wiki links to that page and found that I
had to wade through the entire existing page and find every existing http:
and convert it to HTTP:.

I gave up on that pretty fast.


John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.
 
J

Jim Weirich

Hmm. Unfortunately I live behind a large corporate firewall. So reverse
DNS is never going to work right.

As long as it resolves back to your corporate firewall, it should be OK. I
live behind an extremely unfriendly firewall, and it causes no problems for
the wiki.

And if it is the case that it does cause problem, just define a preferences
setting (which sets a cookie in your browser). That will make you avoid the
tarpit as well.
I tried putting in other wiki links to that page and found that I
had to wade through the entire existing page and find every existing http:
and convert it to HTTP:.

Yea, that is pretty annoying. I disabled that feature tonight. Although
helpful in the short run, it did nothing for long term spam avoidance. You
should be able to post links using http: again.
 
G

George Moschovitis

Except that isn't valid XML anymore; maybe a templating library
along Amrita2 or XTemplate would be more appropriate?

Yeap, it isn't valid XTML and this is discouraged. I find this usefull
when defining for example Email templates. Or if you would like to use
some Rails code with out many changes...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,698
Latest member
LydiaHalle

Latest Threads

Top