Windows ActiveState Perl: MSXML transformNodeToObject finally succeeded



I have finally found a solution for my long-standing problem
with Xslt-transformation under Windows ActiveState Perl and
I thought that other people might have the same problem so I
would like to share my solution with the group. I hope you
don't mind this long post, here is the story:

I had read an article by Shawn Ribordy on
('MSXML, It's Not Just for VB Programmers Anymore')
in which he described how to do Xslt-transform on XML-files
using the "transformNodeToObject" method of a Win32::OLE

The following lines are copied straight from his article:

"Great...", I thought, "...let's try this at home".

So I sat down at my Windows XP computer (with Activestate
v5.8.7 and the latest Msxml2.DOMDocument.4.0/SP2 installed),
fired up notepad.exe and pasted Shawn's example straight
into my perl program, and his example worked -- but that
was as far as it got!

When I started to use my own xslt-stylesheet, things went
seriously wrong. Well, I knew that my own xslt-stylesheets
had some problems, but I hoped (and expected) that the
transformNodeToObject() method would throw something useful
at me (which unfortunately it did not!) The problem was
that Shawn's example did not have any error handling

I googled every possible combination of (perl, xslt, msxml,
win32, errorhandling) under the sun and I searched CPAN to
destruction, but to no avail.

Finally, after months of "pulling out my hair" I finally
stumbled upon the following variables/functions which
allowed me to correctly and reliably test for (almost)
every possible error condition.
- Win32::OLE::LastError()
- $doc->{parseError}->{reason}
- $doc->{parseError}->{line};
- $doc->{parseError}->{linePos};
- $doc->{parseError}->{srcText};

With the improved error-handling, I was now able to
experiment with different situations in my xslt-stylesheets.
Here is what I experienced:

XML-input-files: Use <?xml version='1.0' encoding='...'?>
In your XML-Input-Files, always specify the encoding in the
first line <?xml version='1.0' encoding='...'?>. This is
'ISO-8859-1' for plain old ASCII, but could also be 'UTF-8'
or 'UTF-16' if your XML-Input-File is set-up this way.
If you don't respect the correct encoding, you will end up
with an error ("An invalid character was found in text

XSLT-files: Use <?xml version='1.0' encoding='...'?>
In your XSLT-Files, always specify the encoding in the first
line <?xml version='1.0' encoding='...'?>.
Strictly speaking it is not necessary to specify the encoding
in the first line of the XSLT-file, a simple
<?xml version='1.0'?> is enough. but by doing so, you let
Microsoft guess the encoding, which it does correctly in 95%
of the cases. However, in the remaining 5% of the cases,
Microsoft gets it wrong and you end up with an error
("Switch from current encoding to specified encoding not
supported"). Consequently, I suggest to always specify the
actual encoding directly in the first line of the XSLT-file.

XSLT-files: Use <xsl:eek:utput encoding='ISO-8859-1'/>
It is more convenient to use
<xsl:eek:utput encoding='ISO-8859-1'/> in your XSLT-file. This
works very well, even with accented characters and Umlaute.
You can use other encodings (such as
<xsl:eek:utput encoding='UTF-8'/>), and the XML-Output-File
will be displayed correctly in Internet Explorer, but then
you will find it inconvenient that Notepad does not display
the XML-Output-file correctly any more.

XSLT-files: Use <xsl:eek:utput method='xml'/>
If you want to generate Html, you can do so easily by
generating an XML file with its tags in Html-syntax
(such as <p>, <table>, <hr/>, etc...). However, do not
attempt to use <xsl:eek:utput method='html'/> in your XSLT-file,
use <xsl:eek:utput method='xml'/> instead (even if you want to
generate 'Html', think of 'XHtml' and use
<xsl:eek:utput method='xml'/>). You may end in up tears when you
discover that by using <xsl:eek:utput method='html'/>, your
encoding does not work the way you want to. And you might
even discover that '&#160' and/or '&nbsp' will cause an error
after having erased your output-file! - Why is that so? - I
don't know.
The ultimate rule is: Never use 'html' as your method in
<xsl:eek:utput method='...'/>, you must use
<xsl:eek:utput method='xml'/> at all times.

XSLT-files: Use <xsl:eek:utput indent='yes'/>
This advice is more for convenience than anything else. If
you specify <xsl:eek:utput indent='yes'/> and you look at your
XML-Output-file with Notepad, you will find that its
linebreaks are more conveniently located than they would
have been without <xsl:eek:utput indent='yes'/>. It is still
not perfect, but it is better. So finally, the
<xsl:eek:utput... /> line in your XSLT-file should look like
<xsl:eek:utput method='xml' indent='yes' encoding='ISO-8859-1'/>

In XSLT-files: Use ' ' instead of '&nbsp;'
The instruction '&nbsp;' does not work with MSXML. If you
want your XSLT-file to generate a non-breaking space, use
' ' instead.

....that's the end of my list.

For those of you who want to try, here is a test program:

use strict;
use warnings;
use Win32::OLE;

my $MxErr;

testcase(1, 'transformation succeeds');
testcase(2, 'unbalanced tags in *.xml');
testcase(3, 'unbalanced tags in *.xsl');
testcase(4, 'syntax error in *.xsl');
testcase(5, 'output method=html fails');

sub testcase {
my ($Case, $Description) = @_;


print "Testcase no $Case: $Description\n";

print "\n\nThis is the xml file 'test$Case.xml':\n";
print "=============================================\n";
system("type test$Case.xml");
print "=============================================\n";

print "\n\nThis is the xsl file 'trf$Case.xsl':\n";
print "=============================================\n";
system("type trf$Case.xsl");
print "=============================================\n";

my $success = TransformXslt(xml => "test$Case.xml",
xslt => "trf$Case.xsl",
out => "output$Case.html");

if ($success) {
print "\n\nTransformXslt succeeded, result:\n";
print "=========================================\n";
system("type output$Case.html");
print "=========================================\n";
else {
print "\n\nProblem with TransformXslt:\n";
print "=========================================\n";
print "$MxErr\n";
print "=========================================\n";
print "\n";

sub makefiles {
my ($Case) = @_;

my $XData = ($Case == 2 ? 'data1' : 'data');
my $XTitle = ($Case == 3 ? 'title1' : 'title');
my $XFunc = ($Case == 4 ? 'r([?' : '.');
my $XMethod = ($Case == 5 ? 'html' : 'xml');

open OFL, '>', "test$Case.xml"
or die "err write test$Case.xml: $!";
print OFL qq{<?xml version="1.0"}.
qq{ encoding="ISO-8859-1"?>\n};
print OFL qq{<index>\n};
print OFL qq{ <data>aaaa</$XData>\n};
print OFL qq{ <data>bbbb</data>\n};
print OFL qq{</index>\n};
close OFL;

open OFL, '>', "trf$Case.xsl"
or die "err write trf$Case.xsl: $!";
print OFL qq{<?xml version="1.0"}.
qq{ encoding="ISO-8859-1"?>\n};
print OFL qq{<xsl:stylesheet version="1.0"\n};
print OFL qq{xmlns:xsl="}.
print OFL qq{ <xsl:eek:utput method="$XMethod" indent=}.
qq{"yes" encoding="ISO-8859-1"/>\n};
print OFL qq{ <xsl:template match="/">\n};
print OFL qq{ <html>\n};
print OFL qq{ <body>\n};
print OFL qq{ <title>Test</$XTitle>\n};
print OFL qq{ <p>nonbreaking space</p>\n};
print OFL qq{ <hr/>\n};
print OFL qq{ <xsl:for-each select="index/data">\n};
print OFL qq{ <p>Test: *** <xsl:value-of}.
qq{ select="$XFunc"/> ***</p>\n};
print OFL qq{ </xsl:for-each>\n};
print OFL qq{ </body>\n};
print OFL qq{ </html>\n};
print OFL qq{ </xsl:template>\n};
print OFL qq{</xsl:stylesheet>\n};
close OFL;

sub TransformXslt {
my ($xml_input_file, $xslt_file, $xml_output_file)
= ($_[1], $_[3], $_[5]);
$MxErr = '';
my $DomDocument = 'Msxml2.DOMDocument.4.0';

# Load the document (Xml-Input-File)
my $xml_input_doc = Win32::OLE->new($DomDocument);
unless ($xml_input_doc) {
$MxErr = qq{Mx-0040: Couldn't create Win32::OLE}.
qq{ $DomDocument for XML-Input-File}.
qq{ "$xml_input_file"};
return undef;

$xml_input_doc->{async} = 'False';
$xml_input_doc->{validateOnParse} = 'True';
if (!$xml_input_doc->Load($xml_input_file)) {
my $Rs = $xml_input_doc->{parseError}->{reason};
$Rs =~ s/\r//; chomp $Rs;
my $Ln = $xml_input_doc->{parseError}->{line};
my $Ps = $xml_input_doc->{parseError}->{linePos};
my $Tx = $xml_input_doc->{parseError}->{srcText};
$MxErr = qq{Mx-0060: XML-Input-File}.
qq{ "$xml_input_file"}.
qq{ did not load for $DomDocument at line}.
qq{ $Ln, pos $Ps, reason: $Rs, text: '$Tx'};
return undef;

# create Output-object
my $xml_output_doc = Win32::OLE->new($DomDocument);
unless ($xml_output_doc) {
$MxErr = qq{Mx-0055: Couldn't create Win32::OLE}.
qq{ $DomDocument for XML-Output-File}.
qq{ "$xml_output_file"};
return undef;

# Load the Stylesheet (Xsl-File)
my $xslt_doc = Win32::OLE->new($DomDocument);
unless ($xslt_doc) {
$MxErr = qq{Mx-0050: Couldn't create Win32::OLE}.
qq{ $DomDocument for XSLT-File "$xslt_file"};
return undef;

$xslt_doc->{async} = 'False';
$xslt_doc->{validateOnParse} = 'True';
if (!$xslt_doc->Load($xslt_file)) {
my $Rs = $xslt_doc->{parseError}->{reason};
$Rs =~ s/\r//; chomp $Rs;
my $Ln = $xslt_doc->{parseError}->{line};
my $Ps = $xslt_doc->{parseError}->{linePos};
my $Tx = $xslt_doc->{parseError}->{srcText};
$MxErr = qq{Mx-0070: XSLT-file "$xslt_file" did not}.
qq{ load for $DomDocument at line}.
qq{ $Ln, pos $Ps, reason: $Rs, text: '$Tx'};
return undef;

# Do the work: transform xml using an xslt stylesheet
if (Win32::OLE::LastError()) {
my $Rs = Win32::OLE::LastError(); $Rs =~s/\s+/ /g;
$MxErr = qq{Mx-0080: XSLT-file "$xslt_file" has}.
qq{ syntax-errors for $DomDocument, }.
qq{reason: $Rs};
return undef;

# Save the done work to the output-file
if (Win32::OLE::LastError()) {
my $Rs = Win32::OLE::LastError(); $Rs =~s/\s+/ /g;
$MxErr = qq{Mx-0090: Can't save to output-file}.
qq{ "$xml_output_file" for $DomDocument, }.
qq{reason: $Rs};
return undef;

# "-z" tests for empty file, which is considered to be
# a fatal error
if (-z $xml_output_file) {
$MxErr = qq{Mx-0100: A fatal error occured in either}.
qq{ your XSLT-file "$xslt_file", or in}.
qq{ your XML-input-file "$xml_input_file",}.
qq{ the output-file "$xml_output_file" will}.
qq{ be empty.};
return undef;

return 1;


tuser said:
I have finally found a solution for my long-standing problem
with Xslt-transformation under Windows ActiveState Perl and
I thought that other people might have the same problem so I
would like to share my solution with the group.

Thank you very much for posting this. While I doubt that I will ever
have any need for your specific solution, I hope that you will serve as
an example to others. Many times when researching a problem, I will
find USENET posts from people with the same problem as me, but never any
hint of how it was eventually solved. I sincerely wish that others will
remember this post and share whatever solutions they find for their


I have finally found a solution for my long-standing problem
with Xslt-transformation under Windows ActiveState Perl and
I thought that other people might have the same problem so I
would like to share my solution with the group. I hope you
don't mind this long post, here is the story:

I had read an article by Shawn Ribordy on
('MSXML, It's Not Just for VB Programmers Anymore')
in which he described how to do Xslt-transform on XML-files
using the "transformNodeToObject" method of a Win32::OLE
Good job! You have used a bunch of modules.
Style sheet transforms? I'm willing to bet you don't know
a rats ass about markup at all !! You've quoted code and
folks that do though...
I wouldn't hire you to clean the toilets!

Tad McClellan

sub TransformXslt {
my ($xml_input_file, $xslt_file, $xml_output_file)
= ($_[1], $_[3], $_[5]);

An "array slice" would make that much prettier:

my ($xml_input_file, $xslt_file, $xml_output_file) = @_[1,3,5];


Tad said:
sub TransformXslt {
my ($xml_input_file, $xslt_file, $xml_output_file)
= ($_[1], $_[3], $_[5]);

An "array slice" would make that much prettier:

my ($xml_input_file, $xslt_file, $xml_output_file) = @_[1,3,5];

Thanks for your input, I haven't thought of using array slices in perl
I will use that in my program.

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Latest member

Latest Threads
