Generate html report from directory of XML docs

S

sysxperts

Hi,

I have a mail server that generates archives in a directory for every
message sent or received and each archive has an associated XML file
with <sender>, <receiver>, <subject> and other email related tags and
all files have same exact format. I would like to generate reports in
a web page based upon the content of these XML files but not sure where
to start. I know how to make an individual XML file display in browser
by linking a stylesheet with xsl and xslt but not sure how to go about
traversing the directory to collect the info and display aggregate
content into a report. Any pointers on a best approach and appropriate
documentation would be greatly appreciated. All of the tutorials and
samples I've located point to transforming single xml source files and
do not deal nor give any direction for hanling multiple xml sources.
 
W

William Park

Hi,

I have a mail server that generates archives in a directory for every
message sent or received and each archive has an associated XML file
with <sender>, <receiver>, <subject> and other email related tags and
all files have same exact format. I would like to generate reports in
a web page based upon the content of these XML files but not sure where
to start. I know how to make an individual XML file display in browser
by linking a stylesheet with xsl and xslt but not sure how to go about
traversing the directory to collect the info and display aggregate
content into a report. Any pointers on a best approach and appropriate
documentation would be greatly appreciated. All of the tutorials and
samples I've located point to transforming single xml source files and
do not deal nor give any direction for hanling multiple xml sources.

Shell is your friend, especially when dealing with "file". Post sample
input you have and output you want. And, we'll go from there.

--
William Park <[email protected]>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/
 
S

sysxperts

Thanks for the quick reply,

Unfortunately don't currently have access to files in question but I
think I can elaborate further based upon your response.
As noted previously each xml source contains tags for <sender>,
<receiver> and so on, and there on average 5,000 of these files
generated per day. So lets say I don't really care about formatting of
text other than having the contents of all files output to html in
tabular format so I get a listing of all emails transmitted and grouped
by <sender> so I end up with something like. Source file naming
convention is ARCH<msg-id>.XML.

Sent By Received By Subject > Headings
user1 bob test
user1 joe test2
user2 bob ..........

thx
 
W

William Park

sysxperts said:
Thanks for the quick reply,

Unfortunately don't currently have access to files in question but I
think I can elaborate further based upon your response.
As noted previously each xml source contains tags for <sender>,
<receiver> and so on, and there on average 5,000 of these files
generated per day. So lets say I don't really care about formatting of
text other than having the contents of all files output to html in
tabular format so I get a listing of all emails transmitted and grouped
by <sender> so I end up with something like. Source file naming
convention is ARCH<msg-id>.XML.

Sent By Received By Subject > Headings
user1 bob test
user1 joe test2
user2 bob ..........

Printing out is pretty easy. But, I can't parse the input if I don't
have it.

--
William Park <[email protected]>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/
 
P

Philippe Poulard

hi,

Hi,

I have a mail server that generates archives in a directory for every
message sent or received and each archive has an associated XML file
with <sender>, <receiver>, <subject> and other email related tags and
all files have same exact format. I would like to generate reports in
a web page based upon the content of these XML files but not sure where
to start. I know how to make an individual XML file display in browser
by linking a stylesheet with xsl and xslt but not sure how to go about
traversing the directory to collect the info and display aggregate
content into a report. Any pointers on a best approach and appropriate
documentation would be greatly appreciated. All of the tutorials and
samples I've located point to transforming single xml source files and
do not deal nor give any direction for hanling multiple xml sources.

hmmm, it is a job for Active Tags !!!

Active Tags allow you to mix easily functional tags with litterals,
almost like XSLT does, but you can simply aggregate several XML files in
a single one

have a look at the RefleX web site, there are complete examples in the
tutorial section
http://reflex.gforge.inria.fr/
RefleX is a Java implementation of Active Tags

a simple Active Sheet looks like this :

<?xml version="1.0" encoding="iso-8859-1"?>
<xcl:active-sheet
xmlns:io="http://www.inria.fr/xml/active-tags/io"
xmlns:xcl="http://www.inria.fr/xml/active-tags/xcl"<!-- where are the XML files -->
<xcl:set name="base-dir" value="{ io:file( 'file:///path/to/dir' ) }"/>

<!-- create a single document that contains all the others -->
<xcl:document name="all" type="SAX">
<!--using SAX is better to process a large number of files-->

<document><!--the root element, as a litteral-->
<!-- select all XML files under the base dir -->
<xcl:for-each name="file" select="{
$base-dir//*[@io:is-file][@io:extension='xml'] }">
<xcl:parse name="xml" source="{ $file }"/>
<!-- put the parsed file in the global document -->
{ $xml }
</xcl:for-each>
</document>
<!--XML to HTML-->
<xcl:transform output="file:///path/to/output.html" source="{ $all
}" stylesheet="file:///path/to/stylesheet.xsl"/>
<!--if you omit the stylesheet attribute, the XML document
will be simply serialized-->
</xcl:active-sheet>
Of course, by selecting the files, you can create an HTML output per day
or per XML file or a single one according to your needs

There are means to perform the same within a Web server instead of from
the command line like shown above

NOTE : for the moment, the value of the "type" attribute in
<xcl:document> is "SAX" or "DOM", but it might evolve to "event" or
"tree" in a future release, as specified in the documentation of the XCL
module

enjoy !

--
Cordialement,

///
(. .)
--------ooO--(_)--Ooo--------
| Philippe Poulard |
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,001
Messages
2,570,254
Members
46,851
Latest member
CliftonCor

Latest Threads

Top