Hi Shiperton,
Since you posted in an Excel newsgroup, I had assumed that you
would be willing to run your own VBA macros, but you are still
looking to generate all the code you don't want and then try
to strip it out afterwards.
As I understand your question, you are specifically interested in
converting an Excel worksheet into HTML. My macros are set
up to work from a selection but you can change them to whatever
you want on your own computer.
You looked at tidy
http://tidy.sourceforge.net/
though what they gave you was a link to a zip file.
I use HTML-Kit which you will see as a link on the tidy webpage. It incorporates
Tidy and I am quite happy with it. But HTML-Kit is for
editing HTML code, and (PF9) checking it's syntax (using tidy). So I
doubt very much that tidy was what you are looking for as
you would have to provide it with your completed HTML code,
and it does not strip out Office code.
The Office 2000 HTML filter should work with the HTML output
from Excel 2002 just like it does with the HTML output from
Excel 2000. Basically it has nothing to do with Excel, you run
it afterwards. It is just going to eliminate the round-tripping code
(Excel --> HTML --> Word --> Access --> Excel --> Word ), it is
not going to eliminate the extra garbage to maintain fontsize,
cell widths etc that you do not want.
If you are into writing your own HTML, I would once more suggest
taking a look at my webpage on HTML conversion from Excel
http://www.mvps.org/dmcritchie/excel/xl2html.htm
I have generated some sample HTML output from an Excel file
204 KB using Save As HTML from Excel, using the macros
at my site was 59 KB, you can check out the files yourself at:
http://www.mvps.org/dmcritchie/excel/xl2html.htm#comparison
Instructions to install macro coding
http://www.mvps.org/dmcritchie/excel/getstarted.htm
The code is at
http://www.mvps.org/dmcritchie/excel/code/xl2htmlx.txt
I write my own HTML code and the macro to generate the tables
needed without gray row and column headings XL2HTML
or with the headings from macro XL2HTMLx
based on the current selection.
Most of the tables on my pages were generated with earlier versions of
the macro. I broke down and did add color, and alignment justifications,
which is a simple tradeoff compared to 3 to 10 times the amount from
Excel or Front Page.
The current Microsoft Office solution is to generate all the
horrendous code with all the round-tripping code and then
run the Office 2000 HTML Filter
to remove the round tripping code. But it is still going
to have the junk to make it look just like an Excel page,
overriding formatting that HTML generally does much better
left to it's own devices..