On Sun, 11 Sep 2005, Roy Schestowitz wrote (seen on alt.html):
[...]
* Fragment the output as requires, probably by hand (WYSIWYG programs
like Word have no notion of structure or semantics)
This isn't by any means aimed at you personally, but your posting
triggered a response from me, and it looks as if knowledge is proceeding
backwards.
Proper use of MS Word uses Styles, oriented towards the structure of the
document. (If I had my way, I'd rip the direct styling buttons out of the
main menu of Word, and hide them away in an Advanced Users menu). Such
properly-made Word documents are reasonably capable of being converted
well to structural HTML, and a stylesheet suitable for web use can then be
applied (it usually won't be the same "style sheet" (= style template) as
would be suitable for a printed Word document, of course!).
I had some experience, around 1997-8, with the (payware) rtftohtml program
- subsequently renamed and marketed under the company name Logictran - it
had this pretty-much sorted out. I must admit I haven't got experience of
it since the change of name, but I can say that the principles of the
original program seemed to what I was looking for, unlike most of the
other pseudo-WYSIWYG garbage from other places (that offended all sense of
what is suitable for the WWW).
With that rtftohtml program, decently structured Word could be turned into
decently structured HTML, and split on chapter or section headings quite
automatically, with HTML indexes and table of contents generated
automatically. OK, there were some rough edges, but at least the
principles showed up just fine. I find it sad that some 7 years later we
seem to have fallen back to the stone age of direct styling and
pseudo-WYSIWYG in most of the Word conversions that I have seen.
[Note - there are other programs called rtftohtml or rtf2html - it may be
that some of them do a similar job, I can't speak for or against them,
I'm just commenting as a reasonably satistfied user of version 4 of this
particular program from around 1998 onwards.]