Andreas Wenzke said:
Care to elaborate a little on this?
I separate the code into sub-units.
To parse an XML file, the obvious sub-units would be: a
characters source (a source for the Unicode code points),
then, a scanner (lexical analyzer) then, a parser (syntactical
analyzer). But you also need to know whether you want to
create a DOM (document object model) parser or calls to
client functions (like a SAX parser) or something else.
Anyway, between those units, there are interfaces.
Interfaces are also known as APIs and similar to abstract
datatypes, they are sets of documented calls. So I start by
writing them.
Only then, I will start to write implementations of these
calls.
Some German language notes about software design by me:
http://www.purl.org/stefan_ram/pub/aufbau_grosser_programme
The file-reading part is only a very small part of the whole project.
Implementing UTF-8 parsing isn't likely to have any benefits for my
program (strings will be stored "as is" anyway) and probably isn't going
to earn me many bonus points. However, it would probably make things
more complicated as I'd have to distinguish between ANSI and Unicode chars.
The XML specification says:
»All XML processors MUST accept the UTF-8 and UTF-16
encodings of Unicode [Unicode]« (uppercase emphasis
was done by the W3C, not by me [Stefan Ram])
http://www.w3.org/TR/REC-xml/
(ISO-8859-1 processing, on the other hand is not required.)
Reading the XML specification and then writing a correct
implementation is a huge project. Now, you tell me this is
only a very small part of the whole project. You are to use C++,
but then are not allowed to use C++, you are to read XML,
but then are not required to read XML as it's specified.
Such an attitude of doing a huge project in such a messy way
(calling »C++« what is not C++, calling »XML« what is not XML)
seems to be highly inappropriate for a scientific university.
It even would be inappropriate for any other teaching situation,
like, say, a »university of applied science« (»Fachhochschule«).
Let me end this post by a quote from Rob Walling:
»I've known smart developers who don't pay attention to detail.
The result is misspelled database columns, uncommented code,
projects that aren't checked into source control,
software that's not unit tested, unimplemented features,
and so on. All of these can be easily dealt with if
you're building a Google mash-up or a five page website.
But in corporate development each of these screw-ups is
a death knell.
So I'll say it very loud, but I promise I'll only say it once:
I have /never, ever, ever/ seen a great software
developer who does not have amazing attention to detail.«