Input file donkey work.

M

Malcolm

I volunteered to write a program for someone to calculate Wagner trees.
(A Wagner tree is a phylogenetic tree made by comparing lists of traits,
which are coded as integers, and for which the basal or most primitive value
is known).
It took 2-3 hours to get a tree right, based on this input.

typedef struct
{
int ntraits; /* number of traits */
int nspecies; /* no species (excluding ancestor) */
int *ancestor; /* traits for ancestor */
int **traits; /* traits for species */
char **names; /* names of species */
} WAG_INPUT;

Now of course the user can't fill up this structure directly, so I wondered
if it might be nice to define a file format.

Example file

Wagner Tree file
Version 1.0

Traits
No digits
Tail length
Skin Colour = Grey, Dark Grey, Black, Brown

Species
Oinkus woinkus, 2, 4, Black
Equus geegee, 1, 10, Dark Grey
Moocow 2, 8, Brown
Ancestor 5, 1, Grey

Now coding this up is proving to be a complete nightmare, because of course
input is hostile and you have to deal with anything the user throws at you.
I wonder if I am missing a trick (I hardly ever use text input files).
Otherwise it looks like the imput parser is going to be substantially bigger
than the rest of the program.
 
N

Nick Landsberg

Malcolm said:
I volunteered to write a program for someone to calculate Wagner trees.
(A Wagner tree is a phylogenetic tree made by comparing lists of traits,
which are coded as integers, and for which the basal or most primitive value
is known).
It took 2-3 hours to get a tree right, based on this input.

typedef struct
{
int ntraits; /* number of traits */
int nspecies; /* no species (excluding ancestor) */
int *ancestor; /* traits for ancestor */
int **traits; /* traits for species */
char **names; /* names of species */
} WAG_INPUT;

Now of course the user can't fill up this structure directly, so I wondered
if it might be nice to define a file format.

Example file

Wagner Tree file
Version 1.0

Traits
No digits
Tail length
Skin Colour = Grey, Dark Grey, Black, Brown

Species
Oinkus woinkus, 2, 4, Black
Equus geegee, 1, 10, Dark Grey
Moocow 2, 8, Brown
Ancestor 5, 1, Grey

Now coding this up is proving to be a complete nightmare, because of course
input is hostile and you have to deal with anything the user throws at you.
I wonder if I am missing a trick (I hardly ever use text input files).
Otherwise it looks like the imput parser is going to be substantially bigger
than the rest of the program.

Yes, this is grunt work and not as much fun.
<OT>
To save you the time, see if there's a Freeware or
PD XML parser available to you. I hear that
you could define a data schema in it and
have it read the data. But, on second thought,
having to have an unsophisticated user write
XML is ... never mind.
</OT>
 
C

CBFalconer

Malcolm said:
I volunteered to write a program for someone to calculate Wagner
trees. (A Wagner tree is a phylogenetic tree made by comparing
lists of traits, which are coded as integers, and for which the
basal or most primitive value is known).
It took 2-3 hours to get a tree right, based on this input.

typedef struct
{
int ntraits; /* number of traits */
int nspecies; /* no species (excluding ancestor) */
int *ancestor; /* traits for ancestor */
int **traits; /* traits for species */
char **names; /* names of species */
} WAG_INPUT;

Now of course the user can't fill up this structure directly, so
I wondered if it might be nice to define a file format.

Example file

Wagner Tree file
Version 1.0

Traits
No digits
Tail length
Skin Colour = Grey, Dark Grey, Black, Brown

Species
Oinkus woinkus, 2, 4, Black
Equus geegee, 1, 10, Dark Grey
Moocow 2, 8, Brown
Ancestor 5, 1, Grey

Now coding this up is proving to be a complete nightmare, because
of course input is hostile and you have to deal with anything the
user throws at you. I wonder if I am missing a trick (I hardly
ever use text input files). Otherwise it looks like the imput
parser is going to be substantially bigger than the rest of the
program.

It doesn't sound that bad, but the overriding consideration in any
input coding is that IT IS GOING TO CHANGE. So you want to build
something that is easily configured. It looks like you want to
convert name into an index value, and that may be limiting. A
hash table may handle that conveniently. Then you need some sort
of list of allowable fields, and default values for missing
fields. Then you need lists of allowable entries to go with
fields. This can probably be handled with enums and range limits,
but that is awkward (in C) for configuration by an input file.

You will probably need some sort of input order restriction, i.e.
an entry without all ancestors defined is faulty.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,141
Messages
2,570,813
Members
47,357
Latest member
sitele8746

Latest Threads

Top