D
Daniel Carrera
Hi all,
I have a difficult problem and I need some smart people to give me a hand.
So I knew where to go for that.
I'm trying to figure out how to write a parser fo BibTeX files. It's not
easy. A single BibTeX entry might look like this:
@BOOK{texbook,
author = "Donald O'brian",
title = "The {{\TeX}book}",
publisher = 'Addison-Wesley',
year = 1984,
key = {Don's key}
}
I think you can see the problem.
There is a nested collection of {squigly} brackers, as well as "double
quotes" and 'single quotes'. I'm not sure, either how to represent this
structure, nor how to parse it.
If I only had to deal with {brackets} I could use an n-ary tree. And to
parse it, I would start with one node, move one character at a time.
Every time I see a { I'd make a new node. Every time I saw a } I would
come back up.
Now, when you and "double" quotes, the problem becomes more complicated,
but doable. I could first extract all the quotes and use an array where
quoted and non-qutoed text alternates (for instance) and then parse using
the brackets to make an n-ary tree.
But if I have 'single' quotes also, things can get very complicated. I
will have to deal with thigns like:
{Dan's book}
and
"O'brian"
And at this point I am truly at a loss.
I hope one of the more experienced programmers here can offer some
insight.
Thanks a lot,
I have a difficult problem and I need some smart people to give me a hand.
So I knew where to go for that.
I'm trying to figure out how to write a parser fo BibTeX files. It's not
easy. A single BibTeX entry might look like this:
@BOOK{texbook,
author = "Donald O'brian",
title = "The {{\TeX}book}",
publisher = 'Addison-Wesley',
year = 1984,
key = {Don's key}
}
I think you can see the problem.
There is a nested collection of {squigly} brackers, as well as "double
quotes" and 'single quotes'. I'm not sure, either how to represent this
structure, nor how to parse it.
If I only had to deal with {brackets} I could use an n-ary tree. And to
parse it, I would start with one node, move one character at a time.
Every time I see a { I'd make a new node. Every time I saw a } I would
come back up.
Now, when you and "double" quotes, the problem becomes more complicated,
but doable. I could first extract all the quotes and use an array where
quoted and non-qutoed text alternates (for instance) and then parse using
the brackets to make an n-ary tree.
But if I have 'single' quotes also, things can get very complicated. I
will have to deal with thigns like:
{Dan's book}
and
"O'brian"
And at this point I am truly at a loss.
I hope one of the more experienced programmers here can offer some
insight.
Thanks a lot,