J
Juliano
Hello, everybody.
I'm a linguist with practical skills on computers/programming.
We've been working with an ontology at my department, and now I need
to create a GUI viewer for the flat file we have.
I tried to write an Ontology class which manages the data read and
parsed from the flat file, but it takes a relatively long time.
Besides, we have plans to set up a website for online display of said
ontology. So, I have been being pushed towards changing the basic plan
and build a DB so that data access will be faster and easier for both
the desktop GUI and the web app. Right now, I'm trying to work with
sqlite, since it can be used as a separate file for the GUI and as a
DB for Django (which may be the choice for the web interface).
I have been redaing some books on DBs but I kind of get stuck when it
comes to the normalization and the planning of the tables. The problem
is that we have basically four fields that can be arranged in a tree-
like structure. Eg:
Concept
|----- Slot
| `------ Facet
| `------ Filler
`----- Slot
`------ Facet
`------ Filler
`------ Filler
...
So, for ONE *concept*, we have, usually, MANY *slots*, each *slot* has
ONE *facet*, and each *facet* can have MORE THAN ONE *filler*.
Besides, some *slots* and *fillers* are themselves *concepts*,
creating a sort of recursive reference.
<begin table>
line_no concepts slots facets fillers
------------------------------------------------------------------------------
00000 ABANDON DEFINITION VALUE "to leave or desert something or
someone"
00001 ABANDON IS-A VALUE EXIT
00002 ABANDON LEXE MAP-LEX "leave behind-V1"
00003 ABANDON LEXE MAP-LEX abandon-V1
(...)
97420 ZULU DEFINITION VALUE "a language or dialect spoken in south
africa and others"
97421 ZULU INSTANCE-OF VALUE OTHER-*****-KORDOFANIAN-LANGUAGE
97422 ZULU LANGUAGE-OF INV LESOTHO
97423 ZULU LANGUAGE-OF INV SOUTH-AFRICA
<end table>
I tried to create index tables for concepts, slots, facets and
fillers, which gave me the following table:
<begin table>
line_no concepts slots facets fillers
------------------------------------------------------------------------------
00000 cn_00000 sl_00048 fc_00007 fl_07349
00001 cn_00000 cn_02605 fc_00007 cn_01768
00002 cn_00000 sl_00121 fc_00002 fl_04329
00003 cn_00000 sl_00121 fc_00002 fl_15009
(...)
97420 cn_05429 sl_00048 fc_00007 fl_01340
97421 cn_05429 cn_02493 fc_00007 cn_03526
97422 cn_05429 cn_02750 fc_00001 cn_02816
97423 cn_05429 cn_02750 fc_00001 cn_04580
<end table>
(cn_XXXXX from concept index, sl_XXXXX from slot index,
fc_XXXXX from facet index, fl_XXXXX from filler index.)
As you can see, only concepts and facets are populated by their own
type of data.
Whereas slots and fillers can be populated by their own types or by
concepts.
What would be a good way to create tables for this situation?
In fact, this is the first time I've ever tried to create a DB, so I'm
completely lost.
I'm looking forward to a reply...
Thank you very much,
Juliano
I'm a linguist with practical skills on computers/programming.
We've been working with an ontology at my department, and now I need
to create a GUI viewer for the flat file we have.
I tried to write an Ontology class which manages the data read and
parsed from the flat file, but it takes a relatively long time.
Besides, we have plans to set up a website for online display of said
ontology. So, I have been being pushed towards changing the basic plan
and build a DB so that data access will be faster and easier for both
the desktop GUI and the web app. Right now, I'm trying to work with
sqlite, since it can be used as a separate file for the GUI and as a
DB for Django (which may be the choice for the web interface).
I have been redaing some books on DBs but I kind of get stuck when it
comes to the normalization and the planning of the tables. The problem
is that we have basically four fields that can be arranged in a tree-
like structure. Eg:
Concept
|----- Slot
| `------ Facet
| `------ Filler
`----- Slot
`------ Facet
`------ Filler
`------ Filler
...
So, for ONE *concept*, we have, usually, MANY *slots*, each *slot* has
ONE *facet*, and each *facet* can have MORE THAN ONE *filler*.
Besides, some *slots* and *fillers* are themselves *concepts*,
creating a sort of recursive reference.
<begin table>
line_no concepts slots facets fillers
------------------------------------------------------------------------------
00000 ABANDON DEFINITION VALUE "to leave or desert something or
someone"
00001 ABANDON IS-A VALUE EXIT
00002 ABANDON LEXE MAP-LEX "leave behind-V1"
00003 ABANDON LEXE MAP-LEX abandon-V1
(...)
97420 ZULU DEFINITION VALUE "a language or dialect spoken in south
africa and others"
97421 ZULU INSTANCE-OF VALUE OTHER-*****-KORDOFANIAN-LANGUAGE
97422 ZULU LANGUAGE-OF INV LESOTHO
97423 ZULU LANGUAGE-OF INV SOUTH-AFRICA
<end table>
I tried to create index tables for concepts, slots, facets and
fillers, which gave me the following table:
<begin table>
line_no concepts slots facets fillers
------------------------------------------------------------------------------
00000 cn_00000 sl_00048 fc_00007 fl_07349
00001 cn_00000 cn_02605 fc_00007 cn_01768
00002 cn_00000 sl_00121 fc_00002 fl_04329
00003 cn_00000 sl_00121 fc_00002 fl_15009
(...)
97420 cn_05429 sl_00048 fc_00007 fl_01340
97421 cn_05429 cn_02493 fc_00007 cn_03526
97422 cn_05429 cn_02750 fc_00001 cn_02816
97423 cn_05429 cn_02750 fc_00001 cn_04580
<end table>
(cn_XXXXX from concept index, sl_XXXXX from slot index,
fc_XXXXX from facet index, fl_XXXXX from filler index.)
As you can see, only concepts and facets are populated by their own
type of data.
Whereas slots and fillers can be populated by their own types or by
concepts.
What would be a good way to create tables for this situation?
In fact, this is the first time I've ever tried to create a DB, so I'm
completely lost.
I'm looking forward to a reply...
Thank you very much,
Juliano