L
Lee
Definitely a newbie question, so please bear with me.
I'm reading "Programming the Semantic Web" by Segaran, Evans, and Tayor.
It's about the Semantic Web BUT it uses python to build a "toy" triple
store claimed to have good performance in the "tens of thousands" of
triples.
Just in case anybody doesnt know what an RDF triple is (not that it
matters for my question) think of it as an ordered 3 tuple representing
a Subject, a Predicate, and an Object eg: (John, loves, Mary) (Mary,
has-a, lamb) {theSky, has-color,blue}
To build the triple store entirely in Python, the authors recommend
using the Python hash. Three hashes actually (I get that. You
want to have a hash with the major index being the Subject in one hash,
the Predicate in another hash, or the Object for the third hash)
He creates a class SimpleGraph which initializes itself by setting the
three hashes names _spo, _pos, and _osp thus
class SimpleGraph;
def __init__(self);
self._spo={};
self._pos=();
self._osp={};
So far so good. I get the convention with the double underbars for the
initializer but
Q1: Not the main question but while I'm here....I'm a little fuzzy on
the convention about the use of the single underbar in the definition of
the hashes. Id the idea to "underbar" all objects and methods that
belong to the class? Why do that?
But now the good stuff:
Our authors define the hashes thus: (showing only one of the three
hashes because they're all the same idea)
self._pos = {predicate:{object:set( [subject] ) }}
Q2: Wha? Two surprises ...
1) Why not {predicate:{object:subject}} i.e.
pos[predicate][object]=subject....why the set( [object] ) construct?
putting the object into a list and turning the list into a set to be the
"value" part of a name:value pair. Why not just use the naked subject
for the value?
2) Why not something like pos[predicate][object][subject] = 1
......or any constant. The idea being to create the set of three indexes.
If the triple exists in the hash, its "in" your tripple store. If not,
then there's no such triple.
I'm reading "Programming the Semantic Web" by Segaran, Evans, and Tayor.
It's about the Semantic Web BUT it uses python to build a "toy" triple
store claimed to have good performance in the "tens of thousands" of
triples.
Just in case anybody doesnt know what an RDF triple is (not that it
matters for my question) think of it as an ordered 3 tuple representing
a Subject, a Predicate, and an Object eg: (John, loves, Mary) (Mary,
has-a, lamb) {theSky, has-color,blue}
To build the triple store entirely in Python, the authors recommend
using the Python hash. Three hashes actually (I get that. You
want to have a hash with the major index being the Subject in one hash,
the Predicate in another hash, or the Object for the third hash)
He creates a class SimpleGraph which initializes itself by setting the
three hashes names _spo, _pos, and _osp thus
class SimpleGraph;
def __init__(self);
self._spo={};
self._pos=();
self._osp={};
So far so good. I get the convention with the double underbars for the
initializer but
Q1: Not the main question but while I'm here....I'm a little fuzzy on
the convention about the use of the single underbar in the definition of
the hashes. Id the idea to "underbar" all objects and methods that
belong to the class? Why do that?
But now the good stuff:
Our authors define the hashes thus: (showing only one of the three
hashes because they're all the same idea)
self._pos = {predicate:{object:set( [subject] ) }}
Q2: Wha? Two surprises ...
1) Why not {predicate:{object:subject}} i.e.
pos[predicate][object]=subject....why the set( [object] ) construct?
putting the object into a list and turning the list into a set to be the
"value" part of a name:value pair. Why not just use the naked subject
for the value?
2) Why not something like pos[predicate][object][subject] = 1
......or any constant. The idea being to create the set of three indexes.
If the triple exists in the hash, its "in" your tripple store. If not,
then there's no such triple.