T
Tom Anderson
Hay guise,
I'm wondering if JCR, aka the Content Repository API for Java, could be
useful to me. I've been reading about it - including reading the spec -
but i'm still not really sure. Does anyone have any hands-on experience
with it?
The thing is, my use is not managing anything really document-like, but
nonetheless, it looks like the fundamentals of JCR are still a good fit. I
want to manage a catalogue for an e-commerce site - there's a
hierarchicaly structure comprising a root category at the top, with
subcategories below it, then at some level products, which in turn contain
SKUs. Each of these has a number of properties, things like name,
description, price, and so on. Possibly also images, although those could
be handled externally to the repository.
Now, on the site itself, this stuff is all read-only, with some simple
access patterns based on individual item lookups plus a few kinds of
query, followed by reading properties from the objects found. Here,
something like JPA or another ORM or OODB approach (or even just POJOs
stored with serialisation) is probably right - there's no need for the
complexity of JPA.
However, the site is only half the story. There's also a backoffice
server, on which the merchandising team prepare the data that gets used on
the site. Lots of this is coming from feeds from other backoffice systems,
but there's still manual editing on top of that. To let people do their
work without stepping on each others' toes, there's a rather CVS-like
model, where there's a single canonical version of the data, plus a set of
workspaces for each user: users can update their workspace from the
canonical version, and commit changes from their workspace into it, with
merge conflicts being detected and handled. The system maintains a history
of these commits, and the versions of the data that existed at each stage.
There's a workflow mechanism associated with this, because edits might
need to be approved by a supervisor before being committed. There's then a
mechanism for pushing the contents of the canonical version to the site -
the two use separate databases, so this is a final checkpoint for making
sure everything is okay.
This is all rather more complicated than the 'objects which live in a box'
model embodied by things like JPA. It does, however, seem like a pretty
good fit for JCR, which has these ideas of workspaces, updates, merges,
and so on. Am i right in thinking that?
The things i'm not sure about are:
- Whether JCR implementations will work well with large numbers (on the
order of 200 000 items all told) of fairly small items, rather than the
smaller numbers of larger items that are more typical.
- Whether there's any kind of ready-made generic metadata-driven UI (free
or commercial) for the backoffice server that i can slap on top of my
repository, rather than having to handwrite it.
- Whether JCR's workspace and versioning model really works the way i want
it to. I don't think JCR has the idea of a canonical workspace, but that's
no big deal - i'd just set one up and designate it canonical by fiat.
- Whether JCR implementations out there actually support the bits of the
workspace and versioning model i need.
- Whether, and how, JCR implementations would let me specify the quite
rigid node type definitions i need. I don't want users to be able to add
random extra properties to items, for instance.
- How i'd handle the push from the backoffice JCR store to the site's JPA
or whatever, or whether i even would - should i just use JCR at the front
too? The API is much less fun to work with than JPA, but the look of it.
- How i'd integrate workflow with the editing and pushing process.
- That there are N other showstopping problems i haven't even thought of!
Any thoughts welcome!
tom
PS Bonus for those who read to the end - an article about JCR that posits
that it's the kind of data architecture Sir Thomas More would have wanted:
http://www.artima.com/lejava/articles/contentrepository.html
Certainly, when i'm choosing software, the hypothetical opinions of
Renaissance intellectuals weigh heavily in my evaluation. For example, we
chose CentOS Linux as our preferred development platform because we
thought Francis Bacon would have been cool with it.
I'm wondering if JCR, aka the Content Repository API for Java, could be
useful to me. I've been reading about it - including reading the spec -
but i'm still not really sure. Does anyone have any hands-on experience
with it?
The thing is, my use is not managing anything really document-like, but
nonetheless, it looks like the fundamentals of JCR are still a good fit. I
want to manage a catalogue for an e-commerce site - there's a
hierarchicaly structure comprising a root category at the top, with
subcategories below it, then at some level products, which in turn contain
SKUs. Each of these has a number of properties, things like name,
description, price, and so on. Possibly also images, although those could
be handled externally to the repository.
Now, on the site itself, this stuff is all read-only, with some simple
access patterns based on individual item lookups plus a few kinds of
query, followed by reading properties from the objects found. Here,
something like JPA or another ORM or OODB approach (or even just POJOs
stored with serialisation) is probably right - there's no need for the
complexity of JPA.
However, the site is only half the story. There's also a backoffice
server, on which the merchandising team prepare the data that gets used on
the site. Lots of this is coming from feeds from other backoffice systems,
but there's still manual editing on top of that. To let people do their
work without stepping on each others' toes, there's a rather CVS-like
model, where there's a single canonical version of the data, plus a set of
workspaces for each user: users can update their workspace from the
canonical version, and commit changes from their workspace into it, with
merge conflicts being detected and handled. The system maintains a history
of these commits, and the versions of the data that existed at each stage.
There's a workflow mechanism associated with this, because edits might
need to be approved by a supervisor before being committed. There's then a
mechanism for pushing the contents of the canonical version to the site -
the two use separate databases, so this is a final checkpoint for making
sure everything is okay.
This is all rather more complicated than the 'objects which live in a box'
model embodied by things like JPA. It does, however, seem like a pretty
good fit for JCR, which has these ideas of workspaces, updates, merges,
and so on. Am i right in thinking that?
The things i'm not sure about are:
- Whether JCR implementations will work well with large numbers (on the
order of 200 000 items all told) of fairly small items, rather than the
smaller numbers of larger items that are more typical.
- Whether there's any kind of ready-made generic metadata-driven UI (free
or commercial) for the backoffice server that i can slap on top of my
repository, rather than having to handwrite it.
- Whether JCR's workspace and versioning model really works the way i want
it to. I don't think JCR has the idea of a canonical workspace, but that's
no big deal - i'd just set one up and designate it canonical by fiat.
- Whether JCR implementations out there actually support the bits of the
workspace and versioning model i need.
- Whether, and how, JCR implementations would let me specify the quite
rigid node type definitions i need. I don't want users to be able to add
random extra properties to items, for instance.
- How i'd handle the push from the backoffice JCR store to the site's JPA
or whatever, or whether i even would - should i just use JCR at the front
too? The API is much less fun to work with than JPA, but the look of it.
- How i'd integrate workflow with the editing and pushing process.
- That there are N other showstopping problems i haven't even thought of!
Any thoughts welcome!
tom
PS Bonus for those who read to the end - an article about JCR that posits
that it's the kind of data architecture Sir Thomas More would have wanted:
http://www.artima.com/lejava/articles/contentrepository.html
Certainly, when i'm choosing software, the hypothetical opinions of
Renaissance intellectuals weigh heavily in my evaluation. For example, we
chose CentOS Linux as our preferred development platform because we
thought Francis Bacon would have been cool with it.