Philip said:
Well OK, but that's a very different argument. Yes, joins can be
expensive. They're often still the best option, though. The first step
people usually take to get away from joins is denormalization which can
improve SELECT performance at the expense of slowing down INSERTs,
UPDATEs, and DELETEs, not to mention complicating one's code and data
model. Is that a worthwhile trade?
I'd say that in more than 99% of situations: NO.
More than that: if I haven't normalized my data as it should have been
normalized, I wouldn't be able to do complicated querying that I really,
really have to be able to do due to business logic. A few of my queries
have a few hundred lines each with many sub-queries and multiple
many-to-many joins: I *dread the thought* what would happen if I had to
reliably do it in a denormalized db and still ensure data integrity
across all the business logic contexts. And performance is still more
than good enough: so there's no point for me, as of the contexts I
normally work in, to denormalize data at all.
It's just interesting for me to see what happens in that <1% of situations.
Depends on the application. As I
said, sometimes the cure is worse than the disease.
Don't worry about joins until you know they're a problem. As Knuth said,
premature optimization is the root of all evil.
Sure -- the cost of joins is just interesting to me as a 'corner case'.
I don't have datasets large enough for this to matter in the first place
(and I probably won't have them that huge).
PS - Looks like you're using Postgres -- excellent choice. I miss using it.
If you can, I'd recommend using SQLAlchemy layer on top of
Oracle/Mysql/Sqlite, if that's what you have to use: this *largely*
insulates you from the problems below and it does the job of translating
into a peculiar dialect very well. For my purposes, SQLAlchemy worked
wonderfully: it's very flexible, it has middle-level sql expression
language if normal querying is not flexible enough (and normal querying
is VERY flexible), it has a ton of nifty features like autoloading and
rarely fails bc of some lower-level DB quirk AND its high-level object
syntax is so similar to SQL that you quickly & intuitively grasp it.
(and if you have to/prefer writing some query in "low-level" SQL, as I
have done a few times, it's still easy to make SQLAlchemy slurp the
result into objects provided you ensure there are all of the necessary
columns in the query result)
Regards,
mk