D
Dmitry Borodaenko
What is Samizdat?
Samizdat is a generic RDF-based engine for building collaboration and
open publishing web sites. Samizdat provides users with means to
cooperate and coordinate on all kinds of activities, including media
activism, resource sharing, education and research, advocacy, and so on.
Samizdat intends to promote values of freedom, openness, equality, and
cooperation.
Samizdat library includes four stand-alone modules that can be used
outside the Samizdat engine: Cache (thread-safe time-limited object
cache with flexible replacement policy), Storage (RDF storage over a
relational database), Sanitize (whitelist XSS filter based on HTMLTidy
and REXML), and Antispam (simple wiki spam filter).
What's new in Samizdat 0.6.1?
Main goal of 0.6.x series is to address the shortcomings that were
identified in the IMC CMS Survey in November 2006 [0]. This version takes
care of the most important part: security. New security features in
Samizdat 0.6.1 include: CSRF protection, Antispam module, per-resource
moderation logs, moderation requests tracker.
[0] http://samizdat.nongnu.org/doc/CMSSurveyReportSamizdat.html
Samizdat's internals have changed beyond recognition since previous
release. The engine code is refactored into MVC architecture, Samizdat
Cache now uses a deadlock-proof two-level locking algorithm, RDF Storage
has undergone a massive overhaul that allowed to add support for
optional sub-patterns in Squish queries. Apache/PostgreSQL combo is no
longer the only way to install Samizdat: Lighttpd web server and MySQL
and SQLite3 databases are now supported. The database schema is changed
once again, see below on how to upgrade.
There's also a lot of small features and usability improvements here and
there. The tired "next page" link is replaced with proper pagination
system, file sizes are displayed next to download links, replies are
sorted by id instead of last edit date, posting comment to a multi-page
thread redirects to thread's last page, translations don't appear in the
replies list and can't be replied to, error reporting is more detailed
and less confusing to users. User interface was translated into several
more languages, with varying degrees of completeness.
And the "cherry on top" prize goes to RSS import module, with special
thanks to Boud who evangelized this feature for a long time and created
the first implementation.
Changes in more detail:
- RSS import: each site can configure a list of feeds to be syndicated
into the front page, feeds are stored in persistent DRb cache and
updated by samizdat-import-feeds script
- CSRF protection: cross-site request forgery is a type of web exploit
that relies on one site fooling user's browser into submitting to
another site a request that would apply changes on user's behalf; to
prevent that, every form that submits changes to a Samizdat site
includes a unique ID that is also stored on the server and
cross-checked when the form is submitted
- Antispam module: a list of regular expressions is loaded from a
configured location and stored in persistent DRb cache, messages by
users with configured access levels (by default, only guests) are
compared against the list and rejected if a match is found
- per-resouce moderation logs: whenever a resource was moderated (for
message, this includes moderation actions applied to its replies), a
link to resource's moderation log will appear in its header; the "show
hidden" option was removed from the UI as it was made obsolete by this
feature
- moderation requests tracker: users with roles that allow to post
messages are also able to request moderation of a message;
unacknowledged moderation requests are listed on a tracker page that
allows moderators to action requests on the spot
- new translations: Spanish, German, Japanese (very rough), Chinese (in
progress)
- MVC architecture: originally, Nitro and Rails frameworks were reviewed
as candidates, both were discarded: Nitro was missing some important
features, while Rails required too many code changes and wasn't
friendly to "multiple sites per host" setups; instead, a tiny
Samizdat-specific MVC library was implemented: in just 300 lines, it
provides dispatcher, controller, and view classes (no ORM as we
already have RDF for that), later a 100-line DataSet class was added
to support the new pagination system
- deadlock-proof Cache: new two-level locking algorithm has made
Samizdat Cache re-enterable (it's now possible to invoke fetch_or_add
from inside a block passed to an outer fetch_or_add invokation), and
protected against dead-locks and live-locks (beware: the algorithm
relies on RubyForge patch #11680 which as of today is included in
Debian package of Ruby 1.8, but not upstream); replacement policy can
now be overridden (see CacheEntry#replacement_index); configurable
rate limit ensures that at least given amount of time passes between
two flushes and prevents a situation where rapid site updates don't
give the engine enough time to update the cache
- optional sub-patterns in Squish: new OPTIONAL section of Squish query
allows to augment the query pattern graph with sub-graphs that may or
may not match against the site knowledge base; per-statement FILTER
conditions help to put restriction on variable values closer to where
the variables are defined; these changes bring Samizdat Squish
semantically closer to W3C-recommended SPARQL RDF query language
- Lighttpd: see doc/examples/lighttpd.conf on how to setup Samizdat with
Lighttpd in FastCGI mode; due to limitations of Lighty's rewrite
capabilities, it's tricky to make it properly handle static content
(e.g. site logo), otherwise this setup is well-tested and stable
- MySQL: now that MySQL 5 supports triggers and transactions, it is
possible to run Samizdat on MySQL, database generation scripts are
included in the package (and no, there's no performance difference
between PostgreSQL and MySQL); one gotcha when migrating from
PostgreSQL to MySQL is the latter's peculiar understanding of Unicode
string equality: if your database has member logins that only differ
in case, you will not be able to migrate it to MySQL as it will
consider it a clash on unique field; to prevent that from happening in
the future, Samizdat now enforces lowercase login names
- SQLite3: if you want to play with Samizdat without installing a
heavy-duty DBMS, or if your hosting only allows you to run scripts
from your home directory and doesn't provide database access, you can
hook Samizdat to SQLite3 and still get a functional site; beware that
SQLite3 (or at least it's Ruby/DBI driver) has a tendency to lock up
under heavy load, so if you expect lots of traffic, PostgreSQL is
still the way to go
How to upgrade?
Apache configuration was changed to rewrite everything to the MVC
dispatcher, review the changes in doc/examples/apache.conf and merge
them into configurations of your sites.
Following SQL commands need to be run by database user that owns your
tables to bring it up to date with version 0.6.1:
ALTER TABLE Member RENAME COLUMN passwd TO password;
UPDATE Moderation SET action = 'replace' WHERE action = 'displace';
CREATE INDEX Resource_uriref_idx ON Resource (uriref);
CREATE INDEX Resource_published_date_idx ON Resource (published_date);
CREATE INDEX Statement_object_idx ON Statement (object);
CREATE INDEX Vote_proposition_idx ON Vote (proposition);
CREATE INDEX Moderation_resource_idx ON Moderation (resource);
Where to get it?
Project page: http://samizdat.nongnu.org/
Download: http://savannah.nongnu.org/download/samizdat/samizdat-0.6.1.tar.gz
Debian package: apt-get install samizdat
(http://packages.qa.debian.org/s/samizdat.html)
Samizdat is a generic RDF-based engine for building collaboration and
open publishing web sites. Samizdat provides users with means to
cooperate and coordinate on all kinds of activities, including media
activism, resource sharing, education and research, advocacy, and so on.
Samizdat intends to promote values of freedom, openness, equality, and
cooperation.
Samizdat library includes four stand-alone modules that can be used
outside the Samizdat engine: Cache (thread-safe time-limited object
cache with flexible replacement policy), Storage (RDF storage over a
relational database), Sanitize (whitelist XSS filter based on HTMLTidy
and REXML), and Antispam (simple wiki spam filter).
What's new in Samizdat 0.6.1?
Main goal of 0.6.x series is to address the shortcomings that were
identified in the IMC CMS Survey in November 2006 [0]. This version takes
care of the most important part: security. New security features in
Samizdat 0.6.1 include: CSRF protection, Antispam module, per-resource
moderation logs, moderation requests tracker.
[0] http://samizdat.nongnu.org/doc/CMSSurveyReportSamizdat.html
Samizdat's internals have changed beyond recognition since previous
release. The engine code is refactored into MVC architecture, Samizdat
Cache now uses a deadlock-proof two-level locking algorithm, RDF Storage
has undergone a massive overhaul that allowed to add support for
optional sub-patterns in Squish queries. Apache/PostgreSQL combo is no
longer the only way to install Samizdat: Lighttpd web server and MySQL
and SQLite3 databases are now supported. The database schema is changed
once again, see below on how to upgrade.
There's also a lot of small features and usability improvements here and
there. The tired "next page" link is replaced with proper pagination
system, file sizes are displayed next to download links, replies are
sorted by id instead of last edit date, posting comment to a multi-page
thread redirects to thread's last page, translations don't appear in the
replies list and can't be replied to, error reporting is more detailed
and less confusing to users. User interface was translated into several
more languages, with varying degrees of completeness.
And the "cherry on top" prize goes to RSS import module, with special
thanks to Boud who evangelized this feature for a long time and created
the first implementation.
Changes in more detail:
- RSS import: each site can configure a list of feeds to be syndicated
into the front page, feeds are stored in persistent DRb cache and
updated by samizdat-import-feeds script
- CSRF protection: cross-site request forgery is a type of web exploit
that relies on one site fooling user's browser into submitting to
another site a request that would apply changes on user's behalf; to
prevent that, every form that submits changes to a Samizdat site
includes a unique ID that is also stored on the server and
cross-checked when the form is submitted
- Antispam module: a list of regular expressions is loaded from a
configured location and stored in persistent DRb cache, messages by
users with configured access levels (by default, only guests) are
compared against the list and rejected if a match is found
- per-resouce moderation logs: whenever a resource was moderated (for
message, this includes moderation actions applied to its replies), a
link to resource's moderation log will appear in its header; the "show
hidden" option was removed from the UI as it was made obsolete by this
feature
- moderation requests tracker: users with roles that allow to post
messages are also able to request moderation of a message;
unacknowledged moderation requests are listed on a tracker page that
allows moderators to action requests on the spot
- new translations: Spanish, German, Japanese (very rough), Chinese (in
progress)
- MVC architecture: originally, Nitro and Rails frameworks were reviewed
as candidates, both were discarded: Nitro was missing some important
features, while Rails required too many code changes and wasn't
friendly to "multiple sites per host" setups; instead, a tiny
Samizdat-specific MVC library was implemented: in just 300 lines, it
provides dispatcher, controller, and view classes (no ORM as we
already have RDF for that), later a 100-line DataSet class was added
to support the new pagination system
- deadlock-proof Cache: new two-level locking algorithm has made
Samizdat Cache re-enterable (it's now possible to invoke fetch_or_add
from inside a block passed to an outer fetch_or_add invokation), and
protected against dead-locks and live-locks (beware: the algorithm
relies on RubyForge patch #11680 which as of today is included in
Debian package of Ruby 1.8, but not upstream); replacement policy can
now be overridden (see CacheEntry#replacement_index); configurable
rate limit ensures that at least given amount of time passes between
two flushes and prevents a situation where rapid site updates don't
give the engine enough time to update the cache
- optional sub-patterns in Squish: new OPTIONAL section of Squish query
allows to augment the query pattern graph with sub-graphs that may or
may not match against the site knowledge base; per-statement FILTER
conditions help to put restriction on variable values closer to where
the variables are defined; these changes bring Samizdat Squish
semantically closer to W3C-recommended SPARQL RDF query language
- Lighttpd: see doc/examples/lighttpd.conf on how to setup Samizdat with
Lighttpd in FastCGI mode; due to limitations of Lighty's rewrite
capabilities, it's tricky to make it properly handle static content
(e.g. site logo), otherwise this setup is well-tested and stable
- MySQL: now that MySQL 5 supports triggers and transactions, it is
possible to run Samizdat on MySQL, database generation scripts are
included in the package (and no, there's no performance difference
between PostgreSQL and MySQL); one gotcha when migrating from
PostgreSQL to MySQL is the latter's peculiar understanding of Unicode
string equality: if your database has member logins that only differ
in case, you will not be able to migrate it to MySQL as it will
consider it a clash on unique field; to prevent that from happening in
the future, Samizdat now enforces lowercase login names
- SQLite3: if you want to play with Samizdat without installing a
heavy-duty DBMS, or if your hosting only allows you to run scripts
from your home directory and doesn't provide database access, you can
hook Samizdat to SQLite3 and still get a functional site; beware that
SQLite3 (or at least it's Ruby/DBI driver) has a tendency to lock up
under heavy load, so if you expect lots of traffic, PostgreSQL is
still the way to go
How to upgrade?
Apache configuration was changed to rewrite everything to the MVC
dispatcher, review the changes in doc/examples/apache.conf and merge
them into configurations of your sites.
Following SQL commands need to be run by database user that owns your
tables to bring it up to date with version 0.6.1:
ALTER TABLE Member RENAME COLUMN passwd TO password;
UPDATE Moderation SET action = 'replace' WHERE action = 'displace';
CREATE INDEX Resource_uriref_idx ON Resource (uriref);
CREATE INDEX Resource_published_date_idx ON Resource (published_date);
CREATE INDEX Statement_object_idx ON Statement (object);
CREATE INDEX Vote_proposition_idx ON Vote (proposition);
CREATE INDEX Moderation_resource_idx ON Moderation (resource);
Where to get it?
Project page: http://samizdat.nongnu.org/
Download: http://savannah.nongnu.org/download/samizdat/samizdat-0.6.1.tar.gz
Debian package: apt-get install samizdat
(http://packages.qa.debian.org/s/samizdat.html)