Where To Put Validation Logic?

T

Tom Wardrop

This is probably a question you could put to any programmer who writes
object-oriented code.

I'm developing a Mongo object mapper for Ruby, inspired by Candy's
write-on-set approach. I'm currently still in the planning phase, trying
to work out the scope of the project and the most appropriate interface.
One question I now find myself asking, is whether to include validation
helpers, and if so, where should they reside? In the model, or in the
controller? The answer I guess also depends on the scope of the project.
Will the mapper act as a foundation for all models (like most mappers),
or will it be more like a tool that models will simply use, if they have
to - something I'm still undecided on, but which I'm hoping this thread
may help me with.

While validation serves as a means to ensure the integrity of data, it
can also serve as a means to validate user input at the application
level. If you look at validations as a means to ensure data integrity,
then naturally, such logic should reside in the model; the data layer of
the application. On the other hand, looking at validation as a means to
validate user input and interactions with the UI, you would think the
controller would be the better fit.

When you think about it, you're really dealing with two types or levels
of validation: user-level, and data-level. A good example that
demonstrates this, is assume you had an application that can accept a
home phone number, a work phone number, and a mobile/cell phone number.
No field is mandatory, but at least one phone number must be given. You
could call this multi-field validation. In theory, the data layer of
your application shouldn't give a damn about whether or not at least one
phone number is set. All the data layer should care about is whether it
has a valid phone or not. How I see it, such "multi-field" validation
can be classed as general application logic which should be the
responsibility of the view/controller, not the data layer, who's only
responsible should be persisting the data.

Sadly, I don't know the right answer here, hence why I've come to this
forum. Hopefully some of you have some thoughts or opinions to share on
this matter?

Remember, one the main reasons for bringing up this topic, is because
I'm writing a Mongo object mapper that writes to Mongo when a field is
set, hence such multi-field validation can't easily be achieved. This
made me challenge my understanding of validations and which part of the
application should be responsible for them.
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

I think the traditional Rails thinking is going to be: in the model, and
more specifically the Rails 3 thinking is going to be in an
ActiveModel-drived model.
 
T

Tom Wardrop

I honestly don't give a damn about rails. In fact, if anything, I'm
trying to go against the rails-way by breaking out of the common
ActiveRecord-esque interface (write, validate, save) by using a
write-on-set which takes advantage of MongoDB.

I was more asking this question from a general programming perspective.
Thanks for your reply though none-the-less.
 
P

Phillip Gawlowski

I honestly don't give a damn about rails. In fact, if anything, I'm
trying to go against the rails-way by breaking out of the common
ActiveRecord-esque interface (write, validate, save) by using a
write-on-set which takes advantage of MongoDB.


Well, validation logic should be in the same place that deals with the
database. In MVC, that's the model.


P.S.: Active Record is one possible variant of how to deal with the DB
in a MVC environment, of which ActiveRecord is the canoncial Rails
implementation.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 
D

David Masover

I'm developing a Mongo object mapper for Ruby, inspired by Candy's
write-on-set approach. I'm currently still in the planning phase, trying
to work out the scope of the project and the most appropriate interface.

I may be trying to shoehorn this into what I already know, but I think this
could be done well as a DataMapper adapter, rather than as an entirely new
ORM. (Better yet, does the existing DataMapper adapter do what you want?) If
that's the case, the obvious answer is:
One question I now find myself asking, is whether to include validation
helpers, and if so, where should they reside? In the model, or in the
controller?

DataMapper already includes them, and puts them in the model.
Will the mapper act as a foundation for all models (like most mappers),
or will it be more like a tool that models will simply use, if they have
to - something I'm still undecided on, but which I'm hoping this thread
may help me with.

I would suggest that if there isn't already a good Ruby API for this kind of
stuff, start with that. It might take a form that essentially looks like a
mapper, but it really doesn't need to do any validation.

For example, look at dm-appengine. (I might be a teensy-bit biased here...)
There's already a somewhat clumsy Java API, which appengine-jruby enhances
_slightly_. It includes some of what you'd expect from an ORM, in that you do
get "entity" objects which represent the state of an entity in the datastore.
But fundamentally, the entire Java API, while it _could_ be used directly (and
would be much nicer than writing raw GQL), is just a backend for dm-appengine.

The biggest change App Engine makes is removing some features and adding some
types. This is actually pretty reasonable to do in DataMapper now.

In the case of Mongo, it looks like the biggest thing DataMapper doesn't do
already is the ability to atomically update a single field -- for instance, it
might be nice to declare a field as 'immediate' in some sense, so you could
translate:

joe.n += 1

into something like the Mongo command:

db.people.update( { name:"Joe" }, { $inc: { n : 1 } } );

It's not perfect, because that entails actually retrieving the object first,
but the problems here are more or less the same as the problems of writing an
efficient 'update' command for SQL.
If you look at validations as a means to ensure data integrity,
then naturally, such logic should reside in the model; the data layer of
the application. On the other hand, looking at validation as a means to
validate user input and interactions with the UI, you would think the
controller would be the better fit.

Not really.

If it's validation, what we're talking about is answering the question, "What
states are valid for this object, and what are the valid transitions between
those states?" It makes sense to me that this should be in the model, and that
it is _always_ a data integrity question.

Think of it this way: If you were to create another controller, or another
view -- basically, if there were another UI, or even an entirely different
program accessing your data -- would you want it to obey the same constraints?
If so, how are you going to share them? The model is the most obvious place.

It also makes sense to me that it belongs in the model, and no lower -- while
some databases may support a subset of the validations you want, you're going
to want something Turing-complete. And if you end up writing lots of Turing-
complete validations in your database layer, that's pushing too much of your
application logic into your database. If you want multiple applications to be
able to access the same database, either share the model code or expose an API
on top of your application, something like REST.

I actually tend to favor the REST approach, which would tend to make this a
moot question -- if you actually use MVC to expose that API, everything is
going to hit your controllers anyway. But conceptually, all that's really
doing is using an MVC framework to implement the model as a network service,
rather than as part of a monolithic application.
A good example that
demonstrates this, is assume you had an application that can accept a
home phone number, a work phone number, and a mobile/cell phone number.
No field is mandatory, but at least one phone number must be given. You
could call this multi-field validation.

I would.
In theory, the data layer of
your application shouldn't give a damn about whether or not at least one
phone number is set.

Why not?

The way I see it, either you require at least one phone number, or you don't.
If you do, presumably your application has a good reason to care, a good
reason it's inconveniencing the user by forcing them to give you a phone
number.

If your application doesn't give a damn, why should your users?
All the data layer should care about is whether it
has a valid phone or not.

But that's just it -- I see a valid phone as a component of a valid...
whatever record you're applying that to. (User? Employee? Contact?)

Having a valid record absolutely _is_ the responsibility of the model, at
least.
How I see it, such "multi-field" validation
can be classed as general application logic which should be the
responsibility of the view/controller, not the data layer, who's only
responsible should be persisting the data.

Maybe not the data layer, but I don't think it belongs in the controller, and
certainly not the view!
Remember, one the main reasons for bringing up this topic, is because
I'm writing a Mongo object mapper that writes to Mongo when a field is
set, hence such multi-field validation can't easily be achieved.

Why not?

Take it from the standpoint of locking in a traditional database. Either you
need some sort of locks to preserve data integrity, in which case you can be
sure the change you've made is valid, or you don't, and you can't. Take your
phone number example -- if you were to implement it on top of the API you've
described, you've got a race condition which could result in a record having
no phone numbers, and thus being invalid, even though your UI forced the user
to enter a phone number.

But just because you've decided to dismiss the possibility of that race
condition (or prevent it some other way) doesn't mean you can't do _any_
validation on the entire record. It's no reason you can't have code which
pulls in the other two phone number fields and checks that at least one of the
three has a valid phone number before writing all three back out.

And I think that's how DM validations would effectively work. I think that's
how they _do_ effectively work with SQL. DataMapper is just lazy enough that
it's certainly capable of reading as much of a record as it needs (and of
lazily ignoring fields it doesn't) and then writing only the fields it thinks
it's updating. In fact, if you don't need to retreive an object, you can also
do things like:

Person.all:)name => 'Joe').update:)n => 5)

I'm pretty sure that translates directly into the SQL you'd expect -- a single
mass UPDATE statement. I'm pretty sure it'd work more or less the same way in
a Mongo adapter.
 
T

Tom Wardrop

I appreciate the reply David.

A little while after my initial post, I actually started coming to a
similar conclusion as what you've described. The model should define the
valid state of the system. This obviously does present a problem for a
write-on-set database interface, because if you only set one field, even
if you soon follow up by setting multiple other fields, the new or
updated record will become invalid because it'll be missing field x, y
and z.

I did manage to come up with a solution however, which is something I
had already planned to implement, but for a different purpose. I've
decided to use a queuing construct. So if you've created a new object,
or you need to update multiple fields, you can wrap such operations in a
"set" block. For example...

john.set do |doc|
doc[:age] = 23
doc[:hair] = 'long'
doc[:skin] = 'fair'
end

Or you could simply pass in a hash depending on the situation...

john.set age: 23, hair: 'long', skin: 'fair'

Single field write-on-set operation can still be performed using:
john[:hair] = 'long', but if you wish to perform an operation that will
invalidate the object for any period of time, you'll have to wrap the
operations in a 'set' block.

Behind the scenes, I've just implemented a simple queueing system which
in essence, all it does is create a hash of new values which are only
committed when commit() is called. Essentially, all the above set()
method does, is the following internally...

john.queue_updates = true
john[:age] = 23
john[:hair] = 'long'
john[:skin] = 'fair'
# ...do plenty of other unrelated stuff here if you want...
john.commit
# Turn off queueing if you no longer wish to queue further operations.
john.queue_updates = false

I'm pretty happy with the solution I've managed to come up with here, as
it's not just a construct for validation, but also allows one to batch
operations to improve application performance. I'm yet to define the
validation syntax, but it will be something like this...


class Person
define_valid do
cond :name, :age, :email {|val| !val.empty? }
cond :name {|val| val.gsub /.+ .+/ }
cond :age {|val| val.is_a? Integer && (0..160) === val}
cond :email {|val|
val.match(/^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$/)}
end
end

As you can see, it uses blocks to define the validation logic. You can
also map multiple fields to the same validation block. I still need to
come up with an elegant way of setting validation messages. One idea is
to simply return a string from the validation block when the condition
fails, e.g.

cond :name do |val|
return "Name field must contain both a first and a last name." unless
val.gsub /.+ .+/
end

Whenever data is about to be committed, the validation conditions can be
checked against the fields they apply to. This method allows one to
define validations as complex or as simple as one desires.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top