Capcha in ruby

F

Federico

Hello,
I'm creating a public registration process in rails and I would like to
add some kind of spam filtering...
Anyone knows about a library/implementation/gem of the capcha anti-spam
in ruby or better already integrated into rails (validates_capcha)?

thanks

-Federico
 
R

Robbie Carlton

------=_Part_25121_5347777.1127125812571
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

try

http://www.google.co.uk/search?q=3Druby+captcha

or better yet

http://captcha.rubyforge.org/

I don't understand the relevance of the unbind method linked to above. I'm=
=20
assuming it was a mistake. If I'm being dense, will someone explain what th=
e=20
relevance is to me :)


------=_Part_25121_5347777.1127125812571--
 
A

Austin Ziegler

Hello,
I'm creating a public registration process in rails and I would like to
add some kind of spam filtering...
Anyone knows about a library/implementation/gem of the capcha anti-spam
in ruby or better already integrated into rails (validates_capcha)?

Remember, though, that CAPTCHAs are NOT accessible, and you may need
to provide an alternate means for verification that is as secure.
CAPTCHAs are also imperfect and have been shown to be able to be
broken by computer programs.

-austin
--=20
Austin Ziegler * (e-mail address removed)
* Alternate: (e-mail address removed)
 
R

Robbie Carlton

------=_Part_25356_29278647.1127130846051
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

But they stop the average spambot, which is what they're for I think.
The simplest accessible alternative would be email verification, but this=
=20
obviously slows the whole thing down.
Has anyone thought of an accessible alternative that can be embedded on the=
=20
page?

=20

=20
Remember, though, that CAPTCHAs are NOT accessible, and you may need
to provide an alternate means for verification that is as secure.
CAPTCHAs are also imperfect and have been shown to be able to be
broken by computer programs.
=20
-austin

------=_Part_25356_29278647.1127130846051--
 
A

Austin Ziegler

But they stop the average spambot, which is what they're for I think.
The simplest accessible alternative would be email verification, but this
obviously slows the whole thing down.
Has anyone thought of an accessible alternative that can be embedded on t= he
page?

Two things. First, they don't work against the average spambot of 2005
or later. The average spambot of 2005 has gocr or something like that
built in. (I'm using spambot because that's what you're using; what
you're talking about is actually crawlers and registration bots.) The
problem with CAPTCHA systems is that something complex enough to
defeat a computer OCR system will be enough to lock out a significant
portion of your potential users. Second, a lot of people *have*
thought about it. I'm unimpressed with most solutions.

http://www.standards-schmandards.com/index.php?2005/01/01/11-captcha
http://www.petefreitag.com/item/376.cfm
http://www.w3.org/WAI/intro/captcha.php
http://www.bestkungfu.com/archive/date/2005/01/captcha-state-of-the-union-2=
005/
http://www.bestkungfu.com/?p=3D445

Basically, my advice is to forget CAPTCHA and go with double
verification. You can even provide multiple levels of user
accessibility, allowing immediate access but nothing that could be
construed as spam until they have verified their identity in some way
that is accessible.

-austin
--=20
Austin Ziegler * (e-mail address removed)
* Alternate: (e-mail address removed)
 
S

Stephen Veit

I have seen the case where a subscriber is asked to solve a simple
math problem. E.g., "what is twelve plus twenty-three?" This would
certainly be accessible. You could think of different types of
questions like "Enter the number that follows fifty-five." or "What
number comes before thirty-two?"
 
L

Lyndon Samson

------=_Part_33719_3049946.1127134713344
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

=20
try
=20
http://www.google.co.uk/search?q=3Druby+captcha
=20
or better yet
=20
http://captcha.rubyforge.org/
=20
I don't understand the relevance of the unbind method linked to above. I'= m
assuming it was a mistake. If I'm being dense, will someone explain what= =20
the
relevance is to me :)


Sorry, my bad, response to wrong thread!

------=_Part_33719_3049946.1127134713344--
 
G

Gavin Kistner

I have seen the case where a subscriber is asked to solve a simple
math problem. E.g., "what is twelve plus twenty-three?" This would
certainly be accessible. You could think of different types of
questions like "Enter the number that follows fifty-five." or "What
number comes before thirty-two?"

Interesting. That would probably keep out existing general-purpose
rakes. But the moment your site becomes popular or targeted, it seems
to me that it would not be difficult to write a program to answer
your questions. Even if you include 33 flavors of how to phrase the
question ("Enter an integer that is not less than (not equal to)
eighty (reduced by the value represented by the roman numeral V) and
more 'n seventy with the number of non-thumbs on a standard hand
added to it.") the engineered bot could be written to handle 20% of
your phrases, and that would be enough.

[OT]
I smell a couple of fun Ruby Quizzes here. One is simply to write an
english-to-numeric processor.
value = Numeric.from_english( "eight-hundred thousand, twenty-three
hundred fifteen")

Another quiz might be to write such a challenge/response captcha
system. Make the questions as clear and varied as possible.

Another might be, given a series of questions like the above, to
write a 'bot that could answer them.
 
R

Robbie Carlton

------=_Part_25483_29388837.1127135473635
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

I like the maths question idea. Is there any set of enumerable problems tha=
t=20
would cause a human no difficulties, for which there isn't a general=20
solution algorithm?
I realise this is a maths question, but just wondered if anyone had any=20
thoughts.

=20
I have seen the case where a subscriber is asked to solve a simple
math problem. E.g., "what is twelve plus twenty-three?" This would
certainly be accessible. You could think of different types of
questions like "Enter the number that follows fifty-five." or "What
number comes before thirty-two?"
=20
Interesting. That would probably keep out existing general-purpose
rakes. But the moment your site becomes popular or targeted, it seems
to me that it would not be difficult to write a program to answer
your questions. Even if you include 33 flavors of how to phrase the
question ("Enter an integer that is not less than (not equal to)
eighty (reduced by the value represented by the roman numeral V) and
more 'n seventy with the number of non-thumbs on a standard hand
added to it.") the engineered bot could be written to handle 20% of
your phrases, and that would be enough.
=20
[OT]
I smell a couple of fun Ruby Quizzes here. One is simply to write an
english-to-numeric processor.
value =3D Numeric.from_english( "eight-hundred thousand, twenty-three
hundred fifteen")
=20
Another quiz might be to write such a challenge/response captcha
system. Make the questions as clear and varied as possible.
=20
Another might be, given a series of questions like the above, to
write a 'bot that could answer them.
=20
=20

------=_Part_25483_29388837.1127135473635--
 
G

Gavin Kistner

I like the maths question idea. Is there any set of enumerable
problems that
would cause a human no difficulties, for which there isn't a general
solution algorithm?
I realise this is a maths question, but just wondered if anyone had
any
thoughts.

Well, taking that thought further, you could even keep simpletons (or
at least those bad at math) out of your site, by setting your
question difficulty to an appropriate level.

"Type the number that comes right before seventy-five."

"What is fifteen plus twelve minus six?"

"What is six x minus three q, if q is two and x three?"

"What is two squared, cubed?"

"Is the cosine of zero one or zero?"

"What trigonometric function of an angle of a right triangle yields
the ratio of the adjacent side's length divided by the hypotenuse?"

"What is the derivative of 2x^2, when x is 3?"

"What is the dot product of the vectors [4 7] and [3 4]?"

"What is the cross product of the vectors [4 7] and [3 4]?"

...and I'll stop there, before I embarrass myself trying to come up
with tougher questions.
 
R

Robbie Carlton

------=_Part_25636_24338999.1127138515270
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

"Is the cosine of zero one or zero?"

"What trigonometric function of an angle of a right triangle yields
the ratio of the adjacent side's length divided by the hypotenuse?"

careful with those trig questions. Soon no one will know what you're talkin=
g=20
about http://physorg.com/news6555.html

------=_Part_25636_24338999.1127138515270--
 
S

Steven Lumos

Austin Ziegler said:
Two things. First, they don't work against the average spambot of 2005
or later. The average spambot of 2005 has gocr or something like that
built in. (I'm using spambot because that's what you're using; what
you're talking about is actually crawlers and registration bots.) The
problem with CAPTCHA systems is that something complex enough to
defeat a computer OCR system will be enough to lock out a significant
portion of your potential users. Second, a lot of people *have*
thought about it. I'm unimpressed with most solutions.

OCR isn't *that* easy. Humans--even young children--far exceed
machines in discerning even relatively clean machine-print characters.

The research at Lehigh is interesting.

http://www.cse.lehigh.edu/~baird/research_hips.html
Basically, my advice is to forget CAPTCHA and go with double
verification. You can even provide multiple levels of user
accessibility, allowing immediate access but nothing that could be
construed as spam until they have verified their identity in some way
that is accessible.

-austin

I guess you're talking about email, but that is considerably less
difficult for a machine to pass than CAPTCHA. Verifying that some
thing that gave you an email address has the ability to view messages
sent to that address doesn't prove much.

Steve
 
A

Austin Ziegler

On 9/20/05 said:
OCR isn't *that* easy. Humans--even young children--far exceed
machines in discerning even relatively clean machine-print characters.

Yes, I understand that. However, CAPTCHA is also proving to be
relatively ineffective and against accessibility standards. If you have
to follow US Federal 508 guidelines, you shouldn't use CAPTCHA. As noted
on the various discussions that I linked to, the large sites that
spawned CAPTCHA have now abandoned it.

[...]

Interesting, but I believe it will be ultimately fruitless. If I am
visually impaired but do not, for example, have audio attached to my
computer, then an audio CAPTCHA is just as limiting as a visual CAPTCHA.
Even the logic puzzle CAPTCHAs -- the most promising of CAPTCHAs -- are
often culturally or linguistically exclusive.
I guess you're talking about email, but that is considerably less
difficult for a machine to pass than CAPTCHA. Verifying that some
thing that gave you an email address has the ability to view messages
sent to that address doesn't prove much.

Not necessarily email. Google has solved this for GMail and Google Talk
with SMS, as the number of people who own computers and the number of
people who own cellphones has a high correspondence.

Other systems can solve it with multiple levels of privilege. If you
have a bulletin board, then someone who has signed up but not yet
verified might have command set X (maybe posting new messages to the
support forum once every four hours and replies to any forum once every
fifteen minutes). After they've verified, they might have the base
restrictions lifted and get command set X + Y (posting new messages
to any forum every thirty minutes, replies every five minutes). After
they've participated on the site for ten days continuously or thirty
days sporadically, they get full posting and reply priveleges. Or maybe
they don't get PM capabilities until thirty days.

CAPTCHA don't work nearly as well as people think and they're
inaccessible. There is a reason that Ruwiki will never support them.

-austin
--=20
Austin Ziegler * (e-mail address removed)
* Alternate: (e-mail address removed)
 
J

James Edward Gray II

Yes, I understand that. However, CAPTCHA is also proving to be
relatively ineffective and against accessibility standards. If you
have
to follow US Federal 508 guidelines, you shouldn't use CAPTCHA. As
noted
on the various discussions that I linked to, the large sites that
spawned CAPTCHA have now abandoned it.

I have good eye site and they still annoy me. I can't believe adding
a user hostile feature is a good idea. :)

Stay tuned for this week's Ruby Quiz though, which covers this very
topic.

James Edward Gray II
 
A

Austin Ziegler

I have good eye site and they still annoy me. I can't believe adding
a user hostile feature is a good idea. :)

With my glasses or contact lenses, I have *great* vision, and I've got
better colour acuity than most people do (it's how I can still see
meaningful things without my glasses, actually). But you're right -- I
can't see that, either.

-austin
--=20
Austin Ziegler * (e-mail address removed)
* Alternate: (e-mail address removed)
 
D

Devin Mullins

Austin said:
[...]

OCR isn't *that* easy. Humans--even young children--far exceed
machines in discerning even relatively clean machine-print characters.
Yes, I understand that. However, CAPTCHA is also proving to be
relatively ineffective and against accessibility standards. If you have
to follow US Federal 508 guidelines, you shouldn't use CAPTCHA. As noted
on the various discussions that I linked to, the large sites that
spawned CAPTCHA have now abandoned it.
That's interesting to me. I don't follow either subject much, but I'm
very interested in website accessibility. Is the ticketmaster way of
providing either a visual or an aural CAPTCHA not sufficient?
Interesting, but I believe it will be ultimately fruitless. If I am
visually impaired but do not, for example, have audio attached to my
computer, then an audio CAPTCHA is just as limiting as a visual CAPTCHA.
Would a large-print CAPTCHA suffice in this case? People who are too
visually impaired to read large print would have to have audio, I'd assume.
CAPTCHA don't work nearly as well as people think and they're
inaccessible. There is a reason that Ruwiki will never support them.
I do agree that they are a big pain in the butt to deal with, as a user.
Your multiple-level thing seems like the right approach. Yahoo Groups
offers the option for admins to moderate new users only -- after which
the admin can manually give you full posting rights.

Devin
 
K

Kirk Haines

OCR isn't *that* easy. Humans--even young children--far exceed
machines in discerning even relatively clean machine-print characters.

If your eyesight is normal. There are LOTS of people who do not have normal
eyesight.

<anecdote>
My wife holds a driver's license (and drives), is an EMT, and can pretty much
do anything anyone else does, but captchas are difficult for her because of
her eyesight. If a normal set of eyes are likened to monitor resultions, and
are said to be 1280x1024, her eyes are like an 800x600 or 640x480 monitor.
It's still crisp and clear, but the resolution is lower. That alone makes
many styles of captcha difficult for her.
</anecdote>

For a large chunk of the population, captchas are quick and easy and fine, but
for a small but not insignificant chunk, they are difficult to impossible.
And even if a machine can only handle 10% of captchas, that is a high enough
percentage to render them ineffective, IMHO.


Kirk Haines
 
D

Devin Mullins

Kirk said:
And even if a machine can only handle 10% of captchas, that is a high enough
percentage to render them ineffective, IMHO.
Not if
1. the processing time to attempt a captcha response is prohibitively large
2. the system forces a prohibitively large delay after an unsuccessful
captcha attempt.

But again, I don't like 'em either. Just setting facts straight. :)

Devin
 
A

Alan Chen

Robbie said:
But they stop the average spambot, which is what they're for I think.
The simplest accessible alternative would be email verification, but this
obviously slows the whole thing down.
Has anyone thought of an accessible alternative that can be embedded on the
page?

How about asking the user to respond by decoding a "spam speak" encoded
word or question. (On the plus side, if the scheme fails and you can
get access to the algorithm to decode it, you can use that as a spam
email filtering test... ;)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,183
Messages
2,570,968
Members
47,524
Latest member
ecomwebdesign

Latest Threads

Top