Simple Regexp help

J

Joe Blow

How can I test a word to make sure it ONLY contains certain characters?

Say i have an expression like /[A-Z]/i

How could i have "Testing" pass but "Testing123" fail?

PS. I can not dynamically create the expression so it looks like this
/[A-Z{10}]/
 
T

Tim Greer

Joe said:
How can I test a word to make sure it ONLY contains certain
characters?

Say i have an expression like /[A-Z]/i

How could i have "Testing" pass but "Testing123" fail?

PS. I can not dynamically create the expression so it looks like this
/[A-Z{10}]/

If you only want a-z, set the character class to start and end the
string, otherwise it'll match anything with a character that's in the
alphabet, regardless of what follows it. I.e., /^[a-z]+$/i will only
match a-z characters from start to end. You can then change the
character class to whatever you wish. ^ is the start of the string and
$ is the end of the string. + is one or more characters, so ^[a-z]+$ is
one of more characters in the character class [], being a-z, and noting
else.
 
S

Siep Korteling

Tim said:
Joe said:
How can I test a word to make sure it ONLY contains certain
characters?

Say i have an expression like /[A-Z]/i

How could i have "Testing" pass but "Testing123" fail?

PS. I can not dynamically create the expression so it looks like this
/[A-Z{10}]/

If you only want a-z, set the character class to start and end the
string, otherwise it'll match anything with a character that's in the
alphabet, regardless of what follows it. (...)

You can also test for the opposite, anything not in the range a-z.

class String
def all_letters?
(self =~ /[^a-z]/i).nil?
end
end

puts "Testing".all_letters?
# => true
puts "Tes34ting".all_letters?
# => false

hth,

Siep
 
R

Robert Klemme

Joe said:
How can I test a word to make sure it ONLY contains certain
characters?

Say i have an expression like /[A-Z]/i

How could i have "Testing" pass but "Testing123" fail?

PS. I can not dynamically create the expression so it looks like this
/[A-Z{10}]/

If you only want a-z, set the character class to start and end the
string, otherwise it'll match anything with a character that's in the
alphabet, regardless of what follows it. I.e., /^[a-z]+$/i will only
match a-z characters from start to end.

Note, this is not totally safe:

irb(main):001:0> s = "foo\n"
=> "foo\n"
irb(main):002:0> /^[a-z]+$/i =~ s
=> 0

As you can see, even a string with a newline at the end passes. This
version is safer because anchors do actually use start and end of the
string:

irb(main):003:0> /\A[a-z]+\z/i =~ s
=> nil
You can then change the
character class to whatever you wish. ^ is the start of the string and
$ is the end of the string.

Actually ^ is a line start and $ is a line end (see above). For proper
anchoring string start and end you need \A and \z.
+ is one or more characters, so ^[a-z]+$ is
one of more characters in the character class [], being a-z, and noting
else.

Note also that, depending on definition of legal string the expression
should probably contain * instead of + because also the empty string
does not contain any illegal characters.

Kind regards

robert
 
J

Joe Blow

Robert thanks \A and \z is what I was looking for.

Can you explain though what that does though? I can not seem to find any
relevant information on it.
 
J

Jesús Gabriel y Galán

Robert thanks \A and \z is what I was looking for.

Can you explain though what that does though? I can not seem to find any
relevant information on it.

\A matches the beggining of the string and \z matches the end of the string.
If you don't "anchor" the regexp with those, what you match can match
anywhere within the string, even a substring. So for example:

irb(main):001:0> a = "abcdef"
=> "abcdef"
irb(main):005:0> a =~ /bc/
=> 1

This means that the regexp /bc/ has matched the string at position
one. Notice now if you anchor it that it doesn't match:

irb(main):006:0> a =~ /\Abc\z/
=> nil
irb(main):007:0> a =~ /\Aabcdef\z/
=> 0

www.rubular.com has a regular expression editor, where you can test,
and also a quick reference guide and a link to an online copy of the
pickaxe, although to be honest, not much is said about \A and \z
specifically.

You can also search here for more info about regexps:

http://www.regular-expressions.info/

Hope this helps,

Jesus.
 
T

Tim Greer

Robert said:
irb(main):001:0> s = "foo\n"
=> "foo\n"
irb(main):002:0> /^[a-z]+$/i =~ s
=> 0

Not knowing exactly what the OP wanted, it was only a quick example, so
what is "safe" is relative. Anyway, the use of \A and \z is indeed
probably a better example regardless of not knowing exactly what they
want.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,184
Messages
2,570,973
Members
47,530
Latest member
jameswilliam1

Latest Threads

Top