Regular Expression help

T

Tony De

On to my next learning exercise. As I parse a file I need to pull an IP
address out a line. Now I thought a regular expression would be the
ticket, but it's giving me a problem. The follow line is an example
string I need to pull one of two IP address out of: (they are not
always formed the same)

Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)

I need that last IP address. Now the problem is that I can't always
count on it being enclosed in paren's. Although I can expect the right
paren to always be there.

So here's my regex exp:
sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
")" or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

tonyd
 
D

David A. Black

Hi --

On to my next learning exercise. As I parse a file I need to pull an IP
address out a line. Now I thought a regular expression would be the
ticket, but it's giving me a problem. The follow line is an example
string I need to pull one of two IP address out of: (they are not
always formed the same)

Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)

I need that last IP address. Now the problem is that I can't always
count on it being enclosed in paren's. Although I can expect the right
paren to always be there.

So here's my regex exp:
sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
")" or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

What you want is an IP address, possibly followed by ')' and
definitely coming at the end of the string (give or take a newline
character after it). That can be expressed like this:

/((\d{1,3}\.){3}\d{1,3})(?=\)?\Z)/

I've got 3 occurences of (\d{1,3}\.), followed by the same thing
without a dot. I've stipulated that this submatch be "looking at"
(i.e., positioned just before) an optional ')' followed by the end of
the string. (\Z gives you end of string, ignoring a possible terminal
newline.)

With your line, it gives you:

irb(main):039:0> line[re] # re.match(line)[0], or whatever
=> "222.222.222.22"


David

--
Rails training from David A. Black and Ruby Power and Light:
ADVANCING WITH RAILS April 14-17 New York City
INTRO TO RAILS June 9-12 Berlin
ADVANCING WITH RAILS June 16-19 Berlin
See http://www.rubypal.com for details and updates!
 
T

Tony De

David said:
Hi --

count on it being enclosed in paren's. Although I can expect the right
paren to always be there.

So here's my regex exp:
sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
")" or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

What you want is an IP address, possibly followed by ')' and
definitely coming at the end of the string (give or take a newline
character after it). That can be expressed like this:

/((\d{1,3}\.){3}\d{1,3})(?=\)?\Z)/

I've got 3 occurences of (\d{1,3}\.), followed by the same thing
without a dot. I've stipulated that this submatch be "looking at"
(i.e., positioned just before) an optional ')' followed by the end of
the string. (\Z gives you end of string, ignoring a possible terminal
newline.)

With your line, it gives you:

irb(main):039:0> line[re] # re.match(line)[0], or whatever
=> "222.222.222.22"


David

David, you rock. I'll give it a try. Those expressions make my head
hurt. But I've been taking in
http://www.regular-expressions.info/tutorial.html. It seems to cover a
lot of foundation and application. Thanks again!

tonyd
 
J

Jesús Gabriel y Galán

David said:
Hi --

On Sat, 29 Mar 2008, Tony De wrote:
count on it being enclosed in paren's. Although I can expect the right
paren to always be there.

So here's my regex exp:
sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
")" or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

What you want is an IP address, possibly followed by ')' and
definitely coming at the end of the string (give or take a newline
character after it). That can be expressed like this:

/((\d{1,3}\.){3}\d{1,3})(?=\)?\Z)/

I've got 3 occurences of (\d{1,3}\.), followed by the same thing
without a dot. I've stipulated that this submatch be "looking at"
(i.e., positioned just before) an optional ')' followed by the end of
the string. (\Z gives you end of string, ignoring a possible terminal
newline.)

With your line, it gives you:

irb(main):039:0> line[re] # re.match(line)[0], or whatever
=> "222.222.222.22"


David

David, you rock. I'll give it a try. Those expressions make my head
hurt. But I've been taking in
http://www.regular-expressions.info/tutorial.html. It seems to cover a
lot of foundation and application. Thanks again!

Another possibility (if I understood correctly): check for the
mandatory ')' in the HELO part, followed by any character (could be
changed by the specific spaces), followed by the numbers and dots for
the IP:

a = "Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)"
a.match(/\).*?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/)[1]

gives: "222.222.222.22"

Jesus.
 
T

Tony De

Jesús Gabriel y Galán said:
David, you rock. I'll give it a try. Those expressions make my head
hurt. But I've been taking in
http://www.regular-expressions.info/tutorial.html. It seems to cover a
lot of foundation and application. Thanks again!

Another possibility (if I understood correctly): check for the
mandatory ')' in the HELO part, followed by any character (could be
changed by the specific spaces), followed by the numbers and dots for
the IP:

a = "Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)"
a.match(/\).*?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/)[1]

gives: "222.222.222.22"

Jesus.


Thanks Jesus,

I appraciate your imput as well. You guys have been a great deal of
help.

tonyd
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,708
Latest member
SherleneF1

Latest Threads

Top