string search and modification

J

Jim Britain

I know absolutely nothing about Python. My background is shell
scripts assembly language and C programming. Currently I work network
support.

This is a portion of a Python script written by aaronsinclair. the
full script can be found at:
http://forums.ev1servers.net/printthread.php?t=50435&page=3&pp=25

It monitors sendmail logfiles for dictionary attacks, and blocks with
additions to iptables.



The part I am having a problem with is the regular expression in the
re.search function.

Basicly, it is insufficiently qualified.

Troublesome example logfile line:
Sep 6 00:46:32 tabor sendmail[26642]: k867kMH5026642:
dsl-kk-dynamic-013.38.22.125.touchtelindia.net [125.22.38.13] (may be
forged)
Possible SMTP RCPT flood, throttling.

(all one line in the logfile)

What is happenning, is there are two sections that will qualify in
this logfile line, and it matches on the wrong one.

What I would like to happen, is to return the value from within the
brackets, in every successful match.

I have tried putting \[ in the beginning of the string, but am
unsuccessful editting the qualifying character back out again, and
returning the real ip string. (If indeed I did even get a match).

This script runs in the background, and I would have to build a
complete test environ, and rewrite the whole darn thing to run
visibly, and use different files.

I thought asking -- like a beginner -- for the trivial solution.
(besides being up all night and all day).

Thanks in advance for any help. Quick searches online for tutorial
documentation and the books I have.. met with horrible results in
finding a solution.

I would like to match [123.123.123.123] (including the qualifying
brackets), but be able to simply return the contents, without the
brackets.

(Perl would be easy, but it's not Python)

def identifyHost(self):

for line in self.fileContents:
if re.search("throttling", line.lower()):
ip = re.search("[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}",
line)

if ip.group() in self.ignoreList:
continue
if not ip.group() in self.banList:
self.banList.append(ip.group())
 
P

Paul Rubin

Jim Britain said:
I would like to match [123.123.123.123] (including the qualifying
brackets), but be able to simply return the contents, without the
brackets.
>>> p=r'\[((\d{1,3}\.){3}\d{1,3})\]'
>>> m = 'dsl-kk-dynamic-013.38.22.125.touchtelindia.net [125.22.38.13] (may be'
>>> g=re.search(p,m)
>>> g.group(1)
'125.22.38.13'

g.group(1) matches the stuff in the first set of parens, which excludes
the square brackets.
 
J

Jim Britain

Jim Britain said:
I would like to match [123.123.123.123] (including the qualifying
brackets), but be able to simply return the contents, without the
brackets.
p=r'\[((\d{1,3}\.){3}\d{1,3})\]'
m = 'dsl-kk-dynamic-013.38.22.125.touchtelindia.net [125.22.38.13] (may be'
g=re.search(p,m)
g.group(1)
'125.22.38.13'

g.group(1) matches the stuff in the first set of parens, which excludes
the square brackets.

Final integration:

def identifyHost(self):
for line in self.fileContents:
if re.search("throttling", line.lower()):
p=r'\[((\d{1,3}\.){3}\d{1,3})\]'
ip=re.search(p,line)
if ip.group(1) in self.ignoreList:
continue
if not ip.group(1) in self.banList:
self.banList.append(ip.group(1))


Thanks for the help.
Jim
--
 
J

John Machin

Jim said:
Final integration:

def identifyHost(self):
for line in self.fileContents:
if re.search("throttling", line.lower()):
p=r'\[((\d{1,3}\.){3}\d{1,3})\]'
ip=re.search(p,line)

A prudent pessimist might test for the complete absence of an IP
address:
if not ip:
print "Huh?" # or whatever
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,738
Latest member
JinaMacvit

Latest Threads

Top