M
|MKSM|
Hello,
I want to parse a log file containing several line in the same format.
My log files are about 50mb each (350k lines) so i need something
quite fast. The current (and fastest) solution i came up with is using
StringScanner.
I save what i get into variables and then pass them all into a Struct
i created. Each new struct is then passed into an Array that holds all
structs.
Here's my test code:
require 'strscan'
a =3D "1140908573.050732 rule 19/0(match): pass unkn(255) on sis1:
80.202.226.15.50000 > 192.168.0.6.52525: UDP, length 64"
s =3D StringScanner.new(a)
time =3D s.scan(/\d+\.\d+/)
s.pos +=3D 23
rule_no =3D s.scan(/\d+/)
s.skip(/[\d\D]*?\s/)
stat =3D s.scan(/\w+/)
s.skip(/.*on\s/)
interface =3D s.scan(/\w+\:/)
s.skip(/\D+?\s/)
out_ip =3D s.scan(/(\d+\.){3}\d{0,3}/)
s.pos +=3D 1
out_port =3D s.scan(/\d+/)
s.skip(/\D+/)
in_ip =3D s.scan(/(\d+\.){3}\d{0,3}/)
s.pos +=3D 1
in_port =3D s.scan(/\d+/)
s.pos +=3D 2
proto =3D s.scan(/\w+/)
proto
s.pos +=3D 1
Running that on a 10k times loop it takes about 0.6 seconds to
complete. Is there a better/faster way on doing it?
Regards,
Ricardo.
I want to parse a log file containing several line in the same format.
My log files are about 50mb each (350k lines) so i need something
quite fast. The current (and fastest) solution i came up with is using
StringScanner.
I save what i get into variables and then pass them all into a Struct
i created. Each new struct is then passed into an Array that holds all
structs.
Here's my test code:
require 'strscan'
a =3D "1140908573.050732 rule 19/0(match): pass unkn(255) on sis1:
80.202.226.15.50000 > 192.168.0.6.52525: UDP, length 64"
s =3D StringScanner.new(a)
time =3D s.scan(/\d+\.\d+/)
s.pos +=3D 23
rule_no =3D s.scan(/\d+/)
s.skip(/[\d\D]*?\s/)
stat =3D s.scan(/\w+/)
s.skip(/.*on\s/)
interface =3D s.scan(/\w+\:/)
s.skip(/\D+?\s/)
out_ip =3D s.scan(/(\d+\.){3}\d{0,3}/)
s.pos +=3D 1
out_port =3D s.scan(/\d+/)
s.skip(/\D+/)
in_ip =3D s.scan(/(\d+\.){3}\d{0,3}/)
s.pos +=3D 1
in_port =3D s.scan(/\d+/)
s.pos +=3D 2
proto =3D s.scan(/\w+/)
proto
s.pos +=3D 1
Running that on a 10k times loop it takes about 0.6 seconds to
complete. Is there a better/faster way on doing it?
Regards,
Ricardo.