Hash counting

S

Stuart Clarke

I am trying to load some data into a hash and then count how many times
it occurs in the hash, if it occurs more than 5 times then we are adding
some data to an array. Below is my code which I will explain

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
if eventdateID.find {|d| (counts[d] +=1) >= 5}
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}")
end

The first line loads a time and date value into an array and using gsub
it creates the date and time into an ID value. We then process the array
and say if an entry (a date/time ID) occurs more or equal to 5 times add
some data to an array.

My testing with this code is not picking up on any such occurrences,
which I no exist see below:

MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009

Does anyone have any ideas why my code is not working?

I do not get errors, it just does not return any data.

Thanks in advance
 
B

Brian Candler

STDERR.puts is your friend.
eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]

STDERR.puts "A: #{eventdateID.inspect}"
counts = Hash.new(0)
if eventdateID.find {|d| (counts[d] +=1) >= 5}
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}")

STDERR.puts "B: #{@alerts.inspect}"

Then you can see if the data is what you expect before you go into the
loop.

Note that 'find' will abort after one successful match. Is that what you
want?
 
I

Ilan Berci

Stuart,

I believe this will get you closer to what you want..

[1,2,2,3,3,3].inject({}) do |hash, val|
hash[val] ||= 0
hash[val] +=1
hash
end

=> {1=>1,2=>2,3=>3}

hth

ilan


Stuart said:
I am trying to load some data into a hash and then count how many times
it occurs in the hash, if it occurs more than 5 times then we are adding
some data to an array. Below is my code which I will explain

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
if eventdateID.find {|d| (counts[d] +=1) >= 5}
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}")
end
 
R

Robert Klemme

I am trying to load some data into a hash and then count how many times
it occurs in the hash, if it occurs more than 5 times then we are adding
some data to an array. Below is my code which I will explain

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
if eventdateID.find {|d| (counts[d] +=1) >= 5}
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}")
end

You probably rather want

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end

Cheers

robert
 
S

Stuart Clarke

Thanks Robert.

This is more to what I need. However I am still getting no result,
everything works until we get to this section:
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end

Does it make any difference that the data being read into the
eventdateID is alphanumeric eg:

MonFeb022009
MonFeb022009
MonFeb022009

Many thanks.


Robert said:
You probably rather want

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end

Cheers

robert
 
S

Stuart Clarke

I have worked out the problem but I am a little unsure how to solve it.

We have counts which holds all of the event ID's, however |d, cnt| is
not counting the number of matching ID numbers and it just assigns each
ID the number 1.

So given this example, we would expect cnt to find the ID of ?? as
occuring more than 5 times and ignore the rest:

MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
TueAug052008
TueAug052008
WedAug062008

However instead cnt is ust placing the number 1 for each ID for example

1
1
1
1
1
1
1
1
1
1

Can anyone help me with a fix? Many thanks

Stuart said:
Thanks Robert.

This is more to what I need. However I am still getting no result,
everything works until we get to this section:
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end

Does it make any difference that the data being read into the
eventdateID is alphanumeric eg:

MonFeb022009
MonFeb022009
MonFeb022009

Many thanks.


Robert said:
You probably rather want

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end

Cheers

robert
 
S

Stuart Clarke

I have worked out the problem but I am a little unsure how to solve it.

We have counts which holds all of the event ID's, however |d, cnt| is
not counting the number of matching ID numbers and it just assigns each
ID the number 1.

So given this example, we would expect cnt to find the ID of ?? as
occuring more than 5 times and ignore the rest:

MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
TueAug052008
TueAug052008
WedAug062008

However instead cnt is just placing the number 1 for each ID for example

1
1
1
1
1
1
1
1
1
1

Can anyone help me with a fix? Many thanks

Stuart said:
Thanks Robert.

This is more to what I need. However I am still getting no result,
everything works until we get to this section:
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end

Does it make any difference that the data being read into the
eventdateID is alphanumeric eg:

MonFeb022009
MonFeb022009
MonFeb022009

Many thanks.


Robert said:
You probably rather want

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end

Cheers

robert
 
R

Robert Klemme

2009/2/3 Stuart Clarke said:
I have worked out the problem but I am a little unsure how to solve it.

We have counts which holds all of the event ID's, however |d, cnt| is
not counting the number of matching ID numbers and it just assigns each
ID the number 1.

What does that mean? What's in the Hash?
So given this example, we would expect cnt to find the ID of ?? as
occuring more than 5 times and ignore the rest:

MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
TueAug052008
TueAug052008
WedAug062008

However instead cnt is ust placing the number 1 for each ID for example

1
1
1
1
1
1
1
1
1
1

Can anyone help me with a fix? Many thanks

Frankly, you lost me there. Please do this:

require 'pp'

File.open('/tmp/log', 'w') {|io| io.write(counts.pretty_inspect)}

And look at the output and / or post it here.

robert
 
S

Stuart Clarke

Thanks for replying and sorry for the confusion.

My hash (counts) contains date and time ID's like so TueAug052008

When I do a puts on counts I get a list of these as per there date and
time values which is what I want. However counting to see if there is
more than 5 occurances of one the ID values fails and doesn't find
anything in my data set.

I have done as you asked and the output is as follows:

{"WedAug062008"=>1}

This suggests there is a problem. Just for your information doing an
output on counts (the hash) gives this:

MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008

Thanks for your help
 
R

Robert Klemme

2009/2/3 Stuart Clarke said:
Thanks for replying and sorry for the confusion.

My hash (counts) contains date and time ID's like so TueAug052008

Obviously not as the output below demonstrates that there is just a
single entry in the Hash.
When I do a puts on counts I get a list of these as per there date and
time values which is what I want. However counting to see if there is
more than 5 occurances of one the ID values fails and doesn't find
anything in my data set.

I have done as you asked and the output is as follows:

{"WedAug062008"=>1}

Looks like there is a lot missing.
This suggests there is a problem. Just for your information doing an
output on counts (the hash) gives this:

What does "doing an output" mean? Please be more specific (e.g. by
posting complete code, ideally a test case that someone else can
execute), otherwise nobody can help you.
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008

Cheers

robert
 
S

Stuart Clarke

Ok I will get straight to the code causing the problem, so first off you
need to no that 'eventdateID' is an array full of values taken from log
files. A sample of the values in this array are:
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008


Then I have the following code:
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end


The @alerts.push data is again specific to the logs I am parsing.
Basically each record in the log is given an ID number based on the time
and date values which goes into eventdateID. The purpose of the code
above is to check if any of the ID numbers occur more than 5 times in
eventdateID.


counts = Hash.new(0) - empty hash called counts
eventdateID.each {|d| counts[d] +=1} - process each ID value in
eventdateID and load into the hash counts
counts.each do |d,cnt| - process counts and see how many of each ID
value exist
@alerts.push ............. if cnt >=5 - If there are more than 5 of an
ID push some of the log data to an array which matches the eventdateID


I have done some checking

eventdateID.each {|d| counts[d] +=1}
@alerts.push

gives

MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008

At this stage we are on the right lines we have the hash counts with
some date ID's in it.

Another test was:

eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push cnt

This gives

1
1
1
1
1
1
1

This is where the problem I want it to identify that

MonFeb0220091 occurs 6 times in the counts hash
WedAug062008 occurs twice in the counts hash

As a result of this I am expecting my code to output the log data to the
@alerts array based on the eventdateID MonFeb0220091 as it occurs more
than 5 times. Below is my code again to summarise, but the restriction
is you do not have the logs, I can assure you the data in eventdateID
are values like this MonFeb0220091.

Code block:

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end


Thanks again.
 
J

Jesús Gabriel y Galán

Ok I will get straight to the code causing the problem, so first off you
need to no that 'eventdateID' is an array full of values taken from log
files. A sample of the values in this array are:
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008


Then I have the following code:
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end
Another test was:

eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push cnt

This gives

1
1
1
1
1
1
1

Sorry Stuart, can you show the exact code that produces that output
(including the puts that you are using to print those values)? Cause
this works for me as it is:


irb(main):009:0> eventDateID = %w{MonFeb0220091 MonFeb0220091
MonFeb0220091 MonFeb0220091 MonFeb0220091 MonFeb0220091 WedAug062008
WedAug062008}
=> ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
"MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
"WedAug062008"]
irb(main):010:0> counts = Hash.new(0)
=> {}
irb(main):011:0> eventDateID.each {|d| counts[d] += 1}
=> ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
"MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
"WedAug062008"]
irb(main):012:0> counts
=> {"MonFeb0220091"=>6, "WedAug062008"=>2}
irb(main):013:0> @alerts = []
=> []
irb(main):014:0> counts.each do |id, cnt|
irb(main):015:1* @alerts.push(id) if cnt >= 5
irb(main):016:1> end
=> {"MonFeb0220091"=>6, "WedAug062008"=>2}
irb(main):017:0> @alerts
=> ["MonFeb0220091"]


If each element in the array eventDateID is stored in the hash as a
different key (which is what seems to be happening), maybe what is
inside the array are not strings, but another class that has a
different implementation of eql?.
Can you inspect the eventDateID array to check that?

Jesus.
 
S

Stuart Clarke

Thanks for getting back to me.

I have done similar to you in Fxri and got those results earlier it
seems you may be correct and eventdateID and id, cnt do not like
eachother so much. After doing an inspect on eventdateID array I only
get the following:

["WedAug062008"]

This is strange as it seems to missing all the other data. For your
information in my actual code I do @alerts.push(counts)

counts = Hash.new(0)
eventdateID.each {|d| counts[d] += 1}
@alerts.push(counts)
counts.each do |id,cnt|

and get what is expected:

MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008

Its the next step counts.each do |id,cnt| which is the problem.
If each element in the array eventDateID is stored in the hash as a
different key (which is what seems to be happening), maybe what is
inside the array are not strings, but another class that has a
different implementation of eql?.

Not sure what you mean by this. This is how I make eventdateID, just
regular expressions on a string from a struct:

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]

Many thanks

Jesús Gabriel y Galán said:
1
1
1
1
1

Sorry Stuart, can you show the exact code that produces that output
(including the puts that you are using to print those values)? Cause
this works for me as it is:


irb(main):009:0> eventDateID = %w{MonFeb0220091 MonFeb0220091
MonFeb0220091 MonFeb0220091 MonFeb0220091 MonFeb0220091 WedAug062008
WedAug062008}
=> ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
"MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
"WedAug062008"]
irb(main):010:0> counts = Hash.new(0)
=> {}
irb(main):011:0> eventDateID.each {|d| counts[d] += 1}
=> ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
"MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
"WedAug062008"]
irb(main):012:0> counts
=> {"MonFeb0220091"=>6, "WedAug062008"=>2}
irb(main):013:0> @alerts = []
=> []
irb(main):014:0> counts.each do |id, cnt|
irb(main):015:1* @alerts.push(id) if cnt >= 5
irb(main):016:1> end
=> {"MonFeb0220091"=>6, "WedAug062008"=>2}
irb(main):017:0> @alerts
=> ["MonFeb0220091"]


If each element in the array eventDateID is stored in the hash as a
different key (which is what seems to be happening), maybe what is
inside the array are not strings, but another class that has a
different implementation of eql?.
Can you inspect the eventDateID array to check that?

Jesus.
 
J

Jesús Gabriel y Galán

Thanks for getting back to me.
and get what is expected:

MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008

Its the next step counts.each do |id,cnt| which is the problem.

Sorry, but can you post a complete executable piece of code we can use
to reproduce the problem?
You have this:

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] + eventsbydate[26..30]

but what is eventdateID? Maybe you have an earlier line of code like
eventdateID = [].
I'd like to see the complete picture. Also, what is eventsbydate?
By the way, now I'm realizing that eventsbydate might be a string, so
how can eventdateID contain more than 1 entry at all?
If that's true, then

eventsbydate.gsub(/\s/, '')[0..7] + eventsbydate[26..30]

is also a string. So you are pushing a single string into eventdateID,
so when you later iterate you only get one iteration. Perhaps you have
a loop around the piece of code you showed? If that's the case, then
it makes sense that you never get more than 1 count per entry, because
you are creating the hash every time. So, I think it would be easier
if you pasted the complete program.

Not sure what you mean by this.

It was another hipothesis, but I think you can forget about it, since
I'm pretty sure now that with the piece of code you showed you are
only ever pushing one string into eventdateID.

Jesus.
 
S

Stuart Clarke

Thanks for your response. It makes a lot more sense and you are on the
right lines I think. There is other code around this but it does not
bare much relevance:

def scanEVTWithSource(file, source)
@alerts = []
@evtLogArray = []
begin
#read the contents of the event logs files
evtLog = EventLog.open_backup(file, source)

#put data into an array
@evtLogArray = evtLog.read.sort { |a, b| (a.event_id <=>
b.event_id).nonzero? || (a.time_written <=> b.time_written)}

#event log data collected
evtLog.close

if evtLogArray.length == 0
return
end

#failed logons where more than 10 have occurred in a day
if event.event_id == 529
eventdateID = []
#assign all time written values to the eventsbydate array
eventsbydate = "#{event.time_written}"
eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
eventdateID.each {|d| counts[d] += 1}
counts.each do |id,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab} #{event.event_type} #{@tab} #{type}") if cnt >= 5
end
end
end


I will explain this.

The scanEVTWithSource(file, source) - takes data and arguements from two
other methods which assist with the reading of the log files.

@evtLogArray - an array full of log data which is inspected in structs

The rest we no about, but for example event.event_id is a struct to
inspect the the ID field.

Hope this helps and thank you very much for your help. You are right
eventsbydate is a string based on data from the event.time_written
struct using GSUB etc to chomp it down into the values you have already
seen.

Regards

Jesús Gabriel y Galán said:
Thanks for getting back to me.
Its the next step counts.each do |id,cnt| which is the problem.

Sorry, but can you post a complete executable piece of code we can use
to reproduce the problem?
You have this:

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]

but what is eventdateID? Maybe you have an earlier line of code like
eventdateID = [].
I'd like to see the complete picture. Also, what is eventsbydate?
By the way, now I'm realizing that eventsbydate might be a string, so
how can eventdateID contain more than 1 entry at all?
If that's true, then

eventsbydate.gsub(/\s/, '')[0..7] + eventsbydate[26..30]

is also a string. So you are pushing a single string into eventdateID,
so when you later iterate you only get one iteration. Perhaps you have
a loop around the piece of code you showed? If that's the case, then
it makes sense that you never get more than 1 count per entry, because
you are creating the hash every time. So, I think it would be easier
if you pasted the complete program.

Not sure what you mean by this.

It was another hipothesis, but I think you can forget about it, since
I'm pretty sure now that with the piece of code you showed you are
only ever pushing one string into eventdateID.

Jesus.
 
D

David A. Black

Hi --

Thanks for your response. It makes a lot more sense and you are on the
right lines I think. There is other code around this but it does not
bare much relevance:

def scanEVTWithSource(file, source)
@alerts = []
@evtLogArray = []
begin
#read the contents of the event logs files
evtLog = EventLog.open_backup(file, source)

#put data into an array
@evtLogArray = evtLog.read.sort { |a, b| (a.event_id <=>
b.event_id).nonzero? || (a.time_written <=> b.time_written)}

I haven't really been following this thread but this caught my eye and
I thought I'd mention this other technique:

array.sort_by {|e| [e.event_id, e.time_written] }


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Coming in 2009: The Well-Grounded Rubyist (http://manning.com/black2)

http://www.wishsight.com => Independent, social wishlist management!
 
J

Jesús Gabriel y Galán

Thanks for your response. It makes a lot more sense and you are on the
right lines I think. There is other code around this but it does not
bare much relevance:

def scanEVTWithSource(file, source)
@alerts = []
@evtLogArray = []

This is unneeded, since you later assign another array to this
variable without using this one.
begin
#read the contents of the event logs files
evtLog = EventLog.open_backup(file, source)

#put data into an array
@evtLogArray = evtLog.read.sort { |a, b| (a.event_id <=>
b.event_id).nonzero? || (a.time_written <=> b.time_written)}

Are you sure you want to put this in an instance variable?
#event log data collected
evtLog.close
if evtLogArray.length == 0

Shouldn't this be checking the @evtLogArray?
return
end

#failed logons where more than 10 have occurred in a day
if event.event_id == 529

Here we are reaching the culprit, I think. What is event? It's not
defined in this method...
eventdateID = []
#assign all time written values to the eventsbydate array
eventsbydate = "#{event.time_written}"
eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
eventdateID.each {|d| counts[d] += 1}
counts.each do |id,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab} #{event.event_type} #{@tab} #{type}") if cnt >= 5
end
end
end

Let me try to write what I think you want cause I still think the
above code is not what you are actually running, cause the above as is
will give a NoMethodError in the evtLogArray.length method call. The
following is untested:


def scanEVTWithSource(file, source)
@alerts = []
#read the contents of the event logs files
evtLog = EventLog.open_backup(file, source)
#put data into an array; sort it using David's advice
evtLogArray = evtLog.read.sort_by { |e| [e.event_id, e.time_written] }

#event log data collected
evtLog.close
return if evtLogArray.length == 0

# Important part here: create the hash outside the loop
# and, actually, do a loop on evtLogArray
counts = Hash.new(0)
# select relevant events, mapping them to the modified string
events = evtLogArray.select {|event| event.event_id == 529}
events.each do |event|
event_time = event.time_written.to_s
eventsbydate = event_time.gsub(/\s/, '')[0..7] + event_time[26..30]
counts[eventsbydate] += 1
end
counts.each do |id,cnt|
# Now I have a problem here: what we are putting in the hash
is a string, not an event object
# @alerts.push("#{event.event_id} #{@tab}
#{event.time_written} #{@tab} #{event.event_type} #{@tab} #{type}") if
cnt >= 5
@alerts.push(id) if cnt >= 5
end
end

I hope this helps. I don't have time now to solve the issue about you
wanting to push the event object to the alerts array, instead of just
the calculated string, but I hope you find a way to do that easily.

Let me know if this helped.

Jesus.
 
R

Robert Klemme

2009/2/3 Jes=FAs Gabriel y Gal=E1n said:
Thanks for your response. It makes a lot more sense and you are on the
right lines I think. There is other code around this but it does not
bare much relevance:

def scanEVTWithSource(file, source)
@alerts =3D []
@evtLogArray =3D []

This is unneeded, since you later assign another array to this
variable without using this one.

Also, when reinitializing these variables on each method call then
chances are that they can be local variables and not instance
variables - unless, of course, some other method in the class (which
class?) uses the leftovers of scanEVTWithSource in those instance
variables.

I am suspecting the issue is somewhere above the method. For example,
you might have a loop calling scanEVTWithSource and expecting that
counts are aggregated throughout all calls but they aren't since you
reinitialize the Hash on each call.
begin
#read the contents of the event logs files
evtLog =3D EventLog.open_backup(file, source)

#put data into an array
@evtLogArray =3D evtLog.read.sort { |a, b| (a.event_id <=3D>
b.event_id).nonzero? || (a.time_written <=3D> b.time_written)}

Are you sure you want to put this in an instance variable?
#event log data collected
evtLog.close
if evtLogArray.length =3D=3D 0

Shouldn't this be checking the @evtLogArray?
return
end

#failed logons where more than 10 have occurred in a day
if event.event_id =3D=3D 529

Here we are reaching the culprit, I think. What is event? It's not
defined in this method...
eventdateID =3D []
#assign all time written values to the eventsbydate array
eventsbydate =3D "#{event.time_written}"
eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts =3D Hash.new(0)
eventdateID.each {|d| counts[d] +=3D 1}
counts.each do |id,cnt|
@alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab} #{event.event_type} #{@tab} #{type}") if cnt >=3D 5
end
end
end

Absolutely agree to your other comments. I still think we haven't
seen all the code. Also, the whole problem is not very clear to me
either.

Cheers

robert

--=20
remember.guy do |as, often| as.you_can - without end
 
M

Martin DeMello

counts = Hash.new(0)
eventdateID.each {|d| counts[d] += 1}

Here is your problem. Hash.new(0) means "when I query the hash, and
the key I request is not in there, return 0". It does not actually add
{key => 0} to the hash itself. To do that, you need the block form of
Hash.new, which yields as block the hash itself and the key:

counts = Hash.new {|h, k| h[k] = 0}

irb(main):001:0> a = Hash.new(0)
=> {}
irb(main):002:0> b = Hash.new {|h,k| h[k] = 0}
=> {}
irb(main):003:0> a['hello']
=> 0
irb(main):004:0> b['hello']
=> 0
irb(main):005:0> a
=> {}
irb(main):006:0> b
=> {"hello"=>0}

martin
 
J

Jesús Gabriel y Galán

counts = Hash.new(0)
eventdateID.each {|d| counts[d] += 1}

Here is your problem. Hash.new(0) means "when I query the hash, and
the key I request is not in there, return 0". It does not actually add
{key => 0} to the hash itself.

This is true, but counts[d] += 1 is actually counts[d] = counts[d] + 1
so the RHS will evaluate to 1 the first time, assigning it to the hash:

irb(main):001:0> h = Hash.new(0)
=> {}
irb(main):002:0> h["a"] += 1
=> 1
irb(main):003:0> h
=> {"a"=>1}

So the above snippet is correct for generating a histogram.

Jesus.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top