Help With PyParsing of output from win32pdhutil.ShowAllProcesses()

S

Steve

Hi All (especially Paul McGuire!)

Could you lend a hand in the grammar and paring of the output from the
function win32pdhutil.ShowAllProcesses()?

This is the code that I have so far (it is very clumsy at the
moment) :


import string
import win32api
import win32pdhutil
import re
import pyparsing


process_info = win32pdhutil.ShowAllProcesses()

print process_info
print

## Output from ShowAllProcesses :

##Process Name ID Process,% Processor Time,% User Time,% Privileged
Time,Virtual Bytes Peak,Virtual Bytes
##PyScripter 2572 0 0 0 96370688 96370688
##vmnetdhcp 1184 0 0 0 13942784 13942784
##vmount2 780 0 0 0 40497152 38400000
##ipoint 260 0 0 0 63074304 58531840


sProcess_Info = str(process_info)
print('type = ', type(sProcess_Info))

## Try some test data :
test = ('Process Name ID Process,% Processor Time,% User Time,%
Privileged Time,Virtual Bytes Peak,Virtual Bytes',
'PyScripter 2572 0 0 0 96370688 96370688',
'vmnetdhcp 1184 0 0 0 13942784 13942784',
'vmount2 780 0 0 0 40497152 38400000',
'ipoint 260 0 0 0 63074304 58531840')

heading = pyparsing.Literal('Process Name ID Process,% Processor
Time,% User Time,% Privileged Time,Virtual Bytes Peak,Virtual
Bytes').suppress()
integer = pyparsing.Word(pyparsing.nums)
process_name = pyparsing.Word(pyparsing.alphas)

#ProcessList = heading + process_name + pyparsing.OneOrMore(integer)
ProcessList = process_name + pyparsing.OneOrMore(integer)

# Now parse data and print results

for current_line in test :
print('Current line = %s') % (current_line)

try:
data = ProcessList.parseString(current_line)
print "data:", data
except:
pass


print('\n\nParse Actual data : \n\n')
## Parse the actual data from ShowAllProcesses :

ProcessList = heading + process_name + pyparsing.OneOrMore(integer)
data = ProcessList.parseString(sProcess_Info)
print "data:", data
print "data.asList():",
print "data keys:", data.keys()



=====

Output from run :


Process Name ID Process,% Processor Time,% User Time,% Privileged
Time,Virtual Bytes Peak,Virtual Bytes
PyScripter 2572 0 0 0 101416960 97730560
vmnetdhcp 1184 0 0 0 13942784 13942784
vmount2 780 0 0 0 40497152 38400000
ipoint 260 0 0 0 65175552 58535936
DockingDirector 916 0 0 0 102903808 101695488
vmnat 832 0 0 0 15757312 15757312
svchost 1060 0 0 0 74764288 72294400
svchost 1120 0 0 0 46632960 45846528
svchost 1768 0 0 0 131002368 113393664
svchost 1988 0 0 0 33619968 31047680
svchost 236 0 0 0 39841792 39055360
System 4 0 0 0 3624960 1921024
.....

None

('type = ', <type 'str'>)
Current line = Process Name ID Process,% Processor Time,% User Time,
% Privileged Time,Virtual Bytes Peak,Virtual Bytes
Current line = PyScripter 2572 0 0 0 96370688
96370688
data: ['PyScripter', '2572', '0', '0', '0', '96370688', '96370688']
Current line = vmnetdhcp 1184 0 0 0 13942784
13942784
data: ['vmnetdhcp', '1184', '0', '0', '0', '13942784', '13942784']
Current line = vmount2 780 0 0 0 40497152
38400000
data: ['vmount', '2', '780', '0', '0', '0', '40497152', '38400000']
Current line = ipoint 260 0 0 0 63074304
58531840
data: ['ipoint', '260', '0', '0', '0', '63074304', '58531840']


Parse Actual data :


Traceback (most recent call last):
File "ProcessInfo.py", line 55, in <module>
data = ProcessList.parseString(sProcess_Info)
File "C:\Python25\lib\site-packages\pyparsing.py", line 821, in
parseString
loc, tokens = self._parse( instring.expandtabs(), 0 )
File "C:\Python25\lib\site-packages\pyparsing.py", line 712, in
_parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python25\lib\site-packages\pyparsing.py", line 1864, in
parseImpl
loc, resultlist = self.exprs[0]._parse( instring, loc, doActions,
callPreParse=False )
File "C:\Python25\lib\site-packages\pyparsing.py", line 716, in
_parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python25\lib\site-packages\pyparsing.py", line 2106, in
parseImpl
return self.expr._parse( instring, loc, doActions,
callPreParse=False )
File "C:\Python25\lib\site-packages\pyparsing.py", line 716, in
_parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python25\lib\site-packages\pyparsing.py", line 1118, in
parseImpl
raise exc
pyparsing.ParseException: Expected "Process Name ID Process,%
Processor Time,% User Time,% Privileged Time,Virtual Bytes
Peak,Virtual Bytes" (at char 0), (line:1, col:1)



Many thanks!

Steve
 
D

David

Hi All (especially Paul McGuire!)

Could you lend a hand in the grammar and paring of the output from the
function win32pdhutil.ShowAllProcesses()?

This is the code that I have so far (it is very clumsy at the
moment) :

Any particular reason you need to use pyparsing? Seems like an
overkill for such simple data.

Here's an example:

import pprint

X="""Process Name ID Process,% Processor Time,% User Time,%
Privileged Time,Virtual Bytes Peak,Virtual Bytes
PyScripter 2572 0 0 0 96370688 96370688
vmnetdhcp 1184 0 0 0 13942784 13942784
vmount2 780 0 0 0 40497152 38400000
ipoint 260 0 0 0 63074304 58531840"""

data = []
for line in X.split('\n')[1:]: # Skip the first row
split = line.split()
row = [split[0]] # Get the process name
row += [int(x) for x in split[1:]] # Convert strings to int, fail
if any aren't.
data.append(row)

pprint.pprint(data)

# Output follows:
#
#[['PyScripter', 2572, 0, 0, 0, 96370688, 96370688],
# ['vmnetdhcp', 1184, 0, 0, 0, 13942784, 13942784],
# ['vmount2', 780, 0, 0, 0, 40497152, 38400000],
# ['ipoint', 260, 0, 0, 0, 63074304, 58531840]]
#
 
S

Steve

Hi All,

I did a lot of digging into the code in the module, win32pdhutil, and
decided to create some custom methods.


added to : import win32pdhutil



def ShowAllProcessesAsList():

object = find_pdh_counter_localized_name("Process")
items, instances =
win32pdh.EnumObjectItems(None,None,object,win32pdh.PERF_DETAIL_WIZARD)

# Need to track multiple instances of the same name.
instance_dict = {}
all_process_dict = {}

for instance in instances:
try:
instance_dict[instance] = instance_dict[instance] + 1
except KeyError:
instance_dict[instance] = 0

# Bit of a hack to get useful info.

items = [find_pdh_counter_localized_name("ID Process")] + items[:
5]
# print items
# print "Process Name", string.join(items,",")

all_process_dict['Headings'] = items # add
headings to dict

for instance, max_instances in instance_dict.items():

for inum in xrange(max_instances+1):
hq = win32pdh.OpenQuery()
hcs = []
row = []

for item in items:
path =
win32pdh.MakeCounterPath( (None,object,instance,None, inum, item) )
hcs.append(win32pdh.AddCounter(hq, path))

win32pdh.CollectQueryData(hq)
# as per http://support.microsoft.com/default.aspx?scid=kb;EN-US;q262938,
some "%" based
# counters need two collections
time.sleep(0.01)
win32pdh.CollectQueryData(hq)
# print "%-15s\t" % (instance[:15]),

row.append(instance[:15])

for hc in hcs:
type, val = win32pdh.GetFormattedCounterValue(hc,
win32pdh.PDH_FMT_LONG)
# print "item : %5d" % (val),
row.append(val)
win32pdh.RemoveCounter(hc)

# print
# print ' row = ', instance ,row
all_process_dict[instance] = row # add
current row to dict

win32pdh.CloseQuery(hq)

return all_process_dict


def ShowSingleProcessAsList(sProcessName):

object = find_pdh_counter_localized_name("Process")
items, instances =
win32pdh.EnumObjectItems(None,None,object,win32pdh.PERF_DETAIL_WIZARD)

# Need to track multiple instances of the same name.
instance_dict = {}
all_process_dict = {}

for instance in instances:
try:
instance_dict[instance] = instance_dict[instance] + 1
except KeyError:
instance_dict[instance] = 0

# Bit of a hack to get useful info.

items = [find_pdh_counter_localized_name("ID Process")] + items[:
5]
# print items
# print "Process Name", string.join(items,",")

# all_process_dict['Headings'] = items # add
headings to dict

# print 'instance dict = ', instance_dict
# print

if sProcessName in instance_dict:
instance = sProcessName
max_instances = instance_dict[sProcessName]
# print sProcessName, ' max_instances = ', max_instances

for inum in xrange(max_instances+1):
hq = win32pdh.OpenQuery()
hcs = []
row = []

for item in items:
path =
win32pdh.MakeCounterPath( (None,object,instance,None, inum, item) )
hcs.append(win32pdh.AddCounter(hq, path))

try:
win32pdh.CollectQueryData(hq)
except:
all_process_dict[sProcessName] =
[0,0,0,0,0,0,0] # process not found - set to all zeros
break

# as per http://support.microsoft.com/default.aspx?scid=kb;EN-US;q262938,
some "%" based
# counters need two collections
time.sleep(0.01)
win32pdh.CollectQueryData(hq)
# print "%-15s\t" % (instance[:15]),

row.append(instance[:15])

for hc in hcs:
type, val = win32pdh.GetFormattedCounterValue(hc,
win32pdh.PDH_FMT_LONG)
# print "item : %5d" % (val),
row.append(val)
win32pdh.RemoveCounter(hc)

# print
# print ' row = ', instance ,row
all_process_dict[instance] = row # add
current row to dict

win32pdh.CloseQuery(hq)
else:
all_process_dict[sProcessName] = [0,0,0,0,0,0,0] #
process not found - set to all zeros

return all_process_dict

=============================

Demo :

import win32pdhutil # with customized methods in win32pdhutil
(above)


###################################################################
# GetMemoryStats #
###################################################################

def GetMemoryStats(sProcessName, iPauseTime):

Memory_Dict = {}

## Headings ['ProcessName', '% Processor Time', '% User Time', '%
Privileged Time', 'Virtual Bytes Peak', 'Virtual Bytes']
##machine process = {'firefox': ['firefox', 2364, 0, 0, 0,
242847744, 211558400]}

loop_counter = 0

print('\n\n** Starting Free Memory Sampler **\n\n')
print('Process : %s\n Delay : %d seconds\n\n') % (sProcessName,
iPauseTime)
print('\n\nPress : Ctrl-C to stop and output stats...\n\n')


try:

while 1:
print('Sample : %d') % loop_counter
row = []

machine_process =
win32pdhutil2.ShowSingleProcessAsList(sProcessName)
# print 'machine process = ', machine_process
row.append(machine_process[sProcessName]
[5]) # Virtual Bytes Peak
row.append(machine_process[sProcessName]
[6]) # Virtual Bytes

Memory_Dict[loop_counter] =
row # add values to the
dictionary
loop_counter += 1
time.sleep(iPauseTime)

except KeyboardInterrupt: # Ctrl-C encountered
print "End of Sample...\n\n"


return Memory_Dict


###################################################################
############# M A I N ###########################
###################################################################

def Main():

iPause_time = 5 # pause time - seconds
sProcessName = 'firefox' # Process to watch
sReportFileName = 'MemoryStats.csv' # output filename

Memory_Dict = GetMemoryStats(sProcessName, iPause_time)


outfile = open(sReportFileName,"w") # send output to a file
outfile.write('SampleTime, VirtualBytesMax, VirtualBytes\n')


for current_stat in Memory_Dict:
line = ('%s,%d,%d\n') % (current_stat, Memory_Dict[current_stat]
[0],Memory_Dict[current_stat][1] )
outfile.write(line)


outfile.close() # close output file


if __name__ == "__main__":
Main()


-------------------------

I have found that the process that you want to want to monitor needs
to be started before this script is started. The script will handle
when the process disappears and set the stats to zeros.

Enjoy!

Steve
 
P

Paul McGuire

Hi All (especially Paul McGuire!)

Could you lend a hand in the grammar and paring of the output from the
function win32pdhutil.ShowAllProcesses()?

This is the code that I have so far (it is very clumsy at the
moment) :
Many thanks!

Steve

Steve -

Well, your first issue is not a pyparsing one, but one of redirecting
stdout. win32pdhutil.ShowAllProcesses does not *return* the output
you listed, it just prints it to stdout. The value returned is None,
which is why you are having trouble parsing it (even after converting
None to a string).

For you to parse out this data, you will need to redirect stdout to a
string buffer, run ShowAllProcesses, and then put stdout back the way
it was. Python's cStringIO module is perfect for this:


from cStringIO import StringIO
import sys
import win32pdhutil

save_stdout = sys.stdout
process_info = StringIO()
sys.stdout = process_info

win32pdhutil.ShowAllProcesses()
sys.stdout = save_stdout
sProcess_Info = process_info.getvalue()


*Now* you have all that data captured into a processable string.

As others have mentioned, this data is pretty predictably formatted,
so pyparsing may be more than you need. How about plain old split?


for line in sProcess_Info.splitlines()[1:]:
data = line.split()
print data


Done!

Still have an urge to parse with pyparsing? Here are some comments on
your grammar:

- Your definition of process_name was not sufficient on my system. I
had some processes running whose names includes numeric digits and
other non-alphas. I needed to modify process_name to:

process_name = pyparsing.Word(pyparsing.alphanums+"_.-")

- Similarly, some of my values returned by ShowAllProcesses had
negative values, so your definition of integer needs to comprehend an
optional leading '-' sign. (This actually sounds like a bug in
win32pdhutil - I don't think any of these listed quantities should
report a negative value.)

- Whenever I have integers in a grammar, I usually convert them to
ints at parse time, using a parse action:

integer.setParseAction( lambda tokens : int(tokens[0]) )

- The tabular format of this data, and the fact that the initial entry
in each row appears to be a label of some sort invites the use of the
pyparsing Dict class. I note that you are already trying to extract
keys from the parsed data, so it looks like you are already thinking
along these lines. (Unfortunately, it is very likely you will get
duplicate keys, since process names do not have to be unique - this
will involve some loss of data in this example.) The Dict class auto-
generates results names in the parsed results. Dict turns out to be
awkward to use directly, so I added the dictOf method to simplify
things. The concept of dictOf(keyExpr,valueExpr) is "parse a list of
dict entries, each of which is a key-value pair; while parsing, label
each entry with the parsed key." In your example, this would be:

ProcessList = heading + pyparsing.dictOf(process_name,
pyparsing.OneOrMore(integer) )

The key is a leading process_name, and the value is the following list
of integers. With this, you can print out the results using:


data = ProcessList.parseString(sProcess_Info)

print "data keys:", data.keys()
for k in sorted(data.keys()):
print k, ":", data[k]


Getting:

BCMWLTRY : [684, 0, 0, 0, 54353920, 53010432]
CLI : [248, 0, 0, 0, 171941888, 153014272]
D4 : [2904, 0, 0, 0, 37527552, 36413440]
F-StopW : [2064, 0, 0, 0, 33669120, 30121984]
....
(again, note that the multiple entries for "CLI" have been reduced to
a single dict entry)

You could get similar results using something like:

data = dict((vals[0],vals[1:]) for vals in
map(str.split,sProcess_Info.splitlines()))

But then you would never have learned about dictOf!

Enjoy!
-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,965
Messages
2,570,148
Members
46,710
Latest member
FredricRen

Latest Threads

Top