J
Jens Theisen
Hello,
I want to apologise in advance for this being off topic. It's not neither
A C nor a C++ question, but to profiling in general, though I my chances
are best to find the answer in the C/C++ community.
I have a C++ program to profile and went about it by producing large
history files of calling dependencies with associated times. It is
presumably similar to gprof's data format and could be converted.
I'm now looking for utilities that can evaluate such a tree. I only
looked at the gprof's manual page and that seemed too basic to me.
I already wrote some Python scripts that can extract stuff like:
- Break the time down into that spent in each (libraries, files,
functions, ...), exclusive to that spent on their behalf in other
such entities. Print it in order.
- Give a tree of (libraries, files, functions) according to their
calling dependencies, but don't recurse into those subentities below
a certain threshhold of consumed time. Give the number of calls and
time (including the time spent in subentities).
All this is already quite useful, but I also dearly want to have:
- My script is already "merging together" subtrees on the same
level. For instance, if f calls g a number of times, and g h a
number of times, you only get f->g->h once in the tree view of the
data, the number at the line of h being the total calls to h through
this callchain. However, if a call to f occurs at a number of
different places in the code, those occurences will not be merged
together: I can't say, view any call to f as a the "whole program"
and forget other calls, merging all these subtrees into one.
- I have a general structure of attributes to each node (ie library, file
and function names), but I also want to have additional attributes like
the value of a parameter the function was called with, which may only
apply to some functions. I need to filter out certain values of these
attributes or break down to their values.
- Reverse tree view: Given a (library, file, function), how often was
it called and how much time did it spend on the behalf of all it's
calling parent, grandparents, etc., again breaking recursion on a
certain time threshold. A special case of this is the ability to
extract a backtrace from the profiling data, which is useful on it's
own.
I have a number of vague ideas how to go about this.
My scripts are in Python, but they are messy and slow. An exotic
approach would be xslt, since it's designed to operate on tree-data. I
have zero experience in this regard - is this possible? how does it
perform? Another one is trying to use databases as a storage for
trees, but I'm not sure if all of these operations will be efficient.
The most realistic option I would go about in the absence of further
advice and existing utilities is to rewrite my scripts in C++.
All of this needs properly thinking through, so I'd really much rather
use an exisiting solution. Is there one?
Cheers,
Jens
I want to apologise in advance for this being off topic. It's not neither
A C nor a C++ question, but to profiling in general, though I my chances
are best to find the answer in the C/C++ community.
I have a C++ program to profile and went about it by producing large
history files of calling dependencies with associated times. It is
presumably similar to gprof's data format and could be converted.
I'm now looking for utilities that can evaluate such a tree. I only
looked at the gprof's manual page and that seemed too basic to me.
I already wrote some Python scripts that can extract stuff like:
- Break the time down into that spent in each (libraries, files,
functions, ...), exclusive to that spent on their behalf in other
such entities. Print it in order.
- Give a tree of (libraries, files, functions) according to their
calling dependencies, but don't recurse into those subentities below
a certain threshhold of consumed time. Give the number of calls and
time (including the time spent in subentities).
All this is already quite useful, but I also dearly want to have:
- My script is already "merging together" subtrees on the same
level. For instance, if f calls g a number of times, and g h a
number of times, you only get f->g->h once in the tree view of the
data, the number at the line of h being the total calls to h through
this callchain. However, if a call to f occurs at a number of
different places in the code, those occurences will not be merged
together: I can't say, view any call to f as a the "whole program"
and forget other calls, merging all these subtrees into one.
- I have a general structure of attributes to each node (ie library, file
and function names), but I also want to have additional attributes like
the value of a parameter the function was called with, which may only
apply to some functions. I need to filter out certain values of these
attributes or break down to their values.
- Reverse tree view: Given a (library, file, function), how often was
it called and how much time did it spend on the behalf of all it's
calling parent, grandparents, etc., again breaking recursion on a
certain time threshold. A special case of this is the ability to
extract a backtrace from the profiling data, which is useful on it's
own.
I have a number of vague ideas how to go about this.
My scripts are in Python, but they are messy and slow. An exotic
approach would be xslt, since it's designed to operate on tree-data. I
have zero experience in this regard - is this possible? how does it
perform? Another one is trying to use databases as a storage for
trees, but I'm not sure if all of these operations will be efficient.
The most realistic option I would go about in the absence of further
advice and existing utilities is to rewrite my scripts in C++.
All of this needs properly thinking through, so I'd really much rather
use an exisiting solution. Is there one?
Cheers,
Jens