[ ... ]
If you notice how it's coded, the stack contains a list of _pointers_
to preallocated vectors.
Okay -- I'm afraid I haven't had a chance to look at the code much yet.
I've gotten an idea of the basic structure, but not much about the
individual bits and pieces.
[ ... ]
I push _pointers_ on the stack, not the vector objects themselves. The
vectors are made using a "new", then the resulting pointer is filed
away on the stack. When I need it, I pop off the pointer, use it, then
stick it back on.
Should I just use raw pointers instead of auto_ptr?
auto_ptr is pretty lightweight -- it's not showing up as something
that's taking much time in the profile...
[ ... ]
Hey, if I can do this, then why not just assign vector buffers to each
thread, directly, and forget about stacks altogether? That was the
whole problem the stack was originally intended to solve in the first
place. If there is an alternative method to tackle this problem then
I'll just toss the whole stack idea altogether and go for it!
Sounds reasonable to me. Generally, when you start up a thread, you do
it by specifying a function that will be called in the new thread. If
you want something done on a per-thread basis, that's the usual place to
start. Unfortunately, I don't know enough about how you're doing the
threading to say much more than that.
PS. What did your profiling say the rest of the time was spent on?
I thought of posting it before. The formatting gets hosed because the
lines are long, but here's the significant parts:
Time % Time % Count Function
---------------------------------------------------------
1318.725 18.4 6778.675 94.6 3000000 RawDigit::Mul(class
RawDigit const &,class RawDigit const &) (rawdigit.obj)
1148.997 16.0 5270.801 73.6 1000000 RawDigit::MulInPlace
(class RawDigit const &) (rawdigit.obj)
790.983 11.0 1199.126 16.7 2000000 RawDigit::Zeroize(void)
(rawdigit.obj)
771.520 10.8 1333.048 18.6 1000000 FG3DStack<class
RawDigit>:
ush(class std::auto_ptr<class RawDigit>) (rawdigit.obj)
605.482 8.4 977.610 13.6 1000000 FG3DStack<class
RawDigit>:
op(void) (rawdigit.obj)
412.900 5.8 613.727 8.6 1000000 RawDigit::Copy(class
RawDigit const &) (rawdigit.obj)
408.143 5.7 408.143 5.7 2000000 RawDigit::Zeroize
(int,int) (rawdigit.obj)
380.550 5.3 7165.489 100.0 1 _main (rawdigit.obj)
379.897 5.3 379.897 5.3 2000000 CriticalSection::Leave
(void) (cs.obj)
363.641 5.1 363.641 5.1 2000000 CriticalSection::Enter
(void) (cs.obj)
200.827 2.8 200.827 2.8 1000000 RawDigit::Copy(class
RawDigit const &,int,int,int,int) (rawdigit.obj)
190.118 2.7 190.119 2.7 1000000 std::vector<class
RawDigit *,class std::allocator<class RawDigit *> >::insert(class
RawDigit * *,unsigned int,class RawDigit * const &) (rawdigit.obj)
187.439 2.6 187.441 2.6 1000000 RawDigit::Resize(int)
(rawdigit.obj)
5.920 0.1 5.920 0.1 430 std::basic_filebuf
<char,struct std::char_traits<char> >:
verflow(int) (iostream.obj)
Just keep in mind that this is done with an old implementation; a newer
one might easily change things, especially wrt the standard library.
IOW, this is probably better than nothing, but don't take it as the
absolute and final word.