Recordsets vs. Arrays?

M

Mark J. McGinty

Chris Hohmann said:
[Chris] Yes, but I had the same problem Bob had.

Sorry about that, it's fixed, I tested the dl from my laptop. I also ended
up configuring a new net block for my office on the fly, I was whining to my
ISP about a pesky little drop-off problem, and they noticed I was in a wrong
block, so I was down for an hour or so today. (Thank goodness for short
TTL.)

Give it another try when you get a chance..
http://www.databoundzone.com/highrestimer.zip

Also, I don't have VB6 at work and I use Linux at home.

Wait, you use WHAT?! :) I run 'nix in a virtual from time to time -- man
I did not see that coming, I even missed it the first time I read that post.
<LOL> Ok then, ahem... moving right along....

Um, VB6 wouldn't do you a lot of good, except for maybe the object browser,
I wrote the component in VC++ with ATL, mostly because dealing with 64 bit
integers in VB was entirely impractical, plus I wanted to keep overhead to a
minimum.

I included the compiled DLL, so building yourself is unnecessary (though if
the places were switched, I'd probably build it just for GP, if possible --
the Internet has become a hostile place.) You will have to call REGSVR32 to
register it, didn't bother with an installer...

But there's just no getting past the linux thing... guess I didn't think it
was a requirement. :)

[Chris] I think the data is stored in it's native binary format, perhaps a
bytearray, and a PAGE worth of data is cast into variants as you move
through the recordset. I suspect that the difference between server-side
and client-side cursors is that with server-side cursors request multiple
binary chunks as need, whereas client-side cursors request one huge binary
chunk.

Agreed. One interesting aspect of server cursors is the almost lazy use of
the network. Client cursor is on more of a mission; server cursor is
apparently trying to avoid burdening the network?


-Mark
 
M

Michael D. Kersey

Chris said:
- The result set (a million rows) seems awfully large: perhaps testing
also with a range of smaller values (and some variety of column counts,
also - in my experience most result sets are more than two columns) would
be useful?

Whose test are you talking about? I believe Mark's test uses the same data
as the original article which uses approximately 400 rows and two columns.
As far as I can tell, the only changes he made was to replace the Timer
calls with call to his higher resolution timer component. And he used
explicit field object references which he discussed earlier in the thread.

- As one poster stated earlier, the size of the result set may push the
server to extremes (out of physical memory). This is a problem that is
easily fixed these days: memory is cheap, programming isn't. [Well, it
wasn't until recently!8-(] How much memory is on the IIS server? The
database server?


Yes, this is a valid point. Memory utilization should be taken into account
when deciding among these various methods. This is especially true when the
selected method will be scale for many concurrent users. I mention it in
passing in the original article but it could certainly do with a more
thorough treatment. However, for the purposes of the original test execution
speed was the only metric.

- One of the benchmarks had multiple rs.GetString() method calls, which
may bias the test results. While multiple GetString() method calls are
interesting, a second subroutine that performs only a single GetString()
method call (as is done in another of the benchmarks) would be nice for
comparison.


Oh, I see, you're talking about the revised code I posted in response to
Bob. That's what the million row reference was about as well?

Yes. Sorry about the confusion: I was looking at two benchmarks at the
same time.

The purpose of
that code was to incorporate some of the observations that had been made
thus far in the thread. Specifically:
1. Mitigate the 15ms Timer() resolution issue by using a larger resultset. I
opted for this instead of an inner loop suggested by Mark, since it was more
representative of how one moves through a recordset. Namely, in one pass. It
also eliminates any possible contamination of results by introducing
addition loop variable costs.
2. Create a baseline measure for object creation and opening time.
3. Used explicit field object references in the recordset iteration method.
4. Added the SaveXML procedure to test the performance of persisting data
directly to Response using the Recordset.Save method. I've always been
curious to see how this would perform.

I'm glad you added this. I've seen it used several times and remember
being quite puzzled the first time I saw it used!
5. Use multiple calls to GetString to mitigate the effects of large string
concatenation. I did start out using a single call, but the script was
timing out.

BTW do you remember whether that was a script timeout
(Server.ScriptTimeout) or a connection (conn.connectionTimeout) timeout?
 
C

Chris Hohmann

Michael D. Kersey said:
BTW do you remember whether that was a script timeout
(Server.ScriptTimeout) or a connection (conn.connectionTimeout) timeout?

It was a script timeout.
 
C

Chris Hohmann

Mark J. McGinty said:
Chris Hohmann said:
[Chris] Yes, but I had the same problem Bob had.

Sorry about that, it's fixed, I tested the dl from my laptop. I also
ended up configuring a new net block for my office on the fly, I was
whining to my ISP about a pesky little drop-off problem, and they noticed
I was in a wrong block, so I was down for an hour or so today. (Thank
goodness for short TTL.)

Give it another try when you get a chance..
http://www.databoundzone.com/highrestimer.zip

I already did. :)
Wait, you use WHAT?! :) I run 'nix in a virtual from time to time --
man I did not see that coming, I even missed it the first time I read that
post. <LOL> Ok then, ahem... moving right along....

I should have said, I use Linux WITH Mono at home. :)
Um, VB6 wouldn't do you a lot of good, except for maybe the object
browser, I wrote the component in VC++ with ATL, mostly because dealing
with 64 bit integers in VB was entirely impractical, plus I wanted to keep
overhead to a minimum.

Sorry, force of habit. I should have said, I don't have Visual Studio 6 at
work.
I included the compiled DLL, so building yourself is unnecessary (though
if the places were switched, I'd probably build it just for GP, if
possible -- the Internet has become a hostile place.) You will have to
call REGSVR32 to register it, didn't bother with an installer...

Yes, I found it in the ReleaseUMinDependency folder eventually. It takes me
a little longer than most, but I get there eventually. :) On a side note, I
did some looking into converting your component to a Windows Scripting
Component(WSC). One problem I'm encountering is how to get access to
QueryPerformanceCounter et al. I see that I can get to it via the
Win32_PerfRawData_PerfOS_Processor in Windows Management
Instrumentation(WMI), but that seems like a lot of overhead just for a
timer. Do you have any idea how I can call QueryPerformanceCounter directly
from WSC? I don't think it's possible, but I'd love to be wrong. The reason
I'd like to convert it to a WSC is so that it could be included in a rewrite
of the article without requiring the reader to register the component.
 
M

Mark J. McGinty

[snip]
Yes, I found it in the ReleaseUMinDependency folder eventually. It takes
me a little longer than most, but I get there eventually. :)

A litle confusing I agree. Last night I reorganized the zip a little, I
created folders DLL, Test Scripts and Docs, and moved files appropriately.

I also fancied-up the docs
(http://www.databoundzone.com/highrestimerdocs.htm) for the HiResTimer
interface (which I like a lot more than my first attempts.) I'm happy
enough with its set of properties and method, but am totally open to any
object name suggestions. I'm thinking it will only take about 10 minutes to
create a new component and copy the code from HiResTimer to it -- way easier
than trying to remove the old interfaces from the IDL and all... so that is
my intent.
On a side note, I did some looking into converting your component to a
Windows Scripting Component(WSC). One problem I'm encountering is how to
get access to QueryPerformanceCounter et al. I see that I can get to it
via the Win32_PerfRawData_PerfOS_Processor in Windows Management
Instrumentation(WMI), but that seems like a lot of overhead just for a
timer. Do you have any idea how I can call QueryPerformanceCounter
directly from WSC? I don't think it's possible, but I'd love to be wrong.
The reason I'd like to convert it to a WSC is so that it could be included
in a rewrite of the article without requiring the reader to register the
component.

I didn't even think about WMI... but you're right that's a lot of o/h,
likely rendering it useless for our purposes here. I think you're right
about it being impossible from WSC, for lack of access to the API for one
thing, and difficulty handling LARGE_INTEGER for another.

I could wrap it in a quick Install Vise setup app, and I may be able to get
permission to sign it, if that'd help. The installer would at least give
them a painless way to register and remove it.

I did some looking-into the threadedness/SMP aspect of it. The counter
value will be read from whatever CPU is executing the thread, but it doesn't
matter because all chips will apparently have the same value. I tested it
out on a 2-processor box and it seems to hold true (I didn't get exact
matches but that's due to the way threads are scheduled, I believe; the
amount of difference seems consistent with theory.) So I'm assuming it's a
non-issue.

What are your thoughts as far as ideal thread model go? Is 'both' the right
choice?


-Mark
 
C

Chris Hohmann

Mark J. McGinty said:
[snip]
Yes, I found it in the ReleaseUMinDependency folder eventually. It takes
me a little longer than most, but I get there eventually. :)

A litle confusing I agree. Last night I reorganized the zip a little, I
created folders DLL, Test Scripts and Docs, and moved files appropriately.

I also fancied-up the docs
(http://www.databoundzone.com/highrestimerdocs.htm) for the HiResTimer
interface (which I like a lot more than my first attempts.) I'm happy
enough with its set of properties and method, but am totally open to any
object name suggestions. I'm thinking it will only take about 10 minutes
to create a new component and copy the code from HiResTimer to it -- way
easier than trying to remove the old interfaces from the IDL and all... so
that is my intent.
On a side note, I did some looking into converting your component to a
Windows Scripting Component(WSC). One problem I'm encountering is how to
get access to QueryPerformanceCounter et al. I see that I can get to it
via the Win32_PerfRawData_PerfOS_Processor in Windows Management
Instrumentation(WMI), but that seems like a lot of overhead just for a
timer. Do you have any idea how I can call QueryPerformanceCounter
directly from WSC? I don't think it's possible, but I'd love to be wrong.
The reason I'd like to convert it to a WSC is so that it could be
included in a rewrite of the article without requiring the reader to
register the component.

I didn't even think about WMI... but you're right that's a lot of o/h,
likely rendering it useless for our purposes here. I think you're right
about it being impossible from WSC, for lack of access to the API for one
thing, and difficulty handling LARGE_INTEGER for another.
Yeah, that's what I feared. With regards to LARGE_INTEGERS, I think you
cheat the system by using Currency.
I could wrap it in a quick Install Vise setup app, and I may be able to
get permission to sign it, if that'd help. The installer would at least
give them a painless way to register and remove it.

I did some looking-into the threadedness/SMP aspect of it. The counter
value will be read from whatever CPU is executing the thread, but it
doesn't matter because all chips will apparently have the same value. I
tested it out on a 2-processor box and it seems to hold true (I didn't get
exact matches but that's due to the way threads are scheduled, I believe;
the amount of difference seems consistent with theory.) So I'm assuming
it's a non-issue.

What are your thoughts as far as ideal thread model go? Is 'both' the
right choice?

We're not storing the timer in the Application/Session scope so I think
either model is fine. With regards to the absense of GetThreadAffinity you
mentioned earlier, I don't think it's necessary since on success,
SetThreadAffinity returns the bitmask of the prior thread affinity mask.
 
C

Chris Hohmann

Yeah, that's what I feared. With regards to LARGE_INTEGERS, I think you
cheat the system by using Currency.

That should read ..."I think you CAN cheat the system by using Currency".
 
M

Mark J. McGinty

We're not storing the timer in the Application/Session scope so I think
either model is fine. With regards to the absense of GetThreadAffinity you
mentioned earlier, I don't think it's necessary since on success,
SetThreadAffinity returns the bitmask of the prior thread affinity mask.

Yeah I saw that, but didn't feel really comfortable with altering such a
thing blindly (setting it to the process' affinity mask) within the process
space of a server app, just to find out what it was before I changed it...
didn't seem like a polite thing to do. :)

-Mark
 
M

Mark J. McGinty

Wow this was a really long thread! :)

I got some info about the processor affinity thing:

When running on an SMP box, is there any way to determine which CPU is
executing your code? [...]

The thread's affinity mask as well as the ideal processor could
be retrieved with the NtQueryInformationThread native API function.
However, I do not know a way to get (in user mode) the thread's
soft affinity (which is used internally by the scheduler to "mark"
the last CPU a thread was executed on).

Actually I should mention what brought this up, I wrote a quickie COM
object to call QueryPerformanceCounters from within ASP, for
profiling purposes. [...]

Since the implementation of QueryPerformanceCounters may read a
counter from the CPU, does the system make sure it always reads that
counter from the same CPU?

No, it does not. Most implementations (depends on the HAL) simply
use the rdtsc assembly instruction to read the CPU-internal clock
count. However, as each CPUs gets the same clock signal, their
rdtsc values should also be same.

The QueryPerformanceCounter specs gives some more evidence: "On a
multiprocessor machine, it should not matter which processor is
called. However, you can get different results on different
processors due to bugs in the BIOS or the HAL."


So it depends on the machine...

Well, I suppose you can assume that a more or less recent SMP
machine has no such bugs in the BIOS or HAL.


[ Written by Daniel Lohman]


-Mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,158
Messages
2,570,881
Members
47,414
Latest member
djangoframe

Latest Threads

Top