DataSet too Memory Intensive!

Z

ziondreams

I've got a couple of very simple web services that query a MSSQL
database, filling a DataSet with the data, applying an XSLT stylesheet
to the data, then returning an XmlDocument (as an XmlNode). I'd also
add that I'm using C#. I've recently been told by my company that I
won't be able to use DataSets as they have a very large memory
footprint and the possibility of pulling back very large amounts of
data is pretty large.

Therefore, I need some other way to get the data from the database and
get it into a form that I can apply an XSLT stylesheet to it. I've been
messing with XmlReader and actually calling a
[SqlCommand].executeXmlReader() but then I can't get that into anything
else to work with it (for example, a XmlDocument, etc.) I would prefer
to have the final result (before the stylesheet transformation) in an
XmlDataDocument, as that's what I pass to my Transform call.

I'm pretty new to WebServices in .Net (month or so) and I'm sure I'm
just missing something minimal. I scoured the newsgroups and couldn't
find anything related so I thought I'd post here. Thanks in advance for
any replies!!
 
D

Dan Rogers

Hi,

If I understand it, you are using dataset as a way to access data and
manage query/response. This isn't unusual. The normal areas where the
kinds of concerns you mention come up are typically due to exposing
"search" methods that let the caller specify either the entire query, or a
"where" clause component that lets them generate an enormous response set.

If I were an operations architect again, I'd be concerned if I saw the
ability to generate uncontrollably large data sets with no built in row
limits (say, 100 results, to narrow down, narrow your search), coupled with
XSLT and the DOM all in one method invocation. Each of these, especially
when coupled with large result sets, can make a substantially able server
fail under high volume loads.

You mention you are returning XmlNode's. Is there any reason you need to
manipulate your data internal to your method as XML? A more efficient
implementation might create a set of native .NET classes that compose the
data structures you want to accept/return, and use datasets coupled with
row-limit throttles, to simply retrieve the data. The basic algorithm then
becomes run the query, copy the results into your native data structure,
and return it. This allows the serializer to create the resulting XML, and
avoids spinning up additional memory and performance intensive operations
(e.g. DOM and XSLT on the server).

I would suggest introducing server side row limits, paging if necessary
*(SQL Server and Datasets support this nicely) to accommodate the
client-side need to get around row limits, and eliminating XML on the
server side by copying the data from the temporary dataset to an in-memory
CLR class that holds native data types.

I hope this helps,

Dan Rogers
Microsoft Corporation


--------------------
From: "ziondreams" <[email protected]>
Newsgroups: microsoft.public.dotnet.framework.aspnet.webservices
Subject: DataSet too Memory Intensive!
Date: 30 Nov 2004 15:20:14 -0800
Organization: http://groups.google.com
Lines: 21
Message-ID: <[email protected]>
NNTP-Posting-Host: 206.230.226.3
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Trace: posting.google.com 1101856817 31540 127.0.0.1 (30 Nov 2004 23:20:17 GMT)
X-Complaints-To: (e-mail address removed)
NNTP-Posting-Date: Tue, 30 Nov 2004 23:20:17 +0000 (UTC)
User-Agent: G2/0.2
Complaints-To: (e-mail address removed)
Injection-Info: f14g2000cwb.googlegroups.com; posting-host=206.230.226.3;
posting-account=T5zaQg0AAACHYmNSDRj4LD8AynSRRKnF
Path: cpmsftngxa10.phx.gbl!TK2MSFTNGXA03.phx.gbl!TK2MSFTNGXA01.phx.gbl!TK2MSFTFEED
02.phx.gbl!tornado.fastwebnet.it!tiscali!newsfeed1.ip.tiscali.net!news.glorb
.com!postnews.google.com!f14g2000cwb.googlegroups.com!not-for-mail
Xref: cpmsftngxa10.phx.gbl microsoft.public.dotnet.framework.aspnet.webservices:26910
X-Tomcat-NG: microsoft.public.dotnet.framework.aspnet.webservices

I've got a couple of very simple web services that query a MSSQL
database, filling a DataSet with the data, applying an XSLT stylesheet
to the data, then returning an XmlDocument (as an XmlNode). I'd also
add that I'm using C#. I've recently been told by my company that I
won't be able to use DataSets as they have a very large memory
footprint and the possibility of pulling back very large amounts of
data is pretty large.

Therefore, I need some other way to get the data from the database and
get it into a form that I can apply an XSLT stylesheet to it. I've been
messing with XmlReader and actually calling a
[SqlCommand].executeXmlReader() but then I can't get that into anything
else to work with it (for example, a XmlDocument, etc.) I would prefer
to have the final result (before the stylesheet transformation) in an
XmlDataDocument, as that's what I pass to my Transform call.

I'm pretty new to WebServices in .Net (month or so) and I'm sure I'm
just missing something minimal. I scoured the newsgroups and couldn't
find anything related so I thought I'd post here. Thanks in advance for
any replies!!
 
Z

ziondreams

Thank you Dan!

Unfortunetly, there are a few things specific to the project that would
limit the uses specified in your post. I'll try and clear some things
up here.

First of all, there is a reason I need to manipulate my data format as
XML server side. The reason for this is the way our client base is set
up. Every one of our clients have a differing data structure and can
even update that structure on our databases whenever they please. This
means that the names of the columns they will want to retrieve are
completely dynamic with the exception of native fields to our system,
such as First Name, etc. The way I understand using .Net classes to
return a structure from the web service is that you have to have the
structure (schema, if you will) set up in the class before. Because
there is no way of knowing what fields the user will request, or the
field names (because they can have fields with whatever names they
want) that will be requested, I don't believe this would work with this
project. Am I right?

Next, I would love to use paging (row limits) and will ask my superiors
if that is an option. However, for now lets assume it's not as I think
the client wants to be able to get the data all in one go. (I know, it
could be a nightmare.) So, if that is the case, and with my above
statements, is there an easy way to achieve what I asked in my first
post? (I'm really hoping so!) :)

Also, I'm not quite sure what you mean when you mention DOM. I've seen
this term used before and I think it refers to Document Object Model,
but what does it mean in this context?

Finally, I will mention that the need to validate an XML schema using
xsd is not a great need for us. If I understand correctly, that is a
great benifit of using native data types in .Net classes right?

Well, thanks for all of your help and I hope I wasn't too confusing! :)
 
D

Dan Rogers

Hi,

Given these things, I would suggest not implementing this as a web service,
but as an HTTP web page applicaiton that returns XML instead. The reason is
that adding in a web service facade to this problem really just complicates
the situation by increaing the number of infrastructure settings you have
to defeat to make your service work - this is what you've been
experiencing, from what I can tell.

A benefit of moving back to say, ASP pages, is you can then use some other
means of getting data - such as ADO. But this really doesn't solve your
problem of having large in-memory footprints. By the way, yes, I was
referring to the DOM (document object model) - which is what you will find
in the XmlDocument class and in our Parser - My bad if I refer to all of
these things generically as "the DOM".

That said, it still doesn't solve your issues around using too many server
resources - so I would probably flag this as a limited # of users design -
as it will almost certainly fall over under load unless provisioned on a
VERY fast, high capacity web server - and even then, we are talking
hundreds or dozens of simultaneous requests at most.

You _CAN_ create a service interface that does what you need - as I suspect
you have created an interface that either exposes XmlNode as the return
type and argument to your methods - or you can use "object" as the type.
Having a system that lets the users define the data that is stored is a
really difficult problem, I do understand.

Sorry if I can't be of more help - making this less memory intensive
without being able to introduce throttles on request/response size is going
to require "extreme magic"...

Good luck

Dan Rogers
Microsoft Corporation


--------------------
 
B

Ben Strackany

Have you looked into using "for xml explicit" from within SQL Server? Then
do something like

XmlDocument newDocument = new XmlDocument();
mySqlCommand.Connection = mySqlConnection;
mySqlCommand.CommandText = "exec spStoredProc blah blah blah";
mySqlCommand.CommandType = CommandType.Text;

XMLResultReader = mySqlCommand.ExecuteXmlReader();
if (XMLResultReader != null) {
newDocument.Load(XMLResultReader);
}
 
Z

ziondreams

Thank you for your input Dan! I will see if that will work for us. I
hadn't even realized I was implementing web services in a way that
wasn't really meant to be. Also, thank you Ben for your comments.
Ironically, that was exactly how I was attempting to load the data.
And, if I can get that to work correctly I think it will be the best
way to go! The problem with that approach lies in the fact that if more
than one row returns, then only that one row gets loaded into the
XmlDocument. (by the way, I'm using FOR XML AUTO because I couldn't get
EXPLICIT to work.)

Is there a way to loop through the XmlReader and somehow grab each row
returned and load it into it's own node or somthing? Thanks again!
 
Z

ziondreams

Thank you Ben for your comments!
Ironically, that was exactly how I was attempting to load the data.
And, if I can get that to work correctly I think it will be the best
way to go! The problem with that approach lies in the fact that if more
than one row returns, then only that one row gets loaded into the
XmlDocument. (by the way, I'm using FOR XML AUTO because I couldn't get
EXPLICIT to work.)

Is there a way to loop through the XmlReader and somehow grab each row
returned and load it into it's own node or somthing? Thanks again!
 
Z

ziondreams

Thank you for your input Dan! I will see if that will work for us. I
hadn't even realized I was implementing web services in a way that
wasn't really meant to be.
 
D

Dan Rogers

To do that, you have to an enclosing XML element in your stored proc.

Dan
--------------------
 
Z

ziondreams

Thank you for your input Dan! I will see if that will work for us. I
hadn't even realized I was implementing web services in a way that
wasn't really meant to be. Also, thank you Ben for your comments.
Ironically, that was exactly how I was attempting to load the data.
And, if I can get that to work correctly I think it will be the best
way to go! The problem with that approach lies in the fact that if more
than one row returns, then only that one row gets loaded into the
XmlDocument. (by the way, I'm using FOR XML AUTO because I couldn't get
EXPLICIT to work.)

Is there a way to loop through the XmlReader and somehow grab each row
returned and load it into it's own node or somthing? Thanks again!
 
D

Dan Rogers

Hi Peter,

To handle the repeating elements issue, you need to add a root element at
the source. SQL XML will gladly turn repeating rows into what it regards
as independent XML instances. The trouble is that in a given XML stream,
there is only supposed to be one root element. The work around is to make
your stored proc/SQL XML statement add a fake root so that whether one or
multiple rows come back, only one root is returned to the XML reader.

Dan
--------------------
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top