Leaking memory when writing to URL

S

Simon Andrews

I've got a class which sends multipart MIME data as a POST request to a
web server. It works fine with small files, but larger files (still
<50Mb) cause it to throw an OutOfMemory exception:

Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space

I've looked through the relevant code and can't see anywhere that I'm
caching the data being sent, and I really can't see why the memory
consumption would be related to file size.

I've found that if I don't write the file data to the OutputStream
(comment out outStream.write(bytes,0,a);) then it completes OK so it
looks like the OutputStream is caching something - but I can't see any
option to control this. I've tried flushing after each write, but with
no effect.

A cut down version of the relevant bit of code is below. Any clues as
to how to debug this are most welcome

Cheers

Simon.

import java.io.*;
import java.net.*;

public class TestFileUpload {

public static void main(String[] args) {
OutputStream outStream = null;
try {
HttpURLConnection h = (HttpURLConnection)new
URL("http://localhost/cgi-bin/test.cgi").openConnection();

h.setAllowUserInteraction(false);
h.setRequestMethod("POST");
h.setDoOutput(true);
h.setUseCaches(false);

h.connect();

outStream = h.getOutputStream();
}
catch (Exception e) {
e.printStackTrace();
}

int byteCount = 0;


FileInputStream fi = null;
try {
fi = new FileInputStream(new File("C:/big.xml"));
}
catch (FileNotFoundException e) {
e.printStackTrace();
}
DataInputStream di = new DataInputStream(fi);

byte [] bytes = new byte[1024];

try {
int a;
while ((a = di.read(bytes)) >0) {
outStream.write(bytes,0,a);
byteCount+=a;
}

outStream.flush();
outStream.close();
}

catch (IOException ioe) {
ioe.printStackTrace();
}

}
}
 
R

Robert Klemme

Simon said:
I've got a class which sends multipart MIME data as a POST request to
a web server. It works fine with small files, but larger files (still
<50Mb) cause it to throw an OutOfMemory exception:

Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap
space

I've looked through the relevant code and can't see anywhere that I'm
caching the data being sent, and I really can't see why the memory
consumption would be related to file size.

Maybe it's storing the file in mem to be able to write a content-size
header. For serious application I'd always use another HTTP client such
as the one from apache.
http://jakarta.apache.org/commons/httpclient/
I've found that if I don't write the file data to the OutputStream
(comment out outStream.write(bytes,0,a);) then it completes OK so it
looks like the OutputStream is caching something - but I can't see any
option to control this. I've tried flushing after each write, but
with no effect.

A cut down version of the relevant bit of code is below. Any clues as
to how to debug this are most welcome

Cheers

Simon.

import java.io.*;
import java.net.*;

public class TestFileUpload {

public static void main(String[] args) {
OutputStream outStream = null;
try {
HttpURLConnection h = (HttpURLConnection)new
URL("http://localhost/cgi-bin/test.cgi").openConnection();

h.setAllowUserInteraction(false);
h.setRequestMethod("POST");
h.setDoOutput(true);
h.setUseCaches(false);

h.connect();

outStream = h.getOutputStream();
}
catch (Exception e) {
e.printStackTrace();
}

int byteCount = 0;


FileInputStream fi = null;
try {
fi = new FileInputStream(new File("C:/big.xml"));
}
catch (FileNotFoundException e) {
e.printStackTrace();
}
DataInputStream di = new DataInputStream(fi);

byte [] bytes = new byte[1024];

try {
int a;
while ((a = di.read(bytes)) >0) {
outStream.write(bytes,0,a);
byteCount+=a;
}

outStream.flush();
outStream.close();
}

catch (IOException ioe) {
ioe.printStackTrace();
}

}
}

Btw, I suggest you rethink the way you use exceptions. Either catch them
at the end of the method or declare them as thrown but catching them in
the middle and continuing doesn't seem a good way to deal with failure
IMHO.

Kind regards

robert
 
S

Simon Andrews

Robert said:
Maybe it's storing the file in mem to be able to write a content-size
header. For serious application I'd always use another HTTP client such
as the one from apache.
http://jakarta.apache.org/commons/httpclient/

That would make sense (even though I think the content-size is optional
for these kinds of requests. I'll look into Jakarta - cheers!
Btw, I suggest you rethink the way you use exceptions. Either catch them
at the end of the method or declare them as thrown but catching them in
the middle and continuing doesn't seem a good way to deal with failure
IMHO.

The code posted was a hacked together version of what's actually used in
the script (which is a lot better controlled but contains a lot of
irrelevant stuff) - the exception handling here is the automated stuff
eclipse adds just to allow it to compile :)

Thanks again

Simon.
 
S

Simon Andrews

Robert said:
Maybe it's storing the file in mem to be able to write a content-size
header.

After a bit more playing this led me to what actually happens.

In fact the HttpURLConnection doesn't calculate a content-size for you
(you can set one yourself, but it's not checked) - but it does cache all
of the data sent to it. For some reason it doesn't actually send any
data until you either call getInputStream() or getResposeCode() - no
matter how much data you've sent to the OutputStream it never leaves the
client until one of those methods is called. This makes it useless for
sending large POST requests.

I'd like to think that there was a way to override this so that the
caching is removed and the data is sent as soon as it's written - but I
couldn't find one. If anyone can tell me how to prevent this caching
I'd certainly appreciate it!

Cheers

Simon.
 
R

Robert Klemme

Simon said:
That would make sense (even though I think the content-size is
optional for these kinds of requests. I'll look into Jakarta -
cheers!

Yeah, but the HTTP implementation of the JDK is quite rudimentary. Even
allows for proxy usage only via an ugly hack...
The code posted was a hacked together version of what's actually used
in the script (which is a lot better controlled but contains a lot of
irrelevant stuff) - the exception handling here is the automated stuff
eclipse adds just to allow it to compile :)

My eclipse lets me choose wherer I want a try-catch block or a throws
declaration. :)
Thanks again

You're welcome!

Kind regards

robert
 
S

Svante Frey

Simon said:
In fact the HttpURLConnection doesn't calculate a content-size for you
(you can set one yourself, but it's not checked) - but it does cache all
of the data sent to it. For some reason it doesn't actually send any
data until you either call getInputStream() or getResposeCode() - no
matter how much data you've sent to the OutputStream it never leaves the
client until one of those methods is called. This makes it useless for
sending large POST requests.

I'd like to think that there was a way to override this so that the
caching is removed and the data is sent as soon as it's written - but I
couldn't find one. If anyone can tell me how to prevent this caching
I'd certainly appreciate it!

Sun has added this possiblity in Java 1.5: have a look at the
documentation for HttpUrlConnection.setChunkedStreamingMode . Don't
know if there are any possiblities to do it in earlier Java versions.

Also see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5026745 and
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4212479 for some
discussion of this problem.
 
R

Robert Klemme

Simon said:
After a bit more playing this led me to what actually happens.

In fact the HttpURLConnection doesn't calculate a content-size for you
(you can set one yourself, but it's not checked) - but it does cache
all of the data sent to it. For some reason it doesn't actually send
any data until you either call getInputStream() or getResposeCode() -
no matter how much data you've sent to the OutputStream it never
leaves the client until one of those methods is called. This makes
it useless for sending large POST requests.

I'd like to think that there was a way to override this so that the
caching is removed and the data is sent as soon as it's written - but
I couldn't find one. If anyone can tell me how to prevent this
caching I'd certainly appreciate it!

Just create a thread that reads all content from InputStream and discards
it.

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top