please confirm something (jdbc)

T

Thomas Kellerer

Lew wrote on 29.11.2007 16:01:
Thomas said:
Note that using ByteArrayOutputStream.toByteArray() will create a copy
of its internal buffer so that will double the amount of memory that
you need. That's the reason I wrote my own dynamic byte array (where I
can access the byte array without copying it around)

Using ByteArrayOutputStream.toByteArray() will also guarantee that "the
valid contents of the buffer have been copied into [the resultant
array]", and that the resultant array is the right size. I find these
guarantees to be worth the copy overhead for reasonable-sized objects
(under a couple MB, as is typical for, say, graphic images, one use case
for BLOBs).

Given all the memory re-allocation that goes on inside the
ByteArrayOutputStream anyway, I'm not sure that one more allocation is
all that much extra overhead. If you do it right, the internal array of
the ByteArrayOutputStream will immediately become eligible for GC right
away anyway, so the "efficiency" you gain from avoiding that one copy
might not be worth the extra risk and effort of maintaining a custom class.

It depends on how L your BLOB is, naturally. I can see how a large
enough object would require striped processing and other tricks, but
like most optimization this strikes me as one of those "Don't do it 'til
you have to, and don't think you have to until you've measured" scenarios.
Valid points, indeed, especially the one about the array beeing GC'ed pretty
soon. But still, it will make a difference if I hold 20 or 40MB in memory.

Well I am processing BLOB where I don't know the size (and the number) in
advance, and they could potentially be quite big. My own ByteBuffer does
basically the same as ByteArrayOutputStream with the exception that I expose
access to the array (which is also guaranteed to have the right size).

I only partially agree with your optimization strategy though. If I have two
ways of coding something, and I know that one will definitely use less memory
(or be quicker for whatever reason) then I will choose the one "better" one. I
think one reason for slow software is the attitude "I'll fix the performance
issue later", because "later" you'll have so many places to fix that it will
probably be hard to nail down the real cause.

Cheers
Thomas
 
S

stc

Thomas Kellerer said:
available() cannot be used to retrieve the number of bytes for an
InputStream.

It does not claim to return the length of the underlying stream. Actually
I doubt that it even knows the size of the "source".
You should create your tmp buffer with a fixed size (e.g. new byte[1024]),
then it should work. I'm not sure I understand what you mean with "read()
blocks" but that could well be caused by your buffer of length zero.

That's probably it. I used available() because it gave me what I needed when
called on ByteArrayInputStream so I didn't bother to check the javadocs.
I'll try with a fixed-size buffer tomorrow and see what happens but there's
no reason not to work :)

Regarding the getBytes(), I can confirm that it works with PostgreSQL 8.1.4
(with 8.1.407 JDBC3 driver).

Thanks for the help...
 
M

Martin Gregorie

Martin said:
It looks to me as if, for a PipedInputStream, the value returned by
available() can be invalidated by the content of the byte array but that
read() will always correctly recognise the end of the byte array.

I'm about to modify my code to check this assumption and will let you
know the result.
Here's the follow-up: I was right in my guess. My solution is to replace:

InputStream is = msg.getContent();
prepsql.setBinaryStream("content", is, is.available());

with
InputStream is = msg.getContent();
int isc = is.available();
ByteArrayOutputStream bos = new ByteArrayOutputStream(isc);
int b;
while ((b = is.read()) != -1)
bos.write(b);

byte[] ba = bos.getByteArray();
bos.close();
ByteArrayInputStream bis =
new ByteArrayInputStream(ba);
prepsql.setBinaryStream("content", bis, ba.length);

Its a bit ugly but it works reliably and, by declaring the
ByteOutputStream and byte[] locally in a method the overall storage can
be minimised. The value of isc might be anywhere in the range 0 to
ba.length, so supplying it as an argument to the ByteArrayOutputStream
constructor may help performance and certainly can't do any harm.

FWIW the value returned by InputStream.available() does seem to be
implementation independent. It seems to always be the size of the array
for ByteArrayInputStream.
 
S

steve

I'll take that into consideration. Thanks again...

no it does not!!

Heres one at the start, I also have 2 & 3 blobs at the same time , in any
position.

String sql =
"Select object_code,client_file_name,filedatestamp from
client_code_java where deleted=0 and rep_index=?";
PreparedStatement st = dbconn.prepareStatement(sql);

st.setString(1, indexkey);

// Bind the replication index
rset = st.executeQuery();

// Execute Query
oracle.sql.BLOB blob = null;

while (rset.next()) {
blob = (oracle.sql.BLOB) rset.getObject(1);
filename = rset.getString(2);
newModifiedTime = rset.getLong(3);
 
S

steve

Thomas Kellerer said:
available() cannot be used to retrieve the number of bytes for an
InputStream.

It does not claim to return the length of the underlying stream. Actually
I doubt that it even knows the size of the "source".
You should create your tmp buffer with a fixed size (e.g. new byte[1024]),
then it should work. I'm not sure I understand what you mean with "read()
blocks" but that could well be caused by your buffer of length zero.

That's probably it. I used available() because it gave me what I needed when
called on ByteArrayInputStream so I didn't bother to check the javadocs.
I'll try with a fixed-size buffer tomorrow and see what happens but there's
no reason not to work :)

Regarding the getBytes(), I can confirm that it works with PostgreSQL 8.1.4
(with 8.1.407 JDBC3 driver).

Thanks for the help...

1.don't try to read the whole blob into memory, process it as a buffered
stream.


I.E

while (rset.next()) {
blob = (oracle.sql.BLOB) rset.getObject(1);
thebuffer = new BufferedInputStream(blob.getBinaryStream());
}

or write it to a temp file.



2. store the size of the blob , in an extra field in the record, it's way
faster to recover


newmessageClob.setString(1, message);
st.setString(1, "");
st.setString(2, "");
st.setString(3, theuser);
st.setString(4, sendto);
st.setString(5, subject);
st.setString(6, "Packingfile.zip");
st.setLong(7, newmessageClob.length());
st.setClob(8, newmessageClob);
st.setBlob(9, newfileBlob);
st.setLong(10, newfileBlob.length());
st.setInt(11, 1);
st.executeUpdate();
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top