StringBuffer/StringBuilder efficiency

L

Lew

Roedy said:
nd as someone else mentioned, the new method would change
the behavior of today's StringBuilder:
       StringBuilder buff = new StringBuilder();
       buff.append( new String[] { "Hello, ", "world!" } );
       System.out.println(buff);
I don't follow.  The API has an extra method, but I can't see how the
result would be any different from traditional StringBuilder or
StringBuffer.

     With today's StringBuilder, the output on my system is

        [Ljava.lang.String;@3e25a5

With an append(String...) method -- really an append(String[])
method -- I would presumably get the output

        Hello, world!

... an incompatible quiet change.  (For the better, I agree,
but incompatible nonetheless and hence not to be undertaken
rashly.  I don't know of any code that relies on the current
behavior, other than the toy fragment I posted, but I would
not bet that no such code exists.)

This was mentioned upthread on 29 September.
 
R

Roedy Green

With today's StringBuilder, the output on my system is

[Ljava.lang.String;@3e25a5

Hmm. I never before noticed that append could take an Object.

However their is no existing String( Object ) constructor, which is
where I really want the concatenation logic for maximum efficiency.
--
Roedy Green Canadian Mind Products
http://mindprod.com

When you can’t find a bug, you are probably looking in the wrong place. When you can’t find your glasses, you don’t keep scanning the same spot because you are convinced that is where you left them.
~ Roedy
 
R

Roedy Green

How about giving the multi-append a new name, such as "appendAll"? That
way, append of an array would have the append(Object) meaning, but
appendAll of a string[] would append each of its elements.

You could also invent a new StringBuilder, called Cat that was
stripped down of backtracking logic for speed -- doing nothing but cat
(append) operations. However, it still needs the gluing core logic to
reside in String where it can allocate the final String char[] value
the correct size and copy the pieces in directly.
--
Roedy Green Canadian Mind Products
http://mindprod.com

When you can’t find a bug, you are probably looking in the wrong place. When you can’t find your glasses, you don’t keep scanning the same spot because you are convinced that is where you left them.
~ Roedy
 
R

Roedy Green

Isn't it the count of DNS hostnames serving HTTP on port 80?

I think it includes all the parked domains that cyber-squatters are
sitting on. On Netcraft's graphs, look at the red line instead.

The number I would really like in the number of Java-powered server
boxes.
--
Roedy Green Canadian Mind Products
http://mindprod.com

When you can’t find a bug, you are probably looking in the wrong place. When you can’t find your glasses, you don’t keep scanning the same spot because you are convinced that is where you left them.
~ Roedy
 
R

Roedy Green

I think it includes all the parked domains that cyber-squatters are
sitting on. On Netcraft's graphs, look at the red line instead.

Yes. That gives a figure about 72 million. I took my figure from
people quoting Netcraft, not from the graph. That clearly was a
mistake.

"There are over 72 million active websites, many with multiple
servers. Imagine the aggregate bill for the electricity, cooling and
hardware. If you did something to improve the efficiency of them by
even 1%, you would have hugely more than justified your existence in
terms of reducing your energy and green house gas emission footprint."

--
Roedy Green Canadian Mind Products
http://mindprod.com

When you can’t find a bug, you are probably looking in the wrong place. When you can’t find your glasses, you don’t keep scanning the same spot because you are convinced that is where you left them.
~ Roedy
 
D

Dave Searles

Roedy said:
System.arraycopy is a native method.

So it can call memcpy.
I would guess for long strings it tries to do the bulk of the copying
with maximally aligned chunks. One nice side effect of Java is every
object is likely paragraph aligned naturally, which makes copying
naturally efficient. [rest deleted]

Those are considerations if you ever implement your own memcpy.
 
L

Lew

Roedy said:
Hmm. I never before noticed that append could take an Object.

It's been mentioned thrice at least in this thread, most recently in a
response to the very message to which you responded. It's also extremely
clear in the 'StringBuilder' Javadocs that that overload exists.

People should really be in the habit of reviewing the Javadocs for API classes
that they are contemplating using or extending.
 
D

Dave Searles

Thomas said:
It probably calls memmove() (my implementation does), which is almost as
fast as memcpy() (only a small constant overhead to decide in which
direction the copy shall be done) and properly supports overlapping
sources and destinations.

System.arraycopy doesn't need to do that, though, does it? Unless it
supports the same array as source and destination.
 
R

Roedy Green

I wrote and posted FastCat, code that saves addresses of Strings, and
concatentate them when you call toString. It needs a new String
constructor to avoid the final copy, but as it is, it does not buy me
anything noticeable over regular StringBuilders with optimised buffer
sizes. I benchmarked it in my app that expands static macros for my
website.

It is easier to accurately estimate the number of chunks than the
total size of the final String.

--
Roedy Green Canadian Mind Products
http://mindprod.com

I advocate that super programmers who can juggle vastly more complex balls than average guys can, should be banned, by management, from dragging the average crowd into system complexity zones where the whole team will start to drown.
~ Jan V.
 
R

Roedy Green

I wrote and posted FastCat, code that saves addresses of Strings, and
concatentate them when you call toString. It needs a new String
constructor to avoid the final copy, but as it is, it does not buy me
anything noticeable over regular StringBuilders with optimised buffer
sizes. I benchmarked it in my app that expands static macros for my
website.

One of the things I did not appreciate, is each chunk requires a
4-byte overhead, If your chunks are only 2 to 4 bytes long, that is
quite a huge percentage.

I did not convert over StringBuffers that were working a char at a
time.
--
Roedy Green Canadian Mind Products
http://mindprod.com

I advocate that super programmers who can juggle vastly more complex balls than average guys can, should be banned, by management, from dragging the average crowd into system complexity zones where the whole team will start to drown.
~ Jan V.
 
T

Tom Anderson

One of the things I did not appreciate, is each chunk requires a
4-byte overhead, If your chunks are only 2 to 4 bytes long, that is
quite a huge percentage.

I did not convert over StringBuffers that were working a char at a
time.

Okay, so maybe we can have a hybrid approach that accumulates large
strings in the list, but when fed short strings or individual characters,
puts them into a buffer - which it then has to push onto the list when the
next large thing comes along, of course.

final static int SMALL_THRESHOLD = 8;
StringBuilder smallStrings;
List<CharSequence> pieces;

void append(String s) {
if (s.length() < SMALL_THRESHOLD) {
if (smallStrings == null) smallStrings = new StringBuilder();
smallStrings.append(s);
}
else {
if (smallStrings != null) {
pieces.add(smallStrings);
smallStrings = null;
}
pieces.add(s);
}
}

You'd want a bit more finesse than this, to avoid ever growing
smallStrings - if it fills up, you want to push it onto pieces and make a
new one. StringBuilder is probably the wrong choice; you could roll a
little FixedSizeStringBuffer thing.

tom
 
R

Roedy Green

It probably calls memmove() (my implementation does), which is almost as
fast as memcpy() (only a small constant overhead to decide in which
direction the copy shall be done) and properly supports overlapping
sources and destinations.

For a string to array copy it knows there is no overleap. For array to
array it does not. Since both use arraycopy, I guess it does not make
that optimisation.

--
Roedy Green Canadian Mind Products
http://mindprod.com

I advocate that super programmers who can juggle vastly more complex balls than average guys can, should be banned, by management, from dragging the average crowd into system complexity zones where the whole team will start to drown.
~ Jan V.
 
R

Roedy Green

Okay, so maybe we can have a hybrid approach that accumulates large
strings in the list, but when fed short strings or individual characters,
puts them into a buffer

I was thinking along those lines last night. I occurred then that my
pieces ArrayList would contain both Strings and char[]s.

I would need an ArrayList<Object> to contain both. I don't think
there as a way te specify an ArrayList of either String or char[].
Object is the only class they have in common.

Since char[] is not a fully fledged class, there is no hope of it
implementing some common interface with String.

The optimum value of threshhold I could find by experiment. By
setting it to 0, it degenerates to the FastCat algorithm, by cranking
it up high, it degenerates to the accumulating buffer algorithm I
described in my original post.

My concern is if you get an above threshold piece when you have a
largely empty buffer, you could end up wasting a LOT of space.


--
Roedy Green Canadian Mind Products
http://mindprod.com

I advocate that super programmers who can juggle vastly more complex balls than average guys can, should be banned, by management, from dragging the average crowd into system complexity zones where the whole team will start to drown.
~ Jan V.
 
L

Lew

Roedy said:
Okay, so maybe we can have a hybrid approach that accumulates large
strings in the list, but when fed short strings or individual characters,
puts them into a buffer

I was thinking along those lines last night. I occurred then that my
pieces ArrayList would contain both Strings and char[]s.

I would need an ArrayList<Object> to contain both. I don't think
there as a way te specify an ArrayList of either String or char[].
Object is the only class they have in common.

Since char[] is not a fully fledged class, there is no hope of it
implementing some common interface with String.

Technical nit: arrays are "fully fledged" classes.

Technical nit: 'char []' and 'String' do implement an interface in common.
 
M

Mike Schilling

Lew said:
Roedy said:
Okay, so maybe we can have a hybrid approach that accumulates
large
strings in the list, but when fed short strings or individual
characters, puts them into a buffer

I was thinking along those lines last night. I occurred then that
my
pieces ArrayList would contain both Strings and char[]s.

I would need an ArrayList<Object> to contain both. I don't think
there as a way te specify an ArrayList of either String or char[].
Object is the only class they have in common.

Since char[] is not a fully fledged class, there is no hope of it
implementing some common interface with String.

Technical nit: arrays are "fully fledged" classes.

You know what Roedy means, though. It isn't possible for char[] to
implement an interface that Object[] does not. Though creating a
class that contains a (possibly mutable) char[] and implements
CharSquence is trival, and creating a new instance of it is almost
free. Having done so, you can use ArrayList<CharSequence>.
 
L

Lew

Mike said:
You know what Roedy means, though.

That's why I called my comments "nits".
It isn't possible for char[] to
implement an interface that Object[] does not.

That isn't what he said. He didn't discuss Object[] or even String[]. He
said there isn't an interface that char[] and String share.
Since char[] is not a fully fledged class, there is no hope of it
implementing some common interface with String.

While it doesn't invalidate his main point (nor did I say that main point was
not valid), it also is not true.
 
M

Mike Schilling

Patricia said:
Mike said:
Lew said:
Roedy Green wrote:
Okay, so maybe we can have a hybrid approach that accumulates
large
strings in the list, but when fed short strings or individual
characters, puts them into a buffer
I was thinking along those lines last night. I occurred then
that
my
pieces ArrayList would contain both Strings and char[]s.

I would need an ArrayList<Object> to contain both. I don't think
there as a way te specify an ArrayList of either String or
char[].
Object is the only class they have in common.

Since char[] is not a fully fledged class, there is no hope of it
implementing some common interface with String.
Technical nit: arrays are "fully fledged" classes.

You know what Roedy means, though. It isn't possible for char[] to
implement an interface that Object[] does not. Though creating a
class that contains a (possibly mutable) char[] and implements
CharSquence is trival, and creating a new instance of it is almost
free. Having done so, you can use ArrayList<CharSequence>.

It seems to me that what you need is a class that wraps a char[],
handles appends, and implements CharSequence. StringBuilder does all
those things.

If the char[] is large, you might want a class that wraps it without
copying it.
 
L

Lew

Mike said:
If the char[] is large, you might want a class that wraps it without
copying it.

You really don't, unless you're willing for the string you're building to
change after you've added the array.
 
M

Mike Schilling

Lew said:
Mike said:
If the char[] is large, you might want a class that wraps it
without
copying it.

You really don't, unless you're willing for the string you're
building to change after you've added the array.

There are times I'll take that chance, e.g when I'm parsing a file and
know that no other code has access to the char arrays being created.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top