String 15 chars long; 56KB size

J

Jaap de Bergen

Hello,

I'm writing a parser which uses Pattern and Matcher from
java.util.regex.*. After about 300 parses i receive a outofmemory
error.

After running a optimizer i discover that the resulting string of a
regular expression is 56KB big. The strange thing is the length of the
string is only 15 or 16 chars.

The code i'm using:
=======
m = Pattern.compile(REGEX, Pattern.MULTILINE |
Pattern.CASE_INSENSITIVE).matcher(INPUT);
while (m.find()) {
temp = m.group(1);
wm.add(temp);
}
======

The temp variable is 56KB big but only 15 chars long.

I'm now using:
======
temp = new String(m.group(1));
======

Which seems to work, no more outofmemory errors :) And temp is only a
couple of bytes big. Could someone explain why the size of the temp
variable is much smaller? Has is something todo with references to
objects which Pattern/Matcher uses internal?

Thanks!


Jaap
 
C

Christophe Vanfleteren

Jaap said:
Hello,

I'm writing a parser which uses Pattern and Matcher from
java.util.regex.*. After about 300 parses i receive a outofmemory
error.

After running a optimizer i discover that the resulting string of a
regular expression is 56KB big. The strange thing is the length of the
string is only 15 or 16 chars.

The code i'm using:
=======
m = Pattern.compile(REGEX, Pattern.MULTILINE |
Pattern.CASE_INSENSITIVE).matcher(INPUT);
while (m.find()) {
temp = m.group(1);
wm.add(temp);
}
======

The temp variable is 56KB big but only 15 chars long.

I'm now using:
======
temp = new String(m.group(1));
======

Which seems to work, no more outofmemory errors :) And temp is only a
couple of bytes big. Could someone explain why the size of the temp
variable is much smaller? Has is something todo with references to
objects which Pattern/Matcher uses internal?

Thanks!


Jaap

The group() call probably returns a substring from the INPUT String. The
String returned from a substring() call will share the same character array
as the original String, but with different offsets, so that's why
someVeryLongString.subString(0,1) will have the same size as
someVeryLongString.

When you use new String(someSubstring), a new char array is created, fit to
the length of someSubtring, so it'll be smaller.
 
J

Jaap de Bergen

The group() call probably returns a substring from the INPUT String. The
String returned from a substring() call will share the same character array
as the original String, but with different offsets, so that's why
someVeryLongString.subString(0,1) will have the same size as
someVeryLongString.

When you use new String(someSubstring), a new char array is created, fit to
the length of someSubtring, so it'll be smaller.


Thanks for the helpfull reply!



Jaap
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,259
Messages
2,571,035
Members
48,768
Latest member
first4landlord

Latest Threads

Top