regex replace \\ by \

S

Stephan Ehlert

Hi,

I have a problem with java.regex. In a given String \\ has to be replaced by \

My code leads to the following exception:
java.lang.StringIndexOutOfBoundsException: String index out of range: 1
at java.lang.String.charAt(String.java:444)
at java.util.regex.Matcher.appendReplacement(Matcher.java:551)


This is my code:
....
Pattern p = Pattern.compile("\\\\");
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
boolean result = m.find();
while(result) {
m.appendReplacement(sb, "\\");
result = m.find();
}
m.appendTail(sb);
System.out.println(sb.toString());
....

Any idea?
Thanks,

Stephan
 
J

jAnO!

Stephan Ehlert said:
Hi,

I have a problem with java.regex. In a given String \\ has to be replaced by \

My code leads to the following exception:
java.lang.StringIndexOutOfBoundsException: String index out of range: 1
at java.lang.String.charAt(String.java:444)
at java.util.regex.Matcher.appendReplacement(Matcher.java:551)


This is my code:
...
Pattern p = Pattern.compile("\\\\");
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
boolean result = m.find();
while(result) {
m.appendReplacement(sb, "\\");
result = m.find();
}
m.appendTail(sb);
System.out.println(sb.toString());
...

Any idea?
Can be done simpler:

String bla =
Pattern.compile("\\\\").matcher(yourString).replaceAll("\\");

Or String bla = yourString.replaceAll("\\\\", "\\"); should work .
 
J

John C. Bollinger

Stephan said:
I have a problem with java.regex. In a given String \\ has to be replaced by \

My code leads to the following exception:
java.lang.StringIndexOutOfBoundsException: String index out of range: 1
at java.lang.String.charAt(String.java:444)
at java.util.regex.Matcher.appendReplacement(Matcher.java:551)


This is my code:
...
Pattern p = Pattern.compile("\\\\");
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
boolean result = m.find();
while(result) {
m.appendReplacement(sb, "\\");
result = m.find();
}
m.appendTail(sb);
System.out.println(sb.toString());

The backslash character is tricky. It is a metacharacter in both Java
source and in a Java regex. A literal single backslash character in a
regex that appears in Java source as a String literal must therefore be
double escaped: "\\\\". That's a String literal containing two
backslashes (each escaped), which will be interpreted as one literal
backslash by the regex engine. Note, however, that you only need one
level of escaping for a String literal that will not be processed as a
regex (such as the replacement text). Finally, consider using
String.replaceAll() to accomplish this feat -- it is much cleaner than
what you are attempting (although it does something similar behind the
scenes). Put that all together for your solution.


John Bollinger
(e-mail address removed)
 
A

Alan Moore

Hi,

I have a problem with java.regex. In a given String \\ has to be replaced by \

My code leads to the following exception:
java.lang.StringIndexOutOfBoundsException: String index out of range: 1
at java.lang.String.charAt(String.java:444)
at java.util.regex.Matcher.appendReplacement(Matcher.java:551)


This is my code:
...
Pattern p = Pattern.compile("\\\\");
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
boolean result = m.find();
while(result) {
m.appendReplacement(sb, "\\");
result = m.find();
}
m.appendTail(sb);
System.out.println(sb.toString());
...

The backslash character is special in both the regex and the
replacement string, so it has to be double escaped in both places.
That means, for every single backslash you want to match or insert,
you have to put *four* backslashes in the regex or replacement string.
Thus is born this monster:

str = str.replaceAll("\\\\\\\\", "\\\\");

As John said, this is equivalent to what you're doing with the
appendXXX methods, as is jAnO!'s approach. It's just that neither of
them used enough backslashes.
 
S

Stephan Ehlert

Hello again,

thanks for your fast replies. But unfortunately none of your solutions
solves my problem. I have written a second example. The String
"someText\\nsomeText2"
has to be replaced by "someText\nsomeText2" , so that someText2 is
written to a new line when I write the whole String into a file.
Note that the replacement shouldn't work with \n only, but with other
expressions like \t, too.

Thanks, Stephan

The code:
class XYZ {

public static void main(final String[] args) {

String s = "someText\\nsomeText2";
System.out.println(s);

// s = s.replaceAll("\\\\\\\\", "\\\\"); // no match
// s = s.replaceAll("\\\\", ""); // deletes \\
// s = s.replaceAll("\\\\n", "\\\n"); // replaces \\n by \n
// s = s.replaceAll("\\\\", "\\"); // should replace \\ by \,
but exc.
System.out.println(s);
}
}
 
J

John C. Bollinger

Alan said:
The backslash character is special in both the regex and the
replacement string, so it has to be double escaped in both places.

Oops, right you are. I overlooked the fact that the replacement string,
although not a regex, is nevertheless not treated as an opaque replacement.


John Bollinger
(e-mail address removed)
 
M

Morten Alver

thanks for your fast replies. But unfortunately none of your solutions
solves my problem. I have written a second example. The String
"someText\\nsomeText2"
has to be replaced by "someText\nsomeText2" , so that someText2 is
written to a new line when I write the whole String into a file.
Note that the replacement shouldn't work with \n only, but with other
expressions like \t, too.

Here you are:

s.replaceAll("\\\\n", "\n");

This worked when I tested it. The problem is that "\n" is a single
character, so the "n" must be treated together with the "\".
 
J

John C. Bollinger

Stephan said:
thanks for your fast replies. But unfortunately none of your solutions
solves my problem. I have written a second example.

We understand your problem. You are mistaken about not having a
solution. Your test is buggy.
The String
"someText\\nsomeText2"
has to be replaced by "someText\nsomeText2" , so that someText2 is
written to a new line when I write the whole String into a file.
Note that the replacement shouldn't work with \n only, but with other
expressions like \t, too.

Thanks, Stephan

The code:
class XYZ {

public static void main(final String[] args) {

String s = "someText\\nsomeText2";

The above string literal represents a String containing only *one*
backslash, represented in the source code as "\\". Did you actually
look at the output of the println() below?
System.out.println(s); [...]
// s = s.replaceAll("\\\\\\\\", "\\\\"); // no match

That doesn't match because your test string does not meet your criteria.
Here is a modified code that demonstrates the above solution working
perfectly, as far as I understand your requirements:


public class ReplTest {

public static void main(final String[] args) {
String s = "someText\\\\nsomeText2";

System.out.println(s);
s = s.replaceAll("\\\\\\\\", "\\\\");
System.out.println(s);
}
}

And here is a transcript of one run of that program:

D:\temp\testdir>java -cp . ReplTest
someText\\nsomeText2
someText\nsomeText2

As I wrote before, backslashes are tricky.


John Bollinger
(e-mail address removed)
 
A

Alan Moore

We understand your problem. You are mistaken about not having a
solution. Your test is buggy.

*I* didn't understand the problem, because he misstated it. He
doesn't want to replace two backslashes with one, he wants to replace
whitespace escpes with the characters they represent. For that
problem, there's no particular advantage to using regular expressions.
The appendXXX methods from the Matcher class can be useful (as I
demonstrate below), but not compellingly so.

Pattern p = Pattern.compile("\\\\.");
Matcher m = p.matcher(str);
StringBuffer sb = new StringBuffer();
while (m.find())
{
m.appendReplacement(sb, "");
char c = m.group().charAt(1);
switch (c)
{
case 'n':
sb.append('\n');
break;
case 't':
sb.append('\t');
break;
// handle other escapes

default:
sb.append(m.group()); // or maybe drop the backslash
}
}
m.appendTail(sb);


So you /can/ use the regex package for this problem, but just
iterating through the string with charAt() takes the same amount of
effort, and is easier to read and maintain.
 
T

Thomas G. Marshall

John C. Bollinger coughed up:

....[rip]...
A literal single backslash character in a
regex that appears in Java source as a String literal must therefore
be double escaped: "\\\\".

This thread reminds me of some horrible csh scripts I had the pleasure of
debugging once upon a time.

They were filled with scripts calling other scripts, sometimes "passing"
bits of awk script around. And awk, of course, has special meanings for
many of the special csh characters.

For example, keeping track of whether a backslash meant itself or its
neighbor, and just when a "$var" was meant to be expanded (verses sent to
another awk or csh script to be expanded there) was horribly confusing.

....[rip]...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top