T
Thomas 'PointedEars' Lahn
kangax said:Thomas said:I don't see how this can be accomplished with using .source.split[1] `/(a|b|c)/` and `/(a|b\|c)/` produce `/(a|c)/` instead of a proper `a`
(/.../) since we don't have negative lookbehind (?<!) in ECMAScript
implementations with which you could exclude `\|' as a delimiter; so
it probably needs to be solved with RegExp-based string parsing.
Not sure about RegExp-based parsing, since escaped sequences could be of
arbitrary length (I don't think it's possible to detect whether a
character is preceded by `2n+1` amount of `\` - and so is escaped).
A character `x' can only be preceded by one `\' -- `\x' -- because `\\\x'
means that there is a literal `\' before the `\x' in the expression. So it
suffices to exclude cases where special characters like `|' are preceded by
one backslash.
A simple parser, on the other hand, seems to solve the problem nicely
(although, I'm sure, can't compare in speed with `split`-based approach)
function split(string, separator) {
var arr = string.split(''),
result = [],
IS_ESC = false,
ESC_CHAR = '\\',
char,
lastIdx = 0;
for (var i=0, len=arr.length; i<len; i++) {
char = arr;
if (char == ESC_CHAR) {
IS_ESC = !IS_ESC;
continue;
}
if (char === separator && !IS_ESC) {
result.push(string.substring(lastIdx, i));
lastIdx = i+1;
}
else if (i == arr.length-1) {
result.push(string.substring(lastIdx, i+1))
}
}
return result;
}
With RegExp-based parsing, it would be
function split(s, separator)
{
var
rx = new RegExp("[^\\\\]\\" + separator, "g"),
m,
a = [],
i = 0;
while ((m = rx.exec(s)))
{
a.push(s.substring(i, rx.lastIndex - 1));
i = rx.lastIndex;
}
a.push(s.substring(i, s.length));
return a;
}
That's just a quick hack, though. It doesn't work unchanged with arbitrary
separators.
PointedEars