Extract javascript strings using regex

T

TonyV

Hey all, I've been trying to hammer away at this, and I just can't
figure it out. I'm hoping a regular expressions guru can help me out.

I'm trying to parse a retrieved javascript file to extract the
parameters out of a function call. Here's a contrived line that
represents what will be fetched:

foo('parameter 1', 'param with \'single\' quotes', 'param with\"double
\" quotes', 'this param, it has a comma', 'five');

The goal is to get an array with these elements:
parameter 1
param with 'single' quotes
param with "double" quotes
this param, it has a comma
five

There will always be five parameters, and the function name will
always be foo. Normally, I'm handy with regexes, but damn, those
escaped quotes and commas are killing me, and the data does have lots
of them in there.

I'm not lazy, I've been plugging away at this trying to work with look-
behind reference, greedy matching, and so on, but I'm just at an
impasse and can't extract what I want out of it. I've googled various
regex cookbooks (even have access to O'Reilly's Safari), but I've come
up with bupkiss.

Any ideas? I'd surely appreciate any help!
--TonyV
 
U

Uri Guttman

T> foo('parameter 1', 'param with \'single\' quotes', 'param with\"double
T> \" quotes', 'this param, it has a comma', 'five');

T> The goal is to get an array with these elements:
T> parameter 1
T> param with 'single' quotes
T> param with "double" quotes
T> this param, it has a comma
T> five

T> Any ideas? I'd surely appreciate any help!

text::balanced should be able to do that easily. it can parse matched
parens, quotes and other top level tokenizing syntax.

uri
 
B

Ben Bullock

I'm trying to parse a retrieved javascript file to extract the
parameters out of a function call. Here's a contrived line that
represents what will be fetched:

foo('parameter 1', 'param with \'single\' quotes', 'param with\"double
\" quotes', 'this param, it has a comma', 'five');

The goal is to get an array with these elements:
parameter 1
param with 'single' quotes
param with "double" quotes
this param, it has a comma
five

#! perl
use warnings;
use strict;
my $parameter = qr/'(?:[^']|\\')+'/;
my $test = q/foo('parameter 1', 'param with \'single\' quotes', 'param
with\"double\" quotes', 'this param, it has a comma', 'five')/;
if ($test =~ /foo\s*\(\s*($parameter)\s*,\s*($parameter)\s*,
\s*($parameter)\s*,\s*($parameter)\s*,\s*($parameter)\s*\)/s) {
print "Matched.\n";
print "$1\n$2\n$3\n$4\n$5\n";
}

You could also use

/foo\s*\(\s*(?:$parameter\s*,\s*){4}($parameter)\s*\)/

if you don't need the parameter values right away (e.g. match for them
using another regex later on). That would make the code tidier.
There will always be five parameters, and the function name will
always be foo.

Are the parameters necessarily single quoted?
 
J

Jürgen Exner

TonyV said:
I'm trying to parse a retrieved javascript file to extract the
parameters out of a function call. Here's a contrived line that
represents what will be fetched:

foo('parameter 1', 'param with \'single\' quotes', 'param with\"double
\" quotes', 'this param, it has a comma', 'five');

The goal is to get an array with these elements:
parameter 1
param with 'single' quotes
param with "double" quotes
this param, it has a comma
five

I think Text::CSV::parse() should do the job just fine.

jue
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top