Quotes around words

P

Pat

Hi,

I have a big input file full of words, whitespace, newlines, punctuation,
and various other symbols. I want to surround every word with quotes,
UNLESS it already has quotes around it.

After some trial and error, I was seeing some unexpected results. The
closest I came to getting it right was this:

my $str = ' "these" "have" "quotes" these do not. ';
$str =~ s/([^"a-zA-Z0-9_])([a-zA-Z0-9_]+)([^"a-zA-Z0-9_])/$1"$2"$3/gs;

And the result is this:
"these" "have" "quotes" "these" do "not".

The only problem is that "do" is skipped. Is this expected? So how do I
get around this?

Thanks.
 
G

Gunnar Hjalmarsson

Pat said:
I have a big input file full of words, whitespace, newlines, punctuation,
and various other symbols. I want to surround every word with quotes,
UNLESS it already has quotes around it.

After some trial and error, I was seeing some unexpected results. The
closest I came to getting it right was this:

my $str = ' "these" "have" "quotes" these do not. ';
$str =~ s/([^"a-zA-Z0-9_])([a-zA-Z0-9_]+)([^"a-zA-Z0-9_])/$1"$2"$3/gs;

And the result is this:
"these" "have" "quotes" "these" do "not".

The only problem is that "do" is skipped. Is this expected?

Yes. The problem is that you include the non-word characters before and
after respective word in the match.
So how do I get around this?

Please read the section "Extended Patterns" in "perldoc perlre". Example:

$str =~ s/(?<!")\b(\w+)\b(?!")/"$1"/g;
 
J

John W. Krahn

Pat said:
I have a big input file full of words, whitespace, newlines, punctuation,
and various other symbols. I want to surround every word with quotes,
UNLESS it already has quotes around it.

After some trial and error, I was seeing some unexpected results. The
closest I came to getting it right was this:

my $str = ' "these" "have" "quotes" these do not. ';
$str =~ s/([^"a-zA-Z0-9_])([a-zA-Z0-9_]+)([^"a-zA-Z0-9_])/$1"$2"$3/gs;

And the result is this:
"these" "have" "quotes" "these" do "not".

The only problem is that "do" is skipped. Is this expected?
Yes.

So how do I get around this?

$ perl -le'
my $str = q[ "these" "have" "quotes" these do not. ];
print $str;
$str =~ s/(?<!")\b(\w+)\b(?!")/"$1"/g;
print $str;
'
"these" "have" "quotes" these do not.
"these" "have" "quotes" "these" "do" "not".



John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,209
Messages
2,571,086
Members
47,684
Latest member
Rashi Yadav

Latest Threads

Top