M
mikko.n
I have recently been experimenting with GNU C library regular
expression functions and noticed a problem with pattern matching. It
seems to recognize only the first match but ignoring the rest of them.
An example:
mikko.c:
-----
#include <stdio.h>
#include <regex.h>
#include <sys/types.h>
int main(int argc, char *argv[]) {
regex_t p;
regmatch_t pm[2];
regcomp(&p,"k",0);
regexec(&p,"mikko",2,pm,0);
printf("start=%d end=%d\n",pm[0].rm_so,pm[0].rm_eo);
printf("start=%d end=%d\n",pm[1].rm_so,pm[1].rm_eo);
regfree(&p);
return 0;
}
-----
This intends to match regular expression 'k' against string 'mikko'
and return start and end of two first matches in the array pm of
regmatch_t:s. The output is, however:
$ ./mikko
start=2 end=3
start=-1 end=-1
instead of the expected
start=2 end=3
start=3 end=4
Is this a bug in GNU library or have I overlooked something? I have
not found any examples from the Internet of multiple subexpression
matching with <regex.h> either.
With more complicated regular expressions it usually seems to return
only the first match as here, but with wildcards the largest match,
nevertheless only one of them.
Thanks,
Mikko Nummelin
expression functions and noticed a problem with pattern matching. It
seems to recognize only the first match but ignoring the rest of them.
An example:
mikko.c:
-----
#include <stdio.h>
#include <regex.h>
#include <sys/types.h>
int main(int argc, char *argv[]) {
regex_t p;
regmatch_t pm[2];
regcomp(&p,"k",0);
regexec(&p,"mikko",2,pm,0);
printf("start=%d end=%d\n",pm[0].rm_so,pm[0].rm_eo);
printf("start=%d end=%d\n",pm[1].rm_so,pm[1].rm_eo);
regfree(&p);
return 0;
}
-----
This intends to match regular expression 'k' against string 'mikko'
and return start and end of two first matches in the array pm of
regmatch_t:s. The output is, however:
$ ./mikko
start=2 end=3
start=-1 end=-1
instead of the expected
start=2 end=3
start=3 end=4
Is this a bug in GNU library or have I overlooked something? I have
not found any examples from the Internet of multiple subexpression
matching with <regex.h> either.
With more complicated regular expressions it usually seems to return
only the first match as here, but with wildcards the largest match,
nevertheless only one of them.
Thanks,
Mikko Nummelin