The issue is that in computer languages, _ is not a word boundary
character, so the theory for SEO goes that "search-term" sill be parsed
properly and "search_term" will not.
Outside of PCRE, and possibly other forms of regex, is a "word boundary"
ever defined?
Technically in regex the word boundary is not the character, it is the
boundary between a non word character and a word character. From the PCRE
man page:
A word boundary is a position in the subject string where the
current character and the previous character do not both match
\w or \W (i.e. one matches \w and the other matches \W), or
the start or end of the string if the first or last character
matches \w, respectively.
Historically, the issue may be rooted in the fact that underscore has
been allowed in variable names for a long time, and when people developed
regex, one of their parsing requirements was computer code, so they
developed regex in which the underscore character was part of the "word"
characters rather than the "not word" characters.
I don't know if "underscore as word character" is a compile time switch
in PCRE or not, but even if not, I imagine that changing underscore from
being a word character to a non word character in the regex
implementation that any particular search engine uses is as simple as
editing the relevant header file and recompiling the regex library
object / dll file.
However, it is probably an even simpler exercise to change the definition
of a "word" in your regex from "\w+" (or "[[:word:]]+") to "[[:alnum:]]+"
if you want to include letters and digits but exclude underscores, or
"[[:alpha:]]" if you just want words comprising of upper and lower case
letters.