B
Ben Lee
Hi,
So I read the post from awhile back about packing multi-byte UTF-8
characters as octal:
r = Regexp.compile("ab\304\243cd", 0, "UTF-8")
or
r = Regexp.compile("ab#{[0x123].pack('U')}cd", 0, "UTF-8")
So this seems to be a way to list out individual multi-byte UTF-8
characters
I was wondering if there's then a convenient way to specify a range of
UTF-8 characters?
For instance the darn
0x2002-2003
0x2013-2014
0x2018-201E
characters?
Thanks,
Ben
So I read the post from awhile back about packing multi-byte UTF-8
characters as octal:
r = Regexp.compile("ab\304\243cd", 0, "UTF-8")
or
r = Regexp.compile("ab#{[0x123].pack('U')}cd", 0, "UTF-8")
So this seems to be a way to list out individual multi-byte UTF-8
characters
I was wondering if there's then a convenient way to specify a range of
UTF-8 characters?
For instance the darn
0x2002-2003
0x2013-2014
0x2018-201E
characters?
Thanks,
Ben