M
mike blamires
I am having great difficulty using Unicode characters in a Regular
Expression, I am trying to match extended Unicode characters.
I am wishing to split a large Dumpfile (containing only JPEGS) I have used
a hex editor to manually extract a file just to show it can be done, so I
know the input is intact.
Each JPEG starts with the Unicode characters \u00FF \u00D8 \u00FF \u00E1
and there are plenty of these to be found within the file.
open(DUMPFILE, "/pathtodumpfile");
my $line;
while(<DUMPFILE>) {
$line = $line.$_;
}
@files = split(/\x{00FF}\x{00D8}\x{00FF}\x{00E1}/, $line);
(As you may see from the above style I am relatively inexperienced to the
perl side of programming
I have tried inserting the Unicode characters in various ways \xFF, \x{FF}
etc. It just doesn't seem to find the pattern. I am at a bit of a loss as
to whether it is my regexp that is wrong, my use of Unicode characters
or use of Extended Unicode characters.
many thanks for your help.
cheers
Mike
Apologies, incorrect newsgroup first time round. Please see above.
cheers
Mike