removing BOM prepended by codecs?

J

J. Bagg

I've checked the original files using od and they don't have BOMs.

I'll remove them in the servlet. The overhead is probably small enough
unless somebody is doing a massive search. We have a limit anyway to
prevent somebody stealing the entire set of data.

I started writing the Python search because the ancient C search had
started putting out BOMs. I'm actually mystified because our home Linux
box does not add BOMs even though it runs 2.7 but my work one does even
though it has the same version. The only difference is Fedora 18 v
Fedora 17.

The BOMs are certainly there:

<86> <AD><FB>%R 10C0203z-621
%A François-Xavier Le_Bourdonnec

0000000 206 255 373 % R 1 0 C 0 2 0 3 z -

J
 
P

Piet van Oostrum

J. Bagg said:
I've checked the original files using od and they don't have BOMs.

I'll remove them in the servlet. The overhead is probably small enough
unless somebody is doing a massive search. We have a limit anyway to
prevent somebody stealing the entire set of data.

I started writing the Python search because the ancient C search had
started putting out BOMs. I'm actually mystified because our home Linux
box does not add BOMs even though it runs 2.7 but my work one does even
though it has the same version. The only difference is Fedora 18 v
Fedora 17.

The BOMs are certainly there:

<86> <AD><FB>%R 10C0203z-621
%A François-Xavier Le_Bourdonnec

0000000 206 255 373 % R 1 0 C 0 2 0 3 z -
That is not a BOM or SIG. It isn't even valid utf-8.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top