EOL created by .write or .encode

Xah Lee · Apr 5, 2005

Why is that some of my files written out by
outF.write(outtext.encode('utf-8'))
has ascii 10 as EOL, while others has ascii 13 as EOL?
both of these files's EOL are originally all ascii 10.

If i remove the EOL after the tt below in the place string, then this
doesn't happen.

findreplace = [
(ur'</body>',
ur'''tt
</body>'''),
]

....

inF = open(filePath,'rb')
s=unicode(inF.read(),'utf-8')
inF.close()

for couple in findreplace:
outtext=s.replace(couple[0],couple[1])
s=outtext
outF = open(tempName,'wb')
outF.write(outtext.encode('utf-8'))
outF.close()

thanks.

Xah
(e-mail address removed)
âˆ‘ http://xahlee.org/PageTwo_dir/more.html â˜„

Fredrik Lundh · Apr 9, 2005

Xah Lee said:
Why is that some of my files written out by
outF.write(outtext.encode('utf-8'))
has ascii 10 as EOL, while others has ascii 13 as EOL?

outF = open(tempName,'wb')
outF.write(outtext.encode('utf-8'))
outF.close()

UTF-8 is not a binary format. get rid of the "b" flags, and things
will work as expected.

</F>

Xah Lee · Apr 9, 2005

I found the problem now. (after some one hour debug time) Python
didn't have problem. Emacs does.

If you open a file in emacs, it will open fine regardless whether the
EOL is ascii 10 or 13. (unix or mac) This is a nice feature. However,
the what-cursor-position which is used to show cursor position and the
char's ascii code, says the EOL is ascii 10 when it is in fact ascii
13. **** the irresponsible fuckhead who is responsible for this.

http://xahlee.org/UnixResource_dir/writ/responsible_license.html

Xah
(e-mail address removed)
âˆ‘ http://xahlee.org/

Aidan Kehoe · Apr 10, 2005

Ar an naoiÃº lÃ¡ de mÃ AibrÃ©an, scrÃobh Xah Lee:

> If you open a file in emacs, it will open fine regardless whether the
> EOL is ascii 10 or 13. (unix or mac) This is a nice feature. However,
> the what-cursor-position which is used to show cursor position and the
> char's ascii code, says the EOL is ascii 10 when it is in fact ascii
> 13.

This _is_ the right thing to do--thereâ€™s no reason naive programs written in
Emacs Lisp should have to worry about different on-disk representations of
line-endings. If you want to open a file which uses \015 as its line
endings, and have those \015 characters appear in the buffer, open it using
a coding system ending in -unix. C-u C-x C-f /path/to/file RET
iso-8859-1-unix RET in XEmacs, something I donâ€™t know but Iâ€™m certain exists
in GNU Emacs.

Xah Lee · Apr 10, 2005

can any GNU person or emacs coder answer this?

specifically: why does what-cursor-position give incorrect answer.

Xah
(e-mail address removed)
âˆ‘ http://xahlee.org/PageTwo_dir/more.html â˜„

Xah said:
I found the problem now. (after some one hour debug time) Python
didn't have problem. Emacs does.

If you open a file in emacs, it will open fine regardless whether the
EOL is ascii 10 or 13. (unix or mac) This is a nice feature. However,
the what-cursor-position which is used to show cursor position and the
char's ascii code, says the EOL is ascii 10 when it is in fact ascii
13. **** the irresponsible fuckhead who is responsible for this.

http://xahlee.org/UnixResource_dir/writ/responsible_license.html

Xah
(e-mail address removed)
âˆ‘ http://xahlee.org/

Xah said:

Why is that some of my files written out by
outF.write(outtext.encode('utf-8'))
has ascii 10 as EOL, while others has ascii 13 as EOL?
both of these files's EOL are originally all ascii 10.

If i remove the EOL after the tt below in the place string, then this
doesn't happen.

findreplace = [
(ur'</body>',
ur'''tt
</body>'''),
]

...

inF = open(filePath,'rb')
s=unicode(inF.read(),'utf-8')
inF.close()

for couple in findreplace:
outtext=s.replace(couple[0],couple[1])
s=outtext
outF = open(tempName,'wb')
outF.write(outtext.encode('utf-8'))
outF.close()

thanks.

Xah
(e-mail address removed)
âˆ‘ http://xahlee.org/PageTwo_dir/more.html â˜„

Click to expand...

Alan Mackenzie · Apr 10, 2005

In comp.emacs.xemacs Xah Lee said:
I found the problem now. (after some one hour debug time) Python
didn't have problem. Emacs does.

If you open a file in emacs, it will open fine regardless whether the
EOL is ascii 10 or 13. (unix or mac) This is a nice feature. However,
the what-cursor-position which is used to show cursor position and the
char's ascii code, says the EOL is ascii 10 when it is in fact ascii
13.

The problem is that there are many ways (at least 3) of indicating where
one line of text ends and the next one begins. Emacs deals with this
problem by converting the file loaded from disk to an internal format,
and converting back again when the time comes to save it again. The
alternatives would have been worse: noting the line-end convention of
each file, and complicating many routines (and we're talking about more
than "at least 3") to take account of that.

The internal representation of an EOL is 0x0a. Now that you know this,
it shouldn't bother you again. Alternatively, you could write a patch
for `what-cursor-position' to fix the problem (if such it be) and submit
it to the mailing list ([email protected], or something like that).
However, it might introduce more problems than it would solve. I suspect
the developers would reject it.

**** the irresponsible fuckhead who is responsible for this.

You having a bad day, or something? ;-) The fuckhead was probably RMS
(Richard Stallman, he of the Free Software Foundation), and he's been
fucked so many times that once more wouldn't achieve anything at all.
;-)

replacing two EOL chars by one	20	Dec 20, 2003
[perl-python] find & replace strings for all files in a dir	1	Jan 31, 2005
[perl-python] 20050126 find replace strings in file	9	Jan 26, 2005
Can't make this page work	6	Mar 8, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004

EOL created by .write or .encode

Xah Lee

Fredrik Lundh

Xah Lee

Aidan Kehoe

Xah Lee

Alan Mackenzie

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads