Relative seeks on string IO

P

Pierre Quentel

Hi,

I am wondering why relative seeks fail on string IO in Python 3.2

Example :

from io import StringIO
txt = StringIO('Favourite Worst Nightmare')
txt.seek(8) # no problem with absolute seek

but

txt.seek(2,1) # 2 characters from current position

raises "IOError: Can't do nonzero cur-relative seeks" (tested with
Python3.2.2 on WindowsXP)

A seek relative to the end of the string IO raises the same IOError

However, it is not difficult to simulate a class that performs
relative seeks on strings :

====================
class FakeIO:

def __init__(self,value):
self.value = value
self.pos = 0

def read(self,nb=None):
if nb is None:
return self.value[self.pos:]
else:
return self.value[self.pos:self.pos+nb]

def seek(self,offset,whence=0):
if whence==0:
self.pos = offset
elif whence==1: # relative to current position
self.pos += offset
elif whence==2: # relative to end of string
self.pos = len(self.value)+offset

txt = FakeIO('Favourite Worst Nightmare')
txt.seek(8)
txt.seek(2,1)
txt.seek(-8,2)
=====================

Is there any reason why relative seeks on string IO are not allowed in
Python3.2, or is it a bug that could be fixed in a next version ?

- Pierre
 
T

Terry Reedy

I am wondering why relative seeks fail on string IO in Python 3.2

Good question.
from io import StringIO
txt = StringIO('Favourite Worst Nightmare')
txt.seek(8) # no problem with absolute seek

Please post code without non-code indents, like so:

from io import StringIO
txt = StringIO('Favourite Worst Nightmare')
txt.seek(8,0) # no problem with absolute seek
txt.seek(0,1) # 0 characters from current position ok, and useless
txt.seek(-2,2) # end-relative gives error message for cur-relative

so someone can copy and paste without deleting indents.
I verified with 3.2.2 on Win7. I am curious what 2.7 and 3.1 do.

What system are you using? Does it have a narrow or wide unicode build?
(IE, what is the value of sys.maxunicode?)
txt.seek(2,1) # 2 characters from current position

raises "IOError: Can't do nonzero cur-relative seeks" (tested with
Python3.2.2 on WindowsXP)

A seek relative to the end of the string IO raises the same IOError
Is there any reason why relative seeks on string IO are not allowed in
Python3.2, or is it a bug that could be fixed in a next version ?

Since StringIO seeks by fixed-size code units (depending on the build),
making seeking from the current position and end trivial, I consider
this a behavior bug. At minimum, it is a doc bug. I opened
http://bugs.python.org/issue12922

As noted there, I suspect the limitation in inherited from TextIOBase.
But I challenge that it should be.

I was somewhat surprised that seeking (from the start) is not limited to
the existing text. Seeking past the end fills in with nulls. (They are
typically a nuisance though.)

from io import StringIO
txt = StringIO('0123456789')
txt.seek(15,0) # no problem with absolute seek
txt.write('xxx')
s = txt.getvalue()
print(ord(s[12]))
# 0
 
P

Pierre Quentel

Please post code without non-code indents, like so:
Sorry about that. After the line "Example :" I indented the next
block, out of habit ;-)
What system are you using? Does it have a narrow or wide unicode build?
(IE, what is the value of sys.maxunicode?)
I use Windows XP Pro, version 2002, SP3. sys.maxunicode is 65535

I have the same behaviour with 3.1.1 and with 2.7

I don't understand why variable sized code units would cause problems.
On text file objects, read(nb) reads nb characters, regardless of the
number of bytes used to encode them, and tell() returns a position in
the text stream just after the next (unicode) character read

As for SringIO, a wrapper around file objects simulates a correct
behaviour for relative seeks :

====================
txt = "abcdef"
txt += "تخصيص هذه الطبعة"
txt += "머니투ë°ì´"
txt += "endof file"

out = open("test.txt","w",encoding="utf-8")
out.write(txt)
out.close()

fobj = open("test.txt",encoding="utf-8")
fobj.seek(3)
try:
fobj.seek(2,1)
except IOError:
print('raises IOError')

class _file:

def __init__(self,file_obj):
self.file_obj = file_obj

def read(self,nb=None):
if nb is None:
return self.file_obj.read()
else:
return self.file_obj.read(nb)

def seek(self,offset,whence=0):
if whence==0:
self.file_obj.seek(offset)
else:
if whence==2:
# read till EOF
while True:
buf = self.file_obj.read()
if not buf:
break
self.file_obj.seek(self.file_obj.tell()+offset)

fobj = _file(open("test.txt",encoding="utf-8"))
fobj.seek(3)
fobj.seek(2,1)
fobj.seek(-5,2)
print(fobj.read(3))
==========================

- Pierre
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,145
Messages
2,570,826
Members
47,371
Latest member
Brkaa

Latest Threads

Top