A
Akihiro KAYAMA
Hi all.
I would like to ask how I can implement string-like class using tuple
or list. Does anyone know about some example codes of pure python
implementation of string-like class?
Because I am trying to use Python for a text processing which is
composed of a large character set. As the character set is wider than
UTF-16(U+10FFFF), I can't use Python's native unicode string class.
So I want to prepare my own string class, which provides convenience
string methods such as split, join, find and others like usual string
class, but it uses a sequence of integer as a internal representation
instead of a native string. Obviously, subclassing of str doesn't
help.
The implementation of each string methods in the Python source
tree(stringobject.c) is far from python code, so I have started from
scratch, like below:
def startswith(self, prefix, start=-1, end=-1):
assert start < 0, "not implemented"
assert end < 0, "not implemented"
if isinstance(prefix, (str, unicode)):
prefix = MyString(prefix)
n = len(prefix)
return self[0:n] == prefix
but I found it's not a trivial task for myself to achive correctness
and completeness. It smells "reinventing the wheel" also, though I
can't find any hints in google and/or Python cookbook.
I don't care efficiency as a starting point. Any comments are welcome.
Thanks.
-- kayama
I would like to ask how I can implement string-like class using tuple
or list. Does anyone know about some example codes of pure python
implementation of string-like class?
Because I am trying to use Python for a text processing which is
composed of a large character set. As the character set is wider than
UTF-16(U+10FFFF), I can't use Python's native unicode string class.
So I want to prepare my own string class, which provides convenience
string methods such as split, join, find and others like usual string
class, but it uses a sequence of integer as a internal representation
instead of a native string. Obviously, subclassing of str doesn't
help.
The implementation of each string methods in the Python source
tree(stringobject.c) is far from python code, so I have started from
scratch, like below:
def startswith(self, prefix, start=-1, end=-1):
assert start < 0, "not implemented"
assert end < 0, "not implemented"
if isinstance(prefix, (str, unicode)):
prefix = MyString(prefix)
n = len(prefix)
return self[0:n] == prefix
but I found it's not a trivial task for myself to achive correctness
and completeness. It smells "reinventing the wheel" also, though I
can't find any hints in google and/or Python cookbook.
I don't care efficiency as a starting point. Any comments are welcome.
Thanks.
-- kayama