-------- Original-Nachricht --------
Datum: Tue, 10 Nov 2009 00:28:56 +0900
Von: Robert Klemme <
[email protected]>
An: (e-mail address removed)
Betreff: Re: FileString - request for comments
I am still trying to wrap my head around the question whether hiding
file IO behind a String API is a good idea. Basically the reason to
create something like this is to be able to use a file in places which
expect to be given a String instance.
No. At least that was not the idea (though, you could).
The reason is that e.g. replacing a part of a file is cumbersome.
Compare:
# IO API:
File.open(path, "r+b") do |fh|
fh.seek(offset+length)
rest = fh.read
fh.seek(offset)
fh.write(replacement)
fh.write(rest)
}
# String API:
fs = FileString.new(path)
fs[offset, length] = replacment # done!
Imagine how much more inconvenient it becomes when it's not offset & length but a Range, or when you have to accomodate negative offsets etc.
And there are other examples, just dive a bit in FileString's source
The String API is *far* more convenient.
However, code that uses String
assumes fast access to arbitrary portions of the string. When those
accesses are translated into random accesses to a file performance
_might_ suffer dramatically.
Yes. If you get that kind of problem - you can always use File.read instead of FileString#to_s (or to_str).
Put differently: hiding the fact that we
are dealing with a file is convenient but may actually break your
neck.
As all highlevel things. If you don't know the things you're dealing with you can easily kill performance. Consider e.g. ary.any? { |obj| other.include?(obj) } - there, just accidentally created an O(n^2) algorithm. It can happen everywhere and it can look totally innocent.
That's not a problem that's specific to FileString but to everything that's abstract.
And although at a certain level of abstraction a file and a
String are pretty much the same (sequence of chars / bytes) it may
actually be a good thing to keep the API separate in order to treat
both appropriately. Stefan, what's your experience?
As you see, I disagree
However, what you say is of course correct. Using FileString means you have to keep in mind that you're dealing with a file.
But: if you know you're dealing with a file, it can even help you making things faster. For example, if you indeed want to compare two files for equality, FileString#== will be faster and less memory intensive than you doing File.read(a) == File.read(b) if the two files are big.
Thanks for your thoughts robert, much appreciated
regards
Stefan