I/O Confusion

N

Novice

I'm try to wrap my head around the basic I/O classes, especially how to
choose the one I need for a given job.

I understand that Streams are for bytes and Readers and Writers are for
characters and I think I understand the distinction between bytes and
characters.

Let's say that I want to write characters, as opposed to bytes. How do I
decide which subclass of Writer I want to use? I'm not clear on when
BufferedWriter, say, is preferable to PrintWriter.

I also get very confused when I see Streams, Readers and Writers wrapped
within one another. For example, this snippet confuses me:

Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream
("outfilename"), "UTF8"));

Why wrap FileOutputStream within OutputStreamWriter within BufferedWriter?
Why not just BufferedWriter, say, to do everything? What is actually
happening at execution time when you execute this code?

Also, when do I use this basic I/O and when is it better to use the NIO
classes?

I've looked at a few different tutorials but the penny hasn't dropped for
me yet. Any advice on how to answer these questions for myself?
 
E

Eric Sosman

I'm try to wrap my head around the basic I/O classes, especially how to
choose the one I need for a given job.

I understand that Streams are for bytes and Readers and Writers are for
characters and I think I understand the distinction between bytes and
characters.

Let's say that I want to write characters, as opposed to bytes. How do I
decide which subclass of Writer I want to use? I'm not clear on when
BufferedWriter, say, is preferable to PrintWriter.

You choose the one with the behavior you want. BufferedWriter
is a fairly low-level object: It offers a few output methods that
accept things already rendered in character form, and writes them to
the destination. Its main promise is that it will make an attempt to
accumulate a batch of output characters before sending them onward,
which may improve efficiency by making N/B trips through the I/O stack
instead of N.

PrintWriter offers a richer set of methods, many of which take
non-character inputs and handle the character rendering themselves
instead of requiring you to do it. There's even locale control, and
things like the printf() methods. Also, the PrintWriter methods don't
throw IOExceptions: Instead of guarding every output with try/catch,
you can blithely ignore all the possible troubles until you finally
call checkError() just before you're done. (Whether this is a Good
Thing or a Bad Thing depends on the program -- but it's your choice.)
I also get very confused when I see Streams, Readers and Writers wrapped
within one another. For example, this snippet confuses me:

Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream
("outfilename"), "UTF8"));

Why wrap FileOutputStream within OutputStreamWriter within BufferedWriter?
Why not just BufferedWriter, say, to do everything? What is actually
happening at execution time when you execute this code?

I believe this is what the Patterns aficionados call "Decoration."
You start with a FileOutputStream, which is capable of only one thing:
Writing bytes to a file.

You "decorate" that by wrapping it inside an OutputStreamWriter,
whose job is to translate characters to sequences of bytes. Note that
the translation is independent of the ultimate destination, which could
be a FileOutputStream or a SocketOutputStream or an IScreamYouStream;
*any* kind of OutputStream is fair game. Separating the translation
from the delivery means that all the translations of OutputStreamWriter
are available for *all* the implementations of OutputStream: You'll
never be tripped up by "We can write TUE7-encoded characters to the
console or to disk, but not to sockets or ZIP files."

Finally, you "decorate" still further by putting a BufferedWriter
around the OutputStreamWriter. As above, the BufferedWriter's job is
to improve efficiency by gathering a batch of characters and delivering
them all at once instead of delivering them one by one. (You need a
gallon of milk, a jar of kosher pickles, and the New Rochelle Times:
do you make one trip or three to the convenience store, and why?)
Also, when do I use this basic I/O and when is it better to use the NIO
classes?

As a non-NIO user I'm not a good person to answer this. My
impression from reading the docs is that NIO is mostly of interest
when you need extremely high performance and/or a lot of parallelism.
For ordinary I/O tasks -- only a few tens of sources and destinations,
only a few gigabytes of total transfer, no troublesome requirements
about latencies -- java.io suffices and is simpler. If you're using
thousands of sockets and moving terabytes of data that must go to the
output wire in no more than X microseconds, look at java.nio more
closely than I've found need for.
I've looked at a few different tutorials but the penny hasn't dropped for
me yet. Any advice on how to answer these questions for myself?

Roedy Green had a cute little helper applet he liked to call an
"amanuensis" at http://www.mindprod.com/applet/fileio.html. (I say
"had" because it seems to be down at the moment: all I get is
"java.lang.ClassNotFoundException: com.mindprod.fileio.FileIO".) The
usage pattern was: "Tell me what sort of data you want to read or write,
tell me what the data comes from or goes to, and I'll write a skeleton
of the code you'll need." Pretty sweet, when it works.
 
S

Silvio Bierman

I'm try to wrap my head around the basic I/O classes, especially how to
choose the one I need for a given job.

I understand that Streams are for bytes and Readers and Writers are for
characters and I think I understand the distinction between bytes and
characters.

Let's say that I want to write characters, as opposed to bytes. How do I
decide which subclass of Writer I want to use? I'm not clear on when
BufferedWriter, say, is preferable to PrintWriter.

I also get very confused when I see Streams, Readers and Writers wrapped
within one another. For example, this snippet confuses me:

Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream
("outfilename"), "UTF8"));

Why wrap FileOutputStream within OutputStreamWriter within BufferedWriter?
Why not just BufferedWriter, say, to do everything? What is actually
happening at execution time when you execute this code?

Also, when do I use this basic I/O and when is it better to use the NIO
classes?

I've looked at a few different tutorials but the penny hasn't dropped for
me yet. Any advice on how to answer these questions for myself?

File and network IO is all about moving bytes around so you basically
need the various stream classes for that. Whether you use the buffered
variants depends on if you think you need to read/write directly to the
physical devices or if using an intermediate buffer would enhance
throughput. This heavily depends on what you are doing exactly.

Bytes MAY represent characters using some encoding scheme. The
InputStreamReader/OutputStreamWriter classes bridge this representation gap.

Utility classes like FileWriter/FileReader are merely convenient
wrappers around OutputStreamWriter+FileOutputStream/
InputStreamReader+FileInputStream.

The NIO classes add asynchronous/non-blocking IO (amongst other things)
to Java. Best to start with the synchronous stuff and look back into NIO
when you master the basics.


Cheers,

Silvio
 
M

markspace

I also get very confused when I see Streams, Readers and Writers wrapped
within one another. For example, this snippet confuses me:

Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream
("outfilename"), "UTF8"));


This was confusing to me at first too, but it makes perfect sense if you
think about it. The problem tends to be that other languages shield you
from some of this, while Java lets you assemble the low level bits how
you want.

The thing to remember is everything on a disk is binary, until it gets
interpreted as something else. It the data could represent integer
numbers, or an image, or characters. So first at the lowest level you
just want to write bytes. That's what X_OutputStream and X_InputStream
are for.

Next up the chain you might want to interpret the data as characters,
not binary. But there's a niggle, Readers and Writers only accept other
Readers and Writers -- they expect the translation to be already done.
So there's an intermediate class that does that translation.
InputStreamReader/Write translate from and to Java character data to raw
bits.

Why expose this detail to the user? I expect it was just to save the API
writers from having to overload every single Reader and Write method
with a pair of methods that manipulate raw streams, and also manipulate
raw stream for a specified encoding. It's just good software design,
although it is rather verbose.

So in summary, Java always translates character data to and from raw
streams.

Characters <=============================> Raw Binary

Where the translator is call an InputStreamReader/OutputStreamWriter.

Characters <=============================> Raw Binary
I/O Stream Reader/Writer
 
R

Roedy Green

I'm try to wrap my head around the basic I/O classes, especially how to
choose the one I need for a given job.

I wrote an applet to generate code for any given I/O task. Watch what
it does, and you can see which classes are used in which
circumstances. It does not yet do nio.

see http://mindprod.com/jgloss/applet/fileio.html
--
Roedy Green Canadian Mind Products
http://mindprod.com
It should not be considered an error when the user starts something
already started or stops something already stopped. This applies
to browsers, services, editors... It is inexcusable to
punish the user by requiring some elaborate sequence to atone,
e.g. open the task editor, find and kill some processes.
 
N

Nasser M. Abbasi

I wrote an applet to generate code for any given I/O task. Watch what
it does, and you can see which classes are used in which
circumstances. It does not yet do nio.

see http://mindprod.com/jgloss/applet/fileio.html

Hello Roedy;

Still there is an error trying to access the above applet:

"Not Found

The requested URL /jgloss/applet.html/fileio.html was not found on this server.
Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.8e DAV/2 mod_apreq2-20051231/2.6.0
mod_perl/2.0.3 Perl/v5.8.8 Server at www.mindprod.com Port 80"

--Nasser
 
N

Novice

Hello Roedy;

Still there is an error trying to access the above applet:

"Not Found

The requested URL /jgloss/applet.html/fileio.html was not found on
this server. Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.8e DAV/2
mod_apreq2-20051231/2.6.0 mod_perl/2.0.3 Perl/v5.8.8 Server at
www.mindprod.com Port 80"

--Nasser


This URL works better: http://mindprod.com/applet/fileio.html. In other
words, just omit the "jgloss" part of the path.
 
N

Novice

I wrote an applet to generate code for any given I/O task. Watch what
it does, and you can see which classes are used in which
circumstances. It does not yet do nio.

see http://mindprod.com/jgloss/applet/fileio.html


Thanks Roedy - and the others who replied to my question! - for your
answers. I'll mull over the information and get back to you with a new
thread if I have further questions.

I think you've got me going in the right direction now.
 
R

Roedy Green

Thanks Roedy - and the others who replied to my question! - for your
answers. I'll mull over the information and get back to you with a new
thread if I have further questions.

When I first encountered the IO classes it felt like learning
irregular verbs in some foreign language. I wrote that Applet for
myself because I could not remember. However, over time using the
Applet many times, The code it generated became familiar and
predictable. There are still some goofy things, -- the way specified
encodings are so different from default encodings, and sorting out how
much buffering to put in the InputStream and how much the Reader. Most
of the time now I just clone some code I wrote earlier.
It is sort of like a Lego set with some very strangely shaped pieces,
and some pieces you would expect to exist missing.

I you look at the source code FileIO.java that generates the code, you
will appreciate how strangely irregular the whole thing is. It was
designed to make the job possible, not easy.

--
Roedy Green Canadian Mind Products
http://mindprod.com
It should not be considered an error when the user starts something
already started or stops something already stopped. This applies
to browsers, services, editors... It is inexcusable to
punish the user by requiring some elaborate sequence to atone,
e.g. open the task editor, find and kill some processes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top