Getting Java to input strange characters from a file like c/c++ does it

Guest · Jan 17, 2007

Thanks for reading.

I'm taking a class where I need to read in PGM files character by character.
We're given code in C but I'd much rather use Java, so I tried rewriting it.
I'm hitting a snag though: using BufferedReader's read() method is returning
different values then what it should.

For example, where the c version...

int ci2;
FILE *fp2;

fp2=fopen("inputimage","rb");

ci2 = getc(fp2);

printf("%d", ci2);

returns "150" for some point in the file. The Java version...

int ci2;

BufferedReader ci2 = new BufferedReader(new FileReader("inputimage"));

ci2 = in2.read(t2);

System.out.print(ci2);

returns "8211".

If I convert the input from both the C and Java versions back to chars and output
to files, the C versions matches the original image exactly (which it should), but
the Java version is screwy.

I assume it has to do with Java VM converting the character encodings of the input,
but I have no clue as to how to get the correct input.

Here a simple example code that just reads in the pgm (without the header info) and
outputs it to another file character by character...

import java.io.*;

public class copy {
public static void main (String[] args) throws IOException {

int i,j, input;

BufferedReader fin = new BufferedReader(new FileReader(args[0]));
BufferedWriter fout = new BufferedWriter(new FileWriter(args[1]));

// .pgm header info for a 256x256 image
fout.write("P5\n256 256\n255\n");

for (i=0; i<256; i++) {
for (j=0;j<256;j++) {
input = fin.read();
fout.write(input);
}
}
fin.close();
fout.close();
}
}

Thanks for any help anyone can provide.

- Drew G.

Oliver Wong · Jan 17, 2007

I'm taking a class where I need to read in PGM files character by
character.

PGM... isn't that an image file format? If so, you should be reading the
values as bytes, not as characters.

- Oliver

A. Bolmarcich · Jan 17, 2007

Thanks for reading.

I'm taking a class where I need to read in PGM files character by character.
We're given code in C but I'd much rather use Java, so I tried rewriting it.
I'm hitting a snag though: using BufferedReader's read() method is returning
different values then what it should.

A BufferedReader is designed to read a character input stream, but you
are trying to read a byte input stream. Use a BufferedInputStream
instead.

For example, where the c version...

[snip]

The "b" in the mode parameter indicates that the file is to be read
as a binary (rather than text) file. Without the "b" in the mode you
might have problems reading the file.

Although it is possible to use a BufferedReader as long as the
underlying Reader is using an encoding that converts each byte of
the input stream to a char with the same value, such as "ISO8859_1",
is would be better to avoid the byte to char conversion since you
are want byte, not char, values.

Tom Hawtin · Jan 17, 2007

returns "150" for some point in the file. The Java version...

returns "8211".

8211 is an en-dash. Presumably this is what 150 represents on your
machines configured default character encoding. Are you sure it's a PGM
file?

I assume it has to do with Java VM converting the character encodings of the input,
but I have no clue as to how to get the correct input.

Yup. FileWriter and FileReader are useless classes. The default
character encoding is almost never what you want for any non-trivial
program. And if it is what you want, you really should state so explicitly.

Instead use FileInputStream and InputStreamReader (and FileOutputStream
and OutputStreamWriter), with an explicit character set.

Alternatively, treat it as a binary file and drop the Readers and
Writers (but mind that byte is signed while InputStream.read() returns a
non-negative byte representation).

Tom Hawtin

Tom Hawtin · Jan 17, 2007

Oliver said:
PGM... isn't that an image file format? If so, you should be reading the
values as bytes, not as characters.

Like xbm, it's an image file format, but is text-based. Looks like C.
Complete hack. Great for image processing in Perl.

Tom Hawtin

How to sort a CSV file with merge sort JAVA	7	May 6, 2021
Cyrillic text from file - set utf8 in cmd, unknown characters output anyway	0	Nov 11, 2022
How to try a range of hex values in C# code ?	0	Nov 19, 2022
save input from web form to a .dat file	20	Aug 11, 2006
how to understand the java code wrapping around C or other dependency libraries?	1	Jan 10, 2014
Post to PHP script from Java Applet	4	Apr 6, 2011
copy elements from a file to another file	4	Jul 19, 2006
I develop a Java program to format Java codes	14	Mar 2, 2012

Getting Java to input strange characters from a file like c/c++ does it

Guest

Oliver Wong

A. Bolmarcich

Tom Hawtin

Tom Hawtin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads