Base64 Encoding with streams.

  • Thread starter andoni.oconchubhair
  • Start date
A

andoni.oconchubhair

Hi all,

I have spent my working day wrestling with a piece of code which
"should" work but of course doesn't! It took me all of 30 minute to
write and I have spent the last 9 hours looking through to find the
bug!!! :-( It is a class which reads a binary file from a socket and
writes it to a text file in Base64 format using the Axis
encoder/decoder.

ws.apache.org/axis/java

If I make the block-size big enough that it does not have to loop it is
fine but I want to be able to process massive files passing through a
mail server with a big (but not massive) block size.

My thanks in advance for any help,
Andoni.


import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

import org.apache.axis.encoding.Base64;

/**
* Class for testing the use of Base64 encoders/decoders.
*
* @author Andoni OConchubhair
*
*/
public class TestBase64 {

private static final String BASE_64_FILE_SUFFIX = ".64";
private static final int BLOCK_SIZE = 128;

/**
* Convert a Base64 encoded file back to it's original Binary file.
*
* @param myFile The Base64 encoded file (as a text file).
* @throws IOException No need to handle exception here.
*/
private void reCreateFile(File myFile) throws IOException {
FileReader myFr = new FileReader(myFile);
FileOutputStream myFos = new FileOutputStream("new_" +
myFile.getName().substring(0, myFile.getName().lastIndexOf(".")));
char[] charArray = new char[BLOCK_SIZE];
final int OFFSET = 0;
int len = 0;
while(BLOCK_SIZE == (len = myFr.read(charArray, OFFSET, BLOCK_SIZE)))
{
System.out.print("<d");
Base64.decode(charArray, OFFSET, len, myFos);
}
System.out.println();
if(-1 < len) {
System.out.println("Len left is: " + len);
Base64.decode(charArray, OFFSET, len, myFos);
}

myFos.flush();
myFos.close();
}

/**
* Convert a binary file into a Base64 encoded text file.
*
* @param myFile The binary file to be encoded.
* @throws IOException No need to handle any exceptions.
*/
private void encodeFile(File myFile) throws IOException {
FileInputStream myFis = new FileInputStream(myFile);
FileWriter out = new FileWriter(myFile.getName() +
BASE_64_FILE_SUFFIX);
byte[] binArray = new byte[BLOCK_SIZE];
int len = 0;
final int OFFSET = 0;

while(BLOCK_SIZE == (len = myFis.read(binArray, OFFSET, BLOCK_SIZE)))
{
System.out.print("e>");
Base64.encode(binArray, OFFSET, len, out);
}

// Protect agains situation where there is no remainter (len == -1).
if(-1 < len) {
Base64.encode(binArray, OFFSET, len, out);
}

out.flush();
out.close();
}

/**
* @param args
*/
public static void main(String[] args) throws IOException {
if(null == args || args.length == 0) {
System.out.println("Usage: java TestBase64 <filename>");
System.exit(0);
}

String fileName = args[0];
File myFile = new File(fileName);

if(!myFile.exists()) {
System.out.println("The specified file does not exist!");
System.exit(0);
}

TestBase64 my64 = new TestBase64();
if(fileName.endsWith(BASE_64_FILE_SUFFIX)) {
my64.reCreateFile(myFile);
}
else {
// Encode file.
my64.encodeFile(myFile);
}
}
}
 
R

robert

(e-mail address removed)-intl.com escreveu:
Hi all,

I have spent my working day wrestling with a piece of code which
"should" work but of course doesn't! It took me all of 30 minute to
write and I have spent the last 9 hours looking through to find the
bug!!! :-( It is a class which reads a binary file from a socket and
writes it to a text file in Base64 format using the Axis
encoder/decoder.

Allow me to show you two pieces of code that should work together just
fine.

First, convert the file to a byte array:

public static byte[] getBytesFromFile(File file) throws
IOException {
InputStream is = new FileInputStream(file);

// Get the size of the file
long length = file.length();

// You cannot create an array using a long type.
// It needs to be an int type.
// Before converting to an int type, check
// to ensure that file is not larger than Integer.MAX_VALUE.
if (length > Integer.MAX_VALUE) {
// File is too large
throw new IOException("File exceeds max value: "
+ Integer.MAX_VALUE);
}

// Create the byte array to hold the data
byte[] bytes = new byte[(int) length];

// Read in the bytes
int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead = is.read(
bytes, offset, bytes.length - offset)) >= 0) {
offset += numRead;
}

// Ensure all the bytes have been read in
if (offset < bytes.length) {
throw new IOException("Could not completely read file"
+ file.getName());
}

// Close the input stream and return bytes
is.close();
return bytes;
}

Then convert from bytes to base64:

String id = null;
try {
ByteArrayOutputStream baos = new
ByteArrayOutputStream();
ObjectOutputStream stream = new
ObjectOutputStream(baos);
stream.write(yourByteArray);
stream.flush();
stream.close();
id = new
String(Base64.encode(baos.toByteArray()));
} catch(Exception ex) {
throw new SomeException(ex.getMessage());
}

This Base64 implementation uses apache/commons/httpclient/Base64.java
From there, just write the String as a file. That'd be my first shot.

HTH,
Robert
http://www.braziloutsource.com/
 
A

andoni.oconchubhair

What you have given is the usual answer which works in Most cases. I
have a quite special case though where I must write directly to the
file as the attachment may be too big to hold for long in memory.
Therefore I needed to stream this directly into a file on disk.

Anyway, I have found the answer to my problem. It is that the
BLOCK_SIZE that I was using needs to be a multiple of 6 & 8 both of
which Base6 uses. I currently have it working well with 24 and 48 but I
can't find a bigger one yet, 480 does not work so it can't be just any
multiple.

I will look into the structure of Base64 and if I remember I will post
my findings here.

Thank you for your help,
Andoni.
 
A

Andrey Kuznetsov

What you have given is the usual answer which works in Most cases. I
have a quite special case though where I must write directly to the
file as the attachment may be too big to hold for long in memory.
Therefore I needed to stream this directly into a file on disk.

then you may try Base64 codec from Unified I/O.
It accepts as input/output Reader/OutputStream or InputStream/Writer paar.

What you need is not yet in released version (I extended Base64.java just
after I read your post).
You have to get it from CVS:
https://uio.dev.java.net/source/browse/uio/com/imagero/uio/io/Base64.java
 
R

Roedy Green

Anyway, I have found the answer to my problem. It is that the
BLOCK_SIZE that I was using needs to be a multiple of 6 & 8 both of
which Base6 uses. I currently have it working well with 24 and 48 but I
can't find a bigger one yet, 480 does not work so it can't be just any
multiple.

If your data are too large to encode as one byte array you
need to encode in chunks, keeping in mind that Base64 is a
scheme where 3 bytes are concatenated, then split to form 4
groups of 6-bits each; and each 6-bits gets translated to an
encoded printable ASCII character, via a table lookup.

You must encode groups of 3 bytes together to get 4
chars and decode groups of 4 chars together to get three
bytes without a split over a buffer boundary.

One easy way of ensuring that is to make sure all buffers
are a multiple of 12 long.
 
T

Thomas Weidenfeller

char[] charArray = new char[BLOCK_SIZE];
final int OFFSET = 0;
int len = 0;
while(BLOCK_SIZE == (len = myFr.read(charArray, OFFSET, BLOCK_SIZE)))
{
System.out.print("<d");
Base64.decode(charArray, OFFSET, len, myFos);
}
System.out.println();
if(-1 < len) {
System.out.println("Len left is: " + len);
Base64.decode(charArray, OFFSET, len, myFos);
}

a) You assume a behavior of read() which is not guaranteed at all. Read
at any time is allowed to return less then the requested number of
characters. This does not indicate that you are near or close to EOF. So
your whole read logic is faulty.

b) You rely on the platforms default character encoding. This might or
might not be what you want.

c) You assume that decode() can be called incrementally. This is not
specified in the API documentation, so maybe it is not supported.
private void encodeFile(File myFile) throws IOException {
FileInputStream myFis = new FileInputStream(myFile);
FileWriter out = new FileWriter(myFile.getName() +
BASE_64_FILE_SUFFIX);
byte[] binArray = new byte[BLOCK_SIZE];
int len = 0;
final int OFFSET = 0;

while(BLOCK_SIZE == (len = myFis.read(binArray, OFFSET, BLOCK_SIZE)))
{
System.out.print("e>");
Base64.encode(binArray, OFFSET, len, out);
}

Same faulty read logic. Whoever taught you that this is the way to do it
should be ashamed. Again usage of the platforms default character encoding.

/Thomas
 
A

andoni.oconchubhair

Hello Thomas,

I would be *very* interested to see your version of same! If I cannot
count on the number of characters that read is returning, are you
suggesting that I only rely on -1 to tell me when to stop? I have
always used this methodology when reading from sockets and it has
always worked for me. Though I know that is not proof that anything
works. Anyway, I'll be looking into this further and be trying to apply
your comments.

I don't appreciate the: "Whoever taught you ..." comment though :-(

Andoni.
 
G

Greg R. Broderick

(e-mail address removed)-intl.com wrote in
Hi all,

I have spent my working day wrestling with a piece of code which
"should" work but of course doesn't! It took me all of 30 minute to
write and I have spent the last 9 hours looking through to find the
bug!!! :-( It is a class which reads a binary file from a socket and
writes it to a text file in Base64 format using the Axis
encoder/decoder.

ws.apache.org/axis/java

If I make the block-size big enough that it does not have to loop it
is fine but I want to be able to process massive files passing
through a mail server with a big (but not massive) block size.

My thanks in advance for any help,
Andoni.

I'd suggest that you take a look at javax.mail.internet.MimeUtility in
JavaMail. This class allows you to 'wrap' an output stream with an
encoder, and treat the resultant encoder as an OutputStream of its own,
that automagically encodes the data that you write to it in base64.


Cheers
GRB
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,739
Latest member
Clint8040

Latest Threads

Top