Replacing string in an existing file

R

ruds

Hi,
I have a .doc file, in this I want to replace/ modify a string
e.g: Date: dd/mm/yyyy to Date: MM/DD/YYYY

Can I use regular file manipulation to achieve this?
Please advice.
 
G

GArlington

Hi,
I have a .doc file, in this I want to replace/ modify a string
e.g: Date: dd/mm/yyyy to Date: MM/DD/YYYY

Can I use regular file manipulation to achieve this?
Please advice.

..doc means MS Word document...
File() type of access is likely to break it because .doc is
proprietary BINARY (NOT text) format...
But, you can give it a try just to check...
 
D

Daniel Pitts

GArlington said:
..doc means MS Word document...
File() type of access is likely to break it because .doc is
proprietary BINARY (NOT text) format...
But, you can give it a try just to check...
What are you talking about? java.util.File doesn't even give you access
to the content of the File, only (some of) the system attributes
associated with a path. Java is well capable of access binary data.

The only part of your message that makes any sense is that the .doc
format is proprietary, which is likely to make it difficult to read/write.
 
L

Lew

Daniel said:
What are you talking about? java.util.File doesn't even give you access
to the content of the File, only (some of) the system attributes
associated with a path. Java is well capable of access binary data.

The only part of your message that makes any sense is that the .doc
format is proprietary, which is likely to make it difficult to read/write.

What I guess they're talking about is that with a binary file format,
arbitrary writes can mess up structural invariants and render the file
useless. So most of their message makes sense to me.

OTOH, if the OP is very familiar with .doc format, they can ensure that those
invariants are not violated, then Java access will work just fine.
 
R

Roedy Green

I have a .doc file, in this I want to replace/ modify a string
e.g: Date: dd/mm/yyyy to Date: MM/DD/YYYY

Can I use regular file manipulation to achieve this?

the easiest way is to read the entire file into a String, then create
a StringBuilder and build your new file images of substrings of the
original. Then write it out.

See http://mindprod.com/applet/fileio.html
http://mindprod.com/jgloss/stringbuilder.html
--
Roedy Green Canadian Mind Products
http://mindprod.com
PM Steven Harper is fixated on the costs of implementing Kyoto, estimated as high as 1% of GDP.
However, he refuses to consider the costs of not implementing Kyoto which the
famous economist Nicholas Stern estimated at 5 to 20% of GDP
 
R

Roedy Green

Hi,
I have a .doc file, in this I want to replace/ modify a string
e.g: Date: dd/mm/yyyy to Date: MM/DD/YYYY

..doc can be many different formats, commonly a MS word proprietary,
but sometimes plain text.

There are several things you can do.

use the POI library. http://mindprod.com/jgloss/poi.html

Export the file to text, html, RDF...

Convert the file to OpenOffice Format.
http://mindprod.com/jgloss/openoffice.html



--
Roedy Green Canadian Mind Products
http://mindprod.com
PM Steven Harper is fixated on the costs of implementing Kyoto, estimated as high as 1% of GDP.
However, he refuses to consider the costs of not implementing Kyoto which the
famous economist Nicholas Stern estimated at 5 to 20% of GDP
 
L

Lew

Roedy said:
the easiest way is to read the entire [.doc] file into a String, then create
a StringBuilder and build your new file images of substrings of the
original. Then write it out.

This approach is fraught with peril for binary file formats.

It has similar risks to this approach employed with, say, .jpg, .wav
or .mp3 files.

It can be done, but one must really understand the file format and be
very, very careful. Also, the translation between binary formats and
String characters must be considered and managed.

Roedy's advice to use POI or the other techniques he mentions is
safer.
 
R

Roedy Green

This approach is fraught with peril for binary file formats.

That is an understatement!
It has similar risks to this approach employed with, say, .jpg, .wav
or .mp3 files.


For binary formats, you could use SPLICE, a C++ utility.
See http://mindprod.com/products1.html#SPLICE

Or you could read byte[], and agglutinate with a ByteArrayStream using
bytes only, no chars. A ByteArrayStream acts something like a
StringBuilder but for bytes. It grows a backing byte[] store as
needed.

see http://mindprod.com/applet/fileio.html

Make very sure you work with bytes with no encoding. As soon as you
turn on any sort of encoding all manner of queer things will happen,
though the code will work for text files.

COPY /B a + b c
will concatenate binary files .

--
Roedy Green Canadian Mind Products
http://mindprod.com
PM Steven Harper is fixated on the costs of implementing Kyoto, estimated as high as 1% of GDP.
However, he refuses to consider the costs of not implementing Kyoto which the
famous economist Nicholas Stern estimated at 5 to 20% of GDP
 
Joined
May 6, 2009
Messages
1
Reaction score
0
sol

It is possible to modify .DOC or .ODT file:

Code:
            FileInputStream fis=new FileInputStream(new File("test.odt"));           
            FileOutputStream fos=new FileOutputStream(new File("c:/result.html"));           


            officetools.OfficeFile f=new OfficeFile(fis,"localhost","8100", false);
          
            f.replaceAll("originalString","replaceString");

            f.convert(fos,"html");

officetools.jar is available on this website:
dancrintea.ro/doc-to-pdf/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,983
Messages
2,570,187
Members
46,747
Latest member
jojoBizaroo

Latest Threads

Top