Generate Word and Powerpoint files

D

Daniel Parry

I'd like to generate word and power point files on a linux based
system. Populated with various random words from a dictionary to
create various different size files. I have this working for:
Excel, HTML, JSON, ODT, PDF, RTF, Text, and XML format but stumped
a bit for doc and ppt. Any one have any suggestions for hacks that
might make these last two formats possible, which don't include
starting up a windows instance somehow (^_^)

Thanks and best wishes,

Daniel
 
S

smallpond

I'd like to generate word and power point files on a linux based
system. Populated with various random words from a dictionary to
create various different size files. I have this working for:
Excel, HTML, JSON, ODT, PDF, RTF, Text, and XML format but stumped
a bit for doc and ppt. Any one have any suggestions for hacks that
might make these last two formats possible, which don't include
starting up a windows instance somehow (^_^)

Thanks and best wishes,

Daniel

Please post a spec for those formats.
 
B

Ben Bullock

I'd like to generate word and power point files on a linux based system.
Populated with various random words from a dictionary to create various
different size files. I have this working for: Excel, HTML, JSON, ODT,
PDF, RTF, Text, and XML format but stumped a bit for doc and ppt. Any
one have any suggestions for hacks that might make these last two
formats possible, which don't include starting up a windows instance
somehow (^_^)

On Linux you could create your file in HTML or some other format and then
have OpenOffice.org save it in Microsoft Word's .doc or .ppt formats.

I don't know how to automate OpenOffice.org but I imagine it's possible.
 
D

Daniel Parry

Please post a spec for those formats.

In essence, I'm after formats suitable for testing the text
extraction capabilities of the java JCR jackrabbit system. I
believe jackrabbit is moving towards using apache tika, so the
formats are likely those listed here:

http://lucene.apache.org/tika/formats.html

Though I am particularly interested in word and power point docs,
which likely means the OLE2 Compound Document format?

Best wishes,

Daniel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top