Read Office Files from C++.

M

miztaken

Hi there,
I want to read my Office Files (DOC,XLS and PPT) files using ole32.dll
and C++ or C++.NEt.
I have googled a bit and found that i can read the contents of DOC
file by using Stogare and Stream, or let me say ole32.dll.
But i dont know how to read other things like embedded attachments.
Actually i want to dump all the embedded attachments of DOC file to my
hard drive for further processing.

So can any one please help me get the info i need.
I am unable to find any tutorial or references to ole32.dll and how to
use it to extract embedded attachments from DOC file.
Please help me, i would be very much grateful.


Thank you
miztaken
 
P

Pascal J. Bourguignon

Victor Bazarov said:
miztaken said:
I want to read my Office Files (DOC,XLS and PPT) files using ole32.dll
and C++ or C++.NEt.
[..]
So can any one please help me get the info i need.
[..]

What you need is to post to the right newsgroup. Please look for any
newsgroup with .ole. in its name, preferably also with 'microsoft' in
it. Your problem has really nothing to do with the C++ *language* and
everything to do with the way Microsoft organizes its document files.

Well, clc++ is ok.
Just start with:

#include <fstream>

ifstream officeFile;
officeFile.open("test.doc",ifstream::in);
while(officeFile.good()){
doSomethingWithNextMSWordByte(officeFile.get());
}
officeFile.close();


Then we could discuss how we could represent with C++ classes a
structured document, etc..
 
S

Stefan Ram

Victor Bazarov said:
of the MS Office-specific formats. (not to doubt that it's all possible
to do in plain C++ if the format layout is known and available)

Yes. One creates COM objects (possibly, IIRC, the call is
something like »CoInitialize« and then »CoCreateInstance«),
then obtains interfaces (»QueryInterface«?) and then sends
commands to them. This can be done in Assembler, C, and C++.
Some of these objects even have an inbuilt documentation
(IIRC, a »typelib«).

But possibly, one needs to have Office installed to do this.

I always liked COM because it puts emphasis on interfaces and
allows binary code from multiple languages to interoperate.

A problem with it seems to have been the reference counting
for garbage detection. But still, I hope one would have tried
to enhance these aspect, instead of dropping it. (I believe
today, for new projects, dotnet is encouraged instead of COM
by Microsoft, Inc..)
 
S

Stefan Ram

But possibly, one needs to have Office installed to do this.

One also can access the files directly, but reading and
writing their format correctly can be a lot of effort.
 
M

miztaken

So how do i start..?
If any one can provide me the structure definition of DOC file (office
file).
I am able to get content of DOC file using IStream and IStorage
objects of ole32.dll.
But when parsing there are different object types and i dont know
their offsets.
So can anyone guide me on this?

You help is greatly appreciated.

Thank You
 
M

miztaken

actually i have posted there as well and the group seem to have very
less activity as it may occur to me.
Thanks anyways for your suggestion.
 
M

miztaken

ok..
here is the things

1. Before i knew this wasnt the group for my question, i already had
posted my question and after that i posted on few more as well.
So where does those electrical oven fit here.

2. and about the light thing.
Since we can use ole32.dll through C++ this makes totally sensible for
me to hope if someone has done this before and if they have used it
then its not related to Office files any more but related to compound
files.

What do u think ?
 
M

ManicQin

ok..
here is the things

1. Before i knew this wasnt the group for my question, i already had
posted my question and after that i posted on few more as well.
So where does those electrical oven fit here.

2. and about the light thing.
Since we can use ole32.dll through C++ this makes totally sensible for
me to hope if someone has done this before and if they have used it
then its not related to Office files any more but related to compound
files.

What do u think ?

Miztaken Mon Ami dont bark at Moderators, they bite back. ;) (No
offence Victor)
You can debate on the relevance of your post as much as you want.
But it wont help you, as Victor said this is C++ *language* group.
(half of the people here dont even know what is office :) )
Feel free to post Qs about the standard, Design issues, UB & etc...
even though it seems it's a "narrow region of intrest" the group's
hands are always full.

try:
comp . os . ms-windows . programmer . win32
http://groups.google.com/group/comp.os.ms-windows.programmer.win32/topics?hl=en
search codeproject.com or even msdn (my favourite for this kind of Qs
is vintage 10/01 and not the recents)
or try e-mailing that nice Stefan Ram.

in conclusion: if you keep with this post you will get a inadequate
answers and a lot of ranting.
 
F

Frank Birbacher

Hi!
Just start with:

#include <fstream>

ifstream officeFile;
officeFile.open("test.doc",ifstream::in);
while(officeFile.good()){
doSomethingWithNextMSWordByte(officeFile.get());
}
officeFile.close();

Which is actually a bad example, because .good() will not indicate that
..get() will succeed on the next call. But if
doSomethingWithNextMSWordByte can cope with the EOF value, everything is
fine, I think.

Regards, Frank
 
J

Jim Langston

miztaken said:
So how do i start..?
If any one can provide me the structure definition of DOC file (office
file).
I am able to get content of DOC file using IStream and IStorage
objects of ole32.dll.
But when parsing there are different object types and i dont know
their offsets.
So can anyone guide me on this?

This is OT here, and you should be directed to microsoft.* but I will give a
pointer because in microsoft.* you will be directed to microsoft specific
solutions.

Look at openoffice. It's opensource, freely downloadable, will open office
documents (.doc, .xls, etc..) I'm not sure if they tie into ole or not, but
you can check them out. If you don't get an answer there,
microsoft.public.vc.language would be the way to go for the microsoft
specific answers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,815
Latest member
treekmostly22

Latest Threads

Top