Search inside PDF and CHM files

Y

Ya Ya

I have a folder with a lot of PDF and CHM files.
I would like to develope an ASP.net application that enables the user to
search inside the content of those files.
How do I search inside those type of files ?

Thanks for your time

(e-mail address removed)
 
M

Mark Fitzpatrick

Adobe provides an addition to Indexing Services in the form of a dll that
will enable Indexing services to search PDF files. That would enable you to
use Indexing Services to search those files easily. CHM are a whole other
matter though and I'm not sure what the best way to search them is.

Hope this helps,
Mark Fitzpatrick
Microsoft MVP - FrontPage
 
P

pdavis68

For .CHM files, there are several files that describe the file format. It is
technically undocumented, so using this information is obviously at your own
risk:

http://bonedaddy.net/pabs3/hhm/
http://www.speakeasy.org/~russotto/chm/

As for PDF files, Adobe documents the format. I'm not exactly sure where it
is, but it's out there. You can also take a look at the open source project
XPDF. It is a PDF viewer for X Windows in Unix and you should be able to
learn quite a bit from it.

Both of these formats are fairly complex, so there's no simple way to get at
what you want going this route.

Pete Davis
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,225
Members
46,815
Latest member
treekmostly22

Latest Threads

Top