Extract images from PDF files

W

writeson

Hi all,

I've looked around with Google quite a bit, but haven't found anything
like what I'm looking for. Is there a Python library that will extract
images from PDF files? My ultimate goal is to pull the images out, use
the PIL library to reduce the size of the images and rebuild another
PDF file that's an essentially "thumbnail" version of the original PDF
file, smaller in size.

We've been using imagick to extract the images, but it's difficult to
script and slow to process the input PDF. Can someone suggest
something better?

Thanks in advance,
Doug
 
W

writeson

David,

Thanks for your reply, I'll take a look at pdftohtml and see if it
suits my needs.

Thanks!
Doug
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top