Working with PDFs?

Guest · Aug 15, 2010

Just curious if anyone knows if it's possible to work with pdf documents
with Python? I'd like to do the following:

- Pull out text from each PDF page (to search for specific words)
- Combine separate pdf documents into one document
- Add bookmarks (with destination settings)

A few programs I've been looking at are pdfminer, pyPDF, etc from this
link:
http://pypi.python.org/pypi?:action=search&term=pdf&submit=search

Originally, I was using AppleScript and JavaScript to do this in Acrobat.
But now Acrobat 9 has broken this process and I can't seem to make it
work. I'd like to find other workarounds instead of having to rely on
Adobe.

Thanks for your help.

Jay

Anssi Saari · Aug 18, 2010

- Pull out text from each PDF page (to search for specific words)
- Combine separate pdf documents into one document
- Add bookmarks (with destination settings)

PDF Shuffler is a Python app which does PDF merging and splitting very
well. I don't think it does anything else, though, but maybe that's
where your code comes in?

Digital Signature field form in PDF generated document from HTML	5	Nov 16, 2022
PyPDF Processing Errors (ValueError: invalid literal for int() with	0	Aug 8, 2011
appending multiple PDFs together.	1	Jul 25, 2003
file does not begin with '%PDF-' - Error when searching pdfs on FireFox browser	0	Aug 2, 2006
Problem when searching for PDFs with Indexing Service in ASP-solution.	7	Oct 15, 2004
How to force PDFs into frames?	0	Sep 22, 2003
PDFs with acrobat form fields from XML (with FOP, EPSes or FormDesigner)?	0	May 24, 2004
File IO errors with PyPDF	0	Mar 9, 2007

Working with PDFs?

Guest

Anssi Saari

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads