J
January Weiner
Hello,
I have this cool little script for content analysis of some documents.
Currently, it takes simple text and finds certain phrases, then creates a
html which shows the phrases in bold in a context.
However, the documents are initilally all PDF. Right now I just run
pdftotext first and proceed from there. Of course, all images, formatting
etc. are lost and if you want to precisely track back what happens where,
you need to go back to the original document (preferably with a printout
and a pencil).
What I would like to do is to take directly the PDF file and modify it in
such a way that the phrases are shown in red.
Can this be done? If yes, could you point me to which of the numerous PDF
Perl modules should I use? Or, even better, give me examples how this
can be done?
Regards,
January
--
I have this cool little script for content analysis of some documents.
Currently, it takes simple text and finds certain phrases, then creates a
html which shows the phrases in bold in a context.
However, the documents are initilally all PDF. Right now I just run
pdftotext first and proceed from there. Of course, all images, formatting
etc. are lost and if you want to precisely track back what happens where,
you need to go back to the original document (preferably with a printout
and a pencil).
What I would like to do is to take directly the PDF file and modify it in
such a way that the phrases are shown in red.
Can this be done? If yes, could you point me to which of the numerous PDF
Perl modules should I use? Or, even better, give me examples how this
can be done?
Regards,
January
--