Problem when searching for PDFs with Indexing Service in ASP-solution.

M

Martin Emanuelsson

Hello,

I have a problem with a small asp-solution that searches for PDF-documents
with
indexing service.

For some files in the search result I get gibberish returned, such as
******************************************************
I$OYDURSURGXFWVSURGXFHGLQ0H[LFR/DERUDWRU\5HSRUW/DERUDWRU\2UGHUHU5HVSRQVLEOH6
WDWXV)HPLQLQH*27-RKDQVVRQ6XVDQQH)LQDO'DWH)LQDO3URMHFW3URMHFW1DPH&RVWSODFH9HU
1R$9$523'36XPPDU\7KHUHVXOWV5XQ2II7KHSURGXFWVWKDWZHUHSURGXFHGZHUHEDG7KHVXUIDF
HPDWHULDOZDVK\GURSKRELFDQGDOOWKHSURGXFWVKDGUXQRII6HHSLFWXUH7KHSURGXFWVWKDWSU
RGXFHGZHUHJRRG,WZDVWKHVDPHSURGXFWVWKDWSURGXFHGEXWZLWKVSXQERQG%XURSHVXUIDFHPD
WHULDO7KHSURGXFWVKDGIDVWLQOHWJRRGVSUHDGLQJLQWKHFRUHDQGQRUXQRII'RVLPDW7KHSURG
XFWSURGXFHGZDVEDG6HYHUDORIWKHSURGX
******************************************************

while other files returns "good text" like this:
******************************************************
Feminine 865106-Date Final Projectname Orderer 2004-06-02 ALVARO PDP
Johansson Susanne Distributed to: Internal test Alvaro v. 20-21 Summary
Mission Background Comments Conclusion Test methods Test objects Sample No:
20040527-001-01 Alvaro Labrep 2_2.rep SEBJOIS 2004-03-17 Printed by:
labreporter 2004-06-02 15:51:51Laboratory Report No:20040527-001 Rev: 1
Status:Final Brand /Name SABA Ultr
******************************************************

The only difference between these files are that they seem to be saved with
different PDF versions or something like that (looking in File --> Document
Properties of the files).

The "bad" file has the following information there:
Creator: Windows NT 4.0
Producer: Acrobat Distiller Daemon 3.01 for HP-UX A.09.01 and later (HPPA)
PDF version: 1.1 (Acrobat 2.x)

The "good" file has the following information:
Creator: AdobePS5.dll Version 5.1.2
Producer: Acrobat Distiller 4.0 for Windows
PDF version: 1.3 (Acrobat 4.x)

A small part of the code looks like this:
******************************************************
set objConnection = Server.CreateObject("ADODB.Connection")
set objIndex = Server.CreateObject("ADODB.Recordset")
objConnection.ConnectionString = "Provider=MSIDXS;"
objConnection.Open
strSQL = "SELECT Characterization, Filename, Path FROM
se_got_data.limspdf..SCOPE() WHERE "

objIndex.Open strSQL, objConnection

do until objIndex.EOF
Response.write objIndex("Characterization")
objIndex.MoveNext
loop
objConnection.Close
Set objConnection = nothing
******************************************************

The problem seems to be this Characterization-part of the earlier version of
PDFs. Has anyone experienced anything like this before??

Best regards
Martin Emanuelsson
Gothenburg, Sweden
 
M

me

You need to install the Plugin. Acquire it from Adobe.
Hilary Cotter said:
could you post these problem docs here?

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html


Martin Emanuelsson said:
Hello,

I have a problem with a small asp-solution that searches for PDF-documents
with
indexing service.

For some files in the search result I get gibberish returned, such as
******************************************************
I$OYDURSURGXFWVSURGXFHGLQ0H[LFR/DERUDWRU\5HSRUW/DERUDWRU\2UGHUHU5HVSRQVLEOH6WDWXV)HPLQLQH*27-RKDQVVRQ6XVDQQH)LQDO'DWH)LQDO3URMHFW3URMHFW1DPH&RVWSODFH9HU1R$9$523'36XPPDU\7KHUHVXOWV5XQ2II7KHSURGXFWVWKDWZHUHSURGXFHGZHUHEDG7KHVXUIDFHPDWHULDOZDVK\GURSKRELFDQGDOOWKHSURGXFWVKDGUXQRII6HHSLFWXUH7KHSURGXFWVWKDWSURGXFHGZHUHJRRG,WZDVWKHVDPHSURGXFWVWKDWSURGXFHGEXWZLWKVSXQERQG%XURSHVXUIDFHPDWHULDO7KHSURGXFWVKDGIDVWLQOHWJRRGVSUHDGLQJLQWKHFRUHDQGQRUXQRII'RVLPDW7KHSURG
XFWSURGXFHGZDVEDG6HYHUDORIWKHSURGX
******************************************************

while other files returns "good text" like this:
******************************************************
Feminine 865106-Date Final Projectname Orderer 2004-06-02 ALVARO PDP
Johansson Susanne Distributed to: Internal test Alvaro v. 20-21 Summary
Mission Background Comments Conclusion Test methods Test objects Sample No:
20040527-001-01 Alvaro Labrep 2_2.rep SEBJOIS 2004-03-17 Printed by:
labreporter 2004-06-02 15:51:51Laboratory Report No:20040527-001 Rev: 1
Status:Final Brand /Name SABA Ultr
******************************************************

The only difference between these files are that they seem to be saved with
different PDF versions or something like that (looking in File --> Document
Properties of the files).

The "bad" file has the following information there:
Creator: Windows NT 4.0
Producer: Acrobat Distiller Daemon 3.01 for HP-UX A.09.01 and later (HPPA)
PDF version: 1.1 (Acrobat 2.x)

The "good" file has the following information:
Creator: AdobePS5.dll Version 5.1.2
Producer: Acrobat Distiller 4.0 for Windows
PDF version: 1.3 (Acrobat 4.x)

A small part of the code looks like this:
******************************************************
set objConnection = Server.CreateObject("ADODB.Connection")
set objIndex = Server.CreateObject("ADODB.Recordset")
objConnection.ConnectionString = "Provider=MSIDXS;"
objConnection.Open
strSQL = "SELECT Characterization, Filename, Path FROM
se_got_data.limspdf..SCOPE() WHERE "

objIndex.Open strSQL, objConnection

do until objIndex.EOF
Response.write objIndex("Characterization")
objIndex.MoveNext
loop
objConnection.Close
Set objConnection = nothing
******************************************************

The problem seems to be this Characterization-part of the earlier
version
of
PDFs. Has anyone experienced anything like this before??

Best regards
Martin Emanuelsson
Gothenburg, Sweden
 
H

Hilary Cotter

could you post these problem docs here?

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html


Martin Emanuelsson said:
Hello,

I have a problem with a small asp-solution that searches for PDF-documents
with
indexing service.

For some files in the search result I get gibberish returned, such as
******************************************************
I$OYDURSURGXFWVSURGXFHGLQ0H[LFR/DERUDWRU\5HSRUW/DERUDWRU\2UGHUHU5HVSRQVLEOH6WDWXV)HPLQLQH*27-RKDQVVRQ6XVDQQH)LQDO'DWH)LQDO3URMHFW3URMHFW1DPH&RVWSODFH9HU1R$9$523'36XPPDU\7KHUHVXOWV5XQ2II7KHSURGXFWVWKDWZHUHSURGXFHGZHUHEDG7KHVXUIDFHPDWHULDOZDVK\GURSKRELFDQGDOOWKHSURGXFWVKDGUXQRII6HHSLFWXUH7KHSURGXFWVWKDWSURGXFHGZHUHJRRG,WZDVWKHVDPHSURGXFWVWKDWSURGXFHGEXWZLWKVSXQERQG%XURSHVXUIDFHPDWHULDO7KHSURGXFWVKDGIDVWLQOHWJRRGVSUHDGLQJLQWKHFRUHDQGQRUXQRII'RVLPDW7KHSURG
XFWSURGXFHGZDVEDG6HYHUDORIWKHSURGX
******************************************************

while other files returns "good text" like this:
******************************************************
Feminine 865106-Date Final Projectname Orderer 2004-06-02 ALVARO PDP
Johansson Susanne Distributed to: Internal test Alvaro v. 20-21 Summary
Mission Background Comments Conclusion Test methods Test objects Sample No:
20040527-001-01 Alvaro Labrep 2_2.rep SEBJOIS 2004-03-17 Printed by:
labreporter 2004-06-02 15:51:51Laboratory Report No:20040527-001 Rev: 1
Status:Final Brand /Name SABA Ultr
******************************************************

The only difference between these files are that they seem to be saved with
different PDF versions or something like that (looking in File --> Document
Properties of the files).

The "bad" file has the following information there:
Creator: Windows NT 4.0
Producer: Acrobat Distiller Daemon 3.01 for HP-UX A.09.01 and later (HPPA)
PDF version: 1.1 (Acrobat 2.x)

The "good" file has the following information:
Creator: AdobePS5.dll Version 5.1.2
Producer: Acrobat Distiller 4.0 for Windows
PDF version: 1.3 (Acrobat 4.x)

A small part of the code looks like this:
******************************************************
set objConnection = Server.CreateObject("ADODB.Connection")
set objIndex = Server.CreateObject("ADODB.Recordset")
objConnection.ConnectionString = "Provider=MSIDXS;"
objConnection.Open
strSQL = "SELECT Characterization, Filename, Path FROM
se_got_data.limspdf..SCOPE() WHERE "

objIndex.Open strSQL, objConnection

do until objIndex.EOF
Response.write objIndex("Characterization")
objIndex.MoveNext
loop
objConnection.Close
Set objConnection = nothing
******************************************************

The problem seems to be this Characterization-part of the earlier version of
PDFs. Has anyone experienced anything like this before??

Best regards
Martin Emanuelsson
Gothenburg, Sweden
 
M

Martin Emanuelsson

If you mean the plugin Adobe PDF IFilter 5.0 then it is already installed on
the server so that is not the problem. Not unless there is some sort of
setting that needs to be done for the plugin?

/Martin


me said:
You need to install the Plugin. Acquire it from Adobe.
Hilary Cotter said:
could you post these problem docs here?
I$OYDURSURGXFWVSURGXFHGLQ0H[LFR/DERUDWRU\5HSRUW/DERUDWRU\2UGHUHU5HVSRQVLEOH6WDWXV)HPLQLQH*27-RKDQVVRQ6XVDQQH)LQDO'DWH)LQDO3URMHFW3URMHFW1DPH&RVWSODFH9HU1R$9$523'36XPPDU\7KHUHVXOWV5XQ2II7KHSURGXFWVWKDWZHUHSURGXFHGZHUHEDG7KHVXUIDFHPDWHULDOZDVK\GURSKRELFDQGDOOWKHSURGXFWVKDGUXQRII6HHSLFWXUH7KHSURGXFWVWKDWSURGXFHGZHUHJRRG,WZDVWKHVDPHSURGXFWVWKDWSURGXFHGEXWZLWKVSXQERQG%XURSHVXUIDFHPDWHULDO7KHSURGXFWVKDGIDVWLQOHWJRRGVSUHDGLQJLQWKHFRUHDQGQRUXQRII'RVLPDW7KHSURG
Sample
No: version
 
M

Martin Emanuelsson

I have to check with the business people where I'm working if that is ok or
if it's all confidential. I'll get back to you as soon as possible.

/Martin


Hilary Cotter said:
could you post these problem docs here?

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html


Martin Emanuelsson said:
Hello,

I have a problem with a small asp-solution that searches for PDF-documents
with
indexing service.

For some files in the search result I get gibberish returned, such as
******************************************************
I$OYDURSURGXFWVSURGXFHGLQ0H[LFR/DERUDWRU\5HSRUW/DERUDWRU\2UGHUHU5HVSRQVLEOH6WDWXV)HPLQLQH*27-RKDQVVRQ6XVDQQH)LQDO'DWH)LQDO3URMHFW3URMHFW1DPH&RVWSODFH9HU1R$9$523'36XPPDU\7KHUHVXOWV5XQ2II7KHSURGXFWVWKDWZHUHSURGXFHGZHUHEDG7KHVXUIDFHPDWHULDOZDVK\GURSKRELFDQGDOOWKHSURGXFWVKDGUXQRII6HHSLFWXUH7KHSURGXFWVWKDWSURGXFHGZHUHJRRG,WZDVWKHVDPHSURGXFWVWKDWSURGXFHGEXWZLWKVSXQERQG%XURSHVXUIDFHPDWHULDO7KHSURGXFWVKDGIDVWLQOHWJRRGVSUHDGLQJLQWKHFRUHDQGQRUXQRII'RVLPDW7KHSURG
XFWSURGXFHGZDVEDG6HYHUDORIWKHSURGX
******************************************************

while other files returns "good text" like this:
******************************************************
Feminine 865106-Date Final Projectname Orderer 2004-06-02 ALVARO PDP
Johansson Susanne Distributed to: Internal test Alvaro v. 20-21 Summary
Mission Background Comments Conclusion Test methods Test objects Sample No:
20040527-001-01 Alvaro Labrep 2_2.rep SEBJOIS 2004-03-17 Printed by:
labreporter 2004-06-02 15:51:51Laboratory Report No:20040527-001 Rev: 1
Status:Final Brand /Name SABA Ultr
******************************************************

The only difference between these files are that they seem to be saved with
different PDF versions or something like that (looking in File --> Document
Properties of the files).

The "bad" file has the following information there:
Creator: Windows NT 4.0
Producer: Acrobat Distiller Daemon 3.01 for HP-UX A.09.01 and later (HPPA)
PDF version: 1.1 (Acrobat 2.x)

The "good" file has the following information:
Creator: AdobePS5.dll Version 5.1.2
Producer: Acrobat Distiller 4.0 for Windows
PDF version: 1.3 (Acrobat 4.x)

A small part of the code looks like this:
******************************************************
set objConnection = Server.CreateObject("ADODB.Connection")
set objIndex = Server.CreateObject("ADODB.Recordset")
objConnection.ConnectionString = "Provider=MSIDXS;"
objConnection.Open
strSQL = "SELECT Characterization, Filename, Path FROM
se_got_data.limspdf..SCOPE() WHERE "

objIndex.Open strSQL, objConnection

do until objIndex.EOF
Response.write objIndex("Characterization")
objIndex.MoveNext
loop
objConnection.Close
Set objConnection = nothing
******************************************************

The problem seems to be this Characterization-part of the earlier
version
of
PDFs. Has anyone experienced anything like this before??

Best regards
Martin Emanuelsson
Gothenburg, Sweden
 
M

Martin Emanuelsson

I tried posting two test files to this newsgroup but got an error message
saying the message was too big (with two attachements with a size of about
130 kb in total).

Could send them directly to you if that's ok Hilary? And to anyone else
interested for that matter.

/Martin



Hilary Cotter said:
could you post these problem docs here?

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html


Martin Emanuelsson said:
Hello,

I have a problem with a small asp-solution that searches for PDF-documents
with
indexing service.

For some files in the search result I get gibberish returned, such as
******************************************************
I$OYDURSURGXFWVSURGXFHGLQ0H[LFR/DERUDWRU\5HSRUW/DERUDWRU\2UGHUHU5HVSRQVLEOH6WDWXV)HPLQLQH*27-RKDQVVRQ6XVDQQH)LQDO'DWH)LQDO3URMHFW3URMHFW1DPH&RVWSODFH9HU1R$9$523'36XPPDU\7KHUHVXOWV5XQ2II7KHSURGXFWVWKDWZHUHSURGXFHGZHUHEDG7KHVXUIDFHPDWHULDOZDVK\GURSKRELFDQGDOOWKHSURGXFWVKDGUXQRII6HHSLFWXUH7KHSURGXFWVWKDWSURGXFHGZHUHJRRG,WZDVWKHVDPHSURGXFWVWKDWSURGXFHGEXWZLWKVSXQERQG%XURSHVXUIDFHPDWHULDO7KHSURGXFWVKDGIDVWLQOHWJRRGVSUHDGLQJLQWKHFRUHDQGQRUXQRII'RVLPDW7KHSURG
XFWSURGXFHGZDVEDG6HYHUDORIWKHSURGX
******************************************************

while other files returns "good text" like this:
******************************************************
Feminine 865106-Date Final Projectname Orderer 2004-06-02 ALVARO PDP
Johansson Susanne Distributed to: Internal test Alvaro v. 20-21 Summary
Mission Background Comments Conclusion Test methods Test objects Sample No:
20040527-001-01 Alvaro Labrep 2_2.rep SEBJOIS 2004-03-17 Printed by:
labreporter 2004-06-02 15:51:51Laboratory Report No:20040527-001 Rev: 1
Status:Final Brand /Name SABA Ultr
******************************************************

The only difference between these files are that they seem to be saved with
different PDF versions or something like that (looking in File --> Document
Properties of the files).

The "bad" file has the following information there:
Creator: Windows NT 4.0
Producer: Acrobat Distiller Daemon 3.01 for HP-UX A.09.01 and later (HPPA)
PDF version: 1.1 (Acrobat 2.x)

The "good" file has the following information:
Creator: AdobePS5.dll Version 5.1.2
Producer: Acrobat Distiller 4.0 for Windows
PDF version: 1.3 (Acrobat 4.x)

A small part of the code looks like this:
******************************************************
set objConnection = Server.CreateObject("ADODB.Connection")
set objIndex = Server.CreateObject("ADODB.Recordset")
objConnection.ConnectionString = "Provider=MSIDXS;"
objConnection.Open
strSQL = "SELECT Characterization, Filename, Path FROM
se_got_data.limspdf..SCOPE() WHERE "

objIndex.Open strSQL, objConnection

do until objIndex.EOF
Response.write objIndex("Characterization")
objIndex.MoveNext
loop
objConnection.Close
Set objConnection = nothing
******************************************************

The problem seems to be this Characterization-part of the earlier
version
of
PDFs. Has anyone experienced anything like this before??

Best regards
Martin Emanuelsson
Gothenburg, Sweden
 
H

Hilary Cotter

sure.
Martin Emanuelsson said:
I tried posting two test files to this newsgroup but got an error message
saying the message was too big (with two attachements with a size of about
130 kb in total).

Could send them directly to you if that's ok Hilary? And to anyone else
interested for that matter.

/Martin



Hilary Cotter said:
could you post these problem docs here?
I$OYDURSURGXFWVSURGXFHGLQ0H[LFR/DERUDWRU\5HSRUW/DERUDWRU\2UGHUHU5HVSRQVLEOH6WDWXV)HPLQLQH*27-RKDQVVRQ6XVDQQH)LQDO'DWH)LQDO3URMHFW3URMHFW1DPH&RVWSODFH9HU1R$9$523'36XPPDU\7KHUHVXOWV5XQ2II7KHSURGXFWVWKDWZHUHSURGXFHGZHUHEDG7KHVXUIDFHPDWHULDOZDVK\GURSKRELFDQGDOOWKHSURGXFWVKDGUXQRII6HHSLFWXUH7KHSURGXFWVWKDWSURGXFHGZHUHJRRG,WZDVWKHVDPHSURGXFWVWKDWSURGXFHGEXWZLWKVSXQERQG%XURSHVXUIDFHPDWHULDO7KHSURGXFWVKDGIDVWLQOHWJRRGVSUHDGLQJLQWKHFRUHDQGQRUXQRII'RVLPDW7KHSURG
Sample
No: version
 
H

Hilary Cotter

the jibberish is in these docs.

There isn't a whole lot you can do, other than convert all the docs to the
"good" format. you might want to talk to Adobe.

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html


Hilary Cotter said:
sure.
Martin Emanuelsson said:
I tried posting two test files to this newsgroup but got an error message
saying the message was too big (with two attachements with a size of about
130 kb in total).

Could send them directly to you if that's ok Hilary? And to anyone else
interested for that matter.

/Martin
I$OYDURSURGXFWVSURGXFHGLQ0H[LFR/DERUDWRU\5HSRUW/DERUDWRU\2UGHUHU5HVSRQVLEOH6WDWXV)HPLQLQH*27-RKDQVVRQ6XVDQQH)LQDO'DWH)LQDO3URMHFW3URMHFW1DPH&RVWSODFH9HU1R$9$523'36XPPDU\7KHUHVXOWV5XQ2II7KHSURGXFWVWKDWZHUHSURGXFHGZHUHEDG7KHVXUIDFHPDWHULDOZDVK\GURSKRELFDQGDOOWKHSURGXFWVKDGUXQRII6HHSLFWXUH7KHSURGXFWVWKDWSURGXFHGZHUHJRRG,WZDVWKHVDPHSURGXFWVWKDWSURGXFHGEXWZLWKVSXQERQG%XURSHVXUIDFHPDWHULDO7KHSURGXFWVKDGIDVWLQOHWJRRGVSUHDGLQJLQWKHFRUHDQGQRUXQRII'RVLPDW7KHSURG Rev:
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top