Search text inside pdf files [Archive]

PDA

View Full Version : Search text inside pdf files

kalkito

08-28-2012, 03:22 AM

How can I do this? I tried everything but search only returns the name of the file.
Thanks!

$bill

08-28-2012, 05:31 AM

Could this be the reason?

"Parsing text from .pdf files available in the professional edition only."

kalkito

08-28-2012, 05:55 AM

I have the pro version.

$bill

08-28-2012, 06:31 AM

If I search for text that exists within a pdf file, then UR returns a list of Items- including the pdf containing the text. (of couse this assumes that the word is not just stored as a image)

I don't have UR configured to display or edit pdf files within UR (the default), so if I want to goto the text's location within the Item, I must open the pdf externally and search within the document.

To edit display internally
http://kinook.com/UltraRecall/Manual/internaloleedit.htm

Am I any closer to being helpful, or do I misunderstand what you are trying to accomplish?

kalkito

08-28-2012, 06:55 AM

Thank you for your help.

"If I search for text that exists within a pdf file, then UR returns a list of Items- including the pdf containing the text."

It's just not working with me. Tried many pdf files (mostly articles, text, no doubt about that).

For example, I drag a file to notes, "store contents". Ctrl+shift+s to search, put a word that is present in the pdf. The word is "demand". Result: "0 matching items".

My settings are set to open pdf externally and none of the boxes in quick search is checked.

Apologies for my poor english.
thx

kinook

08-28-2012, 07:05 AM

Does it work for this PDF document?

http://www.inkwelleditorial.com/pdfSample.pdf

After importing that file, I can search on terms like freelancing, staffing, etc., which are in the document text but not the title.

If you select the parent item of the PDF file you imported and display the Item Text column (F9), that column will show the text (if any) that UR found in the document.

Note that some PDF files aren't parseable for text content. One PDF text parser vendor indicated, "Some PDFs will simply never parse the way you would expect them to for various reasons. There is NO PDF to text converter in the world that can work with every PDF file ever created. Even Adobe itself cannot convert all PDFs to text properly." The PDF parser we use works with most files we have tested, but I believe that if the text in the PDF file is encrypted or stored in a non-standard format, most tools can't parse text from them.

kalkito

08-28-2012, 07:11 AM

Still no results. I must be doing something wrong.

kinook

08-28-2012, 07:44 AM

Please send the info requested at http://www.kinook.com/Forum/showthread.php?t=3038

kinook

08-28-2012, 05:25 PM

Make sure you have .pdf in Tools | Options | Import | File extensions to parse text.

Chaz

01-30-2013, 04:22 PM

This seems to go along with this chat a bit, but not identicle.

If I do a search for specific words ie " working directory" that I know exists in a PDF, it retuns "0 found". If I clcik on the PDF, the words are highlighted. How does it know to highlight the words but yet it cant find the document?

kinook

01-30-2013, 04:26 PM

Not sure, maybe .pdf wasn't in Tools | Options | Import | File extensions to parse text when the document was imported? Please send the info requested at http://www.kinook.com/Forum/showthread.php?t=3038