View Full Version : Search text inside pdf files
kalkito
08-28-2012, 03:22 AM
How can I do this? I tried everything but search only returns the name of the file.
Thanks!
$bill
08-28-2012, 05:31 AM
Could this be the reason?
"Parsing text from .pdf files available in the professional edition only."
kalkito
08-28-2012, 05:55 AM
I have the pro version.
$bill
08-28-2012, 06:31 AM
If I search for text that exists within a pdf file, then UR returns a list of Items- including the pdf containing the text. (of couse this assumes that the word is not just stored as a image)
I don't have UR configured to display or edit pdf files within UR (the default), so if I want to goto the text's location within the Item, I must open the pdf externally and search within the document.
To edit display internally
http://kinook.com/UltraRecall/Manual/internaloleedit.htm
Am I any closer to being helpful, or do I misunderstand what you are trying to accomplish?
kalkito
08-28-2012, 06:55 AM
Thank you for your help.
"If I search for text that exists within a pdf file, then UR returns a list of Items- including the pdf containing the text."
It's just not working with me. Tried many pdf files (mostly articles, text, no doubt about that).
For example, I drag a file to notes, "store contents". Ctrl+shift+s to search, put a word that is present in the pdf. The word is "demand". Result: "0 matching items".
My settings are set to open pdf externally and none of the boxes in quick search is checked.
Apologies for my poor english.
thx
kinook
08-28-2012, 07:05 AM
Does it work for this PDF document?
http://www.inkwelleditorial.com/pdfSample.pdf
After importing that file, I can search on terms like freelancing, staffing, etc., which are in the document text but not the title.
If you select the parent item of the PDF file you imported and display the Item Text column (F9), that column will show the text (if any) that UR found in the document.
Note that some PDF files aren't parseable for text content. One PDF text parser vendor indicated, "Some PDFs will simply never parse the way you would expect them to for various reasons. There is NO PDF to text converter in the world that can work with every PDF file ever created. Even Adobe itself cannot convert all PDFs to text properly." The PDF parser we use works with most files we have tested, but I believe that if the text in the PDF file is encrypted or stored in a non-standard format, most tools can't parse text from them.
kalkito
08-28-2012, 07:11 AM
Still no results. I must be doing something wrong.
kinook
08-28-2012, 07:44 AM
Please send the info requested at http://www.kinook.com/Forum/showthread.php?t=3038
kinook
08-28-2012, 05:25 PM
Make sure you have .pdf in Tools | Options | Import | File extensions to parse text.
This seems to go along with this chat a bit, but not identicle.
If I do a search for specific words ie " working directory" that I know exists in a PDF, it retuns "0 found". If I clcik on the PDF, the words are highlighted. How does it know to highlight the words but yet it cant find the document?
kinook
01-30-2013, 04:26 PM
Not sure, maybe .pdf wasn't in Tools | Options | Import | File extensions to parse text when the document was imported? Please send the info requested at http://www.kinook.com/Forum/showthread.php?t=3038
vBulletin® v3.8.11, Copyright ©2000-2024, vBulletin Solutions Inc.