Kinook Software Forum

Kinook Software Forum (https://www.kinook.com/Forum/index.php)
-   [UR] General Discussion (https://www.kinook.com/Forum/forumdisplay.php?f=23)
-   -   German language / Umlaute (https://www.kinook.com/Forum/showthread.php?t=3893)

pereh 12-20-2008 10:38 AM

German language / Umlaute
 
Hello,

I am new to UR and just doing my first steps, and I have a question for which I could not find an answer in the doku or the forum.
Most of the documents I use are in German language, so there are a lot of special characters like Ä, ä, Ö, etc. (Umlaute). Not only don't they appear in the normal text display of the items, but the words containing such characters are also ignored when building the automated keywords. That way, searching might sometimes become pointless.
Is there a way to tell UR to handle these characters like normal ones?

Thanks for your help,
Peter.

hartmut 12-21-2008 02:43 AM

I tested this now and found out that there is no Problem with the German Umlaut in the Title names or clean text-files, htm-files or Word-files documents imported into ultrarecall.
The Umlaute are shown in PDF Files but cannot be searched, even not with wildcards like "?" and "*"
The search within the PDF-document with the search function of the pdf-reader works well
In the Item Keyword window in all words with umlaute the umlaute are missing in the keyword list, but a far as I have seen only in pdf documents and not in the other kinds of documents mentioned above.
For example "Küche" ist found with Aearch Value "Küche" in all documents except PDF. IN PDF it will be found with "Kche" and not with"K?che"

Hartmut

pereh 12-22-2008 08:37 AM

Hello Hartmut,

thanks for your reply. You are right; I will have to use the search without Umlaute.

kinook 12-22-2008 09:11 AM

There did seem to be a problem with capturing accented Latin characters when keywording PDF documents. The main download at http://www.kinook.com/Download/UltraRecallProEval.exe has been udpated with a fix for this problem (UltraRecall.exe 3.5.3.1 in Help | About | Install Info after installing). You will need to re-import or synchronize (Item | Synchronize on the menu) PDF documents after installing to re-keyword.

hartmut 12-22-2008 09:33 AM

Thank you for your prompt attention.

Hartmut

pereh 12-22-2008 11:05 AM

Hello to all at Kinook,

this was in fact the fastest answer I ever got for any problem I ever had with any kind of software! Great!
I installed the new version, it is also displayed in 'About | Install info', but now instead of leaving out these special characters, they are replaced with even more special ones:
"AbkŸrzungen" instead of 'Abkürzungen', "PortrŠt" instead of 'Porträt', "der Gro§e" instead of 'der Große'. Maybe I get these results because my XP is running with the scheme "German / Germany"?

Best regards,
Peter.

kinook 12-22-2008 12:08 PM

1 Attachment(s)
Apparently. It works ok in our testing here on English XP when configured for German locale, but we don't have a German XP to test with. You might be able to temporarily change to English locale when importing. Or you can get back the old behavior by unzipping and double-clicking the .reg file in the attached zip file and restarting UR.

pereh 12-22-2008 12:20 PM

OK, I am back to the old behaviour. Will there be a fix for this?

kinook 12-22-2008 01:10 PM

We will report the problem to the vendor of the PDF component.

And please ZIP and send a couple of problem PDF files to support@kinook.com so we can verify whether the problem specific to your files. Thanks.

hartmut 12-22-2008 03:42 PM

I have the german XP and don't have a problem in the PDF as far as I see.
Peter, did you follow this instructions of Kinook:
"You will need to re-import or synchronize (Item | Synchronize on the menu) PDF documents after installing to re-keyword."

I searched for PDF, marked all in the search result window und "ITEM SYNCHONIZE".
Harmut

pereh 12-23-2008 03:36 AM

1 Attachment(s)
Quote:

Originally posted by hartmut
I have the german XP and don't have a problem in the PDF as far as I see.
Peter, did you follow this instructions of Kinook:
"You will need to re-import or synchronize (Item | Synchronize on the menu) PDF documents after installing to re-keyword."

I searched for PDF, marked all in the search result window und "ITEM SYNCHONIZE".
Harmut

Hello Hartmut,

I have tried re-import and synchronize. Now I have reinstalled UR (the version mentioned above), but the problem is still there. Please find a page attached for testing.

Best regards,
Peter.

pereh 12-23-2008 04:50 AM

I just tested with PDF2TXT V3.2 and it worked fine.

pereh 12-23-2008 06:52 AM

Now I have found a few PDFs, for which the keywording sometimes is ok ("möglich"), sometimes is wrong ("mglich") in the same document. I suspect now that it might have something to do with the fonts. For files that only use fonts Reader defines as type '1' (embedded) keywording gets always wrong. For files that additionaly use fonts defined as 'TrueType', the results are mixed. Maybe this is the right track to find the error?

kinook 12-23-2008 09:17 AM

Please ZIP and send a .urd file containing all problem PDFs imported (stored) to support@kinook.com. Thanks.

kinook 12-23-2008 12:24 PM

It seems that our licensed version of the PDF2TXT component has some issues. We are trying to get a working version of the licensed component from the vendor.


All times are GMT -5. The time now is 09:45 AM.


Copyright © 1999-2023 Kinook Software, Inc.