View Full Version : German language / Umlaute
pereh
12-20-2008, 11:38 AM
Hello,
I am new to UR and just doing my first steps, and I have a question for which I could not find an answer in the doku or the forum.
Most of the documents I use are in German language, so there are a lot of special characters like Ä, ä, Ö, etc. (Umlaute). Not only don't they appear in the normal text display of the items, but the words containing such characters are also ignored when building the automated keywords. That way, searching might sometimes become pointless.
Is there a way to tell UR to handle these characters like normal ones?
Thanks for your help,
Peter.
hartmut
12-21-2008, 03:43 AM
I tested this now and found out that there is no Problem with the German Umlaut in the Title names or clean text-files, htm-files or Word-files documents imported into ultrarecall.
The Umlaute are shown in PDF Files but cannot be searched, even not with wildcards like "?" and "*"
The search within the PDF-document with the search function of the pdf-reader works well
In the Item Keyword window in all words with umlaute the umlaute are missing in the keyword list, but a far as I have seen only in pdf documents and not in the other kinds of documents mentioned above.
For example "Küche" ist found with Aearch Value "Küche" in all documents except PDF. IN PDF it will be found with "Kche" and not with"K?che"
Hartmut
pereh
12-22-2008, 09:37 AM
Hello Hartmut,
thanks for your reply. You are right; I will have to use the search without Umlaute.
kinook
12-22-2008, 10:11 AM
There did seem to be a problem with capturing accented Latin characters when keywording PDF documents. The main download at http://www.kinook.com/Download/UltraRecallProEval.exe has been udpated with a fix for this problem (UltraRecall.exe 3.5.3.1 in Help | About | Install Info after installing). You will need to re-import or synchronize (Item | Synchronize on the menu) PDF documents after installing to re-keyword.
hartmut
12-22-2008, 10:33 AM
Thank you for your prompt attention.
Hartmut
pereh
12-22-2008, 12:05 PM
Hello to all at Kinook,
this was in fact the fastest answer I ever got for any problem I ever had with any kind of software! Great!
I installed the new version, it is also displayed in 'About | Install info', but now instead of leaving out these special characters, they are replaced with even more special ones:
"AbkŸrzungen" instead of 'Abkürzungen', "PortrÅ*t" instead of 'Porträt', "der Gro§e" instead of 'der Große'. Maybe I get these results because my XP is running with the scheme "German / Germany"?
Best regards,
Peter.
kinook
12-22-2008, 01:08 PM
Apparently. It works ok in our testing here on English XP when configured for German locale, but we don't have a German XP to test with. You might be able to temporarily change to English locale when importing. Or you can get back the old behavior by unzipping and double-clicking the .reg file in the attached zip file and restarting UR.
pereh
12-22-2008, 01:20 PM
OK, I am back to the old behaviour. Will there be a fix for this?
kinook
12-22-2008, 02:10 PM
We will report the problem to the vendor of the PDF component.
And please ZIP and send a couple of problem PDF files to support@kinook.com so we can verify whether the problem specific to your files. Thanks.
hartmut
12-22-2008, 04:42 PM
I have the german XP and don't have a problem in the PDF as far as I see.
Peter, did you follow this instructions of Kinook:
"You will need to re-import or synchronize (Item | Synchronize on the menu) PDF documents after installing to re-keyword."
I searched for PDF, marked all in the search result window und "ITEM SYNCHONIZE".
Harmut
pereh
12-23-2008, 04:36 AM
Originally posted by hartmut
I have the german XP and don't have a problem in the PDF as far as I see.
Peter, did you follow this instructions of Kinook:
"You will need to re-import or synchronize (Item | Synchronize on the menu) PDF documents after installing to re-keyword."
I searched for PDF, marked all in the search result window und "ITEM SYNCHONIZE".
Harmut
Hello Hartmut,
I have tried re-import and synchronize. Now I have reinstalled UR (the version mentioned above), but the problem is still there. Please find a page attached for testing.
Best regards,
Peter.
pereh
12-23-2008, 05:50 AM
I just tested with PDF2TXT V3.2 and it worked fine.
pereh
12-23-2008, 07:52 AM
Now I have found a few PDFs, for which the keywording sometimes is ok ("möglich"), sometimes is wrong ("mglich") in the same document. I suspect now that it might have something to do with the fonts. For files that only use fonts Reader defines as type '1' (embedded) keywording gets always wrong. For files that additionaly use fonts defined as 'TrueType', the results are mixed. Maybe this is the right track to find the error?
kinook
12-23-2008, 10:17 AM
Please ZIP and send a .urd file containing all problem PDFs imported (stored) to support@kinook.com. Thanks.
kinook
12-23-2008, 01:24 PM
It seems that our licensed version of the PDF2TXT component has some issues. We are trying to get a working version of the licensed component from the vendor.
kinook
12-31-2008, 10:39 AM
This should now be fixed in 3.5d.
pereh
12-31-2008, 11:11 AM
Originally posted by kinook
This should now be fixed in 3.5d.
Item text display and keywords are fine now. Thanks.
Jon Polish
12-31-2008, 02:41 PM
Yes, it seems fixed here too. Thank you and have a happy new year.
Jon
vBulletin® v3.8.11, Copyright ©2000-2024, vBulletin Solutions Inc.