Kinook Software Forum

Go Back   Kinook Software Forum > Ultra Recall > [UR] General Discussion

Reply
 
Thread Tools Rate Thread Display Modes
  #1  
Old 02-19-2010, 05:00 AM
hartmut hartmut is online now
Registered User
 
Join Date: 06-12-2005
Posts: 70
Strange strange behaviour with special French characters

Problem:
When I copy from firefox or IE a French webpage directly to UR the French characters are not shown correctly.
When I save the same webpage to scrapbook, export it to a folder and then import to UR via file import the characters are displayer correctly.

Please see following example

direct copy:
Décoration et loisirs créatifs

via scrapbook:
Décoration et loisirs créatifs
I am using UR 4.1b, Firefox 3.6 and WinmdowsXP


Hartmut
Reply With Quote
  #2  
Old 02-19-2010, 09:55 AM
kinook kinook is online now
Administrator
 
Join Date: 03-06-2001
Location: Colorado
Posts: 6,034
That works ok in our tests. The first Google result for "Décoration et loisirs créatifs" was http://www.creamalice.com/. After importing that page from Firefox 3.6 into UR 4.1b using the UR Firefox extension 'Copy to Ultra Recall' button, the item text displays correctly in UR (see attached .urd file and screen shot).
Attached Files
File Type: zip files.zip (691.6 KB, 1619 views)
Reply With Quote
  #3  
Old 02-20-2010, 02:07 AM
hartmut hartmut is online now
Registered User
 
Join Date: 06-12-2005
Posts: 70
Than yo, I tried with the side you mention and it works fine,.
I suppose it is a problem of the site were I downloadad this side, as the side of this site have all the same problem.

The original side was

http://www.tourisme-hautemarne.com/t...810,1283.html?


Hartmut
Reply With Quote
  #4  
Old 02-22-2010, 09:41 AM
kinook kinook is online now
Administrator
 
Join Date: 03-06-2001
Location: Colorado
Posts: 6,034
It appears that what is happening is the web page text is UTF-8 encoded, but without a BOM (byte order mark), and within the web page itself, the content is declared to be encoded as iso-8859-1 (the encoding for Western European text), which is inconsistent with the actual encoding. UR imports the data correctly, but when displaying the page, the embedded IE browser assumes the page is encoded as iso-8859-1 rather than UTF-8, which results in the accented characters displaying incorrectly. My guess is that scrapbook converts everything to the current code page or UTF-8 (adding a BOM), but UR doesn't do this (and even Firefox's Save Page As does something similar to UR, except that it doesn't capture images).

http://en.wikipedia.org/wiki/UTF-8

http://en.wikipedia.org/wiki/Byte-order_mark

http://en.wikipedia.org/wiki/ISO/IEC_8859-1

One workaround is to select the page content (Ctrl+A) in the browser before importing into UR -- the HTML clipboard data has consistent encodings and is handled correctly.
Reply With Quote
Reply

Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



All times are GMT -5. The time now is 07:03 PM.


Copyright © 1999-2023 Kinook Software, Inc.