PDA

View Full Version : Tiny import bug


Spliff
05-14-2022, 05:37 AM
Import - Select Source - A text or rich text file - Delimiter Info - Blank Lines - 1 (I don't remember if the same fault is present with multiple blank-lines, but I think it is): does not work, I tried with `n, `r`n and `r; as above, and then - "A String", and then "|" e.g. works though.

EDIT: Just to clarify: I tried with DOUBLE `n, `r`n, `r, of course.

kinook
05-15-2022, 10:38 PM
It works as expected in my test. Using attached file with Blank Lines - 1, I got the attached .urd file.

Spliff
05-17-2022, 10:25 AM
Thank you for having checked. For me, it didn't and doesn't work though, neither for 1 blank line, nor for 2, etc., blank lines are not recognized as divider; I have now created a new file, in an editor, with no blank lines, then I have put RETURNS before the digits, in order to get 6 sub items under my UR target "Importtrial"; I just get 1 sub-item, with the full text, instead of 6, with the chunks each.

File-Import
Text or rich text RTF
file to import: path, Delimiter: Blank Lines 1
Where: Selected item > Finish

I checked the attached EmEditor file for \r\n\r\n, and got the blank lines ("Escape Sequence yes") as search hits; as said, it's in the form I imported it into UR > 1 sub-item, not 6.


EDIT:

Your file imported correctly in my UR file / item, whilst mine didn't / doesn't; mine is a realistic one though, with "content" between the titles, yours is just some titles separating the blank lines; in EmEditor, your blank lines identified as \r\n\r\n, just like in my text.

Please tell me if my text imported correctly into your UR file; if not, why might that be?

(Always speaking of the above import into a default, native ("factory") "Text" item (i.e. plain-text into an rtf content field).)

kinook
05-17-2022, 09:36 PM
So there is an issue with using the Blank lines option with multi-line paragraphs (with newlines):

text here<newline>
more text here<newline>
<newline>
abc<newline>
def<newline>
xyz<newline>
<newline>
my test<newline>
goes here

vs.

one line<newline>
<newline>
second<newline>
<newline>
third

As an alternative, you can use A string and press Enter twice for the delimiter.

Spliff
05-18-2022, 01:33 AM
Thank you, Kyle, for having confirmed; as I had said, just using a special char instead, e.g. "|", works fine; for this, I have to replace blanklines between my (real data) chunks (but not the ones within them) by a "|" each; this is automated but just will then fail for visual checking of my data, on screen the chunks are not separated anymore; instead, I can of course do both, replace the blank lines between chunks not by | but by blankline plus |, so there is no real problem, I just had wanted to mention that it didn't work as expected.

My real data doesn't consist in multiple short lines with `r`n, but with hundreds of chunks, each with just 5, 6, 8 paragraphs, separated with single blank lines, the chunks being separated by 2 blank lines each, and then, the realistic divider for UR import is TWO blanklines, with the single blanklines within the chunks being preserved, and this hadn't worked.

Thus I then had tried, with the single blank lines within the chunks being done away with beforehand, to import multiple chunks in UR, with single blank lines as separator, failed again; obviously, when there are also single `r`n (or `r or `n) within the chunks, the UR import won't distinguish `r`n`r`n, etc., i.e. multiple occurrences of those as separator anymore, this should be a very simple error in the code.

So my example file doesn't LOOK that realistic indeed, since all the lines = paragraphs are very short, but that was just my typing, in fact:

Whenever there are newlines (of any kind) within the chunks (i.e. "content", beyond just the line which will become the chunk's title), blank lines (single or multiple) are not recognized as chunk separator anymore.

( So you found the exception, but the rule is wider than my quick example (without any blank lines left within the chunks) seems to indicate: The problem is not multi-line paragraphs, but ANY paragraphs, even "regular" ones, separated by blank lines, within the chunks. ;-) )