A Proposal for Bidi Isolates in Unicode
A Proposal for Bidi Isolates in Unicode
A Proposal for Bidi Isolates in Unicode
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
each of the values <strong>in</strong> sequence be<strong>for</strong>e cont<strong>in</strong>u<strong>in</strong>g to the next rule". This is because the details of<br />
the order <strong>in</strong> which the rules are applied are given <strong>in</strong> later sections, and the <strong>for</strong>mulation <strong>in</strong> the<br />
sentence above is potentially problematic <strong>for</strong> isolates.<br />
Section 3.1 (Def<strong>in</strong>itions): Add isolate-related def<strong>in</strong>itions<br />
BD2. Replace “Embedd<strong>in</strong>g levels are explicitly set by both override <strong>for</strong>mat codes and by<br />
embedd<strong>in</strong>g <strong>for</strong>mat codes” with: “Embedd<strong>in</strong>g levels are explicitly set by embedd<strong>in</strong>g <strong>for</strong>mat codes,<br />
isolate <strong>for</strong>mat codes and override <strong>for</strong>mat codes”.<br />
After the example follow<strong>in</strong>g BD7 add:<br />
BD8. The match<strong>in</strong>g PDI <strong>for</strong> a given FSI, LRI, or RLI is the one determ<strong>in</strong>ed by the follow<strong>in</strong>g<br />
algorithm:<br />
● Initialize a counter to zero.<br />
● Scan the text follow<strong>in</strong>g the FSI, LRI, or RLI to the end of the of the paragraph while<br />
<strong>in</strong>crement<strong>in</strong>g the counter at every FSI, LRI, or RLI, and decrement<strong>in</strong>g it at every PDI.<br />
● Stop at the first PDI, if any, be<strong>for</strong>e which the counter is already zero.<br />
● If such a PDI was found, it is the match<strong>in</strong>g PDI <strong>for</strong> the FSI, LRI, or RLI. Otherwise, there<br />
is no match<strong>in</strong>g PDI <strong>for</strong> it.<br />
Note that LRE, RLE, LRO, RLO and PDF characters are ignored when f<strong>in</strong>d<strong>in</strong>g the match<strong>in</strong>g PDI.<br />
Add:<br />
BD9. An isolat<strong>in</strong>g run sequence is an ordered set of level runs where:<br />
●<br />
●<br />
●<br />
●<br />
For every level run except the last one <strong>in</strong> the sequence, the last character <strong>in</strong> the level run<br />
is an FSI, LRI or RLI.<br />
For every level run except the first one <strong>in</strong> the sequence, the first character <strong>in</strong> the level run<br />
is the match<strong>in</strong>g PDI of the FSI, LRI, or RLI at the end of the preced<strong>in</strong>g level run <strong>in</strong> the<br />
sequence.<br />
If the first character of the first level run <strong>in</strong> the sequence is a PDI, it is not the match<strong>in</strong>g<br />
PDI <strong>for</strong> any FSI, LRI, or RLI.<br />
If the last character of the last level run <strong>in</strong> the sequence is an FSI, LRI or RLI, it has no<br />
match<strong>in</strong>g PDI.<br />
As we will see, all the level runs <strong>in</strong> an isolat<strong>in</strong>g run sequence have the same embedd<strong>in</strong>g level.<br />
Let’s take an example, writ<strong>in</strong>g the <strong>for</strong>mat codes <strong>in</strong> subscript to make it easier to read. Assum<strong>in</strong>g<br />
that no texti conta<strong>in</strong>s <strong>for</strong>mat codes or paragraph separators, the paragraph<br />
“text1 FSI<br />
text2 LRI<br />
text3 PDI<br />
text4 PDI‧RLI<br />
text5 PDI<br />
text6” will conta<strong>in</strong> the follow<strong>in</strong>g isolat<strong>in</strong>g run sequences:<br />
●<br />
“text1 FSI<br />
”, “ PDI‧RLI<br />
”, “ PDI<br />
text6”