01.11.2014 Views

A Proposal for Bidi Isolates in Unicode

A Proposal for Bidi Isolates in Unicode

A Proposal for Bidi Isolates in Unicode

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

each of the values <strong>in</strong> sequence be<strong>for</strong>e cont<strong>in</strong>u<strong>in</strong>g to the next rule". This is because the details of<br />

the order <strong>in</strong> which the rules are applied are given <strong>in</strong> later sections, and the <strong>for</strong>mulation <strong>in</strong> the<br />

sentence above is potentially problematic <strong>for</strong> isolates.<br />

Section 3.1 (Def<strong>in</strong>itions): Add isolate-related def<strong>in</strong>itions<br />

BD2. Replace “Embedd<strong>in</strong>g levels are explicitly set by both override <strong>for</strong>mat codes and by<br />

embedd<strong>in</strong>g <strong>for</strong>mat codes” with: “Embedd<strong>in</strong>g levels are explicitly set by embedd<strong>in</strong>g <strong>for</strong>mat codes,<br />

isolate <strong>for</strong>mat codes and override <strong>for</strong>mat codes”.<br />

After the example follow<strong>in</strong>g BD7 add:<br />

BD8. The match<strong>in</strong>g PDI <strong>for</strong> a given FSI, LRI, or RLI is the one determ<strong>in</strong>ed by the follow<strong>in</strong>g<br />

algorithm:<br />

● Initialize a counter to zero.<br />

● Scan the text follow<strong>in</strong>g the FSI, LRI, or RLI to the end of the of the paragraph while<br />

<strong>in</strong>crement<strong>in</strong>g the counter at every FSI, LRI, or RLI, and decrement<strong>in</strong>g it at every PDI.<br />

● Stop at the first PDI, if any, be<strong>for</strong>e which the counter is already zero.<br />

● If such a PDI was found, it is the match<strong>in</strong>g PDI <strong>for</strong> the FSI, LRI, or RLI. Otherwise, there<br />

is no match<strong>in</strong>g PDI <strong>for</strong> it.<br />

Note that LRE, RLE, LRO, RLO and PDF characters are ignored when f<strong>in</strong>d<strong>in</strong>g the match<strong>in</strong>g PDI.<br />

Add:<br />

BD9. An isolat<strong>in</strong>g run sequence is an ordered set of level runs where:<br />

●<br />

●<br />

●<br />

●<br />

For every level run except the last one <strong>in</strong> the sequence, the last character <strong>in</strong> the level run<br />

is an FSI, LRI or RLI.<br />

For every level run except the first one <strong>in</strong> the sequence, the first character <strong>in</strong> the level run<br />

is the match<strong>in</strong>g PDI of the FSI, LRI, or RLI at the end of the preced<strong>in</strong>g level run <strong>in</strong> the<br />

sequence.<br />

If the first character of the first level run <strong>in</strong> the sequence is a PDI, it is not the match<strong>in</strong>g<br />

PDI <strong>for</strong> any FSI, LRI, or RLI.<br />

If the last character of the last level run <strong>in</strong> the sequence is an FSI, LRI or RLI, it has no<br />

match<strong>in</strong>g PDI.<br />

As we will see, all the level runs <strong>in</strong> an isolat<strong>in</strong>g run sequence have the same embedd<strong>in</strong>g level.<br />

Let’s take an example, writ<strong>in</strong>g the <strong>for</strong>mat codes <strong>in</strong> subscript to make it easier to read. Assum<strong>in</strong>g<br />

that no texti conta<strong>in</strong>s <strong>for</strong>mat codes or paragraph separators, the paragraph<br />

“text1 FSI<br />

text2 LRI<br />

text3 PDI<br />

text4 PDI‧RLI<br />

text5 PDI<br />

text6” will conta<strong>in</strong> the follow<strong>in</strong>g isolat<strong>in</strong>g run sequences:<br />

●<br />

“text1 FSI<br />

”, “ PDI‧RLI<br />

”, “ PDI<br />

text6”

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!