A Proposal for Bidi Isolates in Unicode
A Proposal for Bidi Isolates in Unicode
A Proposal for Bidi Isolates in Unicode
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
eviews) פיצה סגולה (3<br />
(which is how it would be displayed if us<strong>in</strong>g RLE and PDF or no <strong>for</strong>matt<strong>in</strong>g).<br />
<strong>Isolates</strong> are skipped over when determ<strong>in</strong><strong>in</strong>g the base direction of a paragraph or a<br />
first-strong isolate. This is part of mak<strong>in</strong>g an isolate behave as a neutral character <strong>for</strong> the<br />
purposes of the visual order<strong>in</strong>g of the content surround<strong>in</strong>g it.<br />
Thus, the paragraph “ LRI<br />
HTML PDI<br />
LRI ראשי תיבות של)<br />
Hyper Text Markup Language PDI<br />
היא )<br />
( LRI שפת תגיות<br />
markup language PDI<br />
(from the Hebrew Wikipedia ”.ליצירה ועיצוב דפי אינטרנט )<br />
article on HTML) would be <strong>in</strong>terpreted as RTL even <strong>in</strong> the absence of a higher protocol <strong>for</strong><br />
paragraph direction despite beg<strong>in</strong>n<strong>in</strong>g with LTR text, and thus (correctly) displayed as:<br />
markup) היא שפת תגיות (Hyper Text Markup Language תיבות של (ראשי HTML<br />
(language ליצירה ועיצוב דפי אינטרנט.<br />
Nest<strong>in</strong>g is allowed. Embedd<strong>in</strong>gs and overrides are allowed to nest with<strong>in</strong> isolates (and<br />
embedd<strong>in</strong>gs and overrides), and isolates are allowed to nest with<strong>in</strong> isolates, embedd<strong>in</strong>gs and<br />
overrides. Just like embedd<strong>in</strong>gs and overrides, isolates count toward the exist<strong>in</strong>g nest<strong>in</strong>g limit<br />
beyond which explicit <strong>for</strong>matt<strong>in</strong>g characters are considered <strong>in</strong>valid. The new PDI codepo<strong>in</strong>t is<br />
used to term<strong>in</strong>ate isolates <strong>in</strong>stead of the exist<strong>in</strong>g PDF. As suggested by Mart<strong>in</strong> J. Dürst, this<br />
makes sure that older applications that implement the UBA without isolate support (and thus<br />
ignore isolate <strong>for</strong>matt<strong>in</strong>g characters) do not mis<strong>in</strong>terpret the end of an isolate as the end of an<br />
embedd<strong>in</strong>g or override <strong>in</strong> which the isolate is nested.<br />
When the embedd<strong>in</strong>gs or overrides and isolates <strong>in</strong> a paragraph are not properly nested, we<br />
def<strong>in</strong>e the isolates to be “stronger”. That is, the algorithm will ignore a PDF when a PDI is<br />
expected, and will have a PDI close all embedd<strong>in</strong>gs/overrides opened between the PDI and the<br />
FSI/LRI/RLI it closes.<br />
Thus, <strong>in</strong> “ LRE<br />
… RLE<br />
… FSI<br />
… PDF<br />
?! PDI<br />
”, the PDF is ignored, so the isolate cont<strong>in</strong>ues to the<br />
PDI and <strong>in</strong>cludes the “?!”.<br />
And <strong>in</strong> “ LRE<br />
… FSI<br />
… RLE<br />
… PDI<br />
?! PDF<br />
”, the PDI ends the scope of the RLE as well as the<br />
isolate, so the “?!” is outside the isolate. (The PDF w<strong>in</strong>ds up match<strong>in</strong>g the LRE.)<br />
When embedd<strong>in</strong>gs and isolates are properly nested, isolates be<strong>in</strong>g “stronger” makes no<br />
difference.<br />
Thus, <strong>in</strong> “ RLI<br />
… LRE<br />
… PDF<br />
... PDI<br />
” and “ RLE<br />
… LRI<br />
… PDI<br />
... PDF<br />
”, the PDF closes the<br />
embedd<strong>in</strong>g, and the PDI closes the isolate. Neither is ignored.<br />
We make isolates “stronger” than embedd<strong>in</strong>gs and overrides so that they can be used to isolate