25.07.2014 Views

VDM-10 Language Manual

VDM-10 Language Manual

VDM-10 Language Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Appendix B<br />

Lexical Specification<br />

B.1 Characters<br />

The characters that comprise a valid <strong>VDM</strong> specification are defined in terms of Unicode codepoints.<br />

The actual character encoding of a <strong>VDM</strong> source file (for example UTF-8, ISO-Latin-1 or<br />

Shift-JIS) is not defined, and the tool support is responsible for converting whatever encoding is<br />

used into Unicode during the parse of the file.<br />

All <strong>VDM</strong> keywords and delimiter tokens are composed of characters from the Basic Latin<br />

block (“ASCII” codepoints less than U+0080). On the other hand, user identifiers (variable names,<br />

function names and so on) can be composed of a rich variety of Unicode codepoints, reflecting the<br />

need for fully internationalized specifications.<br />

All Unicode codepoints have a “category”. Certain categories are entirely excluded from the set<br />

of codepoints that are permitted in identifiers. This prevents, say, puntuation characters from being<br />

used. On the other hand, to provide a degree of compatibility with the original <strong>VDM</strong> ISO standard,<br />

and for backward compatibility, there are different rules for the formation of user identifiers that<br />

only use ASCII characters. For example, the underscore is permitted in identifiers (U+005F), even<br />

though this is in the connecting punctuation category, which would not normally be allowed.<br />

See http://www.fileformat.info/info/unicode/category/index.htm for<br />

more information about categories.<br />

199

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!