13.07.2015 Views

I/O Fundamentals

I/O Fundamentals

I/O Fundamentals

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Java Input and OutputJava TranslationJava Internal Formatsstandard that allows characters from character sets throughout the world to berepresented in two bytes. (for details see http://www.unicode.orgCharacters 0-127 of the UNICODE standard map directly to the ASCII standard.The rest of the character set is composed of "pages" that represent other charactersets. There are pages that map to characters from many different languages,"Dingbats" (symbols that can be used as characters), currency symbols,mathematical symbols and many others.The trick of course, is that each platform has its own native character set, whichusually has some mapping to the UNICODE standard. Java needs some way tomap the native character set to UNICODE.Java's text input and output classes translate the native characters to and fromUNICODE. For each delivered JDK, there is a "default mapping" that is used formost translations. You also can specify the encoding.For example, if you are on a machine that uses an ASCII encoding, Java will mapthe ASCII to UNICODE by padding the ASCII characters with extra 0 bits tocreate two-byte characters. Other languages have a more complex mapping.When reading files, Java translates from native format to UNICODE. Whenwriting files, Java translates from UNICODE to the native format.Many people raise concerns about Java efficiency due to the two-byte characterrepresentation of UNICODE.Java uses the UTF-8 encoding format to store Strings in class files. UTF-8 is asimple encoding of UNICODE characters and strings that is optimized for theASCII characters. In each byte of the encoding, the high bit determines if morebytes follow. A high bit of zero means that the byte has enough information tofully represent a character; ASCII characters require only a single byte.Many non-ASCII UNICODE characters still need only two bytes, but some mayrequire three to represent in this format.The Two Types of I/OConsider the UNICODE translation for a moment. Anytime we need to read text,© 1996-2003 jGuru.com. All Rights Reserved. Java Input and Output -7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!