Java IO.pdf - Nguyen Dang Binh

More documents

Recommendations

Info

Java I/O Chapter 14. Multilingual Character Sets and Unicode We live on a planet on which many languages are spoken. I can walk out my front door in Brooklyn on any given day and hear people conversing in French, Creole, Hebrew, Arabic, Spanish, and languages I don't even recognize. And the Internet is even more diverse than Brooklyn. A local doctor's office that sets up a storefront on the Web to sell vitamins may soon find itself shipping to customers whose native language is Chinese, Gujarati, Turkish, German, Portuguese, or something else. There's no such thing as a local business on the Internet. However, the first computers and the first programming languages were mostly designed by English-speaking programmers in countries where English was the native language. These programmers designed character sets that worked well for English text, though not much else. The preeminent such set is ASCII. Since ASCII is a seven-bit character set, each ASCII character can easily be represented as a single byte, signed or unsigned. Thus, it's natural for ASCII-based programming languages to equate the character data type with the byte data type. In these languages, such as C, the same operations that read and write bytes also read and write characters. Unfortunately, ASCII is inadequate for almost all non-English languages. It contains no cedillas, umlauts, betas, thorns, or any of the other thousands of non-English characters that are used to read and write text around the world. Fairly shortly after the development of ASCII, there was an explosion of extended character sets around the world, each of which encoded the basic ASCII characters as well as the additional characters needed for another language like Greek, Turkish, Arabic, Chinese, Japanese, or Russian. Many of these character sets are still used today, and much existing data is encoded in them. However, these character sets are still inadequate for many needs. For one thing, most assume that you only want to encode English plus one other language. This makes it difficult for a Russian classicist to write a commentary on an ancient Greek text, for example. Furthermore, documents are limited by their character sets. Email sent from Morocco may become illegible in India if the sender is using an Arabic character set but the recipient is using Devanagari. Unicode is an international effort to provide a single character set that everyone can use. Unicode supports the characters needed for English, Arabic, Cyrillic, Greek, Devanagari, and many others. Unicode isn't perfect. There are some omissions, especially in the ideographic character sets for Chinese and Japanese, but it is the most comprehensive character set yet devised for all the languages of planet Earth. Java is one of the first programming languages to explicitly address the need for non-English text. It does this by adopting Unicode as its native character set. All Java chars and strings are given in Unicode. However, since there's also a lot of non-Unicode legacy text in the world, in a dizzying array of encodings, Java also provides the classes you need to read and write text in these encodings as well. 14.1 Unicode Unicode is Java's native character set. Each Unicode character is a two-byte, unsigned number with a value between and 65,535. This provides enough space for characters from all the 337
Java I/O world's alphabetic scripts and the most common characters from the ideographic scripts of Chinese and Japanese. The current version of Unicode (2.1) defines 38,887 different characters from many languages, including English, Russian, Arabic, Hebrew, Greek, Thai, Korean, and Sanskrit. The most common ideographic characters from Japanese and Chinese are also included. However, Chinese alone contains over 80,000 different ideograms, so it's impossible to include them all in a two-byte set. A four-byte Universal Character Set (UCS) that will include the full Chinese and Japanese scripts is under development. Java does not yet support UCS. The first 128 Unicode characters (characters through 127) are identical to the ASCII character set. 32 is the ASCII space; therefore, 32 is the Unicode space. 33 is the ASCII exclamation point, so 33 is the Unicode exclamation point, and so on. Table B.1, in Appendix B, shows this character set. The next 128 Unicode characters (characters 128 through 255) have the same values as the equivalent characters in the Latin-1 character set defined by ISO standard 8859-1. Latin-1, a slight variation of which is used by Windows, adds the various accented characters, umlauts, cedillas, upside-down question marks, and other characters needed to write text in most Western European languages. Table B.2 shows these characters. The first 128 characters in Latin-1 are identical to the ASCII character set. Values beyond 255 encode characters from various other character sets. Where possible, character blocks describing a particular group of characters map onto established encodings for that set of characters by simple transposition. For instance, Unicode characters 884 through 1011 encode the Greek alphabet and associated characters like the Greek question mark (;). [1] This is a direct transposition by 756 of characters 128 through 255 of the ISO 8859-7 character set, which is in turn based on the Greek national standard ELOT 928. For example, the small letter delta, , ISO 8859-7 character 228, is Unicode character 984. A small epsilon, , ISO 8859-7 character 229, is Unicode character 985. In general, the Unicode value for a Greek character equals the ISO 8859-7 value for the character plus 756. Other character sets are included in Unicode in a similar fashion whenever possible. [2] NextStep, BeOS, MacOS X Server, Bell Labs' Plan 9, and Windows NT 4.0 all support Unicode to some extent. Unicode support in MacOS and Windows 98 is more nascent, but it's coming. Application software is a little slower to appear, but Microsoft Word 97 and 98, Netscape Navigator 4.0, and Internet Explorer 4.0 all support Unicode. The big hold-up on most systems is fonts and input methods. Windows NT 5.0 will include fonts covering most of the defined Unicode characters as well as input methods for most major languages. 14.2 Displaying Unicode Text Although internally Java can handle full Unicode data (it's just numbers, after all), not all Java environments can display all Unicode characters. In fact, I'll go so far as to say none of the current Java environments, whether standalone virtual machines or web browsers, can display all Unicode characters. Unicode is divided into blocks. For example, characters through 127 are the Basic Latin block and contain ASCII. Characters 128 through 255 are the Latin Extended-A block and contain 1 Indeed, the Greek question mark is nearly identical to a Latin semicolon; this is not a mistranslation of the character. 2 As much as I'd like to include complete tables for all Unicode characters, if I did so, this book would be little more than that table. For complete lists of all the Unicode characters and associated glyphs, the canonical reference is The Unicode Standard, Version 2.0, by the Unicode Consortium, ISBN 0-201-48345-9. Online versions of the character tables can be found at http://unicode.org/charts/. 338
Page 2 and 3:
Java I/O Elliotte Rusty Harold Publ
Page 4 and 5:
III: Filter Streams ...............
Page 6 and 7:
V: Appendixes .....................
Page 8 and 9:
Preface Java I/O In many ways this
Page 10 and 11:
Java I/O operations from moving byt
Page 12 and 13:
Chapter 9 Chapter 10 Java I/O Chapt
Page 14 and 15:
Chapter 17 Java I/O lining up the d
Page 16 and 17:
Security Issues Java I/O I don't kn
Page 18 and 19:
Java I/O Finally, although many of
Page 20 and 21:
Part I: Basic I/O Java I/O 13
Page 22 and 23:
Java I/O The word stream is derived
Page 24 and 25:
Java I/O The java.util.jar package
Page 26 and 27:
Java I/O For instance, according to
Page 28 and 29:
Java I/O Java understands several d
Page 30 and 31:
Java I/O commonly use an eight-bit
Page 32 and 33:
Java I/O For the most part, these c
Page 34 and 35:
Figure 1.1. An xterm console on Uni
Page 36 and 37:
Java I/O If you absolutely must use
Page 38 and 39:
Java I/O in Apple's Macintosh Runti
Page 40 and 41:
Java I/O to work inside a web brows
Page 42 and 43:
Java I/O bit pattern to a particula
Page 44 and 45:
Java I/O The output is the same as
Page 46 and 47:
OutputStream out = new NullOutputSt
Page 48 and 49:
Figure 2.1. The StreamedTextArea co
Page 50 and 51:
Java I/O When you call read(), you
Page 52 and 53:
Java I/O try { byte[] b = new byte[
Page 54 and 55:
Java I/O Once you have closed an in
Page 56 and 57:
public class StreamCopier { } publi
Page 58 and 59:
Chapter 4. File Streams Java I/O Un
Page 60 and 61:
} } for (int i = 0; i < args.length
Page 62 and 63:
Java I/O The FileOutputStream class
Page 64 and 65:
Java I/O ASCII format is the defaul
Page 66 and 67:
} } } Java I/O if (hex >= 16) Syste
Page 68 and 69:
Java I/O Finally, many HTML files c
Page 70 and 71:
Java I/O In particular, URL connect
Page 72 and 73:
Java I/O Example 5.3 uses these ste
Page 74 and 75:
Java I/O many available ports but o
Page 76 and 77:
Java I/O inspecting the port. It kn
Page 78 and 79:
} } } String response = "Hello " +
Page 80 and 81:
} public void actionPerformed(Actio
Page 82 and 83:
Chapter 6. Filter Streams Java I/O
Page 84 and 85:
Java I/O output stream, out. Among
Page 86 and 87:
Java I/O SourceFile???StringExtract
Page 88 and 89:
6.3 Buffered Streams Java I/O Buffe
Page 90 and 91:
public boolean markSupported() Java
Page 92 and 93:
The compiler rewrites this complica
Page 94 and 95:
public class TeeCopier { } public s
Page 96 and 97:
} } try { FileInputStream fin = new
Page 98 and 99:
} public int available() throws IOE
Page 100 and 101:
public class HexFilter extends Dump
Page 102 and 103:
} } StreamCopier.copy(in, out); in.
Page 104 and 105:
Java I/O the java.io.DataOutput int
Page 106 and 107:
Java I/O Although other schemes are
Page 108 and 109:
} public static void main(String ar
Page 110 and 111:
Java I/O Since Java has no unsigned
Page 112 and 113:
dout.close(); } catch (IOException
Page 114 and 115:
Java I/O The DataInputStream class
Page 116 and 117:
Java I/O The readLine() method read
Page 118 and 119:
7.7 Miscellaneous Methods Java I/O
Page 120 and 121:
public synchronized void write(int
Page 122 and 123:
** * Writes a 4 byte Java float to
Page 124 and 125:
} * possible that this number is te
Page 126 and 127:
*/ public byte readByte(int b) thro
Page 128 and 129:
} Java I/O long byte4 = in.read();
Page 130 and 131:
7.9 Thread Safety Java I/O The Litt
Page 132 and 133:
public static void main(String[] ar
Page 134 and 135:
Table 7.2. Command-Line Switches fo
Page 136 and 137:
} protected void fill() throws IOEx
Page 138 and 139:
Chapter 8. Streams in Memory Java I
Page 140 and 141:
UDP Byte array input and output str
Page 142 and 143:
} } try { // So that the buffer doe
Page 144 and 145:
Java I/O Otherwise, these classes j
Page 146 and 147:
} } catch (IOException e) { // prob
Page 148 and 149:
Java I/O legally unencumbered—tha
Page 150 and 151:
Java I/O if it doesn't, it will com
Page 152 and 153:
Java I/O The first method fills the
Page 154 and 155:
public class DirectDeflater { } pub
Page 156 and 157:
9.1.2.1 Constructing inflaters Ther
Page 158 and 159:
9.1.2.7 An example Java I/O Example
Page 160 and 161:
Java I/O stream itself, without hav
Page 162 and 163:
Java I/O Each inflater input stream
Page 164 and 165:
Java I/O This writes all remaining
Page 166 and 167:
} } } FileOutputStream fout = new F
Page 168 and 169:
} } } catch (IOException e) {System
Page 170 and 171:
Java I/O Normally, the name argumen
Page 172 and 173:
% java FancyZipLister temp.zip test
Page 174 and 175:
Java I/O All high-order Unicode byt
Page 176 and 177:
Java I/O the files to be stored in
Page 178 and 179:
9.3.3.2 Open the next zip entry Jav
Page 180 and 181:
Java I/O Better checksum schemes us
Page 182 and 183:
} } int b; while ((b = in.read()) !
Page 184 and 185:
Java I/O example, under Unix, to ma
Page 186 and 187:
Java I/O archive. If the signatures
Page 188 and 189:
Java I/O public JarFile(String file
Page 190 and 191:
Java I/O The java.util.jar.Attribut
Page 192 and 193:
9.5.7 Manifest Java I/O What the ja
Page 194 and 195:
public ZipEntry getNextEntry() thro
Page 196 and 197:
Java I/O so throw a java.lang.Unsup
Page 198 and 199:
within the space of the next few li
Page 200 and 201:
Chapter 10. Cryptographic Streams J
Page 202 and 203:
Java I/O • It should be difficult
Page 204 and 205:
Figure 10.1. The four steps to calc
Page 206 and 207:
Java I/O you only want to detect si
Page 208 and 209:
public void reset() Java I/O In pra
Page 210 and 211:
Java I/O After each successful call
Page 212 and 213:
} byte[] result = md.digest(); for
Page 214 and 215:
} } catch (Exception e) {System.err
Page 216 and 217:
public class EasyFileDigest { publi
Page 218 and 219:
Java I/O them your encryption (publ
Page 220 and 221:
Java I/O DES, triple DES (DESede),
Page 222 and 223:
IDEA RC2 RC4 Blowfish Java I/O The
Page 224 and 225:
Figure 10.2. Encrypting data Java I
Page 226 and 227:
} } catch (NoSuchAlgorithmException
Page 228 and 229:
} } Java I/O } catch (InvalidKeySpe
Page 230 and 231:
SecretKey blowfishKey = blowfishKey
Page 232 and 233:
Java I/O public final byte[] doFina
Page 234 and 235:
Java I/O A CipherInputStream object
Page 236 and 237:
10.6.2 CipherOutputStream CipherOut
Page 238 and 239:
} } byte[] newDigest = sha.digest()
Page 240 and 241:
} } else if (args[firstFile].equals
Page 242 and 243:
Java I/O Note how little we had to
Page 244 and 245:
Chapter 11. Object Serialization Ja
Page 246 and 247:
Java I/O To write an object onto a
Page 248 and 249:
Java I/O Needless to say, this is a
Page 250 and 251:
Java I/O java.awt.image.renderable
Page 252 and 253:
Java I/O are not serializable, then
Page 254 and 255:
11.5.1.5 Making nonserializable fie
Page 256 and 257:
Java I/O output streams, but DataIn
Page 258 and 259:
Figure 11.2. The serialver GUI Java
Page 260 and 261:
public SerializableZipFile(File fil
Page 262 and 263:
public class NetworkWindow extends
Page 264 and 265:
private void writeObject(ObjectOutp
Page 266 and 267:
} } Vector v = (Vector) oin.readObj
Page 268 and 269:
Java I/O the resolveClass() method
Page 270 and 271:
Example 11.6. Person import java.ut
Page 272 and 273:
public class SealedPoint { } public
Page 274 and 275:
Chapter 12. Working with Files Java
Page 276 and 277:
Java I/O not begin with an @. Under
Page 278 and 279:
Java I/O placing a minus sign befor
Page 280 and 281:
Java I/O rush to snap up all the go
Page 282 and 283:
System Properties System properties
Page 284 and 285:
Java I/O complete list of the ASCII
Page 286 and 287:
Win32 index.html MacOS index.html N
Page 288 and 289:
Java I/O The path argument should b
Page 290 and 291:
Java I/O assume the array returned
Page 292 and 293:
Example 12.2. Paths import java.io.
Page 294 and 295: String canonicalPath = f.getCanonic
Page 296 and 297: Java I/O The isHidden() method, onl
Page 298 and 299: 12.3.4 Manipulating Files Java I/O
Page 300 and 301: fin.close(); src.delete(); Java I/O
Page 302 and 303: public int hashCode() public boolea
Page 304 and 305: public class DirList { } File direc
Page 306 and 307: 12.4 Filename Filters Java I/O You
Page 308 and 309: } } for (int i = 0; i < htmlFiles.l
Page 310 and 311: public RandomAccessFile(File file,
Page 312 and 313: Java I/O • Do not parse pathnames
Page 314 and 315: Figure 13.2. Motif standard Open di
Page 316 and 317: Java I/O Example 13.1 is a program
Page 318 and 319: } } // Work around annoying AWT non
Page 320 and 321: } // Clean up our windows, they won
Page 322 and 323: 13.2.2 Displaying File Choosers Jav
Page 324 and 325: 13.2.5 Custom Dialogs Java I/O File
Page 326 and 327: Figure 13.4. The choosable file fil
Page 328 and 329: Example 13.7. JavaChooser import ja
Page 330 and 331: 13.2.7 Selecting Directories Java I
Page 332 and 333: public void setFileView(fileView) p
Page 334 and 335: 13.2.12.1 Action events Java I/O Wh
Page 336 and 337: public TextFilePreview(JFileChooser
Page 338 and 339: 13.3 File Viewer, Part 6 Java I/O W
Page 340 and 341: } } } theView.setText(""); OutputSt
Page 342 and 343: } } return deflated.isSelected(); p
Page 346 and 347: Java I/O the upper 128 characters o
Page 348 and 349: makeBlock("Telugu", 0x0C00, 0x0C7F)
Page 350 and 351: } } for (int i = 0; i < names.lengt
Page 352 and 353: 14.3 Unicode Escapes Java I/O Curre
Page 354 and 355: 0 0 0 0 0 x10 x9 x8 x7 x6 x5 x4 x3
Page 356 and 357: Character g = new Character('g'); C
Page 358 and 359: Java I/O two Latin letters. In uppe
Page 360 and 361: Java I/O tie, and a few similar cha
Page 362 and 363: Java I/O The most common type of ch
Page 364 and 365: Java I/O encoding into a new file c
Page 366 and 367: Java I/O String openingLineInUnicod
Page 368 and 369: Java I/O For example, given some Wr
Page 370 and 371: The flush() and close() methods flu
Page 372 and 373: 15.4 The InputStreamReader Class Th
Page 374 and 375: Java I/O Similarly, to write text i
Page 376 and 377: Java I/O These methods behave like
Page 378 and 379: Java I/O Since string objects are i
Page 380 and 381: This class has three constructors t
Page 382 and 383: Do not use the newLine() method if
Page 384 and 385: Example 15.5. The cat Program impor
Page 386 and 387: There are four constructors in this
Page 388 and 389: PipedReader pr = new PipedReader();
Page 390 and 391: Example 15.7. SourceReader package
Page 392 and 393: Java I/O question is what to do if
Page 394 and 395:
Java I/O numeric data. If they choo
Page 396 and 397:
} public boolean isText() { if (thi
Page 398 and 399:
default: } } else { LittleEndianInp
Page 400 and 401:
public void init() { } this.addWind
Page 402 and 403:
Chapter 16. Formatted I/O with java
Page 404 and 405:
Java I/O variable salary 12 places
Page 406 and 407:
Hebrew Hebrew Israel iw Hungarian R
Page 408 and 409:
For example: Java I/O NumberFormat
Page 410 and 411:
public void setMaximumIntegerDigits
Page 412 and 413:
Java I/O Like other aspects of text
Page 414 and 415:
Example 16.5. PercentTable import j
Page 416 and 417:
Java I/O align the decimal points i
Page 418 and 419:
Figure 16.1. The PrettiestTable app
Page 420 and 421:
Java I/O This whole class is just a
Page 422 and 423:
Java I/O Most number formats are in
Page 424 and 425:
public int getMultiplier() public v
Page 426 and 427:
Arabic (Saudi Arabia) Arabic (Sudan
Page 428 and 429:
Slovak #,##0.### -1 234,56 #,##0.00
Page 430 and 431:
16.6.2.1 Utility methods Java I/O F
Page 432 and 433:
** * Concrete class for formatting
Page 434 and 435:
} String integerField = digits.subs
Page 436 and 437:
Chapter 17. The Java Communications
Page 438 and 439:
javax.comm.CommPortIdentifier@be4c9
Page 440 and 441:
} } } switch(com.getPortType()) { c
Page 442 and 443:
Java I/O In this example, you see t
Page 444 and 445:
17.2.5 Registering Ports For comple
Page 446 and 447:
} public void run() { } try { byte[
Page 448 and 449:
Java I/O The output buffer size is
Page 450 and 451:
Java I/O Here's the results for bot
Page 452 and 453:
Java I/O If the requested values ar
Page 454 and 455:
Figure 17.1. PC DB-25 serial port t
Page 456 and 457:
public abstract void setRTS(boolean
Page 458 and 459:
Java I/O Steps 1 and 2 should be fa
Page 460 and 461:
Java I/O standard messages on other
Page 462 and 463:
Java I/O serial port events: when t
Page 464 and 465:
17.5.5.3 Step 3 Java I/O In many ci
Page 466 and 467:
Appendix A. Additional Resources Ja
Page 468 and 469:
A.5 Data Compression Java I/O Java
Page 470 and 471:
Java I/O "Developing International
Page 472 and 473:
Appendix B. Character Sets Java I/O
Page 474 and 475:
151 epa (end of guarded area) 183
Page 476 and 477:
Java I/O Cp857 DOS ASCII plus Turki
Page 478 and 479:
Java I/O As extensive as this list
show all

Java IO.pdf - Nguyen Dang Binh

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?