13.07.2015 Views

Applied XML Programming for Microsoft .NET.pdf - Csbdu.in

Applied XML Programming for Microsoft .NET.pdf - Csbdu.in

Applied XML Programming for Microsoft .NET.pdf - Csbdu.in

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The NameTable ObjectOne of the secrets beh<strong>in</strong>d the <strong>XML</strong> readers' great per<strong>for</strong>mance is the NameTableclass—a helper class that works as a quickly accessible table of str<strong>in</strong>g objects. Several.<strong>NET</strong> classes, <strong>in</strong>clud<strong>in</strong>g, but not limited to, XmlDocument and XmlTextReader, makeuse <strong>in</strong>ternally of a NameTable object. User applications too can use a NameTableobject to store potentially duplicated str<strong>in</strong>gs more efficiently. When stored <strong>in</strong> a nametable, a str<strong>in</strong>g is said to be an atomized str<strong>in</strong>g.The net effect of atomized str<strong>in</strong>gs is that <strong>XML</strong> readers can manage elements andattributes as references rather than values and can there<strong>for</strong>e function more effectively,especially <strong>in</strong> terms of memory occupation and speed of comparison. Compar<strong>in</strong>g twoobject references is much faster than compar<strong>in</strong>g all the characters that <strong>for</strong>m a str<strong>in</strong>g.The NameTable class, which <strong>in</strong>herits from the abstract class XmlNameTable, has arelatively simple programm<strong>in</strong>g <strong>in</strong>terface and provides methods to add new items and toread them back. You add a new item to a name table us<strong>in</strong>g the Add method.NameTable table = new NameTable();str<strong>in</strong>g name = table.Add("Author");You get the atomized str<strong>in</strong>g with the specified value from the table us<strong>in</strong>g the Getmethod.str<strong>in</strong>g name = table.Get("Author");<strong>XML</strong> reader classes make <strong>in</strong>ternal use of name tables. The reader's name table can beaccessed through the NameTable property. The reader's name table conta<strong>in</strong>s an atom(a reference to the str<strong>in</strong>g object) <strong>for</strong> each dist<strong>in</strong>ct element or attribute name, completedwith namespace <strong>in</strong><strong>for</strong>mation <strong>for</strong> uniqueness. If the <strong>XML</strong> document be<strong>in</strong>g processedconta<strong>in</strong>s, say, 1000 nodes named , only one atomized entry will be created<strong>in</strong> the name table. Don't mistake the NameTable object <strong>for</strong> a worker table <strong>in</strong> which thereader stores all the document's nodes. Instead, the NameTable object is just a workercollection of unique names stored <strong>in</strong> a way that allows <strong>for</strong> more effective storage,retrieval, and comparison.The NameTable object is <strong>in</strong>ternally implemented us<strong>in</strong>g an array of structures thatmimics a hash table. Like a hash table, the array manages str<strong>in</strong>gs us<strong>in</strong>g hash codes.So when a new str<strong>in</strong>g is added to the table, a new hash code is generated andcompared to the others exist<strong>in</strong>g <strong>in</strong> the array. If a str<strong>in</strong>g with that hash code alreadyexists <strong>in</strong> the table, a reference to the exist<strong>in</strong>g atom is returned; otherwise, a new entryis created and the relative reference (atom) returned. In case of overflow, the size of thearray is doubled.The NameTable object uses a homemade hash table rather than the official .<strong>NET</strong>HashTable object because the HashTable object is not as simple and compact asrequired <strong>in</strong> this context.When creat<strong>in</strong>g a new <strong>in</strong>stance of the XmlTextReader class, you can also <strong>in</strong>dicate thespecific NameTable object to use.Design<strong>in</strong>g a SAX Parser with .<strong>NET</strong> ToolsAs mentioned <strong>in</strong> Chapter 1, significant differences exist between .<strong>NET</strong> <strong>XML</strong> readers—ak<strong>in</strong>d of cursor-like parser—and Simple API <strong>for</strong> <strong>XML</strong> (SAX) parsers. All of thesedifferences can be traced, directly or <strong>in</strong>directly, to the differences exist<strong>in</strong>g between thepush model, which is typical of SAX, and the pull model on which readers are based.A SAX parser takes full control over the pars<strong>in</strong>g process, extrapolates any predef<strong>in</strong>edpiece of <strong>XML</strong> code, duplicates it <strong>in</strong>to local buffers, and f<strong>in</strong>ally pushes that data down tothe call<strong>in</strong>g application. The <strong>in</strong>teraction between the parser and the application takesplace through application-def<strong>in</strong>ed classes that, <strong>in</strong> turn, implement SAX-def<strong>in</strong>ed<strong>in</strong>terfaces.39

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!