13.07.2015 Views

C# in Depth

C# in Depth

C# in Depth

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>C#</strong> <strong>in</strong> <strong>Depth</strong>Licensed to Rhona Hadida


Licensed to Rhona Hadida


<strong>C#</strong> <strong>in</strong> <strong>Depth</strong>JON SKEETMANNINGGreenwich(74° w. long.)Licensed to Rhona Hadida


For family, friends, colleagues,and all those who love <strong>C#</strong>Licensed to Rhona Hadida


Licensed to Rhona Hadida


ief contentsPART 1 PREPARING FOR THE JOURNEY ......................................11 ■ The chang<strong>in</strong>g face of <strong>C#</strong> development 32 ■ Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1 32PART 2 <strong>C#</strong> 2: SOLVING THE ISSUES OF <strong>C#</strong> 1........................... 613 ■ Parameterized typ<strong>in</strong>g with generics 634 ■ Say<strong>in</strong>g noth<strong>in</strong>g with nullable types 1125 ■ Fast-tracked delegates 1376 ■ Implement<strong>in</strong>g iterators the easy way 1617 ■ Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al features 183PART 3 <strong>C#</strong> 3—REVOLUTIONIZING HOW WE CODE ..................2058 ■ Cutt<strong>in</strong>g fluff with a smart compiler 2079 ■ Lambda expressions and expression trees 23010 ■ Extension methods 25511 ■ Query expressions and LINQ to Objects 27512 ■ LINQ beyond collections 31413 ■ Elegant code <strong>in</strong> the new era 352viiLicensed to Rhona Hadida


Licensed to Rhona Hadida


contentsforeword xviipreface xixacknowledgments xxiabout this book xxiiiabout the cover illustration xxviiicomments from the tech review xxixPART 1 PREPARING FOR THE JOURNEY ...........................11The chang<strong>in</strong>g face of <strong>C#</strong> development 31.1 Evolution <strong>in</strong> action: examples of code change 4Def<strong>in</strong><strong>in</strong>g the Product type 5 ■ Sort<strong>in</strong>g products byname 8 ■ Query<strong>in</strong>g collections 11 ■ Represent<strong>in</strong>gan unknown price 13 ■ LINQ and query expressions 141.2 A brief history of <strong>C#</strong> (and related technologies) 18The world before <strong>C#</strong> 18 ■ <strong>C#</strong> and .NET are born 19M<strong>in</strong>or updates with .NET 1.1 and the first major step:.NET 2.0 20 ■ “Next generation” products 21Historical perspective and the fight for developersupport 22ixLicensed to Rhona Hadida


xCONTENTS1.3 The .NET platform 24Dist<strong>in</strong>guish<strong>in</strong>g between language, runtime, andlibraries 25 ■ Untangl<strong>in</strong>g version number chaos 261.4 Fully functional code <strong>in</strong> snippet form 28Snippets and their expansions 28 ■ Introduc<strong>in</strong>g Snippy 301.5 Summary 312Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1 322.1 Delegates 33A recipe for simple delegates 34 ■ Comb<strong>in</strong><strong>in</strong>g and remov<strong>in</strong>gdelegates 38 ■ A brief diversion <strong>in</strong>to events 40 ■ Summaryof delegates 412.2 Type system characteristics 42<strong>C#</strong>’s place <strong>in</strong> the world of type systems 42 ■ When is<strong>C#</strong> 1’s type system not rich enough? 45 ■ When does <strong>C#</strong> 1’stype system get <strong>in</strong> the way? 47 ■ Summary of type systemcharacteristics 482.3 Value types and reference types 48Values and references <strong>in</strong> the real world 49 ■ Value and referencetype fundamentals 50 ■ Dispell<strong>in</strong>g myths 51 ■ Box<strong>in</strong>g andunbox<strong>in</strong>g 53 ■ Summary of value types and reference types 542.4 <strong>C#</strong> 2 and 3: new features on a solid base 54Features related to delegates 54 ■ Features related to the typesystem 56 ■ Features related to value types 582.5 Summary 59PART 2 <strong>C#</strong> 2: SOLVING THE ISSUES OF <strong>C#</strong> 1 ............... 613Parameterized typ<strong>in</strong>g with generics 633.1 Why generics are necessary 643.2 Simple generics for everyday use 65Learn<strong>in</strong>g by example: a generic dictionary 66 ■ Generictypes and type parameters 67 ■ Generic methods andread<strong>in</strong>g generic declarations 713.3 Beyond the basics 74Type constra<strong>in</strong>ts 75 ■ Type <strong>in</strong>ference for type argumentsof generic methods 79 ■ Implement<strong>in</strong>g generics 81Licensed to Rhona Hadida


CONTENTSxi3.4 Advanced generics 85Static fields and static constructors 86 ■ How the JIT compilerhandles generics 88 ■ Generic iteration 90 ■ Reflection andgenerics 923.5 Generic collection classes <strong>in</strong> .NET 2.0 96List 96 ■ Dictionary 99Queue and Stack 100 ■ SortedList and SortedDictionary 101L<strong>in</strong>kedList 1013.6 Limitations of generics <strong>in</strong> <strong>C#</strong> and other languages 102Lack of covariance and contravariance 103 ■ Lackof operator constra<strong>in</strong>ts or a “numeric” constra<strong>in</strong>t 106Lack of generic properties, <strong>in</strong>dexers, and other membertypes 108 ■ Comparison with C++ templates 108Comparison with Java generics 1103.7 Summary 11145Say<strong>in</strong>g noth<strong>in</strong>g with nullable types 1124.1 What do you do when you just don’t have a value? 113Why value type variables can’t be null 113 ■ Patternsfor represent<strong>in</strong>g null values <strong>in</strong> <strong>C#</strong> 1 1144.2 System.Nullable and System.Nullable 115Introduc<strong>in</strong>g Nullable 116 ■ Box<strong>in</strong>g andunbox<strong>in</strong>g 118 ■ Equality of Nullable<strong>in</strong>stances 119 ■ Support from the nongenericNullable class 1194.3 <strong>C#</strong> 2’s syntactic sugar for nullable types 120The ? modifier 121 ■ Assign<strong>in</strong>g and compar<strong>in</strong>g withnull 122 ■ Nullable conversions and operators 124Nullable logic 127 ■ The null coalesc<strong>in</strong>g operator 1284.4 Novel uses of nullable types 131Try<strong>in</strong>g an operation without us<strong>in</strong>g outputparameters 131 ■ Pa<strong>in</strong>less comparisons with the nullcoalesc<strong>in</strong>g operator 133 ■ Summary 136Fast-tracked delegates 1375.1 Say<strong>in</strong>g goodbye to awkward delegate syntax 1385.2 Method group conversions 1405.3 Covariance and contravariance 141Licensed to Rhona Hadida


xiiCONTENTS5.4 Inl<strong>in</strong>e delegate actions with anonymous methods 144Start<strong>in</strong>g simply: act<strong>in</strong>g on a parameter 145 ■ Return<strong>in</strong>gvalues from anonymous methods 147 ■ Ignor<strong>in</strong>g delegateparameters 1495.5 Captur<strong>in</strong>g variables <strong>in</strong> anonymous methods 150Def<strong>in</strong><strong>in</strong>g closures and different types of variables 151Exam<strong>in</strong><strong>in</strong>g the behavior of captured variables 152 ■ What’s thepo<strong>in</strong>t of captured variables? 153 ■ The extended lifetime ofcaptured variables 154 ■ Local variable <strong>in</strong>stantiations 155Mixtures of shared and dist<strong>in</strong>ct variables 157 ■ Capturedvariable guidel<strong>in</strong>es and summary 1585.6 Summary 16067Implement<strong>in</strong>g iterators the easy way 1616.1 <strong>C#</strong> 1: the pa<strong>in</strong> of handwritten iterators 1626.2 <strong>C#</strong> 2: simple iterators with yield statements 165Introduc<strong>in</strong>g iterator blocks and yield return 165 ■ Visualiz<strong>in</strong>gan iterator’s workflow 167 ■ Advanced iterator executionflow 169 ■ Quirks <strong>in</strong> the implementation 1726.3 Real-life example: iterat<strong>in</strong>g over ranges 173Iterat<strong>in</strong>g over the dates <strong>in</strong> a timetable 173 ■ Scop<strong>in</strong>g the Rangeclass 174 ■ Implementation us<strong>in</strong>g iterator blocks 1756.4 Pseudo-synchronous code with the Concurrency andCoord<strong>in</strong>ation Runtime 1786.5 Summary 181Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al features 1837.1 Partial types 184Creat<strong>in</strong>g a type with multiple files 185 ■ Uses of partialtypes 186 ■ Partial methods—<strong>C#</strong> 3 only! 1887.2 Static classes 1907.3 Separate getter/setter property access 1927.4 Namespace aliases 193Qualify<strong>in</strong>g namespace aliases 194 ■ The global namespacealias 195 ■ Extern aliases 1967.5 Pragma directives 197Warn<strong>in</strong>g pragmas 197 ■ Checksum pragmas 1987.6 Fixed-size buffers <strong>in</strong> unsafe code 199Licensed to Rhona Hadida


CONTENTSxiii7.7 Expos<strong>in</strong>g <strong>in</strong>ternal members to selected assemblies 201Friend assemblies <strong>in</strong> the simple case 201 ■ Why useInternalsVisibleTo? 202 ■ InternalsVisibleTo and signedassemblies 2037.8 Summary 204PART 3 <strong>C#</strong> 3—REVOLUTIONIZING HOW WE CODE .......20589Cutt<strong>in</strong>g fluff with a smart compiler 2078.1 Automatically implemented properties 2088.2 Implicit typ<strong>in</strong>g of local variables 210Us<strong>in</strong>g var to declare a local variable 211 ■ Restrictionson implicit typ<strong>in</strong>g 212 ■ Pros and cons of implicittyp<strong>in</strong>g 213 ■ Recommendations 2148.3 Simplified <strong>in</strong>itialization 215Def<strong>in</strong><strong>in</strong>g our sample types 215 ■ Sett<strong>in</strong>g simple properties 216Sett<strong>in</strong>g properties on embedded objects 218 ■ Collection<strong>in</strong>itializers 218 ■ Uses of <strong>in</strong>itialization features 2218.4 Implicitly typed arrays 2238.5 Anonymous types 224First encounters of the anonymous k<strong>in</strong>d 224 ■ Membersof anonymous types 226 ■ Projection <strong>in</strong>itializers 226What’s the po<strong>in</strong>t? 2278.6 Summary 228Lambda expressions and expression trees 2309.1 Lambda expressions as delegates 232Prelim<strong>in</strong>aries: <strong>in</strong>troduc<strong>in</strong>g the Func delegate types 232First transformation to a lambda expression 232 ■ Us<strong>in</strong>g as<strong>in</strong>gle expression as the body 234 ■ Implicitly typed parameterlists 234 ■ Shortcut for a s<strong>in</strong>gle parameter 2349.2 Simple examples us<strong>in</strong>g List and events 235Filter<strong>in</strong>g, sort<strong>in</strong>g, and actions on lists 236 ■ Logg<strong>in</strong>g <strong>in</strong>an event handler 2379.3 Expression trees 238Build<strong>in</strong>g expression trees programmatically 239 ■ Compil<strong>in</strong>g expressiontrees <strong>in</strong>to delegates 240 ■ Convert<strong>in</strong>g <strong>C#</strong> lambda expressions toexpression trees 241 ■ Expression trees at the heart of LINQ 244Licensed to Rhona Hadida


xivCONTENTS9.4 Changes to type <strong>in</strong>ference and overload resolution 245Reasons for change: streaml<strong>in</strong><strong>in</strong>g generic methodcalls 246 ■ Inferred return types of anonymousfunctions 247 ■ Two-phase type <strong>in</strong>ference 248Pick<strong>in</strong>g the right overloaded method 251Wrapp<strong>in</strong>g up type <strong>in</strong>ference and overload resolution 2539.5 Summary 2531011Extension methods 25510.1 Life before extension methods 25610.2 Extension method syntax 258Declar<strong>in</strong>g extension methods 258 ■ Call<strong>in</strong>gextension methods 259 ■ How extension methodsare found 261 ■ Call<strong>in</strong>g a method on a nullreference 26210.3 Extension methods <strong>in</strong> .NET 3.5 263First steps with Enumerable 263 ■ Filter<strong>in</strong>g withWhere, and cha<strong>in</strong><strong>in</strong>g method calls together 265 ■ Projectionsus<strong>in</strong>g the Select method and anonymous types 266 ■ Sort<strong>in</strong>gus<strong>in</strong>g the OrderBy method 267 ■ Bus<strong>in</strong>ess examples<strong>in</strong>volv<strong>in</strong>g cha<strong>in</strong><strong>in</strong>g 26910.4 Usage ideas and guidel<strong>in</strong>es 270“Extend<strong>in</strong>g the world” and mak<strong>in</strong>g <strong>in</strong>terfaces richer 270Fluent <strong>in</strong>terfaces 271 ■ Us<strong>in</strong>g extension methodssensibly 27210.5 Summary 274Query expressions and LINQ to Objects 27511.1 Introduc<strong>in</strong>g LINQ 276What’s <strong>in</strong> a name? 276 ■ Fundamental concepts <strong>in</strong>LINQ 277 ■ Def<strong>in</strong><strong>in</strong>g the sample data model 28211.2 Simple beg<strong>in</strong>n<strong>in</strong>gs: select<strong>in</strong>g elements 283Start<strong>in</strong>g with a source and end<strong>in</strong>g with a selection 283Compiler translations as the basis of query expressions 284Range variables and nontrivial projections 287Cast, OfType, and explicitly typed range variables 28911.3 Filter<strong>in</strong>g and order<strong>in</strong>g a sequence 290Filter<strong>in</strong>g us<strong>in</strong>g a where clause 290 ■ Degenerate queryexpressions 291 ■ Order<strong>in</strong>g us<strong>in</strong>g an orderby clause 292Licensed to Rhona Hadida


CONTENTSxv11.4 Let clauses and transparent identifiers 294Introduc<strong>in</strong>g an <strong>in</strong>termediate computation with let 295Transparent identifiers 29611.5 Jo<strong>in</strong>s 297Inner jo<strong>in</strong>s us<strong>in</strong>g jo<strong>in</strong> clauses 297 ■ Group jo<strong>in</strong>s withjo<strong>in</strong> … <strong>in</strong>to clauses 301 ■ Cross jo<strong>in</strong>s us<strong>in</strong>g multiplefrom clauses 30311.6 Group<strong>in</strong>gs and cont<strong>in</strong>uations 307Group<strong>in</strong>g with the group … by clause 307 ■ Querycont<strong>in</strong>uations 31011.7 Summary 3131213LINQ beyond collections 31412.1 LINQ to SQL 315Creat<strong>in</strong>g the defect database and entities 315 ■ Populat<strong>in</strong>g thedatabase with sample data 318 ■ Access<strong>in</strong>g the database withquery expressions 319 ■ Updat<strong>in</strong>g the database 324 ■ LINQ toSQL summary 32512.2 Translations us<strong>in</strong>g IQueryable and IQueryProvider 326Introduc<strong>in</strong>g IQueryable and related <strong>in</strong>terfaces 326 ■ Fak<strong>in</strong>git: <strong>in</strong>terface implementations to log calls 328 ■ Glu<strong>in</strong>g expressionstogether: the Queryable extension methods 330 ■ The fake queryprovider <strong>in</strong> action 332 ■ Wrapp<strong>in</strong>g up IQueryable 33312.3 LINQ to DataSet 334Work<strong>in</strong>g with untyped datasets 334 ■ Work<strong>in</strong>g with typeddatasets 33512.4 LINQ to XML 338XElement and XAttribute 338 ■ Convert<strong>in</strong>g sample defect data<strong>in</strong>to XML 340 ■ Queries <strong>in</strong> LINQ to XML 341 ■ LINQ toXML summary 34312.5 LINQ beyond .NET 3.5 344Third-party LINQ 344 ■ Future Microsoft LINQ technologies 34812.6 Summary 350Elegant code <strong>in</strong> the new era 35213.1 The chang<strong>in</strong>g nature of language preferences 353A more functional emphasis 353 ■ Static, dynamic, implicit,explicit, or a mixture? 354Licensed to Rhona Hadida


xviCONTENTS13.2 Delegation as the new <strong>in</strong>heritance 35513.3 Readability of results over implementation 35613.4 Life <strong>in</strong> a parallel universe 35713.5 Farewell 358appendix LINQ standard query operators 359<strong>in</strong>dex 371Licensed to Rhona Hadida


forewordThere are two k<strong>in</strong>ds of pianists.There are some pianists who play not because they enjoy it, but because their parentsforce them to take lessons. Then there are those who play the piano because itpleases them to create music. They don’t need to be forced; on the contrary, theysometimes don’t know when to stop.Of the latter k<strong>in</strong>d, there are some who play the piano as a hobby. Then there arethose who play for a liv<strong>in</strong>g. That requires a whole new level of dedication, skill, and talent.They may have some degree of freedom about what genre of music they play andthe stylistic choices they make <strong>in</strong> play<strong>in</strong>g it, but fundamentally those choices aredriven by the needs of the employer or the tastes of the audience.Of the latter k<strong>in</strong>d, there are some who do it primarily for the money. Then thereare those professionals who would want to play the piano <strong>in</strong> public even if they weren’tbe<strong>in</strong>g paid. They enjoy us<strong>in</strong>g their skills and talents to make music for others. Thatthey can have fun and get paid for it is so much the better.Of the latter k<strong>in</strong>d, there are some who are self-taught, who “play by ear,” whomight have great talent and ability but cannot communicate that <strong>in</strong>tuitive understand<strong>in</strong>gto others except through the music itself. Then there are those who have formaltra<strong>in</strong><strong>in</strong>g <strong>in</strong> both theory and practice. They can expla<strong>in</strong> what techniques the composerused to achieve the <strong>in</strong>tended emotional effect, and use that knowledge to shape their<strong>in</strong>terpretation of the piece.Of the latter k<strong>in</strong>d, there are some who have never looked <strong>in</strong>side their pianos. Thenthere are those who are fasc<strong>in</strong>ated by the clever escapements that lift the damper feltsa fraction of a second before the hammers strike the str<strong>in</strong>gs. They own key levelersand capstan wrenches. They take delight and pride <strong>in</strong> be<strong>in</strong>g able to understand themechanisms of an <strong>in</strong>strument that has five to ten thousand mov<strong>in</strong>g parts.xviiLicensed to Rhona Hadida


xviiiFOREWORDOf the latter k<strong>in</strong>d, there are some who are content to master their craft and exercisetheir talents for the pleasure and profit it br<strong>in</strong>gs. Then there are those who arenot just artists, theorists, and technicians; somehow they f<strong>in</strong>d the time to pass thatknowledge on to others as mentors.I have no idea if Jon Skeet is any k<strong>in</strong>d of pianist. But from my email conversationswith him as one of the <strong>C#</strong> team’s Most Valuable Professionals over the years, fromread<strong>in</strong>g his blog and from read<strong>in</strong>g every word of this book at least three times, it hasbecome clear to me that Jon is that latter k<strong>in</strong>d of software developer: enthusiastic,knowledgeable, talented, curious and analytical—a teacher of others.<strong>C#</strong> is a highly pragmatic and rapidly evolv<strong>in</strong>g language. Through the addition ofquery comprehensions, richer type <strong>in</strong>ference, a compact syntax for anonymous functions,and so on, I hope that we have enabled a whole new style of programm<strong>in</strong>g whilestill stay<strong>in</strong>g true to the statically typed, component-oriented approach that has made<strong>C#</strong> a success.Many of these new stylistic elements have the paradoxical quality of feel<strong>in</strong>g veryold (lambda expressions go back to the foundations of computer science <strong>in</strong> the firsthalf of the twentieth century) and yet at the same time feel<strong>in</strong>g new and unfamiliar todevelopers used to a more modern object-oriented approach.Jon gets all that. This book is ideal for professional developers who have a need tounderstand the “what” and “how” of the latest revision to <strong>C#</strong>. But it is also for thosedevelopers whose understand<strong>in</strong>g is enriched by explor<strong>in</strong>g the “why” of the language’sdesign pr<strong>in</strong>ciples.Be<strong>in</strong>g able to take advantage of all that new power will require new ways of th<strong>in</strong>k<strong>in</strong>gabout data, functions, and the relationship between them. It’s not unlike try<strong>in</strong>g toplay jazz after years of classical tra<strong>in</strong><strong>in</strong>g—or vice versa. Either way, I am look<strong>in</strong>g forwardto f<strong>in</strong>d<strong>in</strong>g out what sorts of functional compositions the next generation of <strong>C#</strong>programmers come up with. Happy compos<strong>in</strong>g, and thanks for choos<strong>in</strong>g the key of<strong>C#</strong> to do it <strong>in</strong>.ERIC LIPPERTSenior Software Eng<strong>in</strong>eer, MicrosoftLicensed to Rhona Hadida


prefaceI have a sneak<strong>in</strong>g suspicion that many authors have pretty much stumbled <strong>in</strong>to writ<strong>in</strong>gbooks. That’s certa<strong>in</strong>ly true <strong>in</strong> my case. I’ve been writ<strong>in</strong>g about Java and <strong>C#</strong> on theWeb and <strong>in</strong> newsgroups for a long time, but the leap from that to the pr<strong>in</strong>ted page isquite a large one. From my perspective, it’s been an “anti-Lemony Snicket”—a seriesof fortunate events.I’ve been review<strong>in</strong>g books for various publishers, <strong>in</strong>clud<strong>in</strong>g Mann<strong>in</strong>g, for a while. InApril 2006 I asked whether it would be OK to write a blog entry on a book that lookedparticularly promis<strong>in</strong>g: PowerShell <strong>in</strong> Action. In the course of the ensu<strong>in</strong>g conversation,I somehow managed to end up on the author team for Groovy <strong>in</strong> Action. I owe a hugedebt of thanks to my wife for even allow<strong>in</strong>g me to agree to this—which makes hersound like a control freak until you understand we were expect<strong>in</strong>g tw<strong>in</strong>s at the time,and she had just gone <strong>in</strong>to the hospital. It wasn’t an ideal time to take on extra work,but Holly was as supportive as she’s always been.Contribut<strong>in</strong>g to the Groovy book took a lot of hard work, but the writ<strong>in</strong>g bugfirmly hit me dur<strong>in</strong>g the process. When talk<strong>in</strong>g with the pr<strong>in</strong>cipal author, DierkKönig, I realized that I wanted to take on that role myself one day. So, when I heardlater that Mann<strong>in</strong>g was <strong>in</strong>terested <strong>in</strong> publish<strong>in</strong>g a book about <strong>C#</strong>3, I started writ<strong>in</strong>g aproposal right away.My relationship with <strong>C#</strong> itself goes further back. I started us<strong>in</strong>g it <strong>in</strong> 2002, andhave kept up with it ever s<strong>in</strong>ce. I haven’t been us<strong>in</strong>g it professionally for all thattime—I’ve been flitt<strong>in</strong>g back and forth between <strong>C#</strong> and Java, depend<strong>in</strong>g on what myemployers wanted for the projects I was work<strong>in</strong>g on. However, I’ve never let my <strong>in</strong>terest<strong>in</strong> it drop, post<strong>in</strong>g on the newsgroups and develop<strong>in</strong>g code at home. Although IxixLicensed to Rhona Hadida


xxPREFACEdidn’t start us<strong>in</strong>g <strong>C#</strong>2 until Visual Studio 2005 was close to release, I’ve tracked <strong>C#</strong>3more closely.While watch<strong>in</strong>g the gradual emergence of <strong>C#</strong>3, I’ve also been watch<strong>in</strong>g the developerreaction to <strong>C#</strong>2—and I th<strong>in</strong>k people are miss<strong>in</strong>g out. The adoption rate of <strong>C#</strong>2has been quite slow for various reasons, but where it is be<strong>in</strong>g used, I believe its fullpotential isn’t be<strong>in</strong>g realized. I want to fix that.The proposal I made to Mann<strong>in</strong>g was narrow <strong>in</strong> focus, but deep <strong>in</strong> scope. My missionis a simple one: to take exist<strong>in</strong>g <strong>C#</strong>1 developers and turn them <strong>in</strong>to confidentand competent <strong>C#</strong>2 and 3 developers who understand the language at a deep level. Atthe time of this writ<strong>in</strong>g, I don’t know of any other books that have a similar explicitaim. I’m immensely grateful to Mann<strong>in</strong>g for allow<strong>in</strong>g me to write the book I reallywanted to, without <strong>in</strong>terfer<strong>in</strong>g and forc<strong>in</strong>g it down a more conventional path. At thesame time, the folks at Mann<strong>in</strong>g expertly guided the book and made it much moreuseful than it would have been otherwise.I tried to write the book that I would want to read when learn<strong>in</strong>g <strong>C#</strong>2 and 3. To thatextent, I th<strong>in</strong>k I’ve succeeded. Whether that means it’s a book that anyone else will wantto read rema<strong>in</strong>s to be seen—but if you, the reader, have even a fraction of the enjoymentwhen read<strong>in</strong>g it that I’ve had writ<strong>in</strong>g it, you’re <strong>in</strong> for a good time.Licensed to Rhona Hadida


acknowledgmentsI had a wonderful time writ<strong>in</strong>g this book, which is largely due to the skill and dedicationof the people who worked with me on it and the patience of the people who putup with me. Those who fall <strong>in</strong>to both camps are virtuous beyond words.My family has borne the brunt of the time I spent writ<strong>in</strong>g when I should have beenplay<strong>in</strong>g, sleep<strong>in</strong>g, or just be<strong>in</strong>g more sociable. My wife Holly has been more supportivethan I have any right to expect, and I’m both proud and slightly disturbed that theeyes of my sons Tom, William, and Rob<strong>in</strong> light up as soon as a laptop is opened.As well as the formal peer reviewers listed <strong>in</strong> a moment, I would like to mentionEmma Middlebrook and Douglas Leeder, both of whom read significant portions ofthe orig<strong>in</strong>al manuscript before it even reached my editor. Emma has the remarkablebad luck of hav<strong>in</strong>g had to cope not only with review<strong>in</strong>g my writ<strong>in</strong>g but also with work<strong>in</strong>gwith me on a daily basis. Douglas freed himself from that drudgery a while ago,but was still k<strong>in</strong>d enough to suggest improvements to the book without even know<strong>in</strong>g<strong>C#</strong>. You’re both dear friends, and I thank you for that even more than for your effortson the book.The folks at Mann<strong>in</strong>g have been supportive right from the start—particularlyJackie Carter’s orig<strong>in</strong>al work with me, as well as Mike Stephens’ help at a high levelthroughout the process and Jeff Bleiel’s constant support and diligent edit<strong>in</strong>g. LizWelch performed the near-impossible trick of mak<strong>in</strong>g the copyedit<strong>in</strong>g process fun forboth of us, and Karen Tegtmeyer kept me relatively sane dur<strong>in</strong>g nail-bit<strong>in</strong>g peerreviews. These are only the people I got to know reasonably well dur<strong>in</strong>g the last year;many other people were <strong>in</strong>volved <strong>in</strong> the project, do<strong>in</strong>g marvelous th<strong>in</strong>gs: there’s theproduction team of Gordan Sal<strong>in</strong>ovic, Dottie Marsico, Tiffany Taylor, Katie Tennant,and Mary Piergies; cover designer Leslie Haimes; webmaster Gabriel Dobrescu, market<strong>in</strong>gdirector Ron Tomich, and last but not least, publisher Marjan Bace.xxiLicensed to Rhona Hadida


xxiiACKNOWLEDGMENTSThe aforementioned peer reviewers are a picky bunch. There’s no doubt <strong>in</strong> mym<strong>in</strong>d that my life would have been easier without them—it’s really you, the reader,who should thank them for a much better end result <strong>in</strong> the book that you are nowread<strong>in</strong>g. They pushed, prodded, pulled, commented, questioned, and generally madethorough nuisances of themselves—which is exactly what they were asked to do—asthey read the manuscript <strong>in</strong> its many iterations. So on your behalf I thank DaveCorun, Anil Radhakrishna, Riccardo Audano, Will Morse, Christopher Haupt, Rob<strong>in</strong>Shahan, Mark Seeman, Peter A. Bromberg, Fabio Angius, Massimo Perga, Chris Mull<strong>in</strong>s,JD Conley, Marc Gravell, Ross Bradbury, Andrew Seimer, Alex Thissen, Keith J.Farmer, Fabrice Marguerie, Josh Cronemeyer, and Anthony Jones.I’ve been very fortunate over the years to make acqua<strong>in</strong>tances (both onl<strong>in</strong>e and <strong>in</strong>person) with many people who are far smarter than I am, and I consulted some ofthem for pieces of <strong>in</strong>formation <strong>in</strong> this book. For keep<strong>in</strong>g my facts straight, I’d like tothank Jon Jagger, Nigel Perry, Prashant Sridharan, Dan Fernandez, Tony Goodhew,Simon Tatham, Ben Voigt, and George Chrysanthakopoulos.F<strong>in</strong>ally, I need to thank the <strong>C#</strong> team. As well as creat<strong>in</strong>g such a fantastic languageto write about, they were <strong>in</strong>credibly responsive to my questions. I don’t know what’scom<strong>in</strong>g <strong>in</strong> <strong>C#</strong>4 (beyond the h<strong>in</strong>ts on blogs about possible features) or even a veryrough timescale for it, but I wish the team the very best of luck.I’d like to pay special tribute to Eric Lippert, first for be<strong>in</strong>g will<strong>in</strong>g to enter <strong>in</strong>todetailed conversations about all k<strong>in</strong>ds of topics while I was writ<strong>in</strong>g the <strong>in</strong>itial manuscript,and then for act<strong>in</strong>g as the technical reviewer for the book, and f<strong>in</strong>ally for agree<strong>in</strong>gto write the foreword. I couldn’t have asked for a more knowledgable andthorough reviewer—Eric is about as close to be<strong>in</strong>g the <strong>C#</strong> specification <strong>in</strong> humanform as you can get! He’s also a great source of trivia and recommendations on cod<strong>in</strong>gstyle. Many of the comments Eric made when review<strong>in</strong>g the book didn’t impactthe text directly, but are pure gold nonetheless. Rather than keep<strong>in</strong>g them to myself,I’ve <strong>in</strong>corporated them <strong>in</strong>to the notes on the book’s website. Th<strong>in</strong>k of them as the literaryequivalent of a DVD commentary track.Licensed to Rhona Hadida


about this bookThis is a book about <strong>C#</strong>2 and 3—it’s as simple as that. I barely cover <strong>C#</strong>1, and only coverthe .NET Framework libraries and Common Language Runtime (CLR) when they’rerelated to the language. This is a deliberate decision, and the result is quite a differentbook from most of the <strong>C#</strong> and .NET books I’ve seen.By assum<strong>in</strong>g a reasonable amount of knowledge of <strong>C#</strong>1, I avoided spend<strong>in</strong>g hundredsof pages cover<strong>in</strong>g material that I th<strong>in</strong>k most people already understand. Thatgave me much more room to expand on the details of <strong>C#</strong>2 and 3, which is what I hopeyou’re read<strong>in</strong>g the book for.I believe that many developers would be less frustrated with their work if they hada deeper connection with the language they’re writ<strong>in</strong>g <strong>in</strong>. I know it sounds geeky <strong>in</strong> theextreme to talk about hav<strong>in</strong>g a “relationship” with a programm<strong>in</strong>g language, but that’sthe best way I can describe it. This book is my attempt to help you achieve that sort ofunderstand<strong>in</strong>g, or deepen it further. It won’t be enough on its own—it should be acompanion to your cod<strong>in</strong>g, guid<strong>in</strong>g you and suggest<strong>in</strong>g some <strong>in</strong>terest<strong>in</strong>g avenues toexplore, as well as expla<strong>in</strong><strong>in</strong>g why your code behaves the way it does.Who should read this book?Dur<strong>in</strong>g the course of the multiple rounds of review<strong>in</strong>g this book underwent as I waswrit<strong>in</strong>g it, one comment worried me more than most: “This is a book for <strong>C#</strong> experts.”That was never the <strong>in</strong>tention, and I hope that (partly thanks to that honest feedback)it’s not an accurate reflection of who will get the most out of this book.I don’t particularly want to write for experts. Aside from anyth<strong>in</strong>g else, I’ve got lessto offer experts than I have “<strong>in</strong>termediate” developers. I want to write for people whoxxiiiLicensed to Rhona Hadida


xxivABOUT THIS BOOKwant to become experts. That’s what this book is about. If you feel passionately aboutcomput<strong>in</strong>g, but happen not to have studied <strong>C#</strong>2 or 3 <strong>in</strong> much detail, this book isaimed squarely at you. If you want to immerse yourself <strong>in</strong> <strong>C#</strong> until it’s part of yourbloodstream, then I’d feel honored to be the one to push you under. If you feel frustratedwhen you arrive at work<strong>in</strong>g code, but don’t quite know why it works, I want tohelp you to understand.Hav<strong>in</strong>g said all that, this book is not meant for complete beg<strong>in</strong>ners. If you haven’tused <strong>C#</strong>1 before, you’ll f<strong>in</strong>d this book very hard work. That doesn’t mean it won’t beuseful to you—but please go and f<strong>in</strong>d a book (or at least a tutorial) on <strong>C#</strong>1 before yougo much further. The first chapter will tease you with the joys of <strong>C#</strong>2 and 3, but youwon’t be able to appreciate them if you’re worry<strong>in</strong>g about how variables are declaredand where the semicolons go.I’m not go<strong>in</strong>g to claim that read<strong>in</strong>g this book will make you a fabulous coder.There’s so much more to software eng<strong>in</strong>eer<strong>in</strong>g than know<strong>in</strong>g the syntax of the languageyou happen to be us<strong>in</strong>g. I give some words of guidance here and there, but ultimatelythere’s a lot more gut <strong>in</strong>st<strong>in</strong>ct <strong>in</strong> development than most of us would like toadmit. What I will claim is that if you read and understand this book, you should feelcomfortable with <strong>C#</strong>2 and 3, and free to follow your <strong>in</strong>st<strong>in</strong>cts without too much apprehension.It’s not about be<strong>in</strong>g able to write code that no one else will understandbecause it uses unknown corners of the language: it’s about be<strong>in</strong>g confident that youknow the options available to you, and know which path the <strong>C#</strong> idioms are encourag<strong>in</strong>gyou to follow.RoadmapThe book’s structure is simple. There are three parts and a s<strong>in</strong>gle appendix. The firstpart serves as an <strong>in</strong>troduction, <strong>in</strong>clud<strong>in</strong>g a refresher on topics <strong>in</strong> <strong>C#</strong>1 that are importantfor understand<strong>in</strong>g <strong>C#</strong>2 and 3, and that are often misunderstood. The second partcovers the new features <strong>in</strong> <strong>C#</strong>2. The third part covers the new features <strong>in</strong> <strong>C#</strong>3.There are occasions where organiz<strong>in</strong>g the material this way means we come backto a topic a couple of times—<strong>in</strong> particular delegates are improved <strong>in</strong> <strong>C#</strong>2 and thenaga<strong>in</strong> <strong>in</strong> <strong>C#</strong>3—but there is method <strong>in</strong> my madness. I anticipate that a number of readerswill <strong>in</strong>itially only be us<strong>in</strong>g <strong>C#</strong>2 for the bulk of their professional work, but with an<strong>in</strong>terest <strong>in</strong> learn<strong>in</strong>g <strong>C#</strong>3 for new projects and hobbies. That means that it is useful toclarify what is <strong>in</strong> which version. It also provides a feel<strong>in</strong>g of context and evolution—itshows how the language has developed over time.Chapter 1 sets the scene by tak<strong>in</strong>g a simple piece of <strong>C#</strong>1 code and evolv<strong>in</strong>g it, see<strong>in</strong>ghow <strong>C#</strong>2 and 3 allow the source to become more readable and powerful. We lookat the historical context <strong>in</strong> which <strong>C#</strong> has grown, and the technical context <strong>in</strong> which itoperates as part of a complete platform: <strong>C#</strong> as a language builds on framework librariesand a powerful runtime to turn abstraction <strong>in</strong>to reality.Chapter 2 looks back at <strong>C#</strong>1, and three specific aspects: delegates, the type systemcharacteristics, and the differences between value types and reference types. TheseLicensed to Rhona Hadida


ABOUT THIS BOOKxxvtopics are often understood “just well enough” by <strong>C#</strong>1 developers, but <strong>C#</strong>2 and 3develop them significantly, so a solid ground<strong>in</strong>g is required <strong>in</strong> order to make the mostof the new features.Chapter 3 tackles the biggest feature of <strong>C#</strong>2, and potentially the hardest to grasp:generics. Methods and types can be written generically, with type parameters stand<strong>in</strong>g<strong>in</strong> for real types which are specified <strong>in</strong> the call<strong>in</strong>g code. Initially it’s as confus<strong>in</strong>g as thisdescription makes it sound, but once you understand generics you’ll wonder how yousurvived without them.If you’ve ever wanted to represent a null <strong>in</strong>teger, chapter 4 is for you. It <strong>in</strong>troducesnullable types, a feature built on generics and tak<strong>in</strong>g advantage of support <strong>in</strong> the language,runtime, and framework.Chapter 5 shows the improvements to delegates <strong>in</strong> <strong>C#</strong>2. You may have only useddelegates for handl<strong>in</strong>g events such as button clicks before now. <strong>C#</strong>2 makes it easier tocreate delegates, and library support makes them more useful for situations otherthan events.In chapter 6 we exam<strong>in</strong>e iterators, and the easy way to implement them <strong>in</strong> <strong>C#</strong>2.Few developers use iterator blocks, but as LINQ to Objects is built on iterators, theywill become more and more important. The lazy nature of their execution is also a keypart of LINQ.Chapter 7 shows a number of smaller features <strong>in</strong>troduced <strong>in</strong> <strong>C#</strong>2, each mak<strong>in</strong>g lifea little more pleasant. The language designers have smoothed over a few rough places<strong>in</strong> <strong>C#</strong>1, allow<strong>in</strong>g more flexible <strong>in</strong>teraction with code generators, better support forutility classes, more granular access to properties, and more.Chapter 8 once aga<strong>in</strong> looks at a few relatively simple features—but this time <strong>in</strong><strong>C#</strong>3. Almost all the new syntax is geared toward the common goal of LINQ but thebuild<strong>in</strong>g blocks are also useful <strong>in</strong> their own right. With anonymous types, automaticallyimplemented properties, implicitly typed local variables, and greatly enhanced<strong>in</strong>itialization support, <strong>C#</strong>3 gives a far richer language with which your code canexpress its behavior.Chapter 9 looks at the first major topic of <strong>C#</strong>3—lambda expressions. Not contentwith the reasonably concise syntax we saw <strong>in</strong> chapter 5, the language designers havemade delegates even easier to create than <strong>in</strong> <strong>C#</strong>2. Lambdas are capable of more—theycan be converted <strong>in</strong>to expression trees: a powerful way of represent<strong>in</strong>g code as data.In chapter 10 we exam<strong>in</strong>e extension methods, which provide a way of fool<strong>in</strong>g thecompiler <strong>in</strong>to believ<strong>in</strong>g that methods declared <strong>in</strong> one type actually belong to another.At first glance this appears to be a readability nightmare, but with careful considerationit can be an extremely powerful feature—and one which is vital to LINQ.Chapter 11 comb<strong>in</strong>es the previous three chapters <strong>in</strong> the form of query expressions,a concise but powerful way of query<strong>in</strong>g data. Initially we concentrate on LINQ toObjects, but see how the query expression pattern is applied <strong>in</strong> a way which allowsother data providers to plug <strong>in</strong> seamlessly.Chapter 12 reaps the rewards of query expressions comb<strong>in</strong>ed with expressiontrees: it shows how LINQ to SQL is able to convert what appears to be normal <strong>C#</strong> <strong>in</strong>toLicensed to Rhona Hadida


xxviABOUT THIS BOOKSQL statements. We also take a speedy look at some other LINQ providers—those <strong>in</strong>the .NET Framework, some third-party ones which are gradually appear<strong>in</strong>g, and asneak peek at what Microsoft has <strong>in</strong> store.We w<strong>in</strong>d down <strong>in</strong> chapter 13 with a speculative glance at what the future mighthold. We exam<strong>in</strong>e how the changes to <strong>C#</strong> will affect the flavor and texture of the codewe write <strong>in</strong> it, as well as look<strong>in</strong>g at some future trends <strong>in</strong> technology.The appendix is a straightforward piece of reference material: the LINQ standardquery operators, with some examples. Strictly speak<strong>in</strong>g, this is not part of <strong>C#</strong>3—it’slibrary material—but I believe it’s so useful that readers will welcome its <strong>in</strong>clusion.Term<strong>in</strong>ology and typographyMost of the term<strong>in</strong>ology of the book is expla<strong>in</strong>ed as it goes along, but there are a fewdef<strong>in</strong>itions that are worth highlight<strong>in</strong>g here. I use <strong>C#</strong>1, <strong>C#</strong>2, and <strong>C#</strong>3 <strong>in</strong> a reasonablyobvious manner—but you may see other books and websites referr<strong>in</strong>g to <strong>C#</strong>1.0, <strong>C#</strong>2.0,and <strong>C#</strong>3.0. The extra “.0” seems redundant to me, which is why I’ve omitted it—I hopethe mean<strong>in</strong>g is clear.I’ve appropriated a pair of terms from a <strong>C#</strong> book by Mark Michaelis. To avoid theconfusion between “runtime” be<strong>in</strong>g an execution environment (as <strong>in</strong> “the CommonLanguage Runtime”) and a po<strong>in</strong>t <strong>in</strong> time (as <strong>in</strong> “overrid<strong>in</strong>g occurs at runtime”), Markuses “execution time” for the latter concept, usually <strong>in</strong> comparison with “compiletime.” This seems to me to be a thoroughly sensible idea, and one that I hope catcheson <strong>in</strong> the wider community. I’m do<strong>in</strong>g my bit by follow<strong>in</strong>g his example <strong>in</strong> this book.I frequently refer to “the language specification” or just “the specification”—unlessI <strong>in</strong>dicate otherwise, this means “the <strong>C#</strong> language specification.” However, multiple versionsof the specification are available, partly due to different versions of the languageitself and partly due to the standardization process. Rather than clutter up the bookwith specific “chapter and verse” references, there’s a page on the book’s website thatallows you to pick which version of the specification you’re <strong>in</strong>terested <strong>in</strong> and then seewhich part of the book refers to which area of the specification.This book conta<strong>in</strong>s numerous pieces of code, which appear <strong>in</strong> a fixed-width font;output from the list<strong>in</strong>gs appears <strong>in</strong> the same way. Code annotations accompany somelist<strong>in</strong>gs, and at other times particular sections of the code are highlighted <strong>in</strong> bold.Almost all of the code appears <strong>in</strong> “snippet” form, allow<strong>in</strong>g it to stay compact butstill runnable—with<strong>in</strong> the right environment. That environment is Snippy, a customtool that is <strong>in</strong>troduced <strong>in</strong> section 1.4. Snippy is available for download, along with allof the code from the book (<strong>in</strong> the form of snippets, full Visual Studio solutions, ormore often both) from the book’s website at www.mann<strong>in</strong>g.com/CSharpIn<strong>Depth</strong>.Author Onl<strong>in</strong>e and the <strong>C#</strong> <strong>in</strong> <strong>Depth</strong> websitePurchase of <strong>C#</strong> <strong>in</strong> <strong>Depth</strong> <strong>in</strong>cludes free access to a private web forum run by Mann<strong>in</strong>g Publicationswhere you can make comments about the book, ask technical questions, andreceive help from the author and other users. To access the forum and subscribe to it,po<strong>in</strong>t your web browser to www.mann<strong>in</strong>g.com/CSharpIn<strong>Depth</strong> or www.mann<strong>in</strong>g.com/skeet. This page provides <strong>in</strong>formation on how to get on the forum once you are registered,what k<strong>in</strong>d of help is available, and the rules of conduct on the forum.Licensed to Rhona Hadida


ABOUT THIS BOOKxxviiThe Author Onl<strong>in</strong>e forum and the archives of previous discussions will be accessiblefrom the publisher’s website as long as the book is <strong>in</strong> pr<strong>in</strong>t.In addition to Mann<strong>in</strong>g’s own website, I have set up a companion website for thebook at www.csharp<strong>in</strong>depth.com, conta<strong>in</strong><strong>in</strong>g <strong>in</strong>formation that didn’t quite fit <strong>in</strong>tothe book, as well as downloadable source code for all the list<strong>in</strong>gs <strong>in</strong> the book and furtherexamples.About the authorIn many books, you will f<strong>in</strong>d a very impressive list of bus<strong>in</strong>ess and technical achievementsaccomplished by the author(s). Sadly, I have little to boast of on that front.Microsoft has been k<strong>in</strong>d enough to award me MVP (Most Valuable Professional) statuss<strong>in</strong>ce 2003 for my “work” <strong>in</strong> the <strong>C#</strong> newsgroups, but I have to put “work” <strong>in</strong> quotes as it’sbeen such a fun ride. Beyond that, I run a modest website with some articles about <strong>C#</strong>and .NET, and a blog with some random thoughts about software development. I’m notthe CTO of a wildly successful startup. I haven’t given sell-out lecture tours across multiplecont<strong>in</strong>ents with webcasts that brought the Internet to its knees. Instead, I’ve spentmy time work<strong>in</strong>g as a developer, listen<strong>in</strong>g to the problems of other developers, and try<strong>in</strong>gto gradually learn the best way to write code and design solutions.I’d like to th<strong>in</strong>k that <strong>in</strong> some ways that makes me the right person to write a bookabout <strong>C#</strong>—because it’s what I live and breathe from day to day, and it’s what I lovehelp<strong>in</strong>g people with. I’m passionate about <strong>C#</strong> <strong>in</strong> a way which my wife has learned totolerate, and I hope that passion comes through <strong>in</strong> this book. I thought I was madabout it before I started writ<strong>in</strong>g, and my appreciation has only grown as I’ve becomemore <strong>in</strong>timately familiar with the details.I’m not so much <strong>in</strong> love with <strong>C#</strong> that I can’t see any flaws—aga<strong>in</strong>, I hope thatcomes across <strong>in</strong> my writ<strong>in</strong>g. I’ve never met a language yet that didn’t have its hiddentraps: <strong>C#</strong> is better than most <strong>in</strong> that respect, but it’s not perfect. When I see areas thathave caused problems, either for me or for other developers who have posted <strong>in</strong> newsgroupsor emailed me, I’m more than will<strong>in</strong>g to po<strong>in</strong>t them out. I hope the designerswill forgive the implied criticisms, and understand that I hold them <strong>in</strong> the highestregard for the beautiful and elegant language they created.Licensed to Rhona Hadida


about the cover illustrationThe caption on the illustration on the cover of <strong>C#</strong> <strong>in</strong> <strong>Depth</strong> is a “Musician.” The illustrationis taken from a collection of costumes of the Ottoman Empire published onJanuary 1, 1802, by William Miller of Old Bond Street, London. The title page is miss<strong>in</strong>gfrom the collection and we have been unable to track it down to date. The book’stable of contents identifies the figures <strong>in</strong> both English and French, and each illustrationbears the names of two artists who worked on it, both of whom would no doubtbe surprised to f<strong>in</strong>d their art grac<strong>in</strong>g the front cover of a computer programm<strong>in</strong>gbook...two hundred years later.The collection was purchased by a Mann<strong>in</strong>g editor at an antiquarian flea market <strong>in</strong>the “Garage” on West 26th Street <strong>in</strong> Manhattan. The seller was an American based <strong>in</strong>Ankara, Turkey, and the transaction took place just as he was pack<strong>in</strong>g up his stand forthe day. The Mann<strong>in</strong>g editor did not have on his person the substantial amount ofcash that was required for the purchase and a credit card and check were both politelyturned down. With the seller fly<strong>in</strong>g back to Ankara that even<strong>in</strong>g the situation was gett<strong>in</strong>ghopeless. What was the solution? It turned out to be noth<strong>in</strong>g more than an oldfashionedverbal agreement sealed with a handshake. The seller simply proposed thatthe money be transferred to him by wire and the editor walked out with the bank<strong>in</strong>formation on a piece of paper and the portfolio of images under his arm. Needlessto say, we transferred the funds the next day, and we rema<strong>in</strong> grateful and impressed bythis unknown person’s trust <strong>in</strong> one of us. It recalls someth<strong>in</strong>g that might have happeneda long time ago.We at Mann<strong>in</strong>g celebrate the <strong>in</strong>ventiveness, the <strong>in</strong>itiative, and, yes, the fun of thecomputer bus<strong>in</strong>ess with book covers based on the rich diversity of regional life of twocenturies ago‚ brought back to life by the pictures from this collection.xxviiiLicensed to Rhona Hadida


comments from the tech reviewTechnical proofreaders are a vital part of the book-publish<strong>in</strong>g process. They readthe f<strong>in</strong>al manuscript for technical accuracy and test all the code shortly before thebook goes to typesett<strong>in</strong>g.A good proofreader will f<strong>in</strong>d and help correct all technical errors. A reallygood one will pick up <strong>in</strong>correct implications and unhelpful nuances. A great technicalproofreader will provide guidance <strong>in</strong> terms of style as well as content, help<strong>in</strong>gto hone the manuscript and the author’s po<strong>in</strong>t of view. In Eric Lippert wefound such a person. As a member of the <strong>C#</strong> team, we knew from the start that hewould provide accurate <strong>in</strong>formation—but he was also able to give Jon’s excellentmanuscript an extra layer of polish.We always ask our technical proofreaders for feedback once they’ve completedthe review, and for this book we wanted to share some of those comments with you:This is a gem of a book, both <strong>in</strong> its details and its overall organization. Every bitof jargon from the specification is used correctly and <strong>in</strong> context; when Jon needsnew terms he makes up good ones…Where it needs to be simple it is simple, but never simplistic. The majority ofmy comments were not corrections; rather, they expanded on the history beh<strong>in</strong>da particular design decision, giv<strong>in</strong>g ideas for further explorations, and so on.Jon takes a sensible approach to present<strong>in</strong>g complex material. The bookbeg<strong>in</strong>s with an “ontogenic” approach, describ<strong>in</strong>g the evolution of the languageover time. In the section on <strong>C#</strong>3, he switches to a more “constructivist” approach,describ<strong>in</strong>g how we built more complex features (such as query comprehensions)out of more basic features (such as extension methods and lambda expressions).xxixLicensed to Rhona Hadida


xxxCOMMENTS FROM THE TECH REVIEWThis choice of book organization is particularly well-suited to high-end users,like Jon himself, who are primarily look<strong>in</strong>g to use the language, but who can doso better when they understand the parts from which it was built…If I had time to write another book, this is the k<strong>in</strong>d of book I would hope towrite. Now I don’t have to, and thank goodness for that!To see more comments and marg<strong>in</strong> notes made by Eric dur<strong>in</strong>g the technicalreview process, as well as responses from Jon, please visit the book’s web page atwww.mann<strong>in</strong>g.com/CSharpIn<strong>Depth</strong> or www.csharp<strong>in</strong>depth.com/Notes.aspx.Licensed to Rhona Hadida


Part 1Prepar<strong>in</strong>g for the journeyEvery reader will come to this book with a different set of expectations anda different level of experience. Are you an expert look<strong>in</strong>g to fill some holes, howeversmall, <strong>in</strong> your present knowledge? Perhaps you consider yourself an “average”developer, beg<strong>in</strong>n<strong>in</strong>g to migrate projects to .NET 2.0 but with an eye to the future.Maybe you’re reasonably confident with <strong>C#</strong> 2 but have no <strong>C#</strong> 3 experience.As an author, I can’t make every reader the same—and I wouldn’t want toeven if I could. I hope that all readers have two th<strong>in</strong>gs <strong>in</strong> common, however: thedesire for a deeper relationship with <strong>C#</strong> as a language, and at least a basic knowledgeof <strong>C#</strong> 1. If you can br<strong>in</strong>g those elements to the party, I’ll provide the rest.The potentially huge range of skill levels is the ma<strong>in</strong> reason for this partof the book exist<strong>in</strong>g. You may already know what to expect from <strong>C#</strong> 2 and 3—or it could all be brand new to you. You could have a rock-solid understand<strong>in</strong>gof <strong>C#</strong> 1, or you might be rusty on some of the details that didn’t matter muchbefore but that will become <strong>in</strong>creas<strong>in</strong>gly important as you learn <strong>C#</strong> 2 andthen 3. By the end of part 1, I won’t have leveled the play<strong>in</strong>g field entirely, butyou should be able to approach the rest of the book with confidence and anidea of what’s com<strong>in</strong>g later.For the first two chapters, we will be look<strong>in</strong>g both forward and back. One ofthe key themes of the book is evolution. Before <strong>in</strong>troduc<strong>in</strong>g any feature <strong>in</strong>to thelanguage, the design team carefully considers that feature <strong>in</strong> the context ofwhat’s already present and the general aims of the future. This br<strong>in</strong>gs a feel<strong>in</strong>gof consistency to the language even <strong>in</strong> the midst of change. To understand howand why the language is evolv<strong>in</strong>g, we need to see where we’ve come from andwhere we’re go<strong>in</strong>g to.Licensed to Rhona Hadida


Chapter 1 presents a bird’s-eye view of the rest of the book, tak<strong>in</strong>g a brief look atsome of the biggest features of both <strong>C#</strong> 2 and <strong>C#</strong> 3 and show<strong>in</strong>g a progression of codefrom <strong>C#</strong> 1 onward. To br<strong>in</strong>g more perspective and context to the new features, we’ll alsotake a look at nearly 12 years of development history, from the first release of Java <strong>in</strong> January1996 to the birth of <strong>C#</strong> 3 and .NET 3.5 <strong>in</strong> November 2007.Chapter 2 is heavily focused on <strong>C#</strong> 1. If you’re an expert <strong>in</strong> <strong>C#</strong> 1 you can skip thischapter, but it does tackle some of the areas of <strong>C#</strong> 1 that tend to be misunderstood.Rather than try to expla<strong>in</strong> the whole of the language, the chapter concentrates on featuresthat are fundamental to the later versions of <strong>C#</strong>. From this solid base, we canmove on and look at <strong>C#</strong> 2 <strong>in</strong> part 2 of this book.Licensed to Rhona Hadida


The chang<strong>in</strong>g faceof <strong>C#</strong> developmentThis chapter covers■■■■An evolv<strong>in</strong>g example<strong>C#</strong>’s historical contextThe composition of .NETSnippy, the snippet compilerThe world is chang<strong>in</strong>g at a pace that is sometimes terrify<strong>in</strong>g, and technology is oneof the fastest-mov<strong>in</strong>g areas of that change. Comput<strong>in</strong>g <strong>in</strong> particular seems to pushitself constantly, both <strong>in</strong> hardware and <strong>in</strong> software. Although many older computerlanguages are like bedrocks, rarely chang<strong>in</strong>g beyond be<strong>in</strong>g consolidated <strong>in</strong> terms ofstandardization, newer ones are still evolv<strong>in</strong>g. <strong>C#</strong> falls <strong>in</strong>to the latter category, andthe implications of this are double-edged. On the one hand, there’s always more tolearn—the feel<strong>in</strong>g of hav<strong>in</strong>g mastered the language is unlikely to last for long, witha “V next” always loom<strong>in</strong>g. However, the upside is that if you embrace the new featuresand you’re will<strong>in</strong>g to change your cod<strong>in</strong>g style to adopt the new idioms, you’lldiscover a more expressive, powerful way of develop<strong>in</strong>g software.3Licensed to Rhona Hadida


4 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentTo get the most out of any new language feature, you need to understand it thoroughly.That’s the po<strong>in</strong>t of this book—to delve <strong>in</strong>to the very heart of <strong>C#</strong>, so you understandit at a deep level rather than just enough to get by. Without wish<strong>in</strong>g to soundmelodramatic or overly emotional, I want to put you <strong>in</strong> harmony with the language.If you’re anxious to get cod<strong>in</strong>g straight away, and if you’re confident <strong>in</strong> your understand<strong>in</strong>gof <strong>C#</strong> 1, feel free to skip to part 2 and dive <strong>in</strong>. However, there’s always moreto cod<strong>in</strong>g than just the technicalities, and <strong>in</strong> this part I will be provid<strong>in</strong>g backgroundto the bigger picture: the reasons why both the <strong>C#</strong> language and the .NET Frameworkhave changed <strong>in</strong> the ways that they have.In this chapter, we’ll have a sneak peek at a few of the features the rest of the bookwill cover. We’ll see that while <strong>C#</strong> 2 fixed a lot of the issues people encountered whenus<strong>in</strong>g <strong>C#</strong> 1, the ideas <strong>in</strong> <strong>C#</strong> 3 could significantly change the way we write and even th<strong>in</strong>kabout code. I’ll put the changes <strong>in</strong>to historical context, guide you through the maze ofterm<strong>in</strong>ology and version numbers, then talk about how the book is presented <strong>in</strong> orderto help you get as much out of it as possible. Let’s start by look<strong>in</strong>g at how some codemight evolve over time, tak<strong>in</strong>g advantage of new features as they become available.1.1 Evolution <strong>in</strong> action: examples of code changeI’ve always dreamed of do<strong>in</strong>g magic tricks, and for this one section I get to live thatdream. This is the only time that I won’t expla<strong>in</strong> how th<strong>in</strong>gs work, or try to go one stepat a time. Quite the opposite, <strong>in</strong> fact—the plan is to impress rather than educate. If youread this entire section without gett<strong>in</strong>g at least a little excited about what <strong>C#</strong> 2 and 3 cando, maybe this book isn’t for you. With any luck, though, you’ll be eager to get to thedetails of how the tricks work—to slow down the sleight of hand until it’s obvious what’sgo<strong>in</strong>g on—and that’s what the rest of the book is for.I should warn you that the example is very contrived—clearly designed to pack asmany new features <strong>in</strong>to as short a piece of code as possible. From <strong>C#</strong> 2, we’ll see generics,properties with different access modifiers for getters and setters, nullable types, andanonymous methods. From <strong>C#</strong> 3, we’ll see automatically implemented properties,enhanced collection <strong>in</strong>itializers, enhanced object <strong>in</strong>itializers, lambda expressions,extension methods, implicit typ<strong>in</strong>g, and LINQ query expressions. There are, of course,many other new features, but it would be impossible to demonstrate them all together<strong>in</strong> a mean<strong>in</strong>gful way. Even though you usually wouldn’t use even this select set of features<strong>in</strong> such a compact space, I’m sure you’ll recognize the general tasks as ones thatdo crop up frequently <strong>in</strong> real-life code.As well as be<strong>in</strong>g contrived, the example is also clichéd—but at least that makes itfamiliar. Yes, it’s a product/name/price example, the e-commerce virtual child of“hello, world.”To keep th<strong>in</strong>gs simple, I’ve split the code up <strong>in</strong>to sections. Here’s what we want to do:■■Def<strong>in</strong>e a Product type with a name and a price <strong>in</strong> dollars, along with a way ofretriev<strong>in</strong>g a hard-coded list of productsPr<strong>in</strong>t out the products <strong>in</strong> alphabetical orderLicensed to Rhona Hadida


Evolution <strong>in</strong> action: examples of code change5■ Pr<strong>in</strong>t out all the products cost<strong>in</strong>g more than $10■Consider what would be required to represent products with unknown pricesWe’ll look at each of these areas separately, and see how as we move forward <strong>in</strong> versionsof <strong>C#</strong>, we can accomplish the same tasks more simply and elegantly than before. Ineach case, the changes to the code will be <strong>in</strong> a bold font. Let’s start with the Producttype itself.1.1.1 Def<strong>in</strong><strong>in</strong>g the Product typeWe’re not look<strong>in</strong>g for anyth<strong>in</strong>g particularly impressive from the Product type—justencapsulation of a couple of properties. To make life simpler for demonstration purposes,this is also where we create a list of predef<strong>in</strong>ed products. We override ToStr<strong>in</strong>gso that when we pr<strong>in</strong>t out the products elsewhere, they show useful values. List<strong>in</strong>g 1.1shows the type as it might be written <strong>in</strong> <strong>C#</strong> 1. We’ll then move on to see how the sameeffect can be achieved <strong>in</strong> <strong>C#</strong> 2, then <strong>C#</strong> 3. This is the pattern we’ll follow for each ofthe other pieces of code.List<strong>in</strong>g 1.1 The Product type (<strong>C#</strong> 1)us<strong>in</strong>g System.Collections;public class Product{str<strong>in</strong>g name;public str<strong>in</strong>g Name{get { return name; }}decimal price;public decimal Price{get { return price; }}public Product(str<strong>in</strong>g name, decimal price){this.name = name;this.price = price;}public static ArrayList GetSampleProducts(){ArrayList list = new ArrayList();list.Add(new Product("Company", 9.99m));list.Add(new Product("Assass<strong>in</strong>s", 14.99m));list.Add(new Product("Frogs", 13.99m));list.Add(new Product("Sweeney Todd", 10.99m));return list;}public override str<strong>in</strong>g ToStr<strong>in</strong>g(){Licensed to Rhona Hadida


6 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> development}}return str<strong>in</strong>g.Format("{0}: {1}", name, price);Noth<strong>in</strong>g <strong>in</strong> list<strong>in</strong>g 1.1 should be hard to understand—it’s just <strong>C#</strong> 1 code, after all.There are four limitations that it demonstrates, however:■■■■An ArrayList has no compile-time <strong>in</strong>formation about what’s <strong>in</strong> it. We couldhave accidentally added a str<strong>in</strong>g to the list created <strong>in</strong> GetSampleProducts andthe compiler wouldn’t have batted an eyelid.We’ve provided public “getter” properties, which means that if we wantedmatch<strong>in</strong>g “setters,” they would have to be public too. In this case it’s not toomuch of a problem to use the fields directly, but it would be if we had validationthat ought to be applied every time a value was set. A property setterwould be natural, but we may not want to expose it to the outside world. We’dhave to create a private SetPrice method or someth<strong>in</strong>g similar, and that asymmetryis ugly.The variables themselves are available to the rest of the class. They’re private,but it would be nice to encapsulate them with<strong>in</strong> the properties, to make surethey’re not tampered with other than through those properties.There’s quite a lot of fluff <strong>in</strong>volved <strong>in</strong> creat<strong>in</strong>g the properties and variables—code that complicates the simple task of encapsulat<strong>in</strong>g a str<strong>in</strong>g and a decimal.Let’s see what <strong>C#</strong> 2 can do to improve matters (see list<strong>in</strong>g 1.2; changes are <strong>in</strong> bold).List<strong>in</strong>g 1.2 Strongly typed collections and private setters (<strong>C#</strong> 2)us<strong>in</strong>g System.Collections.Generic;public class Product{str<strong>in</strong>g name;public str<strong>in</strong>g Name{get { return name; }private set { name = value; }}decimal price;public decimal Price{get { return price; }private set { price = value; }}public Product(str<strong>in</strong>g name, decimal price){Name = name;Price = price;}public static List GetSampleProducts()Licensed to Rhona Hadida


Evolution <strong>in</strong> action: examples of code change7{}List list = new List();list.Add(new Product("Company", 9.99m));list.Add(new Product("Assass<strong>in</strong>s", 14.99m));list.Add(new Product("Frogs", 13.99m));list.Add(new Product("Sweeney Todd", 10.99m));return list;}public override str<strong>in</strong>g ToStr<strong>in</strong>g(){return str<strong>in</strong>g.Format("{0}: {1}", name, price);}The code hasn’t changed much, but we’ve addressed two of the problems. We nowhave properties with private setters (which we use <strong>in</strong> the constructor), and it doesn’ttake a genius to guess that List is tell<strong>in</strong>g the compiler that the list conta<strong>in</strong>sproducts. Attempt<strong>in</strong>g to add a different type to the list would result <strong>in</strong> a compilererror. The change to <strong>C#</strong> 2 leaves only two of the orig<strong>in</strong>al four difficulties unanswered.List<strong>in</strong>g 1.3 shows how <strong>C#</strong> 3 tackles these.List<strong>in</strong>g 1.3 Automatically implemented properties and simpler <strong>in</strong>itialization (<strong>C#</strong> 3)us<strong>in</strong>g System.Collections.Generic;class Product{public str<strong>in</strong>g Name { get; private set; }public decimal Price { get; private set; }public Product(str<strong>in</strong>g name, decimal price){Name = name;Price = price;}Product(){}public static List GetSampleProducts(){return new List{new Product { Name="Company", Price = 9.99m },new Product { Name="Assass<strong>in</strong>s", Price=14.99m },new Product { Name="Frogs", Price=13.99m },new Product { Name="Sweeney Todd", Price=10.99m}};}}public override str<strong>in</strong>g ToStr<strong>in</strong>g(){return str<strong>in</strong>g.Format("{0}: {1}", Name, Price);}Licensed to Rhona Hadida


8 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentThe properties now don’t have any code (or visible variables!) associated with them, andwe’re build<strong>in</strong>g the hard-coded list <strong>in</strong> a very different way. With no “name” and “price”variables to access, we’re forced to use the properties everywhere <strong>in</strong> the class, improv<strong>in</strong>gconsistency. We now have a private parameterless constructor for the sake of the newproperty-based <strong>in</strong>itialization. In this example, we could actually have removed the publicconstructor completely, but it would make the class less useful <strong>in</strong> the real world.Figure 1.1 shows a summary of how our Product type has evolved so far. I’ll <strong>in</strong>cludea similar diagram after each task, so you can see the pattern of how <strong>C#</strong> 2 and 3 improvethe code.<strong>C#</strong> 1Read-only propertiesWeakly typed collections<strong>C#</strong> 2Private property "setters"Strongly typed collections<strong>C#</strong> 3Automatically implementedpropertiesEnhanced collection andobject <strong>in</strong>itializationFigure 1.1 Evolution of the Product type, show<strong>in</strong>g greater encapsulation, stronger typ<strong>in</strong>g, and easeof <strong>in</strong>itialization over timeSo far the changes are relatively m<strong>in</strong>imal. In fact, the addition of generics (theList syntax) is probably the most important part of <strong>C#</strong>2, but we’ve onlyseen part of its usefulness so far. There’s noth<strong>in</strong>g to get the heart rac<strong>in</strong>g yet, but we’veonly just started. Our next task is to pr<strong>in</strong>t out the list of products <strong>in</strong> alphabetical order.That shouldn’t be too hard…1.1.2 Sort<strong>in</strong>g products by nameThe easiest way of display<strong>in</strong>g a list <strong>in</strong> a particular order is to sort the list and then runthrough it display<strong>in</strong>g items. In .NET 1.1, this <strong>in</strong>volved us<strong>in</strong>g ArrayList.Sort, and <strong>in</strong>our case provid<strong>in</strong>g an IComparer implementation. We could have made the Producttype implement IComparable, but we could only def<strong>in</strong>e one sort order that way, andit’s not a huge stretch to imag<strong>in</strong>e that we might want to sort by price at some stage aswell as by name. List<strong>in</strong>g 1.4 implements IComparer, then sorts the list and displays it.List<strong>in</strong>g 1.4 Sort<strong>in</strong>g an ArrayList us<strong>in</strong>g IComparer (<strong>C#</strong> 1)class ProductNameComparer : IComparer{public <strong>in</strong>t Compare(object x, object y){Product first = (Product)x;Product second = (Product)y;return first.Name.CompareTo(second.Name);}}...ArrayList products = Product.GetSampleProducts();products.Sort(new ProductNameComparer());Licensed to Rhona Hadida


Evolution <strong>in</strong> action: examples of code change9foreach (Product product <strong>in</strong> products){Console.WriteL<strong>in</strong>e (product);}The first th<strong>in</strong>g to spot <strong>in</strong> list<strong>in</strong>g 1.4 is that we’ve had to <strong>in</strong>troduce an extra type to helpus with the sort<strong>in</strong>g. That’s not a disaster, but it’s a lot of code if we only want to sort byname <strong>in</strong> one place. Next, we see the casts <strong>in</strong> the Compare method. Casts are a way oftell<strong>in</strong>g the compiler that we know more <strong>in</strong>formation than it does—and that usuallymeans there’s a chance we’re wrong. If the ArrayList we returned from GetSample-Products had conta<strong>in</strong>ed a str<strong>in</strong>g, that’s where the code would go bang—where thecomparison tries to cast the str<strong>in</strong>g to a Product.We’ve also got a cast <strong>in</strong> the code that displays the sorted list. It’s not obvious,because the compiler puts it <strong>in</strong> automatically, but the foreach loop implicitly castseach element of the list to Product. Aga<strong>in</strong>, that’s a cast we’d ideally like to get rid of,and once more generics come to the rescue <strong>in</strong> <strong>C#</strong> 2. List<strong>in</strong>g 1.5 shows the earlier codewith the use of generics as the only change.List<strong>in</strong>g 1.5 Sort<strong>in</strong>g a List us<strong>in</strong>g IComparer (<strong>C#</strong> 2)class ProductNameComparer : IComparer{public <strong>in</strong>t Compare(Product first, Product second){return first.Name.CompareTo(second.Name);}}...List products = Product.GetSampleProducts();products.Sort(new ProductNameComparer());foreach (Product product <strong>in</strong> products){Console.WriteL<strong>in</strong>e(product);}The code for the comparer <strong>in</strong> list<strong>in</strong>g 1.5 is simpler because we’re given products tostart with. No cast<strong>in</strong>g necessary. Similarly, the <strong>in</strong>visible cast <strong>in</strong> the foreach loop isgone. It’s hard to tell the difference, given that it’s <strong>in</strong>visible, but it really is gone. Honest.I wouldn’t lie to you. At least, not <strong>in</strong> chapter 1…That’s an improvement, but it would be nice to be able to sort the products by simplyspecify<strong>in</strong>g the comparison to make, without need<strong>in</strong>g to implement an <strong>in</strong>terface todo so. List<strong>in</strong>g 1.6 shows how to do precisely this, tell<strong>in</strong>g the Sort method how to comparetwo products us<strong>in</strong>g a delegate.List<strong>in</strong>g 1.6 Sort<strong>in</strong>g a List us<strong>in</strong>g Comparison (<strong>C#</strong> 2)List products = Product.GetSampleProducts();products.Sort(delegate(Product first, Product second){ return first.Name.CompareTo(second.Name); });Licensed to Rhona Hadida


10 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentforeach (Product product <strong>in</strong> products){Console.WriteL<strong>in</strong>e(product);}Behold the lack of the ProductNameComparer type. The statement <strong>in</strong> bold actually createsa delegate <strong>in</strong>stance, which we provide to the Sort method <strong>in</strong> order to performthe comparisons. More on that feature (anonymous methods) <strong>in</strong> chapter 5. We’ve nowfixed all the th<strong>in</strong>gs we didn’t like about the <strong>C#</strong> 1 version. That doesn’t mean that <strong>C#</strong> 3can’t do better, though. First we’ll just replace the anonymous method with an evenmore compact way of creat<strong>in</strong>g a delegate <strong>in</strong>stance, as shown <strong>in</strong> list<strong>in</strong>g 1.7.List<strong>in</strong>g 1.7 Sort<strong>in</strong>g us<strong>in</strong>g Comparison from a lambda expression (<strong>C#</strong> 3)List products = Product.GetSampleProducts();products.Sort((first, second) => first.Name.CompareTo(second.Name));foreach (Product product <strong>in</strong> products){Console.WriteL<strong>in</strong>e(product);}We’ve ga<strong>in</strong>ed even more strange syntax (a lambda expression), which still creates aComparison delegate just the same as list<strong>in</strong>g 1.6 did but this time with lessfuss. We haven’t had to use the delegate keyword to <strong>in</strong>troduce it, or even specify thetypes of the parameters. There’s more, though: with <strong>C#</strong> 3 we can easily pr<strong>in</strong>t thenames out <strong>in</strong> order without modify<strong>in</strong>g the orig<strong>in</strong>al list of products. List<strong>in</strong>g 1.8 showsthis us<strong>in</strong>g the OrderBy method.List<strong>in</strong>g 1.8 Order<strong>in</strong>g a List us<strong>in</strong>g an extension method (<strong>C#</strong> 3)List products = Product.GetSampleProducts();foreach (Product product <strong>in</strong> products.OrderBy(p => p.Name)){Console.WriteL<strong>in</strong>e (product);}We appear to be call<strong>in</strong>g an OrderBy method, but if you look <strong>in</strong> MSDN you’ll see that itdoesn’t even exist <strong>in</strong> List. We’re able to call it due to the presence of anextension method, which we’ll see <strong>in</strong> more detail <strong>in</strong> chapter 10. We’re not actually sort<strong>in</strong>gthe list “<strong>in</strong> place” anymore, but just retriev<strong>in</strong>g the contents of the list <strong>in</strong> a particularorder. Sometimes you’ll need to change the actual list; sometimes an order<strong>in</strong>g withoutany other side effects is better. The important po<strong>in</strong>t is that it’s much more compact andreadable (once you understand the syntax, of course). We wanted to order the list byname, and that’s exactly what the code says. It doesn’t say to sort by compar<strong>in</strong>g thename of one product with the name of another, like the <strong>C#</strong> 2 code did, or to sort byus<strong>in</strong>g an <strong>in</strong>stance of another type that knows how to compare one product withanother. It just says to order by name. This simplicity of expression is one of the keyLicensed to Rhona Hadida


Evolution <strong>in</strong> action: examples of code change11benefits of <strong>C#</strong> 3. When the <strong>in</strong>dividual pieces of data query<strong>in</strong>g and manipulation are sosimple, larger transformations can still rema<strong>in</strong> compact and readable <strong>in</strong> one piece ofcode. That <strong>in</strong> turn encourages a more “data-centric” way of look<strong>in</strong>g at the world.We’ve seen a bit more of the power of <strong>C#</strong> 2 and 3 <strong>in</strong> this section, with quite a lot of(as yet) unexpla<strong>in</strong>ed syntax, but even without understand<strong>in</strong>g the details we can seethe progress toward clearer, simpler code. Figure 1.2 shows that evolution.<strong>C#</strong> 1Weakly typed comparatorNo delegate sort<strong>in</strong>g option<strong>C#</strong> 2Strongly typed comparatorDelegate comparisonsAnonymous methods<strong>C#</strong> 3Lambda expressionsExtension methodsOption of leav<strong>in</strong>g list unsortedFigure 1.2 Features <strong>in</strong>volved <strong>in</strong> mak<strong>in</strong>g sort<strong>in</strong>g easier <strong>in</strong> <strong>C#</strong> 2 and 3That’s it for sort<strong>in</strong>g. Let’s do a different form of data manipulation now—query<strong>in</strong>g.1.1.3 Query<strong>in</strong>g collectionsOur next task is to f<strong>in</strong>d all the elements of the list that match a certa<strong>in</strong> criterion—<strong>in</strong> particular,those with a price greater than $10. In <strong>C#</strong> 1, we need to loop around, test<strong>in</strong>g eachelement and pr<strong>in</strong>t<strong>in</strong>g it out where appropriate (see list<strong>in</strong>g 1.9).List<strong>in</strong>g 1.9 Loop<strong>in</strong>g, test<strong>in</strong>g, pr<strong>in</strong>t<strong>in</strong>g out (<strong>C#</strong> 1)ArrayList products = Product.GetSampleProducts();foreach (Product product <strong>in</strong> products){if (product.Price > 10m){Console.WriteL<strong>in</strong>e(product);}}OK, this is not difficult code to understand. However, it’s worth bear<strong>in</strong>g <strong>in</strong> m<strong>in</strong>d how<strong>in</strong>tertw<strong>in</strong>ed the three tasks are—loop<strong>in</strong>g with foreach, test<strong>in</strong>g the criterion with if,then display<strong>in</strong>g the product with Console.WriteL<strong>in</strong>e. The dependency is obviousbecause of the nest<strong>in</strong>g. <strong>C#</strong> 2 lets us flatten th<strong>in</strong>gs out a bit (see list<strong>in</strong>g 1.10).List<strong>in</strong>g 1.10 Separat<strong>in</strong>g test<strong>in</strong>g from pr<strong>in</strong>t<strong>in</strong>g (<strong>C#</strong> 2)List products = Product.GetSampleProducts();Predicate test = delegate(Product p){ return p.Price > 10m; };List matches = products.F<strong>in</strong>dAll(test);Action pr<strong>in</strong>t = delegate(Product p){ Console.WriteL<strong>in</strong>e (p); };matches.ForEach (pr<strong>in</strong>t);Licensed to Rhona Hadida


12 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentI’m not go<strong>in</strong>g to claim this code is simpler than the <strong>C#</strong> 1 code—but it is a lot morepowerful. 1 In particular, it makes it very easy to change the condition we’re test<strong>in</strong>g forand the action we take on each of the matches <strong>in</strong>dependently. The delegate variables<strong>in</strong>volved (test and pr<strong>in</strong>t) could be passed <strong>in</strong>to a method—that same method couldend up test<strong>in</strong>g radically different conditions and tak<strong>in</strong>g radically different actions. Ofcourse, we could have put all the test<strong>in</strong>g and pr<strong>in</strong>t<strong>in</strong>g <strong>in</strong>to one statement, as shown <strong>in</strong>list<strong>in</strong>g 1.11.List<strong>in</strong>g 1.11 Separat<strong>in</strong>g test<strong>in</strong>g from pr<strong>in</strong>t<strong>in</strong>g redux (<strong>C#</strong> 2)List products = Product.GetSampleProducts();products.F<strong>in</strong>dAll (delegate(Product p) { return p.Price > 10;}).ForEach (delegate(Product p) { Console.WriteL<strong>in</strong>e(p); });That’s a bit better, but the delegate(Product p) is gett<strong>in</strong>g <strong>in</strong> the way, as are the braces.They’re add<strong>in</strong>g noise to the code, which hurts readability. I still prefer the <strong>C#</strong> 1 version,<strong>in</strong> the case where we only ever want to use the same test and perform the same action.(It may sound obvious, but it’s worth remember<strong>in</strong>g that there’s noth<strong>in</strong>g stopp<strong>in</strong>g usfrom us<strong>in</strong>g the <strong>C#</strong> 1 version when us<strong>in</strong>g <strong>C#</strong> 2 or 3. You wouldn’t use a bulldozer to planttulip bulbs, which is the k<strong>in</strong>d of overkill we’re us<strong>in</strong>g here.) <strong>C#</strong> 3 improves matters dramaticallyby remov<strong>in</strong>g a lot of the fluff surround<strong>in</strong>g the actual logic of the delegate (seelist<strong>in</strong>g 1.12).List<strong>in</strong>g 1.12 Test<strong>in</strong>g with a lambda expression (<strong>C#</strong> 3)List products = Product.GetSampleProducts();foreach (Product product <strong>in</strong> products.Where(p => p.Price > 10)){Console.WriteL<strong>in</strong>e(product);}The comb<strong>in</strong>ation of the lambda expression putt<strong>in</strong>g the test <strong>in</strong> just the right place anda well-named method means we can almost read the code out loud and understand itwithout even th<strong>in</strong>k<strong>in</strong>g. We still have the flexibility of <strong>C#</strong> 2—the argument to Wherecould come from a variable, and we could use an Action <strong>in</strong>stead of thehard-coded Console.WriteL<strong>in</strong>e call if we wanted to.This task has emphasized what we already knew from sort<strong>in</strong>g—anonymous methodsmake writ<strong>in</strong>g a delegate simple, and lambda expressions are even more concise. Inboth cases, that brevity means that we can <strong>in</strong>clude the query or sort operation <strong>in</strong>sidethe first part of the foreach loop without los<strong>in</strong>g clarity. Figure 1.3 summarizes thechanges we’ve just seen.So, now that we’ve displayed the filtered list, let’s consider a change to our <strong>in</strong>itialassumptions about the data. What happens if we don’t always know the price for aproduct? How can we cope with that with<strong>in</strong> our Product class?1In some ways, this is cheat<strong>in</strong>g. We could have def<strong>in</strong>ed appropriate delegates <strong>in</strong> <strong>C#</strong> 1 and called them with<strong>in</strong>the loop. The F<strong>in</strong>dAll and ForEach methods <strong>in</strong> .NET 2.0 just help to encourage you to consider separationof concerns.Licensed to Rhona Hadida


Evolution <strong>in</strong> action: examples of code change13<strong>C#</strong> 1Strong coupl<strong>in</strong>g betweencondition and action.Both are hard-coded.<strong>C#</strong> 2Separate condition fromaction <strong>in</strong>voked.Anonymous methodsmake delegates simple.<strong>C#</strong> 3Lambda expressionsmake the conditioneven easier to read.Figure 1.3 Anonymous methods and lambda expressions aid separation of concerns and readabilityfor <strong>C#</strong> 2 and 3.1.1.4 Represent<strong>in</strong>g an unknown priceI’m not go<strong>in</strong>g to present much code this time, but I’m sure it will be a familiar problemto you, especially if you’ve done a lot of work with databases. Let’s imag<strong>in</strong>e our listof products conta<strong>in</strong>s not just products on sale right now but ones that aren’t availableyet. In some cases, we may not know the price. If decimal were a reference type, wecould just use null to represent the unknown price—but as it’s a value type, we can’t.How would you represent this <strong>in</strong> <strong>C#</strong> 1? There are three common alternatives:■ Create a reference type wrapper around decimal■ Ma<strong>in</strong>ta<strong>in</strong> a separate Boolean flag <strong>in</strong>dicat<strong>in</strong>g whether the price is known■ Use a “magic value” (decimal.M<strong>in</strong>Value, for example) to represent the unknownpriceI hope you’ll agree that none of these holds much appeal. Time for a little magic: wecan solve the problem with the addition of a s<strong>in</strong>gle extra character <strong>in</strong> the variableand property declarations. <strong>C#</strong> 2 makes matters a lot simpler by <strong>in</strong>troduc<strong>in</strong>g theNullable structure and some syntactic sugar for it that lets us change the propertydeclaration todecimal? price;public decimal? Price{get { return price; }private set { price = value; }}The constructor parameter changes to decimal? as well, and then we can pass <strong>in</strong> nullas the argument, or say Price = null; with<strong>in</strong> the class. That’s a lot more expressivethan any of the other solutions. The rest of the code just works as is—a product withan unknown price will be considered to be less expensive than $10, which is probablywhat we’d want. To check whether or not a price is known, we can compare it withnull or use the HasValue property—so to show all the products with unknown prices<strong>in</strong> <strong>C#</strong> 3, we’d write the code <strong>in</strong> list<strong>in</strong>g 1.13.List<strong>in</strong>g 1.13 Display<strong>in</strong>g products with an unknown price (<strong>C#</strong> 2 and 3)List products = Product.GetSampleProducts();foreach (Product product <strong>in</strong> products.Where(p => p.Price==null))Licensed to Rhona Hadida


14 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> development{}Console.WriteL<strong>in</strong>e(product.Name);The <strong>C#</strong> 2 code would be similar to list<strong>in</strong>g 1.11 but us<strong>in</strong>g return p.Price == null; asthe body for the anonymous method. There’s no difference between <strong>C#</strong> 2 and 3 <strong>in</strong>terms of nullable types, so figure 1.4 represents the improvements with just two boxes.<strong>C#</strong> 1Choice between extra workma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g a flag, chang<strong>in</strong>gto reference type semantics,or the hack of a magic value.<strong>C#</strong> 2 / 3Nullable types make the"extra work" option simpleand syntactic sugar improvesmatters even further.Figure 1.4 The options available for work<strong>in</strong>g around the lack of nullable types <strong>in</strong> <strong>C#</strong> 1,and the benefits of <strong>C#</strong> 2 and 3So, is that it? Everyth<strong>in</strong>g we’ve seen so far is useful and important (particularlygenerics), but I’m not sure it really counts as excit<strong>in</strong>g. There are some cool th<strong>in</strong>gsyou can do with these features occasionally, but for the most part they’re “just” mak<strong>in</strong>gcode a bit simpler, more reliable, and more expressive. I value these th<strong>in</strong>gsimmensely, but they rarely impress me enough to call colleagues over to show howmuch can be done so simply. If you’ve seen any <strong>C#</strong> 3 code already, you were probablyexpect<strong>in</strong>g to see someth<strong>in</strong>g rather different—namely LINQ. This is where thefireworks start.1.1.5 LINQ and query expressionsLINQ (Language Integrated Query) is what <strong>C#</strong> 3 is all about at its heart. Whereas thefeatures <strong>in</strong> <strong>C#</strong> 2 are arguably more about fix<strong>in</strong>g annoyances <strong>in</strong> <strong>C#</strong> 1 than sett<strong>in</strong>g theworld on fire, <strong>C#</strong> 3 is rather special. In particular, it conta<strong>in</strong>s query expressions that allowa declarative style for creat<strong>in</strong>g queries on various data sources. The reason none of theexamples so far have used them is that they’ve all actually been simpler without us<strong>in</strong>gthe extra syntax. That’s not to say we couldn’t use it anyway, of course—list<strong>in</strong>g 1.12, forexample, is equivalent to list<strong>in</strong>g 1.14.List<strong>in</strong>g 1.14First steps with query expressions: filter<strong>in</strong>g a collectionList products = Product.GetSampleProducts();var filtered = from Product p <strong>in</strong> productswhere p.Price > 10select p;foreach (Product product <strong>in</strong> filtered){Console.WriteL<strong>in</strong>e(product);}Licensed to Rhona Hadida


Evolution <strong>in</strong> action: examples of code change15Personally, I f<strong>in</strong>d the earlier list<strong>in</strong>g easier to read—the only benefit to the queryexpression version is that the where clause is simpler.So if query expressions are no good, why is everyone mak<strong>in</strong>g such a fuss aboutthem, and about LINQ <strong>in</strong> general? The first answer is that while query expressions arenot particularly suitable for simple tasks, they’re very, very good for more complicatedsituations that would be hard to read if written out <strong>in</strong> the equivalent method calls(and fiendish <strong>in</strong> <strong>C#</strong> 1 or 2). Let’s make th<strong>in</strong>gs just a little harder by <strong>in</strong>troduc<strong>in</strong>ganother type—Supplier. I haven’t <strong>in</strong>cluded the whole code here, but complete readyto-compilecode is provided on the book’s website (www.csharp<strong>in</strong>depth.com). We’llconcentrate on the fun stuff.Each supplier has a Name (str<strong>in</strong>g) and a SupplierID (<strong>in</strong>t). I’ve also addedSupplierID as a property <strong>in</strong> Product and adapted the sample data appropriately.Admittedly that’s not a very object-oriented way of giv<strong>in</strong>g each product a supplier—it’s much closer to how the data would be represented <strong>in</strong> a database. It makes thisparticular feature easier to demonstrate for now, but we’ll see <strong>in</strong> chapter 12 thatLINQ allows us to use a more natural model too.Now let’s look at the code (list<strong>in</strong>g 1.15) to jo<strong>in</strong> the sample products with the samplesuppliers (obviously based on the supplier ID), apply the same price filter asbefore to the products, sort by supplier name and then product name, and pr<strong>in</strong>t outthe name of both supplier and product for each match. That was a mouthful (f<strong>in</strong>gerful?)to type, and <strong>in</strong> earlier versions of <strong>C#</strong> it would have been a nightmare to implement.In LINQ, it’s almost trivial.List<strong>in</strong>g 1.15Jo<strong>in</strong><strong>in</strong>g, filter<strong>in</strong>g, order<strong>in</strong>g, and project<strong>in</strong>gList products = Product.GetSampleProducts();List suppliers = Supplier.GetSampleSuppliers();var filtered = from p <strong>in</strong> productsjo<strong>in</strong> s <strong>in</strong> supplierson p.SupplierID equals s.SupplierIDwhere p.Price > 10orderby s.Name, p.Nameselect new {SupplierName=s.Name,ProductName=p.Name};foreach (var v <strong>in</strong> filtered){Console.WriteL<strong>in</strong>e("Supplier={0}; Product={1}",v.SupplierName, v.ProductName);}The more astute among you will have noticed that it looks remarkably like SQL. 2Indeed, the reaction of many people on first hear<strong>in</strong>g about LINQ (but before exam<strong>in</strong><strong>in</strong>git closely) is to reject it as merely try<strong>in</strong>g to put SQL <strong>in</strong>to the language for the sakeof talk<strong>in</strong>g to databases. Fortunately, LINQ has borrowed the syntax and some ideasfrom SQL, but as we’ve seen, you needn’t be anywhere near a database <strong>in</strong> order to use2If you’ve ever worked with SQL <strong>in</strong> any form whatsoever but didn’t notice the resemblance, I’m shocked.Licensed to Rhona Hadida


16 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentit—none of the code we’ve run so far has touched a database at all. Indeed, we couldbe gett<strong>in</strong>g data from any number of sources: XML, for example. Suppose that <strong>in</strong>steadof hard-cod<strong>in</strong>g our suppliers and products, we’d used the follow<strong>in</strong>g XML file:Well, the file is simple enough, but what’s the best way of extract<strong>in</strong>g the data from it?How do we query it? Jo<strong>in</strong> on it? Surely it’s go<strong>in</strong>g to be somewhat harder than list<strong>in</strong>g 1.14,right? List<strong>in</strong>g 1.16 shows how much work we have to do <strong>in</strong> LINQ to XML.List<strong>in</strong>g 1.16Complex process<strong>in</strong>g of an XML file with LINQ to XMLXDocument doc = XDocument.Load("data.xml");var filtered = from p <strong>in</strong> doc.Descendants("Product")jo<strong>in</strong> s <strong>in</strong> doc.Descendants("Supplier")on (<strong>in</strong>t)p.Attribute("SupplierID")equals (<strong>in</strong>t)s.Attribute("SupplierID")where (decimal)p.Attribute("Price") > 10orderby (str<strong>in</strong>g)s.Attribute("Name"),(str<strong>in</strong>g)p.Attribute("Name")select new{SupplierName = (str<strong>in</strong>g)s.Attribute("Name"),ProductName = (str<strong>in</strong>g)p.Attribute("Name")};foreach (var v <strong>in</strong> filtered){Console.WriteL<strong>in</strong>e("Supplier={0}; Product={1}",v.SupplierName, v.ProductName);}Well, it’s not quite as straightforward, because we need to tell the system how it shouldunderstand the data (<strong>in</strong> terms of what attributes should be used as what types)—butit’s not far off. In particular, there’s an obvious relationship between each part of thetwo list<strong>in</strong>gs. If it weren’t for the l<strong>in</strong>e length limitations of books, you’d see an exactl<strong>in</strong>e-by-l<strong>in</strong>e correspondence between the two queries.Impressed yet? Not quite conv<strong>in</strong>ced? Let’s put the data where it’s much more likelyto be—<strong>in</strong> a database. There’s some work (much of which can be automated) to letLicensed to Rhona Hadida


Evolution <strong>in</strong> action: examples of code change17LINQ to SQL know about what to expect <strong>in</strong> what table, but it’s all fairly straightforward.List<strong>in</strong>g 1.17 shows the query<strong>in</strong>g code.List<strong>in</strong>g 1.17Apply<strong>in</strong>g a query expression to a SQL databaseus<strong>in</strong>g (L<strong>in</strong>qDemoDataContext db = new L<strong>in</strong>qDemoDataContext()){var filtered = from p <strong>in</strong> db.Productsjo<strong>in</strong> s <strong>in</strong> db.Supplierson p.SupplierID equals s.SupplierIDwhere p.Price > 10orderby s.Name, p.Nameselect new{SupplierName = s.Name,ProductName = p.Name};foreach (var v <strong>in</strong> filtered){Console.WriteL<strong>in</strong>e("Supplier={0}; Product={1}",v.SupplierName, v.ProductName);}}By now, this should be look<strong>in</strong>g <strong>in</strong>credibly familiar. Everyth<strong>in</strong>g below the“jo<strong>in</strong>” l<strong>in</strong>e is cut and pasted directly from list<strong>in</strong>g 1.14 with no changes.That’s impressive enough, but if you’re performance conscious you maybe wonder<strong>in</strong>g why we would want to pull down all the data from the databaseand then apply these .NET queries and order<strong>in</strong>gs. Why not get thedatabase to do it? That’s what it’s good at, isn’t it? Well, <strong>in</strong>deed—andthat’s exactly what LINQ to SQL does. The code <strong>in</strong> list<strong>in</strong>g 1.17 issues adatabase request, which is basically the query translated <strong>in</strong>to SQL. Eventhough we’ve expressed the query <strong>in</strong> <strong>C#</strong> code, it’s been executed as SQL.We’ll see later that the way this query jo<strong>in</strong>s isn’t how we’d normally use LINQ to SQL—there’s a more relation-oriented way of approach<strong>in</strong>g it when the schema and the entitiesknow about the relationship between suppliers and products. The result is thesame, however, and it shows just how similar LINQ to Objects (the <strong>in</strong>-memory LINQoperat<strong>in</strong>g on collections) and LINQ to SQL can be.It’s important to understand that LINQ is flexible, too: you can write your ownquery translators. It’s not easy, but it can be well worth it. For <strong>in</strong>stance, here’s an exampleus<strong>in</strong>g Amazon’s web service to query its available books:Query iswritten <strong>in</strong> <strong>C#</strong>,but executesas SQLvar query =from book <strong>in</strong> new L<strong>in</strong>qToAmazon.AmazonBookSearch()wherebook.Title.Conta<strong>in</strong>s("ajax") &&(book.Publisher == "Mann<strong>in</strong>g") &&(book.Price


18 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentThis example was taken from the <strong>in</strong>troduction 3 to “LINQ to Amazon,” which is a LINQprovider written as an example for the LINQ <strong>in</strong> Action book (Mann<strong>in</strong>g, 2008). The queryis easy to understand, and written <strong>in</strong> what appears to be “normal” <strong>C#</strong> 3—but the provideris translat<strong>in</strong>g it <strong>in</strong>to a web service call. How cool is that?Hopefully by now your jaw is suitably close to the floor—m<strong>in</strong>e certa<strong>in</strong>ly was thefirst time I tried an exercise like the database one we’ve just seen, when it workedpretty much the first time. Now that we’ve seen a little bit of the evolution of the <strong>C#</strong>language, it’s worth tak<strong>in</strong>g a little history lesson to see how other products and technologieshave progressed <strong>in</strong> the same timeframe.1.2 A brief history of <strong>C#</strong> (and related technologies)When I was learn<strong>in</strong>g French and German at school, the teachers always told me that Iwould never be proficient <strong>in</strong> those languages until I started th<strong>in</strong>k<strong>in</strong>g <strong>in</strong> them. UnfortunatelyI never achieved that goal, but I do th<strong>in</strong>k <strong>in</strong> <strong>C#</strong> (and a few other languages). 4There are people who are quite capable of programm<strong>in</strong>g reasonably reliably <strong>in</strong> a computerlanguage without ever gett<strong>in</strong>g comfortable (or even <strong>in</strong>timate) with it. They willalways write their code with an accent, usually one rem<strong>in</strong>iscent of whatever languagethey are comfortable <strong>in</strong>.While you can learn the mechanics of <strong>C#</strong> without know<strong>in</strong>g anyth<strong>in</strong>g about the context<strong>in</strong> which it was designed, you’ll have a closer relationship with it if you understandwhy it looks the way it does—its ancestry, effectively. The technological landscape andits evolution have a significant impact on how both languages and libraries evolve, solet’s take a brief walk through <strong>C#</strong>’s history, see<strong>in</strong>g how it fits <strong>in</strong> with the stories of othertechnologies, both those from Microsoft and those developed elsewhere. This is by nomeans a comprehensive history of comput<strong>in</strong>g at the end of the twentieth century andthe start of the twenty-first—any attempt at such a history would take a whole (large)book <strong>in</strong> itself. However, I’ve <strong>in</strong>cluded the products and technologies that I believehave most strongly <strong>in</strong>fluenced .NET and <strong>C#</strong> <strong>in</strong> particular.1.2.1 The world before <strong>C#</strong>We’re actually go<strong>in</strong>g to start with Java. Although it would be a stretch to claim that<strong>C#</strong> and .NET def<strong>in</strong>itely wouldn’t have come <strong>in</strong>to be<strong>in</strong>g without Java, it would also behard to argue that it had no effect. Java 1.0 was released <strong>in</strong> January 1996 and theworld went applet mad. Briefly. Java was very slow (at the time it was 100 percent<strong>in</strong>terpreted) and most of the applets on the Web were fairly useless. The speed graduallyimproved as just-<strong>in</strong>-time compilers (JITs) were <strong>in</strong>troduced, and developersstarted look<strong>in</strong>g at us<strong>in</strong>g Java on the server side <strong>in</strong>stead of on the client. Java 1.2 (orJava 2, depend<strong>in</strong>g on whether you talk developer version numbers or market<strong>in</strong>g versionnumbers) overhauled the core libraries significantly, the servlet API and JavaServerPages took off, and Sun’s Hotspot eng<strong>in</strong>e boosted the performance significantly.3http://l<strong>in</strong>q<strong>in</strong>action.net/blogs/ma<strong>in</strong>/archive/2006/06/26/Introduc<strong>in</strong>g-L<strong>in</strong>q-to-Amazon.aspx4Not all the time, I hasten to add. Only when I’m cod<strong>in</strong>g.Licensed to Rhona Hadida


A brief history of <strong>C#</strong> (and related technologies)19Java is reasonably portable, despite the “write once, debug everywhere” skit on Sun’scatchphrase of “write once, run anywhere.” The idea of lett<strong>in</strong>g coders develop enterpriseJava applications on W<strong>in</strong>dows with friendly IDEs and then deploy (without evenrecompil<strong>in</strong>g) to powerful Unix servers was a compell<strong>in</strong>g proposition—and clearlysometh<strong>in</strong>g of a threat to Microsoft.Microsoft created their own Java Virtual Mach<strong>in</strong>e (JVM), which had reasonableperformance and a very fast startup time, and even released an IDE for it, named J++.However, they <strong>in</strong>troduced <strong>in</strong>compatible extensions <strong>in</strong>to their platform, and Sun suedMicrosoft for violat<strong>in</strong>g licens<strong>in</strong>g terms, start<strong>in</strong>g a very long (and frankly tedious) legalbattle. The ma<strong>in</strong> impact of this legal battle was felt long before the case was concluded—whilethe rest of the world moved on to Java 1.2 and beyond, Microsoft’s versionof Java stayed at 1.1, which made it effectively obsolete pretty rapidly. It was clearthat whatever Microsoft’s vision of the future was, Java itself was unlikely to be a majorpart of it.In the same period, Microsoft’s Active Server Pages (ASP) ga<strong>in</strong>ed popularity too.After an <strong>in</strong>itial launch <strong>in</strong> December 1996, two further versions were released <strong>in</strong> 1997and 2000. ASP made dynamic web development much simpler for developers onMicrosoft servers, and eventually third parties ported it to non-W<strong>in</strong>dows platforms.Despite be<strong>in</strong>g a great step forward <strong>in</strong> the W<strong>in</strong>dows world, ASP didn’t tend to promotethe separation of presentation logic, bus<strong>in</strong>ess logic, and data persistence, which mostof the vast array of Java web frameworks encouraged.1.2.2 <strong>C#</strong> and .NET are born<strong>C#</strong> and .NET were properly unveiled at the Professional Developers Conference(PDC) <strong>in</strong> July 2000, although some elements had been preannounced before then,and there had been talk about the same technologies under different names(<strong>in</strong>clud<strong>in</strong>g COOL, COM3, and Lightn<strong>in</strong>g) for a long time. Not that Microsoft hadn’tbeen busy with other th<strong>in</strong>gs, of course—that year also saw both W<strong>in</strong>dows Me andW<strong>in</strong>dows 2000 be<strong>in</strong>g released, with the latter be<strong>in</strong>g wildly successful compared withthe former.Microsoft didn’t “go it alone” with <strong>C#</strong> and .NET, and <strong>in</strong>deed when the specificationsfor <strong>C#</strong> and the Common Language Infrastructure (CLI) were submitted toECMA (an <strong>in</strong>ternational standards body), they were co-sponsored by Hewlett-Packardand Intel along with Microsoft. ECMA ratified the specification (with some modifications),and later versions of <strong>C#</strong> and the CLI have gone through the same process. <strong>C#</strong>and Java are “open” <strong>in</strong> different ways, with Microsoft favor<strong>in</strong>g the standardizationpath and Sun gradually open sourc<strong>in</strong>g Java and allow<strong>in</strong>g or even encourag<strong>in</strong>g otherJava runtime environments. There are alternative CLI and <strong>C#</strong> implementations, themost visible be<strong>in</strong>g the Mono project, 5 but they don’t generally implement the wholeof what we th<strong>in</strong>k of as the .NET Framework. Commercial reliance on and support5http://www.mono-project.comLicensed to Rhona Hadida


20 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentof non-Microsoft implementations is small, outside of Novell, which sponsors theMono project.Although <strong>C#</strong> and .NET weren’t released until 2002 (along with Visual Studio.NET 2002), betas were available long before then, and by the time everyth<strong>in</strong>g was official,<strong>C#</strong> was already a popular language. ASP.NET was launched as part of .NET 1.0, andit was clear that Microsoft had no plans to do anyth<strong>in</strong>g more with either “ASP Classic”or “VB Classic”—much to the annoyance of many VB6 developers. While VB.NET lookssimilar to VB6, there are enough differences to make the transition a nontrivial one—not least of which is learn<strong>in</strong>g the .NET Framework. Many developers have decided to gostraight from VB6 to <strong>C#</strong>, for various reasons.1.2.3 M<strong>in</strong>or updates with .NET 1.1 and the first major step: .NET 2.0As is often the case, the 1.0 release was fairly quickly followed by .NET 1.1, whichlaunched with Visual Studio .NET 2003 and <strong>in</strong>cluded <strong>C#</strong> 1.2. There were few significantchanges to either the language or the framework libraries—<strong>in</strong> a sense, it wasmore of a service pack than a truly new release. Despite the small number of changes,it’s rare to see anyone us<strong>in</strong>g .NET 1.0 at the time of this writ<strong>in</strong>g, although 1.1 is stillvery much alive and kick<strong>in</strong>g, partly due to the OS requirements of 2.0.While Microsoft was busy br<strong>in</strong>g<strong>in</strong>g its new platform to the world, Sun (andits other significant partners, <strong>in</strong>clud<strong>in</strong>g IBM) hadn’t left Java stagnat<strong>in</strong>g.Not quite, anyway. Java 1.5 (Java 5 for the market<strong>in</strong>g folk among you) waslaunched <strong>in</strong> September 2004, with easily the largest set of languageenhancements <strong>in</strong> any Java release, <strong>in</strong>clud<strong>in</strong>g generics, enums (supported<strong>in</strong> a very cool way—far more object-oriented than the “named numbers”that <strong>C#</strong> provides), an enhanced for loop (foreach to you and me), annotations(read: attributes), “varargs” (broadly equivalent to parameterarrays of <strong>C#</strong>—the params modifier), and automatic box<strong>in</strong>g/unbox<strong>in</strong>g. It would be foolishto suggest that all of these enhancements were due to <strong>C#</strong> hav<strong>in</strong>g taken off (after all,putt<strong>in</strong>g generics <strong>in</strong>to the language had been talked about s<strong>in</strong>ce 1997), but it’s alsoworth acknowledg<strong>in</strong>g the competition for the m<strong>in</strong>dshare of developers. For Sun,Microsoft, and other players, it’s not just about com<strong>in</strong>g up with a great language: it’sabout persuad<strong>in</strong>g developers to write software for their platform.<strong>C#</strong> and Java have both been cautious when it comes to <strong>in</strong>troduc<strong>in</strong>g powerful featuressuch as templates and macros from C++. Every new feature has to earn its place<strong>in</strong> the language <strong>in</strong> terms of not just power, but also ease of use and readability—andsometimes that can take time. For example, both Java and <strong>C#</strong> shipped without anyth<strong>in</strong>glike C++ templates to start with, and then worked out ways of provid<strong>in</strong>g much oftheir value with as few risks and drawbacks as possible. We’ll see <strong>in</strong> chapter 3 thatalthough Java and <strong>C#</strong> generics look quite similar on the most superficial level, theydiffer significantly under the surface.Imitation isthe s<strong>in</strong>cerestform offlatteryLicensed to Rhona Hadida


A brief history of <strong>C#</strong> (and related technologies)21NOTEThe pioneer<strong>in</strong>g role of Microsoft Research—Microsoft Research is responsiblefor some of the new directions for .NET and <strong>C#</strong>. They published a paperon .NET generics as early as May 2001 (yes, even before .NET 1.0 hadbeen released!) and worked on an extension called Cω (pronounced Comega), which <strong>in</strong>cluded—among other th<strong>in</strong>gs—some of the ideas whichlater formed LINQ. Another <strong>C#</strong> extension, Spec#, adds contracts to <strong>C#</strong>,allow<strong>in</strong>g the compiler to do more verification automatically. 6 We willhave to wait and see whether any or all of the ideas of Spec# eventuallybecome part of <strong>C#</strong> itself.<strong>C#</strong> 2 was released <strong>in</strong> November 2005, as part of .NET 2.0 and alongside Visual Studio2005 and VB8. Visual Studio became more productive to work with as an IDE—particularlynow that refactor<strong>in</strong>g was f<strong>in</strong>ally <strong>in</strong>cluded—and the significant improvementsto both the language and the platform were warmly welcomed by most developers.As a sign of just how quickly the world is mov<strong>in</strong>g on—and of how long it takes toactually br<strong>in</strong>g a product to market—it’s worth not<strong>in</strong>g that the first announcementsabout <strong>C#</strong> 3 were made at the PDC <strong>in</strong> September 2005, which was two months before<strong>C#</strong> 2 was released. The sad part is that while it seems to take two years to br<strong>in</strong>g a productfrom announcement to market, it appears that the <strong>in</strong>dustry takes another year ortwo—at least—to start widely embrac<strong>in</strong>g it. As mentioned earlier, many companies areonly now transition<strong>in</strong>g from .NET 1.1 to 2.0. We can only hope that it will be a shorterpath to widespread adoption of .NET 3.0 and 3.5. (<strong>C#</strong> 3 comes with .NET 3.5, althoughyou can use many <strong>C#</strong> 3 features while still target<strong>in</strong>g .NET 2.0. I’ll talk about the versionnumbers shortly.)One of the reasons .NET 2.0 took so long to come out is that it was be<strong>in</strong>g embeddedwith<strong>in</strong> SQL Server 2005, with the obvious robustness and reliability concerns thatgo hand <strong>in</strong> hand with such a system. This allows .NET code to execute right <strong>in</strong>side thedatabase, with potential for much richer logic to sit so close to the data. Database folktend to be rather cautious, and only time will tell how widely this ability is used—butit’s a powerful tool to have available if you f<strong>in</strong>d you need it.1.2.4 “Next generation” productsIn November 2006 (a year after .NET 2.0 was released), Microsoft launched W<strong>in</strong>dowsVista, Office 2007, and Exchange Server 2007. This <strong>in</strong>cluded launch<strong>in</strong>g .NET 3.0,which comes pre<strong>in</strong>stalled on Vista. Over time, this is likely to aid adoption of .NET clientapplications for two reasons. First, the old “.NET isn’t <strong>in</strong>stalled on all computers”objection will become less relevant—you can safely assume that if the user is runn<strong>in</strong>gVista, they’ll be able to run a .NET application. Second, W<strong>in</strong>dows Presentation Foundation(WPF) is now the rich client platform of choice for developers <strong>in</strong> Microsoft’sview—and it’s only available from .NET.6http://research.microsoft.com/specsharp/Licensed to Rhona Hadida


22 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentAga<strong>in</strong>, while Microsoft was busy with Vista and other products, the rest of the worldwas <strong>in</strong>novat<strong>in</strong>g too. Lightweight frameworks have been ga<strong>in</strong><strong>in</strong>g momentum, andObject Relational Mapp<strong>in</strong>g (ORM) now has a significant developer m<strong>in</strong>dshare, partlydue to high-quality free frameworks such as Hibernate. The SQL aspect of LINQ ismuch more than just the query<strong>in</strong>g side we’ve seen so far, and marks a more def<strong>in</strong>itestep from Microsoft than its previous lukewarm ventures <strong>in</strong>to this area, such asObjectSpaces. Only time will tell whether LINQ to SQL or perhaps its cous<strong>in</strong> theADO.NET Entity Framework hits the elusive sweet spot of mak<strong>in</strong>g database access trulysimple—they’re certa<strong>in</strong>ly very promis<strong>in</strong>g.Visual Studio 2008 was released <strong>in</strong> November 2007, <strong>in</strong>clud<strong>in</strong>g .NET 3.5, <strong>C#</strong> 3, andVB9. It conta<strong>in</strong>s built-<strong>in</strong> support for many features that were previously only available asextensions to Visual Studio 2005, as well as the new language and framework features.Cont<strong>in</strong>u<strong>in</strong>g the trend from Visual Studio 2005, a free Express edition is available foreach language. With the ability to target multiple versions of the .NET Framework andonly m<strong>in</strong>imal solution and project changes when migrat<strong>in</strong>g exist<strong>in</strong>g code, there is littlereason not to upgrade to Visual Studio 2008—I expect its adoption rate to be far fasterthan that of Visual Studio 2005.Dynamic languages have become <strong>in</strong>creas<strong>in</strong>gly important, with many options vy<strong>in</strong>gfor developers’ attention. Ruby—and particularly the Ruby on Rails framework—hashad a large impact (with ports for Java and .NET), and other projects such as Groovy onthe Java platform and IronRuby and IronPython on .NET are ga<strong>in</strong><strong>in</strong>g support. As partof Silverlight 2.0, Microsoft will release the Dynamic Language Runtime (DLR), whichis a layer on top of the CLR to make it more amenable to dynamic languages. Silverlightis part of another battleground, but this time for rich Internet applications (RIAs),where Microsoft is compet<strong>in</strong>g with Adobe Flex and Sun’s JavaFX. Silverlight 1.0 wasreleased <strong>in</strong> September 2007, but this version was based on JavaScript. At the time of thiswrit<strong>in</strong>g, many developers are currently await<strong>in</strong>g 1.1, which will ship with a “m<strong>in</strong>i-CLR”and cater for multiple platforms.1.2.5 Historical perspective and the fight for developer supportIt’s hard to describe all of these strands <strong>in</strong>terweav<strong>in</strong>g through history and yet keep abird’s-eye view of the period. Figure 1.5 shows a collection of timel<strong>in</strong>es with some ofthe major milestones described earlier, with<strong>in</strong> different technological areas. The list isnot comprehensive, of course, but it gives some <strong>in</strong>dication of which product versionswere compet<strong>in</strong>g at different times.There are many ways to look at technological histories, and many untold stories<strong>in</strong>fluenc<strong>in</strong>g events beh<strong>in</strong>d the scenes. It’s possible that this retrospective overemphasizesthe <strong>in</strong>fluence of Java on the development of .NET and <strong>C#</strong>, and that maywell partly be due to my mixed allegiances to both technologies. However, it seemsto me that the large wars for developer support are tak<strong>in</strong>g place among the follow<strong>in</strong>gcamps.Licensed to Rhona Hadida


A brief history of <strong>C#</strong> (and related technologies)23Java .NETOther MS OthersJDK 1.01996JDK 1.1VB6 1996ASP 1.0Ruby 1.019971997ASP 2.0W<strong>in</strong>dows 98PHP 3.019981998J2SE 1.219991999J2EE 1.2W<strong>in</strong>dows 20002000J2SE 1.3.NETunveiledASP 3.0PHP 4.020002001J2EE 1.3J2SE 1.4.NET 1.0,<strong>C#</strong> 1.0,VS.NET 2002W<strong>in</strong>dows XPMonoannounced20012002Hibernate 1.020022003.NET 1.1,<strong>C#</strong> 1.2,VS.NET 2003W<strong>in</strong>dowsServer 2003Hibernate 2.02003J2EE 1.42004J2SE 5.0Mono 1.0,PHP 5.02004Hibernate 3.02005.NET 2.0,<strong>C#</strong> 2.0,VS 2005SQL Server2005Ruby On Rails1.02005Java EE 52006J2SE 6.0.NET 3.0W<strong>in</strong>dows Vista,Exchange 2007Groovy 1.020062007.NET 3.5,<strong>C#</strong> 3.0,VS 2008Silverlight1.0Ruby On Rails2.02007Figure 1.5Timel<strong>in</strong>e of releases for <strong>C#</strong>, .NET, and related technologiesLicensed to Rhona Hadida


24 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> development■ Native code (primarily C and C++) developers, who will have to be conv<strong>in</strong>cedabout the reliability and performance of managed code before chang<strong>in</strong>g theirhabits. C++/CLI is the obvious way of dipp<strong>in</strong>g a toe <strong>in</strong> the water here, but itspopularity may not be all that Microsoft had hoped for.■ VB6 developers who may have antipathy toward Microsoft for abandon<strong>in</strong>g theirpreferred platform, but will need to decide which way to jump sooner or later—and .NET is the most obvious choice for most people at this stage. Some maycross straight to <strong>C#</strong>, with others mak<strong>in</strong>g the smaller move to VB.NET.■ Script<strong>in</strong>g and dynamic language developers who value the immediacy ofchanges. Familiar languages runn<strong>in</strong>g on managed platforms can act as Trojanhorses here, encourag<strong>in</strong>g developers to learn the associated frameworks for use<strong>in</strong> their dynamic code, which then lowers the barrier to entry for learn<strong>in</strong>g thetraditional object-oriented languages for the relevant platform. The IronPythonprogrammer of today may well become the <strong>C#</strong> programmer of tomorrow.■ “Traditional” managed developers, primarily writ<strong>in</strong>g <strong>C#</strong>, VB.NET, or Java. Herethe war is not about whether or not runn<strong>in</strong>g under some sort of managed environmentis a good th<strong>in</strong>g, but which managed environment to use. The battlegroundsare primarily <strong>in</strong> tools, portability, performance, and libraries, all ofwhich have come on <strong>in</strong> leaps and bounds. Competition between different .NETlanguages is partly <strong>in</strong>ternal to Microsoft, with each team want<strong>in</strong>g its own languageto have the best support—and features developed primarily for onelanguage can often be used by another <strong>in</strong> the fullness of time.■ Web developers who have already had to move from static HTML, to dynamicallygenerated content, to a nicer user experience with Ajax. Now the age ofRIAs is upon us, with three very significant contenders <strong>in</strong> Microsoft, Adobe, andSun. At the time of this writ<strong>in</strong>g, it’s too early to tell whether there will be a clearw<strong>in</strong>ner here or whether the three can all garner enough support to make themviable for a long time to come. Although it’s possible to use a .NET-based RIAsolution with a Java-based server to some extent, the development process is significantlyeasier when technologies are aligned, so captur<strong>in</strong>g the market here isimportant for all parties.One th<strong>in</strong>g is clear from all of this—it’s a good time to be a developer. Companies are<strong>in</strong>vest<strong>in</strong>g a lot of time and money <strong>in</strong> mak<strong>in</strong>g software development a fun and profitable<strong>in</strong>dustry to be <strong>in</strong>. Given the changes we’ve seen over the last decade or so, it’s difficultto predict what programm<strong>in</strong>g will look like <strong>in</strong> another decade, but it’ll be afantastic journey gett<strong>in</strong>g there.I mentioned earlier that <strong>C#</strong> 3 is effectively part of .NET 3.5. It’s worth tak<strong>in</strong>g a bit oftime to look at the different aspects that together make up .NET.1.3 The .NET platformWhen it was orig<strong>in</strong>ally <strong>in</strong>troduced, “.NET” was used as a catchall term for a vast rangeof technologies com<strong>in</strong>g from Microsoft. For <strong>in</strong>stance, W<strong>in</strong>dows Live ID was calledLicensed to Rhona Hadida


The .NET platform25.NET Passport despite there be<strong>in</strong>g no clear relationship between that and what we currentlyknow as .NET. Fortunately th<strong>in</strong>gs have calmed down somewhat s<strong>in</strong>ce then. Inthis section we’ll look at the various parts of .NET (at least the ones we’re <strong>in</strong>terested<strong>in</strong>) and how they have been separately versioned.1.3.1 Dist<strong>in</strong>guish<strong>in</strong>g between language, runtime, and librariesIn several places <strong>in</strong> this book, I’ll refer to three different k<strong>in</strong>ds of features: features of<strong>C#</strong> as a language, features of the runtime that provides the “eng<strong>in</strong>e” if you will, and featuresof the .NET framework libraries. In particular, this book is heavily focused on thelanguage of <strong>C#</strong>, only expla<strong>in</strong><strong>in</strong>g runtime and framework features when they relate tofeatures of <strong>C#</strong> itself. This only makes sense if there is a clear dist<strong>in</strong>ction between thethree. Often features will overlap, but it’s important to understand the pr<strong>in</strong>ciple ofthe matter.LANGUAGEThe language of <strong>C#</strong> is def<strong>in</strong>ed by its specification, which describes the format of <strong>C#</strong>source code, <strong>in</strong>clud<strong>in</strong>g both syntax and behavior. It does not describe the platform thatthe compiler output will run on, beyond a few key po<strong>in</strong>ts at which the two <strong>in</strong>teract. For<strong>in</strong>stance, the <strong>C#</strong> language requires a type called System.IDisposable, which conta<strong>in</strong>sa method called Dispose. These are required <strong>in</strong> order to def<strong>in</strong>e the us<strong>in</strong>g statement.Likewise, the platform needs to be able to support (<strong>in</strong> one form or other) both valuetypes and reference types, along with garbage collection.In theory, any platform that supports the required features could have a <strong>C#</strong> compilertarget<strong>in</strong>g it. For example, a <strong>C#</strong> compiler could legitimately produce output <strong>in</strong> a formother than the Intermediate Language (IL), which is the typical output at the time ofthis writ<strong>in</strong>g. A runtime could legitimately <strong>in</strong>terpret the output of a <strong>C#</strong> compiler ratherthan JIT-compil<strong>in</strong>g it. In practice, although <strong>in</strong>terpret<strong>in</strong>g IL is possible (and <strong>in</strong>deed supportedby Mono), we are unlikely to see widespread use of <strong>C#</strong> on platforms that are verydifferent from .NET.RUNTIMEThe runtime aspect of the .NET platform is the relatively small amount of code that isresponsible for mak<strong>in</strong>g sure that programs written <strong>in</strong> IL execute accord<strong>in</strong>g to the CLIspecification, partitions I to III. The runtime part of the CLI is called the CommonLanguage Runtime (CLR). When I refer to the CLR <strong>in</strong> the rest of the book, I meanMicrosoft’s implementation.Some elements of language never appear at the runtime level, but others cross thedivide. For <strong>in</strong>stance, enumerators aren’t def<strong>in</strong>ed at a runtime level, and neither is anyparticular mean<strong>in</strong>g attached to the IDisposable <strong>in</strong>terface—but arrays and delegatesare important to the runtime.FRAMEWORK LIBRARIESLibraries provide code that is available to our programs. The framework libraries <strong>in</strong>.NET are largely built as IL themselves, with native code used only where necessary. Thisis a mark of the strength of the runtime: your own code isn’t expected to be a secondclasscitizen—it can provide the same k<strong>in</strong>d of power and performance as the librariesLicensed to Rhona Hadida


26 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentit utilizes. The amount of code <strong>in</strong> the library is much larger than that of the runtime,<strong>in</strong> the same way that there’s much more to a car than the eng<strong>in</strong>e.The .NET libraries are partially standardized. Partition IV of the CLI specificationprovides a number of different profiles (compact and kernel) and libraries. Partition IVcomes <strong>in</strong> two parts—a general textual description of the libraries, <strong>in</strong>clud<strong>in</strong>g whichlibraries are required with<strong>in</strong> which profiles, and another part conta<strong>in</strong><strong>in</strong>g the details ofthe libraries themselves <strong>in</strong> XML format. This is the same form of documentation producedwhen you use XML comments with<strong>in</strong> <strong>C#</strong>.There is much with<strong>in</strong> .NET that is not with<strong>in</strong> the base libraries. If you write a programthat only uses libraries from the specification, and only uses them correctly, youshould f<strong>in</strong>d your code works flawlessly on any implementation—Mono, .NET, or anyth<strong>in</strong>gelse. In practice, almost any program of any size will use libraries that aren’tstandardized—W<strong>in</strong>dows Forms or ASP.NET, for <strong>in</strong>stance. The Mono project has itsown libraries that are not part of .NET as well, of course, such as GTK#, <strong>in</strong> addition toimplement<strong>in</strong>g many of the nonstandardized libraries.The term .NET refers to the comb<strong>in</strong>ation of the runtime and libraries provided byMicrosoft, and it also <strong>in</strong>cludes compilers for <strong>C#</strong> and VB.NET. It can be seen as a wholedevelopment platform built on top of W<strong>in</strong>dows.Now that we know what term means what, we can look at different versions availableof each. The subject of the version numbers chosen by Microsoft and what’s <strong>in</strong>which version is a slightly convoluted one, but it’s important that we all agree on whatwe mean when we talk about a particular version.1.3.2 Untangl<strong>in</strong>g version number chaosA newcomer to the <strong>in</strong>dustry might th<strong>in</strong>k that com<strong>in</strong>g up with version numbers wouldbe easy. You start with 1, then move on to 2, then 3 <strong>in</strong> a logical progression, right? If onlythat were the case… Software products and projects of all natures like to keep m<strong>in</strong>orversion changes dist<strong>in</strong>ct from major ones, and then there are patch levels, service packs,build numbers, and so forth. In addition, there are the codenames, which are widely usedand then abandoned, much to the frustration of “bleed<strong>in</strong>g edge” book authors andpublishers. Fortunately from the po<strong>in</strong>t of view of <strong>C#</strong> as a language we can make life reasonablystraightforward.NOTEKeep<strong>in</strong>g it simple: <strong>C#</strong> 1, <strong>C#</strong> 2, and <strong>C#</strong> 3—Throughout this book, I’ll refer to<strong>C#</strong> versions as just 1, 2, and 3. There’s little po<strong>in</strong>t <strong>in</strong> dist<strong>in</strong>guish<strong>in</strong>gbetween the two 1.x versions, and no po<strong>in</strong>t <strong>in</strong> add<strong>in</strong>g a cumbersomeextra “.0” every time I refer to the different versions—which of course I’llbe do<strong>in</strong>g quite a lot.We don’t just need to keep track of the language, unfortunately. There are five th<strong>in</strong>gswe’re <strong>in</strong>terested <strong>in</strong>, when it comes to version<strong>in</strong>g.■■■The .NET FrameworkFramework librariesThe CLRLicensed to Rhona Hadida


The .NET platform27■■<strong>C#</strong> (the version of the compiler that comes with the framework)Visual Studio—version number and codenameJust for kicks, we’ll throw <strong>in</strong> the Visual Basic number<strong>in</strong>g and nam<strong>in</strong>g too. (Visual Studiois abbreviated to VS and Visual Basic is abbreviated to VB for reasons of space.)Table 1.1 shows the different version numbers.Table 1.1Cross-reference table for versions of different products and technologies.NETFrameworklibraries (max)CLR <strong>C#</strong> Visual Studio Visual Basic1.0 1.0 1.0 1.0 VS .NET 2002 (no codename) VB.NET 7.01.1 1.1 1.1 1.2 aVS .NET 2003 (Everett) VB.NET 7.12.0 2.0 2.0 2.0 VS 2005 (Whidbey) VB 8.03.0 3.0 2.0 2.0 VS 2005 (extension previews),VS 2008 (full support)VB 8.03.5 3.5 2.0 3.0 VS 2008 (Orcas) VB 9.0a. I’ve no idea why this isn’t 1.1. I only discovered that it was 1.2 while research<strong>in</strong>g this book. That’s thenumber<strong>in</strong>g accord<strong>in</strong>g to Microsoft’s version of the specification, at least. I decided not to confuse mattersfurther by also <strong>in</strong>clud<strong>in</strong>g the ECMA-334 edition number here, although that’s another story <strong>in</strong> its own right.Note how both Visual Studio and Visual Basic lost the “.NET” moniker between 2003and 2005, <strong>in</strong>dicat<strong>in</strong>g Microsoft’s emphasis on this be<strong>in</strong>g the tool for W<strong>in</strong>dows development,as far as they’re concerned.As you can see, so far the version of the overall framework has followed the librariesexactly. However, it would be possible for a new version of the CLR with more capabilitiesto still be released with the exist<strong>in</strong>g libraries, so we could (for <strong>in</strong>stance) have.NET 4.0 with libraries from 3.5, a CLR 3.0, and a <strong>C#</strong> 3 compiler. Let’s hope it doesn’tcome to that. As it is, Microsoft has already confounded developers somewhat with thelast two l<strong>in</strong>es of the table..NET 3.0 is really just the addition of four libraries: W<strong>in</strong>dows Presentation Foundation(WPF), W<strong>in</strong>dows Communication Foundation (WCF), W<strong>in</strong>dows WorkflowFoundation (WF 7 ), and W<strong>in</strong>dows CardSpace. None of the exist<strong>in</strong>g library classeswere changed, and neither was the CLR, nor any of the languages target<strong>in</strong>g the CLR,so creat<strong>in</strong>g a whole new major version number for this feels a bit over the top.Next comes .NET 3.5. This time, along with completely new classes (notably LINQ)there are many enhancements to the base class libraries (BCL—types with<strong>in</strong> thenamespaces such as System, System.IO; the core of the framework libraries). There’s anew version of <strong>C#</strong>, without which this book would be considerably shorter, and a new versionof Visual Studio to support that and VB 9.0. Apparently all of that isn’t worth a majorversion number change, though. There are service packs for both .NET 2.0 and 3.0, and7Not WWF due to wrestl<strong>in</strong>g and wildlife conflicts.Licensed to Rhona Hadida


28 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> developmentboth service packs ship with Visual Studio 2008—so while you can target .NET 2.0and 3.0 with the latest and greatest IDE (as well as 3.5, of course) you should be awarethat what you’ll really be compil<strong>in</strong>g and runn<strong>in</strong>g aga<strong>in</strong>st is 2.0SP1, 3.0SP1 or 3.5.OK, rant over. It’s only version numbers, after all—but it is important to understandwhat each version means, if for no other reason than communication. If someonesays they’re us<strong>in</strong>g “3.0” you need to check whether they mean <strong>C#</strong> 3 or .NET 3.0.If all this talk of history and version<strong>in</strong>g is mak<strong>in</strong>g you want to get back onto thefamiliar ground of actual programm<strong>in</strong>g, don’t worry—we’re nearly there. Indeed, ifyou fancy writ<strong>in</strong>g some code right now, the next section <strong>in</strong>vites you to do just that, as I<strong>in</strong>troduce the style I’ll be us<strong>in</strong>g for most of the examples <strong>in</strong> this book.1.4 Fully functional code <strong>in</strong> snippet formOne of the challenges when writ<strong>in</strong>g a book about a computer language (other thanscript<strong>in</strong>g languages) is that complete programs—ones that the reader can compileand run with no source code other than what’s presented—get pretty long prettyquickly. I wanted to get around this, to provide you with code that you could easilytype <strong>in</strong> and experiment with: I believe that actually try<strong>in</strong>g someth<strong>in</strong>g is a much betterway of learn<strong>in</strong>g about it than just read<strong>in</strong>g.The solution I’ve come up with isn’t applicable to all situations, but it will serve uswell for most of the example code. It would be awful to use for “real” development,but it’s specifically tailored to the context we’re work<strong>in</strong>g <strong>in</strong>: present<strong>in</strong>g and play<strong>in</strong>gwith code that can be compiled and run with the m<strong>in</strong>imal amount of fuss. That’s notto say you should only use it for experimentation when read<strong>in</strong>g this book—I’ve foundit useful as a general way of test<strong>in</strong>g the behavior of small pieces of code.1.4.1 Snippets and their expansionsWith the right assembly references and the right us<strong>in</strong>g directives, you can accomplishquite a lot <strong>in</strong> a fairly short amount of <strong>C#</strong> code—but the killer is the fluff <strong>in</strong>volved <strong>in</strong> writ<strong>in</strong>gthose us<strong>in</strong>g directives, then declar<strong>in</strong>g a class, then declar<strong>in</strong>g a Ma<strong>in</strong> method beforeyou’ve even written the first l<strong>in</strong>e of useful code. My examples are mostly <strong>in</strong> the form ofsnippets, which ignore the fluff that gets <strong>in</strong> the way of simple programs, concentrat<strong>in</strong>gon the important part. So, for example, suppose I presented the snippet <strong>in</strong> list<strong>in</strong>g 1.18.List<strong>in</strong>g 1.18The first snippet, which simply displays two words on separate l<strong>in</strong>esforeach (str<strong>in</strong>g x <strong>in</strong> new str<strong>in</strong>g[] {"Hello", "There"}){Console.WriteL<strong>in</strong>e (x);}This code clearly won’t compile on its own—there’s no class declaration, for a start. Thecode from list<strong>in</strong>g 1.18 corresponds to the full program shown <strong>in</strong> list<strong>in</strong>g 1.19.List<strong>in</strong>g 1.19Expanded form of list<strong>in</strong>g 1.18, creat<strong>in</strong>g a complete programus<strong>in</strong>g System;public class SnippetLicensed to Rhona Hadida


Fully functional code <strong>in</strong> snippet form29{}[STAThread]static void Ma<strong>in</strong>(str<strong>in</strong>g[] args){foreach (str<strong>in</strong>g x <strong>in</strong> new str<strong>in</strong>g[] {"Hello", "There"}){Console.WriteL<strong>in</strong>e (x);}}Occasionally extra methods or even types are required, with a bit of code <strong>in</strong> the Ma<strong>in</strong>method to access them. I <strong>in</strong>dicate this by list<strong>in</strong>g the non-Ma<strong>in</strong> code, then an ellipsis(...) and then the Ma<strong>in</strong> code. So the code <strong>in</strong> list<strong>in</strong>g 1.20 would turn <strong>in</strong>to list<strong>in</strong>g 1.21.List<strong>in</strong>g 1.20A code snippet with an extra method, called with<strong>in</strong> the Ma<strong>in</strong> methodstatic str<strong>in</strong>g[] GetGreet<strong>in</strong>gWords(){return new str<strong>in</strong>g[] {"Hello", "There"};}...foreach (str<strong>in</strong>g x <strong>in</strong> GetGreet<strong>in</strong>gWords()){Console.WriteL<strong>in</strong>e (x);}List<strong>in</strong>g 1.21 Expanded form of list<strong>in</strong>g 1.20us<strong>in</strong>g System;public class Snippet{static str<strong>in</strong>g[] GetGreet<strong>in</strong>gWords(){return new str<strong>in</strong>g[] {"Hello", "There"};}}[STAThread]static void Ma<strong>in</strong>(str<strong>in</strong>g[] args){foreach (str<strong>in</strong>g x <strong>in</strong> GetGreet<strong>in</strong>gWords()){Console.WriteL<strong>in</strong>e (x);}}Types declared <strong>in</strong> snippets will be nested with<strong>in</strong> the Snippet class, but that’s veryrarely a problem.Now that we understand what snippets are and what they look like when they’reexpanded, let’s make them a bit more user friendly.Licensed to Rhona Hadida


30 CHAPTER 1 The chang<strong>in</strong>g face of <strong>C#</strong> development1.4.2 Introduc<strong>in</strong>g SnippyJust know<strong>in</strong>g what the code would look like isn’t terribly helpful, so I’ve written a smalltool that you can download from the book’s website. It’s written <strong>in</strong> WPF, so you’ll needto have .NET 3.0 or higher <strong>in</strong>stalled <strong>in</strong> order to run it. Figure 1.6 shows a screenshot ofit <strong>in</strong> action.It’s not a visual masterpiece, but it does the job. You can edit the code, compile it,and run it. There are different options available to use different us<strong>in</strong>g directives andreferences depend<strong>in</strong>g on which part of the book you are look<strong>in</strong>g at, although thechoice of “.NET 3.5” will compile anyth<strong>in</strong>g that doesn’t require extra custom references.Snippy doesn’t try to work out which us<strong>in</strong>g directives are actually required by thecode, so the full code is rather longer than the examples <strong>in</strong> the previous section, buthav<strong>in</strong>g extra us<strong>in</strong>g directives is harmless.Aside from the WPF requirement to run Snippy, everyth<strong>in</strong>g <strong>in</strong> the <strong>C#</strong> 2 section ofthe book compiles and runs with only .NET 2.0 <strong>in</strong>stalled, and all the snippets compileand run with .NET 3.5 <strong>in</strong>stalled. There’s a s<strong>in</strong>gle button to compile and run, as you’reunlikely to want to do anyth<strong>in</strong>g after a successful compilation other than runn<strong>in</strong>gthe code.As I mentioned earlier, not all examples work this way—the examples <strong>in</strong> this chapter,for <strong>in</strong>stance, all require the Product type, which isn’t <strong>in</strong>cluded <strong>in</strong> every snippet. Fromthis po<strong>in</strong>t on, however, I will give fair warn<strong>in</strong>g whenever a list<strong>in</strong>g isn’t a snippet—sounless you hear otherwise, you should be able to type it <strong>in</strong> and play around with it.Of course, if you don’t like manually typ<strong>in</strong>g <strong>in</strong> code from books, you can downloadall of the code from the book’s website, <strong>in</strong>clud<strong>in</strong>g extra examples that don’tappear directly <strong>in</strong> the text. All the code works <strong>in</strong> the Express editions of Visual<strong>C#</strong> 2005 and 2008, although of course the examples that are specific to <strong>C#</strong> 3 don’trun <strong>in</strong> Visual <strong>C#</strong> 2005. 8Figure 1.6 Snippy <strong>in</strong> action.The code <strong>in</strong> the top area isconverted <strong>in</strong>to a full program,then run. Its output is shown<strong>in</strong> the bottom area.8Some of them may run <strong>in</strong> Visual Studio 2005 with the <strong>C#</strong> 3 extension Community Technology Preview (CTP)<strong>in</strong>stalled, but I make no guarantees. The language has changed <strong>in</strong> a few ways s<strong>in</strong>ce the f<strong>in</strong>al CTP was released,and I haven’t tested any of the code <strong>in</strong> this environment. Visual <strong>C#</strong> 2008 Express is free, though, so why notgive it a try?Licensed to Rhona Hadida


Summary311.5 SummaryIn this chapter, I’ve shown (but not expla<strong>in</strong>ed) some of the features that are tackled <strong>in</strong>depth <strong>in</strong> the rest of the book. There are plenty more that haven’t been shown here,and all the features we’ve seen so far have further “subfeatures” associated with them.Hopefully what you’ve seen here has whetted your appetite for the rest of the book.After look<strong>in</strong>g through some actual code, we took a step back to consider the historyof <strong>C#</strong> and the .NET Framework. No technology is developed <strong>in</strong> a vacuum, andwhen it’s commercial (whether or not it directly comes with a price tag) you can guaranteethat the fund<strong>in</strong>g body sees a bus<strong>in</strong>ess opportunity <strong>in</strong> that development. I’ve notbeen through Microsoft’s <strong>in</strong>ternal company memos, nor have I <strong>in</strong>terviewed Bill Gates,but I’ve given my view on the reasons Microsoft has <strong>in</strong>vested so much <strong>in</strong> .NET, andwhat the rest of the world has been do<strong>in</strong>g <strong>in</strong> the same period. By talk<strong>in</strong>g around thelanguage, I hope I’ve made you more comfortable <strong>in</strong> the language, and what it’s try<strong>in</strong>gto achieve.We then performed a little detour by way of version numbers. This was ma<strong>in</strong>ly tomake sure that you’ll understand what I mean when I refer to particular .NET and <strong>C#</strong>version numbers (and how different those two can be!), but it might also help whentalk<strong>in</strong>g with other people who may not have quite as clear a grasp on the matter as younow do. It’s important to be able to get to the bottom of what people actually meanwhen they talk about a particular version, and with the <strong>in</strong>formation <strong>in</strong> this chapter youshould be able to ask appropriate questions to get an accurate picture. This could beparticularly useful if you ever talk to other developers <strong>in</strong> a support role—establish<strong>in</strong>gthe operat<strong>in</strong>g environment is always critical.F<strong>in</strong>ally, I described how code will be presented <strong>in</strong> this book, and <strong>in</strong>troducedSnippy, the application you can use to run the code quickly if you don’t want to downloadthe full set of complete samples from the book’s website. This system of codesnippets is designed to pack the book with the really <strong>in</strong>terest<strong>in</strong>g parts of code samples—the bits that demonstrate the language features I’ll be expla<strong>in</strong><strong>in</strong>g—without remov<strong>in</strong>gthe possibility of actually runn<strong>in</strong>g the code yourself.There’s one more area we need to cover before we dive <strong>in</strong>to the features of <strong>C#</strong> 2, andthat’s <strong>C#</strong> 1. Obviously as an author I have no idea how knowledgeable you are about<strong>C#</strong> 1, but I do have some understand<strong>in</strong>g of which areas of <strong>C#</strong> 1 are typically understoodfairly vaguely. Some of these areas are critical to gett<strong>in</strong>g the most out of <strong>C#</strong> 2 and 3, so<strong>in</strong> the next chapter I’ll go over them <strong>in</strong> some detail.Licensed to Rhona Hadida


Core foundations:build<strong>in</strong>g on <strong>C#</strong> 1This chapter covers■■■DelegatesType system characteristicsValue/reference typesThis is not a refresher on the whole of <strong>C#</strong> 1. Let’s get that out of the way immediately.I couldn’t do justice to any topic <strong>in</strong> <strong>C#</strong> if I had to cover the whole of the firstversion <strong>in</strong> a s<strong>in</strong>gle chapter. I’ve written this book assum<strong>in</strong>g that all my readers are atleast reasonably competent <strong>in</strong> <strong>C#</strong> 1. What counts as “reasonably competent” is, ofcourse, a somewhat subjective matter, but I’ll assume you would at least be happy towalk <strong>in</strong>to an <strong>in</strong>terview for a junior <strong>C#</strong> developer role and answer technical questionsappropriate to that job. My expectation is that many readers will have moreexperience, but that’s the level of knowledge I’m assum<strong>in</strong>g.In this chapter we’re go<strong>in</strong>g to focus on three areas of <strong>C#</strong> 1 that are particularlyimportant for <strong>C#</strong> 2 and 3. This should raise the “lowest common denom<strong>in</strong>ator” a little,so that I can make slightly greater assumptions later on <strong>in</strong> the book. Given thatit is a lowest common denom<strong>in</strong>ator, you may well f<strong>in</strong>d you already have a perfect32Licensed to Rhona Hadida


Delegates33understand<strong>in</strong>g of all the concepts <strong>in</strong> this chapter. If you believe that’s the case withouteven read<strong>in</strong>g the chapter, then feel free to skip it. You can always come back later if itturns out someth<strong>in</strong>g wasn’t as simple as you thought. You might want to at least look atthe summary at the end of each section, which highlights the important po<strong>in</strong>ts—if anyof those sound unfamiliar, it’s worth read<strong>in</strong>g that section <strong>in</strong> detail.You may be wonder<strong>in</strong>g why I’ve <strong>in</strong>cluded this chapter at all, if I’ve already assumedyou know <strong>C#</strong>1. Well, my experience is that some of the fundamental aspects of <strong>C#</strong>tend to be fudged over, both <strong>in</strong> books and tutorials. As a result, there’s a lot of somewhathazy understand<strong>in</strong>g of these concepts among developers—creat<strong>in</strong>g a lot of questions(and occasional well-<strong>in</strong>tentioned but ill-<strong>in</strong>formed answers) <strong>in</strong> the <strong>C#</strong> newsgroup,for <strong>in</strong>stance.The misunderstood concepts tend to be about the type system used by <strong>C#</strong> and.NET, and the details around it. If those concepts weren’t important when learn<strong>in</strong>gabout <strong>C#</strong> 2 and 3, I wouldn’t have <strong>in</strong>cluded this chapter, but as it happens they’reabsolutely crucial to the rest of the book. It’s hard to come to grips with generics if youdon’t understand static typ<strong>in</strong>g, for <strong>in</strong>stance, or to understand the problem solved bynullable types if the difference between value types and reference types is a bit of ablur. There’s no shame <strong>in</strong> hav<strong>in</strong>g an <strong>in</strong>complete understand<strong>in</strong>g of these concepts—often the details and differences are only important <strong>in</strong> certa<strong>in</strong> rare situations or whendiscuss<strong>in</strong>g technicalities. Given that, a more thorough understand<strong>in</strong>g of the language<strong>in</strong> which you’re work<strong>in</strong>g is always a good th<strong>in</strong>g.In this chapter we’ll be look<strong>in</strong>g at three high-level concepts:■ Delegates■ Type system characteristics■ Value types and reference typesIn each case I’ll describe the ideas and behavior, as well as take the opportunity todef<strong>in</strong>e terms so that I can use them later on. After we’ve looked at how <strong>C#</strong> 1 works, I’llshow you a quick preview of how many of the new features <strong>in</strong> <strong>C#</strong> 2 and 3 relate to thetopics exam<strong>in</strong>ed <strong>in</strong> this chapter.2.1 DelegatesI’m sure you already have an <strong>in</strong>st<strong>in</strong>ctive idea about the concept of a delegate, hard asit can be to articulate. If you’re familiar with C and had to describe delegates toanother C programmer, the term “function po<strong>in</strong>ter” would no doubt crop up. Essentially,delegates provide a way of giv<strong>in</strong>g a level of <strong>in</strong>direction, so that <strong>in</strong>stead of specify<strong>in</strong>gbehavior directly, it can be <strong>in</strong> some way “conta<strong>in</strong>ed” <strong>in</strong> an object, which can beused like any other object, where one available option is to execute the encapsulatedbehavior. Alternatively, you can th<strong>in</strong>k of a delegate type as a s<strong>in</strong>gle-method <strong>in</strong>terface,and a delegate <strong>in</strong>stance as an object implement<strong>in</strong>g that <strong>in</strong>terface.If that’s just a lot of gobbledygook to you, maybe an example will help. It’s slightlymorbid, but it does capture what delegates are all about. Consider your will—that is,Licensed to Rhona Hadida


34 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1your last will and testament. It is a set of <strong>in</strong>structions—“pay the bills, make a donationto charity, leave the rest of my estate to the cat,” for <strong>in</strong>stance. You write it before yourdeath, and leave it <strong>in</strong> an appropriately safe place. After your death, your attorney will(you hope!) act on those <strong>in</strong>structions.A delegate <strong>in</strong> <strong>C#</strong> acts like your will does <strong>in</strong> the real world—as a sequence of actionsto be executed at the appropriate time. Delegates are typically used when the code thatwants to execute the actions doesn’t know the details of what that action should be. For<strong>in</strong>stance, the only reason that the Thread class knows what to run <strong>in</strong> a new thread whenyou start it is because you provide it with a ThreadStart delegate <strong>in</strong>stance.We’ll start our tour of delegates with the four absolute basics, without which noneof the rest would make sense.2.1.1 A recipe for simple delegatesIn order for delegates to do anyth<strong>in</strong>g, four th<strong>in</strong>gs need to happen:■ The delegate type needs to be declared.■ There must be a method conta<strong>in</strong><strong>in</strong>g the code to execute.■ A delegate <strong>in</strong>stance must be created.■ The delegate <strong>in</strong>stance must be <strong>in</strong>voked.Let’s take each of the steps of this recipe <strong>in</strong> turn.DECLARING THE DELEGATE TYPEA delegate type is effectively just a list of parameter types and a return type. It specifieswhat k<strong>in</strong>d of action can be represented by <strong>in</strong>stances of the type. For <strong>in</strong>stance, considera delegate type declared like this:delegate void Str<strong>in</strong>gProcessor (str<strong>in</strong>g <strong>in</strong>put);The code says that if we want to create an <strong>in</strong>stance of Str<strong>in</strong>gProcessor, we’re go<strong>in</strong>g toneed a method with one parameter (a str<strong>in</strong>g) and a void return type (the methoddoesn’t return anyth<strong>in</strong>g). It’s important to understand that Str<strong>in</strong>gProcessor really isa type. It has methods, you can create <strong>in</strong>stances of it, pass around references to <strong>in</strong>stances,the whole works. There are obviously a few “special features,” but if you’re ever stuckwonder<strong>in</strong>g what will happen <strong>in</strong> a particular situation, first th<strong>in</strong>k about what would happenif you were just us<strong>in</strong>g a “normal” reference type.NOTESource of confusion: the ambiguous term “delegate”—Delegates are often misunderstoodbecause the word “delegate” is used to describe both a “delegatetype” and a “delegate <strong>in</strong>stance.” The dist<strong>in</strong>ction between these two isexactly the same as the one that exists between any other type and<strong>in</strong>stances of that type—the str<strong>in</strong>g type itself is different from a particularsequence of characters, for example. I’ve used the terms “delegate type”and “delegate <strong>in</strong>stance” throughout this chapter to try to keep it clearexactly what I’m talk<strong>in</strong>g about at any po<strong>in</strong>t.We’ll use the Str<strong>in</strong>gProcessor delegate type when we consider the next <strong>in</strong>gredient.Licensed to Rhona Hadida


Delegates35FINDING AN APPROPRIATE METHOD FOR THE DELEGATE INSTANCE’S ACTIONIn .NET, delegate <strong>in</strong>stances always refer to methods. Our next <strong>in</strong>gredient is to f<strong>in</strong>d (orwrite, of course) a method that does what we want and has the same signature as the delegatetype we’re us<strong>in</strong>g. The idea is to make sure that when we try to <strong>in</strong>voke a delegate<strong>in</strong>stance, the parameters we use will all match up and we’ll be able to use the return value(if any) <strong>in</strong> the way we expect—just like a normal method call.Now consider these five method signatures as candidates to be used for a Str<strong>in</strong>g-Processor <strong>in</strong>stance:void Pr<strong>in</strong>tStr<strong>in</strong>g (str<strong>in</strong>g x)void Pr<strong>in</strong>tInteger (<strong>in</strong>t x)void Pr<strong>in</strong>tTwoStr<strong>in</strong>gs (str<strong>in</strong>g x, str<strong>in</strong>g y)<strong>in</strong>t GetStr<strong>in</strong>gLength (str<strong>in</strong>g x)void Pr<strong>in</strong>tObject (object x)The first method has everyth<strong>in</strong>g right, so we can use it to create a delegate <strong>in</strong>stance.The second method takes one parameter, but it’s not str<strong>in</strong>g, so it’s <strong>in</strong>compatible withStr<strong>in</strong>gProcessor. The third method has the correct first parameter type, but it hasanother parameter as well, so it’s still <strong>in</strong>compatible.The fourth method has the right parameter list but a nonvoid return type. (If ourdelegate type had a return type, the return type of the method would have to matchthat too.) The fifth method is <strong>in</strong>terest<strong>in</strong>g—any time we <strong>in</strong>voke a Str<strong>in</strong>gProcessor<strong>in</strong>stance we could call the Pr<strong>in</strong>tObject method with the same arguments, becausestr<strong>in</strong>g derives from object. It would make sense to be able to use it for an <strong>in</strong>stance ofStr<strong>in</strong>gProcessor, but <strong>C#</strong> 1 limits the delegate to have exactly the same parametertypes. 1 <strong>C#</strong> 2 changes this situation—see chapter 5 for more details. In some ways thefourth method is similar, as you could always ignore the unwanted return value. However,void and nonvoid return types are currently always deemed to be <strong>in</strong>compatible.Let’s assume we’ve got a method body for the compatible signature (Pr<strong>in</strong>tStr<strong>in</strong>g)and move on to our next <strong>in</strong>gredient—the delegate <strong>in</strong>stance itself.CREATING A DELEGATE INSTANCENow that we’ve got a delegate type and a method with the right signature, we can createan <strong>in</strong>stance of that delegate type, specify<strong>in</strong>g that this method be executed when the delegate<strong>in</strong>stance is <strong>in</strong>voked. There’s no good official term<strong>in</strong>ology def<strong>in</strong>ed for this, but forthis book I will call it the action of the delegate <strong>in</strong>stance. The exact form of the expressionused to create the delegate <strong>in</strong>stance depends on whether the action uses an <strong>in</strong>stancemethod or a static method. Suppose Pr<strong>in</strong>tStr<strong>in</strong>g is a static method <strong>in</strong> a type calledStaticMethods and an <strong>in</strong>stance method <strong>in</strong> a type called InstanceMethods. Here are twoexamples of creat<strong>in</strong>g an <strong>in</strong>stance of Str<strong>in</strong>gProcessor:Str<strong>in</strong>gProcessor proc1, proc2;proc1 = new Str<strong>in</strong>gProcessor(StaticMethods.Pr<strong>in</strong>tStr<strong>in</strong>g);InstanceMethods <strong>in</strong>stance = new InstanceMethods();proc2 = new Str<strong>in</strong>gProcessor(<strong>in</strong>stance.Pr<strong>in</strong>tStr<strong>in</strong>g);1As well as the parameter types, you have to match whether the parameter is <strong>in</strong> (the default), out, or ref. It’sreasonably rare to use out/ref parameters with delegates, though.Licensed to Rhona Hadida


36 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1When the action is a static method, you only need to specify the type name. When theaction is an <strong>in</strong>stance method, you need an <strong>in</strong>stance of the type (or a derived type)—just as if you were call<strong>in</strong>g the method <strong>in</strong> the normal way. This object is called the targetof the action, and when the delegate <strong>in</strong>stance is <strong>in</strong>voked, the method will be called onthat object. If the action is with<strong>in</strong> the same class (as it often is, particularly whenyou’re writ<strong>in</strong>g event handlers <strong>in</strong> UI code), you don’t need to qualify it either way—thethis reference is used implicitly for <strong>in</strong>stance methods. 2 Aga<strong>in</strong>, these rules act just as ifyou were call<strong>in</strong>g the method directly.NOTEUtter garbage! (Or not, as the case may be…)—It’s worth be<strong>in</strong>g aware that adelegate <strong>in</strong>stance will prevent its target from be<strong>in</strong>g garbage collected, ifthe delegate <strong>in</strong>stance itself can’t be collected. This can result <strong>in</strong> apparentmemory leaks, particularly when a “short-lived” object subscribes toan event <strong>in</strong> a “long-lived” object, us<strong>in</strong>g itself as the target. The long-livedobject <strong>in</strong>directly holds a reference to the short-lived one, prolong<strong>in</strong>gits lifetime.There’s not much po<strong>in</strong>t <strong>in</strong> creat<strong>in</strong>g a delegate <strong>in</strong>stance if it doesn’t get <strong>in</strong>voked atsome po<strong>in</strong>t. Let’s look at our last step—the <strong>in</strong>vocation.INVOKING A DELEGATE INSTANCEThis is the really easy bit 3 —it’s just a case of call<strong>in</strong>g a method on the delegate <strong>in</strong>stance.The method itself is called Invoke, and it’s always present <strong>in</strong> a delegate type with thesame list of parameters and return type that the delegate type declaration specifies. So<strong>in</strong> our case, there’s a method like this:void Invoke (str<strong>in</strong>g <strong>in</strong>put)Call<strong>in</strong>g Invoke will execute the action of the delegate<strong>in</strong>stance, pass<strong>in</strong>g on whatever parameters you’vespecified <strong>in</strong> the call to Invoke, and (if the return typeisn’t void) return<strong>in</strong>g the return value of the action.Simple as this is, <strong>C#</strong> makes it even easier—if youhave a variable 4 whose type is a delegate type, you cantreat it as if it were a method itself. It’s easiest to see thishappen<strong>in</strong>g as a cha<strong>in</strong> of events occurr<strong>in</strong>g at differenttimes, as shown <strong>in</strong> figure 2.1.So, that’s simple too. All our <strong>in</strong>gredients are <strong>in</strong>place, so we can preheat our CLR to 220°F, stir everyth<strong>in</strong>gtogether, and see what happens.Compiles to...Which atexecutiontime <strong>in</strong>vokes...proc1("Hello");proc1.Invoke("Hello");Pr<strong>in</strong>tStr<strong>in</strong>g("Hello");Figure 2.1 Process<strong>in</strong>g a call to adelegate <strong>in</strong>stance that uses the <strong>C#</strong>shorthand syntax2Of course, if the action is an <strong>in</strong>stance method and you’re try<strong>in</strong>g to create a delegate <strong>in</strong>stance from with<strong>in</strong> astatic method, you’ll still need to provide a reference to be the target.3For synchronous <strong>in</strong>vocation, anyway. You can use Beg<strong>in</strong>Invoke and EndInvoke to <strong>in</strong>voke a delegate <strong>in</strong>stanceasynchronously, but that’s beyond the scope of this chapter.4Or any other k<strong>in</strong>d of expression—but it’s usually a variable.Licensed to Rhona Hadida


Delegates37A COMPLETE EXAMPLE AND SOME MOTIVATIONIt’s easiest to see all this <strong>in</strong> action <strong>in</strong> a complete example—f<strong>in</strong>ally, someth<strong>in</strong>g we canactually run! As there are lots of bits and pieces go<strong>in</strong>g on, I’ve <strong>in</strong>cluded the wholesource code this time rather than us<strong>in</strong>g snippets. There’s noth<strong>in</strong>g m<strong>in</strong>d-blow<strong>in</strong>g <strong>in</strong> list<strong>in</strong>g2.1, so don’t expect to be amazed—it’s just useful to have concrete code to discuss.List<strong>in</strong>g 2.1Us<strong>in</strong>g delegates <strong>in</strong> a variety of simple waysus<strong>in</strong>g System;delegate void Str<strong>in</strong>gProcessor(str<strong>in</strong>g <strong>in</strong>put);BDeclaresdelegate typeclass Person{}str<strong>in</strong>g name;public Person(str<strong>in</strong>g name){this.name = name;}public void Say(str<strong>in</strong>g message){Console.WriteL<strong>in</strong>e ("{0} says: {1}", name, message);}CDeclares compatible<strong>in</strong>stance methodclass Background{public static void Note(str<strong>in</strong>g note){Console.WriteL<strong>in</strong>e ("({0})", note);}DDeclares compatiblestatic method}class SimpleDelegateUse{static void Ma<strong>in</strong>(){Person jon = new Person("Jon");Person tom = new Person("Tom");Str<strong>in</strong>gProcessor jonsVoice, tomsVoice, background;jonsVoice = new Str<strong>in</strong>gProcessor(jon.Say);tomsVoice = new Str<strong>in</strong>gProcessor(tom.Say);background = new Str<strong>in</strong>gProcessor(Background.Note);E Createsthreedelegate<strong>in</strong>stances}}jonsVoice("Hello, son.");tomsVoice.Invoke("Hello, Daddy!");background("An airplane flies past.");Invokes delegate<strong>in</strong>stancesTo start with, we declare the delegate type B. Next, we create two methods (Cand D) that are both compatible with the delegate type. We’ve got one <strong>in</strong>stancemethod (Person.Say) and one static method (Background.Note) so that we can seeFLicensed to Rhona Hadida


38 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1how they’re used differently when we create the delegate <strong>in</strong>stances E. We’ve createdtwo <strong>in</strong>stances of the Person class so that we can see the difference that the targetof a delegate makes. When jonsVoice is <strong>in</strong>voked F, it calls the Say method onthe Person object with the name Jon; likewise, when tomsVoice is <strong>in</strong>voked, it usesthe object with the name Tom. I’ve <strong>in</strong>cluded both the ways we’ve seen of <strong>in</strong>vok<strong>in</strong>gdelegate <strong>in</strong>stances—call<strong>in</strong>g Invoke explicitly and us<strong>in</strong>g the <strong>C#</strong> shorthand—just for<strong>in</strong>terest’s sake. Normally you’d just use the shorthand. The output for list<strong>in</strong>g 2.1 isfairly obvious:Jon says: Hello, son.Tom says: Hello, Daddy!(An airplane flies past.)Frankly, there’s an awful lot of code <strong>in</strong> list<strong>in</strong>g 2.1 to display three l<strong>in</strong>es of output. Evenif we wanted to use the Person class and the Background class, there’s no real need touse delegates here. So what’s the po<strong>in</strong>t? Why can’t we just call methods directly? Theanswer lies <strong>in</strong> our orig<strong>in</strong>al example of an attorney execut<strong>in</strong>g a will—just because youwant someth<strong>in</strong>g to happen, that doesn’t mean you’re always there at the right timeand place to make it happen yourself. Sometimes you need to give <strong>in</strong>structions—todelegate responsibility, as it were.I should stress that back <strong>in</strong> the world of software, this isn’t a matter ofobjects leav<strong>in</strong>g dy<strong>in</strong>g wishes. Often the object that first creates a delegate<strong>in</strong>stance is still alive and well when the delegate <strong>in</strong>stance is<strong>in</strong>voked. Instead, it’s about specify<strong>in</strong>g some code to be executed at aparticular time, when you may not be able to (or may not want to)change the code that is runn<strong>in</strong>g at that po<strong>in</strong>t. If I want someth<strong>in</strong>g tohappen when a button is clicked, I don’t want to have to change thecode of the button—I just want to tell the button to call one of my methodsthat will take the appropriate action. It’s a matter of add<strong>in</strong>g a level of<strong>in</strong>direction—as so much of object-oriented programm<strong>in</strong>g is. As we’ve seen, this addscomplexity (look at how many l<strong>in</strong>es of code it took to produce so little output!) butalso flexibility.Now that we understand a bit more about simple delegates, we’ll take a brief look atcomb<strong>in</strong><strong>in</strong>g delegates together to execute a whole bunch of actions <strong>in</strong>stead of just one.Reasons fordelegatesexist<strong>in</strong>g2.1.2 Comb<strong>in</strong><strong>in</strong>g and remov<strong>in</strong>g delegatesSo far, all the delegate <strong>in</strong>stances we’ve looked at have had a s<strong>in</strong>gle action. The truth isa little bit more complicated: a delegate <strong>in</strong>stance actually has a list of actions associatedwith it. This is called the <strong>in</strong>vocation list of the delegate <strong>in</strong>stance. The static Comb<strong>in</strong>e andRemove methods of the System.Delegate type are responsible for creat<strong>in</strong>g new delegate<strong>in</strong>stances by respectively splic<strong>in</strong>g together the <strong>in</strong>vocation lists of two delegate<strong>in</strong>stances or remov<strong>in</strong>g the <strong>in</strong>vocation list of one delegate <strong>in</strong>stance from another.Before we look at the details, it’s important to note that delegate <strong>in</strong>stances areimmutable. Once you’ve created a delegate <strong>in</strong>stance, noth<strong>in</strong>g about it can be changed.This makes it safe to pass around delegate <strong>in</strong>stances and comb<strong>in</strong>e them with othersLicensed to Rhona Hadida


Delegates39without worry<strong>in</strong>g about consistency, thread safety, or anyone try<strong>in</strong>g to change theiractions. This is just the same with delegate <strong>in</strong>stances as it is with str<strong>in</strong>gs, which arealso immutable. The reason for mention<strong>in</strong>g this is that Delegate.Comb<strong>in</strong>e is just likeStr<strong>in</strong>g.Concat—they both comb<strong>in</strong>e exist<strong>in</strong>g <strong>in</strong>stances together to form a new onewithout chang<strong>in</strong>g the orig<strong>in</strong>al objects at all. In the case of delegate <strong>in</strong>stances, theorig<strong>in</strong>al <strong>in</strong>vocation lists are concatenated together. Note that if you ever try to comb<strong>in</strong>enull with a delegate <strong>in</strong>stance, the null is treated as if it were a delegate<strong>in</strong>stance with an empty <strong>in</strong>vocation list.You’ll rarely see an explicit call to Delegate.x += y;x = x+y;x = Delegate.Comb<strong>in</strong>e(x, y);Figure 2.2 The transformation processused for the <strong>C#</strong> shorthand syntax forcomb<strong>in</strong><strong>in</strong>g delegate <strong>in</strong>stancesComb<strong>in</strong>e <strong>in</strong> <strong>C#</strong> code—usually the + and += operatorsare used. Figure 2.2 shows the translationprocess, where x and y are both variables of thesame (or compatible) delegate types. All of this isdone by the <strong>C#</strong> compiler.As you can see, it’s a straightforward transformation,but it does make the code a lot neater.Just as you can comb<strong>in</strong>e delegate <strong>in</strong>stances,you can remove one from another with theDelegate.Remove method, and <strong>C#</strong> uses the shorthandof the - and -= operators <strong>in</strong> the obvious way. Delegate.Remove(source, value)creates a new delegate whose <strong>in</strong>vocation list is the one from source, with the list fromvalue hav<strong>in</strong>g been removed. If the result would have an empty <strong>in</strong>vocation list, nullis returned.Table 2.1 shows some examples of the results of comb<strong>in</strong><strong>in</strong>g and remov<strong>in</strong>g delegate<strong>in</strong>stances. I’ve used the notation [a, b, c] to <strong>in</strong>dicate a delegate with an <strong>in</strong>vocationlist of actions a, b, and c (whatever they may happen to be).Table 2.1A selection of examples of comb<strong>in</strong><strong>in</strong>g and remov<strong>in</strong>g delegates, show<strong>in</strong>g what the resultis and whyOperation Result Notes[a] + [b] [a, b] --[a] + null [a] null counts as an empty <strong>in</strong>vocation list.null + [a] [a] --[a] + [b, c] [a, b, c] --[a, b] + [b, c] [a, b, b, c] Duplicates are allowed.[a, b, c] - [b] [a, c] The removal list doesn’t have to be at the end…[a, b, c, d, b, c] - [b, c] [a, b, c, d] … but the last occurrence of the removal list isremoved.[a] - [b] [a] No-op removal of nonexistent list.Licensed to Rhona Hadida


40 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1Table 2.1A selection of examples of comb<strong>in</strong><strong>in</strong>g and remov<strong>in</strong>g delegates, show<strong>in</strong>g what the resultis and why (cont<strong>in</strong>ued)Operation Result Notes[a, b, c, d, e] - [a, c, e] [a, b, c, d, e] The removal list must be present as a sublist;it’s not just remov<strong>in</strong>g each element of theremoval list.[a, b] - [a, b] null null is returned for an empty <strong>in</strong>vocation list.When a delegate <strong>in</strong>stance is <strong>in</strong>voked, all its actions are executed <strong>in</strong> order. If the delegate’ssignature has a nonvoid return type, the value returned by Invoke is the valuereturned by the last action executed. It’s quite rare to see a nonvoid delegate <strong>in</strong>stancewith more than one action <strong>in</strong> its <strong>in</strong>vocation list, because it means the return values ofall the other actions are never seen.If any of the actions <strong>in</strong> the <strong>in</strong>vocation list throws an exception, that prevents any ofthe subsequent actions from be<strong>in</strong>g executed. For example, if a delegate <strong>in</strong>stance withan action list [a, b, c] is <strong>in</strong>voked, and action b throws an exception, then the exceptionwill be propagated immediately and action c won’t be executed.Comb<strong>in</strong><strong>in</strong>g and remov<strong>in</strong>g delegate <strong>in</strong>stances is particularly useful when it comes toevents. Now that we understand what comb<strong>in</strong><strong>in</strong>g and remov<strong>in</strong>g <strong>in</strong>volves, we can sensiblytalk about what events are.2.1.3 A brief diversion <strong>in</strong>to eventsYou probably have an <strong>in</strong>st<strong>in</strong>ctive idea about the overall po<strong>in</strong>t of events—particularly ifyou’ve written any UIs. The idea is that an event allows code to react when someth<strong>in</strong>ghappens—sav<strong>in</strong>g a file when the appropriate button is clicked, for example. In thiscase, the event is the button be<strong>in</strong>g clicked, and the action is the sav<strong>in</strong>g of the file.Understand<strong>in</strong>g the reason for the concept isn’t the same as understand<strong>in</strong>g how <strong>C#</strong>def<strong>in</strong>es events <strong>in</strong> language terms, however.Developers often get confused between events and delegate <strong>in</strong>stances, or betweenevents and delegate type fields. The difference is important: events aren’t delegate typefields. The reason for the confusion is that yet aga<strong>in</strong>, <strong>C#</strong> provides a shorthand, <strong>in</strong> theform of field-like events. We’ll come to those <strong>in</strong> a m<strong>in</strong>ute, but first let’s consider whatevents consist of as far as the <strong>C#</strong> compiler is concerned.I th<strong>in</strong>k it’s helpful to th<strong>in</strong>k of events as be<strong>in</strong>g very similar to properties.To start with, both of them are declared to be of a certa<strong>in</strong> type, which <strong>in</strong>the case of an event is forced to be a delegate type. When you use properties,it looks like you’re fetch<strong>in</strong>g or assign<strong>in</strong>g values directly to fields, butyou’re actually call<strong>in</strong>g methods (getters and setters). The property implementationcan do what it likes with<strong>in</strong> those methods—it just happensthat most properties are implemented with simple fields back<strong>in</strong>g them,Eventsare likeproperties—bothencapsulatedatasometimes with some validation <strong>in</strong> the setter and sometimes with some thread safetythrown <strong>in</strong> for good measure.Licensed to Rhona Hadida


Delegates41Likewise, when you subscribe to or unsubscribe from an event, it looks like you’reus<strong>in</strong>g a field whose type is a delegate type, with the += and -= operators. Aga<strong>in</strong>,though, you’re actually call<strong>in</strong>g methods (add and remove 5 ). That’s all you can do witha pure event—subscribe to it (add an event handler) or unsubscribe from it (removean event handler). It’s up to the event methods to do someth<strong>in</strong>g useful—such as tak<strong>in</strong>gnotice of the event handlers you’re try<strong>in</strong>g to add and remove, and mak<strong>in</strong>g themavailable elsewhere with<strong>in</strong> the class.The reason for hav<strong>in</strong>g events <strong>in</strong> the first place is very much like the reason forhav<strong>in</strong>g properties—they add a layer of encapsulation. Just as you don’t want othercode to be able to set field values without the owner at least hav<strong>in</strong>g the option of validat<strong>in</strong>gthe new value, you often don’t want code outside a class to be able to arbitrarilychange (or call) the handlers for an event. Of course, a class can add methodsto give extra access—for <strong>in</strong>stance, to reset the list of handlers for an event, or toraise the event (<strong>in</strong> other words, call its event handlers). For example, Background-Worker.OnProgressChanged just calls the ProgressChanged event handlers. However,if you only expose the event itself, code outside the class only has the ability toadd and remove handlers.Field-like events make the implementation of all of this much simpler to look at—as<strong>in</strong>gle declaration and you’re done. The compiler turns the declaration <strong>in</strong>to both anevent with default add/remove implementations, and a private delegate type field.Code <strong>in</strong>side the class sees the field; code outside the class only sees the event. Thismakes it look as if you can <strong>in</strong>voke an event—but what you actually do to call the eventhandlers is <strong>in</strong>voke the delegate <strong>in</strong>stance stored <strong>in</strong> the field.The details of events are outside the scope of this chapter—events themselves haven’tchanged significantly <strong>in</strong> <strong>C#</strong> 2 or 3—but I wanted to draw attention to the differencebetween delegate <strong>in</strong>stances and events now, to prevent it caus<strong>in</strong>g confusion later on.2.1.4 Summary of delegatesSo, to summarize what we’ve covered on delegates:■■■■■■■Delegates allow behavior to be encapsulated.The declaration of a delegate type controls which methods can be used to createdelegate <strong>in</strong>stances.Creat<strong>in</strong>g a delegate <strong>in</strong>stance requires a method and (for <strong>in</strong>stance methods) atarget to call the method on.Delegate <strong>in</strong>stances are immutable.Delegate <strong>in</strong>stances each conta<strong>in</strong> an <strong>in</strong>vocation list—a list of actions.Delegate <strong>in</strong>stances can be comb<strong>in</strong>ed together and removed from each other.Events are not delegate <strong>in</strong>stances—they’re just add/remove method pairs(th<strong>in</strong>k property getters/setters).5These aren’t their names <strong>in</strong> the compiled code; otherwise you could only have one event per type. The compilercreates two methods with names that aren’t used elsewhere, and a special piece of metadata to let othertypes know that there’s an event with the given name, and what its add/remove methods are called.Licensed to Rhona Hadida


42 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1Delegates are one very specific feature of <strong>C#</strong> and .NET—a detail, <strong>in</strong> the grand schemeof th<strong>in</strong>gs. Both of the other “rem<strong>in</strong>der” sections <strong>in</strong> this chapter deal with muchbroader topics. First, we will consider what it means to talk about <strong>C#</strong> be<strong>in</strong>g a staticallytyped language and the implications that has.2.2 Type system characteristicsAlmost every programm<strong>in</strong>g language has a type system of some k<strong>in</strong>d. Over time, thesehave been classified as strong/weak, safe/unsafe, static/dynamic, and no doubt somemore esoteric variations. It’s obviously important to understand the type system withwhich one is work<strong>in</strong>g, and it’s reasonable to expect that know<strong>in</strong>g the categories <strong>in</strong>towhich a language falls would give a lot of <strong>in</strong>formation to help on that front. However,because the terms are used to mean somewhat different th<strong>in</strong>gs by different people,miscommunication is almost <strong>in</strong>evitable. I’ll try to say exactly what I mean by each termto avoid confusion as much as possible.One important th<strong>in</strong>g to note is that this section is only applicable to “safe” code—which means all <strong>C#</strong> code that isn’t explicitly with<strong>in</strong> an unsafe context. As you mightjudge from the name, code with<strong>in</strong> an unsafe context can do various th<strong>in</strong>gs that safecode can’t, and that may violate some aspects of normal type safety. Most developersare unlikely ever to need to write unsafe code, and the characteristics of the type systemare far simpler to describe and understand when only safe code is considered.This section shows what restrictions are and aren’t enforced <strong>in</strong> <strong>C#</strong> 1 while def<strong>in</strong><strong>in</strong>gsome terms to describe that behavior. We’ll then see a few th<strong>in</strong>gs we can’t dowith <strong>C#</strong> 1—first from the po<strong>in</strong>t of view of what we can’t tell the compiler, and thenfrom the po<strong>in</strong>t of view of what we wish we didn’t have to tell the compiler.Let’s start off with what <strong>C#</strong> 1 does, and what term<strong>in</strong>ology is usually used to describethat k<strong>in</strong>d of behavior.2.2.1 <strong>C#</strong>’s place <strong>in</strong> the world of type systemsIt’s easiest to beg<strong>in</strong> by mak<strong>in</strong>g a statement, and then clarify what it actually means andwhat the alternatives might be:<strong>C#</strong> 1’s type system is static, explicit, and safe.You might well have expected the word strong to appear <strong>in</strong> the list, and I had half am<strong>in</strong>d to <strong>in</strong>clude it. However, while most people can reasonably agree on whether alanguage has the listed characteristics, decid<strong>in</strong>g whether or not a language is stronglytyped can cause heated debate because the def<strong>in</strong>itions vary so wildly. Some mean<strong>in</strong>gs(those prevent<strong>in</strong>g any conversions, explicit or implicit) would clearly rule <strong>C#</strong> out—whereas others are quite close to (or even the same as) statically typed, which would<strong>in</strong>clude <strong>C#</strong>. Most of the articles and books I’ve read that describe <strong>C#</strong> as a stronglytyped language are effectively us<strong>in</strong>g it to mean statically typed. I’ve used the word statichere to try to m<strong>in</strong>imize the potential for confusion—although it should be noted thatit has little <strong>in</strong> common with the common understand<strong>in</strong>g of the keyword static usedwith<strong>in</strong> code itself as related to the type rather than a particular <strong>in</strong>stance.Let’s take the terms <strong>in</strong> the def<strong>in</strong>ition one at a time and shed some light on them.Licensed to Rhona Hadida


Type system characteristics43STATIC TYPING VS. DYNAMIC TYPING<strong>C#</strong> is statically typed: each variable 6 is of a particular type, and that type is known atcompile time. Only operations that are known about for that type are allowed, andthis is enforced by the compiler. Consider this example of enforcement:object o = "hello";Console.WriteL<strong>in</strong>e (o.Length);Now as developers look<strong>in</strong>g at the code, we obviously know that the value of o refers toa str<strong>in</strong>g, and that the str<strong>in</strong>g type has a Length property, but the compiler only th<strong>in</strong>ksof o as be<strong>in</strong>g of type object. If we want to get to the Length property, we have to tellthe compiler that the value of o actually refers to a str<strong>in</strong>g:object o = "hello";Console.WriteL<strong>in</strong>e (((str<strong>in</strong>g)o).Length);The alternative to static typ<strong>in</strong>g is dynamic typ<strong>in</strong>g, which can take a variety of guises. Theessence of dynamic typ<strong>in</strong>g is to say that variables just have values—they aren’trestricted to particular types, so the compiler can’t perform the same sort of checks.Instead, the execution environment attempts to understand any given expression <strong>in</strong>an appropriate manner for the value <strong>in</strong>volved. For example, if <strong>C#</strong> were dynamicallytyped, we could do this:o = "hello";Console.WriteL<strong>in</strong>e (o.Length);o = new str<strong>in</strong>g[] {"hi", "there"};Console.WriteL<strong>in</strong>e (o.Length);This would be us<strong>in</strong>g two completely unrelated Length properties—Str<strong>in</strong>g.Length andArray.Length—by exam<strong>in</strong><strong>in</strong>g the types dynamically at execution time. Like many areasof def<strong>in</strong><strong>in</strong>g type systems, there are different levels of dynamic typ<strong>in</strong>g. Some languagesallow you to specify types where you want to—possibly still treat<strong>in</strong>g them dynamicallyapart from assignment—but let you use untyped variables elsewhere.EXPLICIT TYPING VS. IMPLICIT TYPINGThe dist<strong>in</strong>ction between explicit typ<strong>in</strong>g and implicit typ<strong>in</strong>g is only relevant <strong>in</strong> statically typedlanguages. With explicit typ<strong>in</strong>g, the type of every variable must be explicitly stated <strong>in</strong> thedeclaration. Implicit typ<strong>in</strong>g allows the compiler to <strong>in</strong>fer the type of the variable basedon its use. For example, the language could dictate that the type of the variable is thetype of the expression used to assign the <strong>in</strong>itial value.Consider a hypothetical language that uses the keyword var to <strong>in</strong>dicate type <strong>in</strong>ference.7 Table 2.2 shows how code <strong>in</strong> such a language could be written <strong>in</strong> <strong>C#</strong> 1. Thecode <strong>in</strong> the left column is not allowed <strong>in</strong> <strong>C#</strong> 1, but the code <strong>in</strong> the right column is theequivalent valid code.6This applies to most expressions too, but not quite all of them. There are certa<strong>in</strong> expressions which don’t havea type, such as void method <strong>in</strong>vocations, but this doesn’t affect <strong>C#</strong>’s status of be<strong>in</strong>g statically typed. I’ve used theword variable throughout this section to avoid unnecessary bra<strong>in</strong> stra<strong>in</strong>.7OK, not so hypothetical. See section 8.2 for <strong>C#</strong> 3’s implicitly typed local variable capabilities.Licensed to Rhona Hadida


44 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1Invalid <strong>C#</strong> 1—implicit typ<strong>in</strong>gvar s = "hello";var x = s.Length;var twiceX = x*2;Valid <strong>C#</strong> 1—explicit typ<strong>in</strong>gstr<strong>in</strong>g s = "hello";<strong>in</strong>t x = s.Length;<strong>in</strong>t twiceX = x*2;Table 2.2 An example show<strong>in</strong>g thedifferences between implicit andexplicit typ<strong>in</strong>gHopefully it’s clear why this is only relevant for statically typed situations: for bothimplicit and explicit typ<strong>in</strong>g, the type of the variable is known at compile time, even ifit’s not explicitly stated. In a dynamic context, the variable doesn’t even have a type tostate or <strong>in</strong>fer.TYPE-SAFE VS. TYPE-UNSAFEThe easiest way of describ<strong>in</strong>g a type-safe system is to describe its opposite. Some languages(I’m th<strong>in</strong>k<strong>in</strong>g particularly of C and C++) allow you to do some really deviousth<strong>in</strong>gs. They’re potentially powerful <strong>in</strong> the right situations, but with great powercomes a free pack of donuts, or however the expression goes—and the right situationsare relatively rare. Some of these devious th<strong>in</strong>gs can shoot you <strong>in</strong> the foot if you getthem wrong. Abus<strong>in</strong>g the type system is one of them.With the right voodoo rituals, you can persuade these languages to treat a value ofone type as if it were a value of a completely different type without apply<strong>in</strong>g any conversions.I don’t just mean call<strong>in</strong>g a method that happens to be called the same th<strong>in</strong>g, as<strong>in</strong> our dynamic typ<strong>in</strong>g example earlier. I mean some code that looks at the raw byteswith<strong>in</strong> a value and <strong>in</strong>terprets them <strong>in</strong> the “wrong” way. List<strong>in</strong>g 2.2 gives a simple Cexample of what I mean.List<strong>in</strong>g 2.2Demonstrat<strong>in</strong>g a type-unsafe system with C code#<strong>in</strong>clude <strong>in</strong>t ma<strong>in</strong>(<strong>in</strong>t argc, char **argv){char *first_arg = argv[1];<strong>in</strong>t *first_arg_as_<strong>in</strong>t = (<strong>in</strong>t *)first_arg;pr<strong>in</strong>tf ("%d", *first_arg_as_<strong>in</strong>t);}If you compile list<strong>in</strong>g 2.2 and run it with a simple argument of "hello", you will seea value of 1819043176—at least on a little-endian architecture with a compiler treat<strong>in</strong>g<strong>in</strong>t as 32 bits and char as 8 bits, and where text is represented <strong>in</strong> ASCII or UTF-8.The code is treat<strong>in</strong>g the char po<strong>in</strong>ter as an <strong>in</strong>t po<strong>in</strong>ter, so dereferenc<strong>in</strong>g it returnsthe first 4 bytes of text, treat<strong>in</strong>g them as a number.In fact, this t<strong>in</strong>y example is quite tame compared with other potential abuses—cast<strong>in</strong>g between completely unrelated structs can easily result <strong>in</strong> total mayhem. It’s notthat this actually happens <strong>in</strong> real life very often, but some elements of the C typ<strong>in</strong>g systemoften mean you’ll have to tell the compiler what to do, leav<strong>in</strong>g it no option but totrust you even at execution time.Fortunately, none of this occurs <strong>in</strong> <strong>C#</strong>. Yes, there are plenty of conversions available,but you can’t pretend that data for one particular type of object is actually dataLicensed to Rhona Hadida


Type system characteristics45for a different type. You can try by add<strong>in</strong>g a cast to give the compiler this extra (and<strong>in</strong>correct) <strong>in</strong>formation, but if the compiler spots that it’s actually impossible for thatcast to work, it will trigger a compilation error—and if it’s theoretically allowed butactually <strong>in</strong>correct at execution time, the CLR will throw an exception.Now that we know a little about how <strong>C#</strong> 1 fits <strong>in</strong>to the bigger picture of type systems,I’d like to mention a few downsides of its choices. That’s not to say the choicesare wrong—just limit<strong>in</strong>g <strong>in</strong> some ways. Often language designers have to choosebetween different paths that add different limitations or have other undesirable consequences.I’ll start with the case where you want to tell the compiler more <strong>in</strong>formation,but there’s no way of do<strong>in</strong>g so.2.2.2 When is <strong>C#</strong> 1’s type system not rich enough?There are two common situations where you might want to expose more <strong>in</strong>formationto the caller of a method, or perhaps force the caller to limit what they provide <strong>in</strong>their arguments. The first <strong>in</strong>volves collections, and the second <strong>in</strong>volves <strong>in</strong>heritanceand overrid<strong>in</strong>g methods or implement<strong>in</strong>g <strong>in</strong>terfaces. We’ll exam<strong>in</strong>e each <strong>in</strong> turn.COLLECTIONS, STRONG AND WEAKHav<strong>in</strong>g avoided the terms strong and weak for the <strong>C#</strong> type system <strong>in</strong> general, I’ll usethem when talk<strong>in</strong>g about collections. They’re used almost everywhere <strong>in</strong> this context,with little room for ambiguity. Broadly speak<strong>in</strong>g, three k<strong>in</strong>ds of collection types arebuilt <strong>in</strong>to .NET 1.1:■■■Arrays—strongly typed—which are built <strong>in</strong>to both the language and the runtimeThe weakly typed collections <strong>in</strong> the System.Collections namespaceThe strongly typed collections <strong>in</strong> the System.Collections.SpecializednamespaceArrays are strongly typed, 8 so at compile time you can’t set an element of a str<strong>in</strong>g[] tobe a FileStream, for <strong>in</strong>stance. However, reference type arrays also support covariance,which provides an implicit conversion from one type of array to another, so long asthere’s a conversion between the element types. Checks occur at execution time tomake sure that the wrong type of reference isn’t actually stored, as shown <strong>in</strong> list<strong>in</strong>g 2.3.List<strong>in</strong>g 2.3Demonstration of the covariance of arrays, and execution time type check<strong>in</strong>gstr<strong>in</strong>g[] str<strong>in</strong>gs = new str<strong>in</strong>g[5];object[] objects = str<strong>in</strong>gs;objects[0] = new object();BCApplies covariantconversionAttempts to storea pla<strong>in</strong> “object”If you run list<strong>in</strong>g 2.3, you will see an ArrayTypeMismatchException is thrown C. Thisis because the conversion from str<strong>in</strong>g[] to object[] B returns the orig<strong>in</strong>al reference—bothstr<strong>in</strong>gs and objects refer to the same array. The array itself “knows” it is8At least, the language allows them to be. You can use the Array type for weakly typed access to arrays, though.Licensed to Rhona Hadida


46 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1a str<strong>in</strong>g array, and will reject attempts to store references to nonstr<strong>in</strong>gs. Covariance isoften useful, but comes at the cost of some of the type safety be<strong>in</strong>g implemented atexecution time <strong>in</strong>stead of compile time.Let’s compare this with the situation that the weakly typed collections such as Array-List and Hashtable put us <strong>in</strong>. The API of these collections uses object as the type ofkeys and values. When you are writ<strong>in</strong>g a method that takes an ArrayList, for example,there is no way of mak<strong>in</strong>g sure at compile time that the caller will pass <strong>in</strong> a list of str<strong>in</strong>gs.You can document it, and the type safety of the runtime will enforce it if you cast eachelement of the list to str<strong>in</strong>g, but you don’t get compile-time type safety. Likewise, if youreturn an ArrayList, you can <strong>in</strong>dicate <strong>in</strong> the documentation that it will just conta<strong>in</strong>str<strong>in</strong>gs, but callers will have to trust that you’re tell<strong>in</strong>g the truth, and will have to <strong>in</strong>sertcasts when they access the elements of the list.F<strong>in</strong>ally, consider the strongly typed collections such as Str<strong>in</strong>gCollection. Theseprovide an API that is strongly typed, so you can be confident that when you receivea Str<strong>in</strong>gCollection as a parameter or return value it will only conta<strong>in</strong> str<strong>in</strong>gs, andyou don’t need to cast when fetch<strong>in</strong>g elements of the collection. It sounds ideal, butthere are two problems. First, it implements IList, so you can still try to add nonstr<strong>in</strong>gsto it (although you’ll fail at runtime). Second, it only deals with str<strong>in</strong>gs.There are other specialized collections, but all told they don’t cover much ground.There’s the CollectionBase type, which can be used to build your own stronglytyped collections, but that means creat<strong>in</strong>g a new collection type for each elementtype, which is also not ideal.Now that we’ve seen the problem with collections, let’s consider the issue that canoccur when overrid<strong>in</strong>g methods and implement<strong>in</strong>g <strong>in</strong>terfaces. It’s related to the ideaof covariance, which we’ve already seen with arrays.LACK OF COVARIANT RETURN TYPESICloneable is one of the simplest <strong>in</strong>terfaces <strong>in</strong> the framework. It has a s<strong>in</strong>gle method,Clone, which should return a copy of the object that the method is called on. Now,leav<strong>in</strong>g aside the issue of whether this should be a deep or shallow copy, let’s look atthe signature of the Clone method:object Clone()It’s a straightforward signature, certa<strong>in</strong>ly—but as I said, the method should return acopy of the object it’s called on. That means it needs to return an object of the sametype—or at least a compatible one (where that mean<strong>in</strong>g will vary depend<strong>in</strong>g on thetype). It would make sense to be able to override the method with a signature that givesa more accurate description of what the method actually returns. For example, <strong>in</strong> aPerson class it would be nice to be able to implement ICloneable withpublic Person Clone()That wouldn’t break anyth<strong>in</strong>g—code expect<strong>in</strong>g any old object would still work f<strong>in</strong>e. Thisfeature is called return type covariance but unfortunately, <strong>in</strong>terface implementation andmethod overrid<strong>in</strong>g don’t support it. Instead, the normal workaround for <strong>in</strong>terfaces isto use explicit <strong>in</strong>terface implementation to achieve the desired effect:Licensed to Rhona Hadida


Type system characteristics47public Person Clone(){}[Implementation goes here]object ICloneable.Clone(){return Clone();}Calls non<strong>in</strong>terfacemethodImplements<strong>in</strong>terface explicitlyAny code that calls Clone() on an expression that the compiler knows is of type Personwill call the top method; if the type of the expression is just ICloneable, it will call thebottom method. This works but is really ugly.The mirror image of this situation also occurs with parameters, where if you hadan <strong>in</strong>terface or virtual method with a signature of, say void Process(str<strong>in</strong>g x), thenit would seem logical to be able to implement or override the method with a lessdemand<strong>in</strong>g signature, such as void Process(object x). This is called parameter typecontravariance—and is just as unsupported as return type covariance, with the sameworkaround for <strong>in</strong>terfaces and normal overload<strong>in</strong>g for virtual methods. It’s not ashowstopper, but it’s irritat<strong>in</strong>g.Of course, <strong>C#</strong> 1 developers put up with all of these issues for a long time—and Javadevelopers had a similar situation for far longer. While compile-time type safety is agreat feature <strong>in</strong> general, I can’t remember see<strong>in</strong>g many bugs where people actually putthe wrong type of element <strong>in</strong> a collection. I can live with the workaround for the lackof covariance and contravariance. But there’s such a th<strong>in</strong>g as elegance and mak<strong>in</strong>gyour code clearly express what you mean, preferably without need<strong>in</strong>g explanatorycomments. We’ll see later that <strong>C#</strong> 2 isn’t flawless either, but it makes large improvements.As an aside, let’s briefly go <strong>in</strong>to an area that isn’t improved upon <strong>in</strong> any way by<strong>C#</strong> 2 and is only tangentially touched on by <strong>C#</strong> 3—dynamic typ<strong>in</strong>g.2.2.3 When does <strong>C#</strong> 1’s type system get <strong>in</strong> the way?I like static typ<strong>in</strong>g. Most of the time, it does what I want without much fuss, and it stopsme from mak<strong>in</strong>g silly mistakes. It also means the IDE has more <strong>in</strong>formation to help mewith features such as IntelliSense.Just occasionally, it’s a real pa<strong>in</strong>. If I’ve got two types that already have a particularmethod signature, but the types have noth<strong>in</strong>g else <strong>in</strong> common or don’t know aboutmy code at all, why can’t I “pretend” they implement an <strong>in</strong>terface conta<strong>in</strong><strong>in</strong>g that signature?This feature is known as duck typ<strong>in</strong>g (and yet aga<strong>in</strong>, different people use theterm to mean slightly different th<strong>in</strong>gs) <strong>in</strong> that if someth<strong>in</strong>g walks like a duck andquacks like a duck, you might as well treat it like a duck.If I’m work<strong>in</strong>g with an API (usually through COM) that doesn’t have strongly typedmethods, why can’t I ask the compiler to just stop us<strong>in</strong>g static typ<strong>in</strong>g for a while? VisualBasic has this feature, which is controlled with Option Strict; likewise, implicit typ<strong>in</strong>gis turned on and off us<strong>in</strong>g Option Explicit. Both of these features have received badpress, partly due to poor choices for default values <strong>in</strong> the past. It’s also partly due toLicensed to Rhona Hadida


48 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1the overuse of these features by developers who decided to make their code compileat all costs (and then wondered why it failed at runtime). I can’t say I would use thissort of feature particularly heavily <strong>in</strong> <strong>C#</strong> if it were there—but every so often, it wouldbe welcome.Of course, a dynamic type system doesn’t have to stop there. More adventurouslanguages such as Ruby, Python, and Groovy allow methods to be added to types (oreven <strong>in</strong>dividual objects) at execution time, and allow the type itself to determ<strong>in</strong>ewhat should happen if you try to call a method that doesn’t exist. These powerful(though expensive <strong>in</strong> performance terms) features can lead to programs that appearto work almost by magic, and that certa<strong>in</strong>ly make <strong>C#</strong>’s static typ<strong>in</strong>g look positivelyantiquated. They have their costs <strong>in</strong> terms of understand<strong>in</strong>g what’s go<strong>in</strong>g on <strong>in</strong> yourcode, as well as reduc<strong>in</strong>g compile-time safety, but aga<strong>in</strong>, used <strong>in</strong> the right place theycan be positive.It’s important that languages don’t try too hard to be all th<strong>in</strong>gs to all people. Suchattempts always end <strong>in</strong> tears, and the cost <strong>in</strong> terms of the complexity of the languagecan be high. We’ll see this balance between add<strong>in</strong>g new features and limit<strong>in</strong>g the costof them time and time aga<strong>in</strong> throughout this book—the <strong>C#</strong> team has quite a highbar <strong>in</strong> terms of how useful a feature must be before it’s considered for <strong>in</strong>clusion <strong>in</strong>the language.2.2.4 Summary of type system characteristicsIn this section we’ve learned some of the differences between type systems, and <strong>in</strong> particularwhich characteristics apply to <strong>C#</strong> 1:■■■■■<strong>C#</strong> 1 is statically typed—the compiler knows what members to let you use.<strong>C#</strong> 1 is explicit—you have to tell the compiler what types variables have.<strong>C#</strong> 1 is safe—you can’t treat one type as if it were another without the availabilityof a genu<strong>in</strong>e conversion.Static typ<strong>in</strong>g still doesn’t allow a s<strong>in</strong>gle collection to be a strongly typed “list ofstr<strong>in</strong>gs” or “list of <strong>in</strong>tegers.”Method overrid<strong>in</strong>g and <strong>in</strong>terface implementation don’t allow covariance/contravariance.Our next section covers one of the most fundamental aspects of <strong>C#</strong>’s type systembeyond its high-level characteristics—the differences between structs and classes.2.3 Value types and reference typesIt would be hard to overstate how important the subject of this section is. Everyth<strong>in</strong>g youdo <strong>in</strong> .NET will deal with either a value type or a reference type—and yet it’s curiouslypossible to develop for a long time with only a vague idea of what the difference is. Worseyet, there are plenty of myths around to confuse th<strong>in</strong>gs further. The unfortunate fact isthat it’s quite easy to make a short but <strong>in</strong>correct statement that is close enough to thetruth to be plausible but <strong>in</strong>accurate enough to be mislead<strong>in</strong>g—but it’s relatively trickyto come up with a concise but accurate description.Licensed to Rhona Hadida


Value types and reference types49Once you “get it,” the difference between value types and reference types is simple.It can take a while to reach that stage, and you may well have been able to write a lot ofcorrect code without really understand<strong>in</strong>g it. It’s worth persever<strong>in</strong>g, though: for oneth<strong>in</strong>g, a lot of seem<strong>in</strong>gly complicated situations are much easier to understand whenyou’re aware of what’s really go<strong>in</strong>g on.This section is not a complete breakdown of how types are handled, marshal<strong>in</strong>gbetween application doma<strong>in</strong>s, <strong>in</strong>teroperability with native code, and the like. Instead,it’s a fairly brief look at the absolute basics of the topic (as applied to <strong>C#</strong> 1) that are crucialto understand <strong>in</strong> order to come to grips with <strong>C#</strong> 2 and 3.We’ll start off by see<strong>in</strong>g how the fundamental differences between value types andreference types appear naturally <strong>in</strong> the real world as well as <strong>in</strong> .NET.2.3.1 Values and references <strong>in</strong> the real worldSuppose you’re read<strong>in</strong>g someth<strong>in</strong>g really fantastic, and want a friend to read it too.Let’s further suppose that it’s a document <strong>in</strong> the public doma<strong>in</strong>, just to avoid any accusationsof support<strong>in</strong>g copyright violation. What do you need to give your friend so thathe can read it too? It entirely depends on just what you’re read<strong>in</strong>g.First we’ll deal with the case where you’ve got real paper <strong>in</strong> your hands. To giveyour friend a copy, you’d need to photocopy all the pages and then give it to him. Atthat po<strong>in</strong>t, he has his own complete copy of the document. In this situation, we aredeal<strong>in</strong>g with value type behavior. All the <strong>in</strong>formation is directly <strong>in</strong> your hands—youdon’t need to go anywhere else to get it. Your copy of the <strong>in</strong>formation is also <strong>in</strong>dependentof your friend’s after you’ve made the copy. You could add some notes to yourpages, and his pages wouldn’t be changed at all.Compare that with the situation where you’re actually read<strong>in</strong>g a web page. Thistime, all you have to give your friend is the URL of the web page. This is reference typebehavior, with the URL tak<strong>in</strong>g the place of the reference. In order to actually read thedocument, you have to navigate the reference by putt<strong>in</strong>g the URL <strong>in</strong> your browser andask<strong>in</strong>g it to load the page. On the other hand, if the web page changes for some reason(imag<strong>in</strong>e it’s a wiki page and you’ve added your notes to the page) both you andyour friend will see that change the next time each of you loads the page.The differences we’ve seen <strong>in</strong> the real world form the heart of the dist<strong>in</strong>ctionbetween value types and reference types <strong>in</strong> <strong>C#</strong> and .NET. Most types <strong>in</strong> .NET are referencetypes, and you’re likely to create far more reference than value types. Aside fromthe special cases that follow, classes (declared us<strong>in</strong>g class) are reference types, andstructures (declared us<strong>in</strong>g struct) are value types. The other cases are as follows:■ Array types are reference types, even if the element type is a value type (so<strong>in</strong>t[] is still a reference type, even though <strong>in</strong>t is a value type).■■■Enumerations (declared us<strong>in</strong>g enum) are value types.Delegate types (declared us<strong>in</strong>g delegate) are reference types.Interface types (declared us<strong>in</strong>g <strong>in</strong>terface) are reference types, but they can beimplemented by value types.Licensed to Rhona Hadida


50 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1Now that we’ve got the basic idea of what reference types and value types are about,we’ll look at a few of the most important details.2.3.2 Value and reference type fundamentalsThe key concept to grasp when it comes to value types and reference types is what thevalue of a particular expression is. To keep th<strong>in</strong>gs concrete, I’ll use variables as themost common examples of expressions—but the same th<strong>in</strong>g applies to properties,method calls, <strong>in</strong>dexers, and other expressions.As we discussed <strong>in</strong> section 2.2.1, most expressions have types associatedwith them. The value of a value type expression is the value, pla<strong>in</strong> andsimple. For <strong>in</strong>stance, the value of the expression “2+3” is just 5. The valueof a reference type expression, however, is a reference. It’s not the objectthat the reference refers to. So, the value of the expression Str<strong>in</strong>g.Empty is not an empty str<strong>in</strong>g—it’s a reference to an empty str<strong>in</strong>g. In everydaydiscussions and even <strong>in</strong> documentation we tend to blur this dist<strong>in</strong>ction.For <strong>in</strong>stance, we might describe Str<strong>in</strong>g.Concat as return<strong>in</strong>g “astr<strong>in</strong>g that is the concatenation of all the parameters.” Us<strong>in</strong>g very precise term<strong>in</strong>ologyhere would be time-consum<strong>in</strong>g and distract<strong>in</strong>g, and there’s no problem so long aseveryone <strong>in</strong>volved understands that it’s only a reference that is returned.To demonstrate this further, consider a Po<strong>in</strong>t type that stores two <strong>in</strong>tegers, x and y.It could have a constructor that takes the two values. Now, this type could be implementedas either a struct or a class. Figure 2.3 shows the result of execut<strong>in</strong>g the follow<strong>in</strong>gl<strong>in</strong>es of code:Key po<strong>in</strong>t!Po<strong>in</strong>t p1 = new Po<strong>in</strong>t(10, 20);Po<strong>in</strong>t p2 = p1;The left side of figure 2.3 <strong>in</strong>dicates thevalues <strong>in</strong>volved when Po<strong>in</strong>t is a class (areference type), and the right side showsthe situation when Po<strong>in</strong>t is a struct (avalue type).In both cases, p1 and p2 have thesame value after the assignment. However,<strong>in</strong> the case where Po<strong>in</strong>t is a referencetype, that value is a reference: bothp1 and p2 refer to the same object. WhenPo<strong>in</strong>t is a value type, the value of p1 is thewhole of the data for a po<strong>in</strong>t—the x andWhen Po<strong>in</strong>t isa value typey values. Assign<strong>in</strong>g the value of p1 to p2 copies all of that data.The values of variables are stored wherever they are declared. Local variable valuesare always stored on the stack, 9 and <strong>in</strong>stance variable values are always stored whereverp1refp2refx10When Po<strong>in</strong>t is areference type20x10x10Figure 2.3 Compar<strong>in</strong>g value type andreference type behaviors, particularlywith regard to assignmentyp1p1y20y209This is only totally true for <strong>C#</strong> 1. We’ll see later that local variables can end up on the heap <strong>in</strong> certa<strong>in</strong> situations<strong>in</strong> <strong>C#</strong> 2 and 3.Licensed to Rhona Hadida


Value types and reference types51the <strong>in</strong>stance itself is stored. Reference type <strong>in</strong>stances (objects) are always stored on theheap, as are static variables.Another difference between the two k<strong>in</strong>ds of type is that value types cannot bederived from. One consequence of this is that the value doesn’t need any extra <strong>in</strong>formationabout what type that value actually is. Compare that with reference types,where each object conta<strong>in</strong>s a block of data at the start of it identify<strong>in</strong>g the actual typeof the object, along with some other <strong>in</strong>formation. You can never change the type of anobject—when you perform a simple cast, the runtime just takes a reference, checkswhether the object it refers to is a valid object of the desired type, and returns the orig<strong>in</strong>alreference if it’s valid or throws an exception otherwise. The reference itselfdoesn’t know the type of the object—so the same reference value can be used for multiplevariables of different types. For <strong>in</strong>stance, consider the follow<strong>in</strong>g code:Stream stream = new MemoryStream();MemoryStream memoryStream = (MemoryStream) stream;The first l<strong>in</strong>e creates a new MemoryStream object, and sets the value of the stream variableto be a reference to that new object. The second l<strong>in</strong>e checks that the value ofstream refers to a MemoryStream (or derived type) object and sets the value ofmemoryStream to the same value.Once you understand these basic po<strong>in</strong>ts, you can apply them when th<strong>in</strong>k<strong>in</strong>g aboutsome of the falsehoods that are often stated about value types and reference types.2.3.3 Dispell<strong>in</strong>g mythsThere are various myths that do the rounds on a regular basis. I’m sure the mis<strong>in</strong>formationis almost always passed on with no malice and with no idea of the <strong>in</strong>accuracies<strong>in</strong>volved. In this section I’ll tackle the most prom<strong>in</strong>ent myths, expla<strong>in</strong><strong>in</strong>g the true situationas I go.MYTH #1: “STRUCTS ARE LIGHTWEIGHT CLASSES”This myth comes <strong>in</strong> a variety of forms. Some people believe that value types can’t orshouldn’t have methods or other significant behavior—they should be used as simpledata transfer types, with just public fields or simple properties. The DateTime type is agood counterexample to this: it makes sense for it to be a value type, <strong>in</strong> terms of be<strong>in</strong>ga fundamental unit like a number or a character, and it also makes sense for it to beable to perform calculations to do with its value. Look<strong>in</strong>g at th<strong>in</strong>gs from the otherdirection, data transfer types should often be reference types anyway—the decisionshould be based on the desired value or reference type semantics, not the simplicity ofthe type.Other people believe that value types are “lighter” than reference types <strong>in</strong> terms ofperformance. The truth is that <strong>in</strong> some cases value types are more performant—theydon’t require garbage collection, don’t have the type identification overhead, anddon’t require dereferenc<strong>in</strong>g, for example. However, <strong>in</strong> other ways reference types aremore performant—parameter pass<strong>in</strong>g, assignment, return<strong>in</strong>g values and similar operationsonly require 4 or 8 bytes to be copied (depend<strong>in</strong>g on whether you’re runn<strong>in</strong>gLicensed to Rhona Hadida


52 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1the 32-bit or 64-bit CLR) rather than copy<strong>in</strong>g all the data. Imag<strong>in</strong>e if ArrayList weresomehow a “pure” value type, and pass<strong>in</strong>g an ArrayList expression to a method<strong>in</strong>volved copy<strong>in</strong>g all its data!MYTH #2: “REFERENCE TYPES LIVE ON THE HEAP, VALUE TYPES LIVE ON THE STACK”This one is often caused just by laz<strong>in</strong>ess on the part of the person repeat<strong>in</strong>g it. Thefirst part is correct—an <strong>in</strong>stance of a reference type is always created on the heap. It’sthe second part that is problematic. As we’ve already noted, a variable’s value liveswherever it’s declared—so if you have a class with an <strong>in</strong>stance variable of type <strong>in</strong>t, thatvariable’s value for any given object will always be where the rest of the data for theobject is—on the heap. Only local variables (variables declared with<strong>in</strong> methods) andmethod parameters live on the stack.NOTE Are these concepts relevant now? It’s arguable that if you’re writ<strong>in</strong>g managedcode, you should let the runtime worry about how memory is bestused. Indeed, the language specification makes no guarantees aboutwhat lives where; a future runtime may be able to create some objects onthe stack if it knows it could get away with it, or the <strong>C#</strong> compiler couldgenerate code that hardly uses the stack at all.The next myth is usually just a term<strong>in</strong>ology issue.MYTH #3: “OBJECTS ARE PASSED BY REFERENCE IN <strong>C#</strong> BY DEFAULT”This is probably the most widely propagated myth. Aga<strong>in</strong>, the people who make thisclaim often (though not always) know what the actual <strong>C#</strong> behavior is, but they don’tknow what “pass by reference” really means. Unfortunately, this leads to people whodo know what it means gett<strong>in</strong>g confused. The formal def<strong>in</strong>ition of “pass by reference”is relatively complicated, <strong>in</strong>volv<strong>in</strong>g l-values and similar computer science term<strong>in</strong>ology,but the important th<strong>in</strong>g is that if you pass a variable by reference, the method you’recall<strong>in</strong>g can change the value of the caller’s variable by chang<strong>in</strong>g its parameter value. Nowremember that the value of a reference type variable is the reference, not the objectitself. You can change the contents of the object that a parameter refers to without theparameter itself be<strong>in</strong>g passed by reference. For <strong>in</strong>stance, the follow<strong>in</strong>g methodchanges the contents of the Str<strong>in</strong>gBuilder object <strong>in</strong> question, but the caller’s expressionwill still refer to the same object as before:void AppendHello (Str<strong>in</strong>gBuilder builder){builder.Append("hello");}When this method is called, the parameter value (a reference to a Str<strong>in</strong>gBuilder) ispassed by value. If I were to change the value of the builder variable with<strong>in</strong> themethod—for example with the statement builder = null;—that change wouldn’t beseen by the caller, contrary to the myth.It’s <strong>in</strong>terest<strong>in</strong>g to note that not only is the “by reference” bit of the myth <strong>in</strong>accurate,but so is the “objects are passed” bit. Objects themselves are never passed, eitherby reference or by value. When a reference type is <strong>in</strong>volved, either the variable isLicensed to Rhona Hadida


Value types and reference types53passed by reference or the value of the argument (the reference) is passed by value.Aside from anyth<strong>in</strong>g else, this answers the question of what happens when null isused as a by-value argument—if objects were be<strong>in</strong>g passed around, that would causeissues, as there wouldn’t be an object to pass! Instead, the null reference is passed byvalue <strong>in</strong> just the same way as any other reference would be.These myths aren’t the only ones around. Box<strong>in</strong>g and unbox<strong>in</strong>g come <strong>in</strong> for theirfair share of misunderstand<strong>in</strong>g, which I’ll try to clear up next.2.3.4 Box<strong>in</strong>g and unbox<strong>in</strong>gSometimes, you just don’t want a value type value. You want a reference. There are anynumber of reasons why this can happen, and fortunately <strong>C#</strong> and .NET provide a mechanism,called box<strong>in</strong>g, that lets you create an object from a value type value and use areference to that new object. Before we leap straight <strong>in</strong>to an example, let’s start off byreview<strong>in</strong>g two important facts:■■The value of a reference type variable is always a reference.The value of a value type variable is always a value of that type.Given those two facts, the follow<strong>in</strong>g three l<strong>in</strong>es of code don’t seem to make muchsense at first glance:<strong>in</strong>t i = 5;object o = i;<strong>in</strong>t j = (<strong>in</strong>t) o;We have two variables: i is a value type variable, and o is a reference type variable. Howdoes it make sense to assign the value of i to o? The value of o has to be a reference,and the number 5 isn’t a reference—it’s an <strong>in</strong>teger value. What’s actually happen<strong>in</strong>g isbox<strong>in</strong>g: the runtime creates an object (on the heap—it’s a normal object) that conta<strong>in</strong>sthe value (5). The value of o is then a reference to that new object. The third l<strong>in</strong>eperforms the reverse operation—unbox<strong>in</strong>g. We have to tell the compiler which type tounbox the object as, and if we use the wrong type (if it’s a boxed u<strong>in</strong>t or long, forexample, or not a boxed value at all) an InvalidCastException is thrown. 10That’s it, really—box<strong>in</strong>g and unbox<strong>in</strong>g <strong>in</strong> a nutshell. The only rema<strong>in</strong><strong>in</strong>g problemis know<strong>in</strong>g when box<strong>in</strong>g and unbox<strong>in</strong>g occur. Unbox<strong>in</strong>g is usually obvious, because thecast is present <strong>in</strong> the code. Box<strong>in</strong>g can be more subtle. We’ve seen the simple version,but it can also occur if you call the ToStr<strong>in</strong>g, Equals, or GetHashCode methods on thevalue of a type that doesn’t override them, or if you use the value as an <strong>in</strong>terface expression—assign<strong>in</strong>git to a variable whose type is an <strong>in</strong>terface type or pass<strong>in</strong>g it as a parameterwith an <strong>in</strong>terface type. For example, the statement IComparable x = 5; would boxthe number 5.It’s worth be<strong>in</strong>g aware of box<strong>in</strong>g and unbox<strong>in</strong>g because of the potential performancepenalty <strong>in</strong>volved. A s<strong>in</strong>gle box or unbox operation is very cheap, but if you10 There are corner cases where the type doesn’t have to be exactly right, mostly to do with enums. These are sofiddly that even the <strong>C#</strong> language specification hasn’t got it quite right yet!Licensed to Rhona Hadida


<strong>C#</strong> 2 and 3: new features on a solid base55fixes this with anonymous methods, and <strong>in</strong>troduces a simpler syntax for the cases whereyou still want to use a normal method to provide the action for the delegate. You can alsocreate delegate <strong>in</strong>stances us<strong>in</strong>g methods with compatible signatures—the method signatureno longer has to be exactly the same as the delegate’s declaration.List<strong>in</strong>g 2.4 demonstrates all these improvements.List<strong>in</strong>g 2.4 Improvements <strong>in</strong> delegate <strong>in</strong>stantiation brought <strong>in</strong> by <strong>C#</strong> 2static void HandleDemoEvent(object sender, EventArgs e){Console.WriteL<strong>in</strong>e ("Handled by HandleDemoEvent");}...EventHandler handler;handler = new EventHandler(HandleDemoEvent);BSpecifies delegatetype and methodhandler(null, EventArgs.Empty);handler = HandleDemoEvent;CImplicitly convertsto delegate <strong>in</strong>stancehandler(null, EventArgs.Empty);handler = delegate(object sender, EventArgs e){ Console.WriteL<strong>in</strong>e ("Handled anonymously"); };handler(null, EventArgs.Empty);handler = delegate{ Console.WriteL<strong>in</strong>e ("Handled anonymously aga<strong>in</strong>"); };handler(null, EventArgs.Empty);MouseEventHandler mouseHandler = HandleDemoEvent;mouseHandler(null, new MouseEventArgs(MouseButtons.None,0, 0, 0, 0));Specifies actionwith anonymousmethodThe first part of the ma<strong>in</strong> code B is just <strong>C#</strong> 1 code, kept for comparison. The rema<strong>in</strong><strong>in</strong>gdelegates all use new features of <strong>C#</strong> 2. The conversion <strong>in</strong>volved C makes eventsubscription code read a lot more pleasantly—l<strong>in</strong>es such as saveButton.Click +=SaveDocument; are very straightforward, with no extra fluff to distract the eye. Theanonymous method syntax D is a little cumbersome, but does allow the action tobe very clear at the po<strong>in</strong>t of creation, rather than be<strong>in</strong>g another method to look atbefore you understand what’s go<strong>in</strong>g on. The shortcut used E is another example ofanonymous method syntax, but this form can only be used when you don’t need theparameters. Anonymous methods have other powerful features as well, but we’ll seethose later.The f<strong>in</strong>al delegate <strong>in</strong>stance created F is an <strong>in</strong>stance of MouseEventHandler ratherthan just EventHandler—but the HandleDemoEvent method can still be used due tocontravariance, which specifies parameter compatibility. Covariance specifies returntype compatibility. We’ll be look<strong>in</strong>g at both of these <strong>in</strong> more detail <strong>in</strong> chapter 5. Eventhandlers are probably the biggest beneficiaries of this, as suddenly the Microsoftguidel<strong>in</strong>e to make all delegate types used <strong>in</strong> events follow the same convention makesDEUsesanonymousmethodshortcutFUsesdelegatecontravarianceLicensed to Rhona Hadida


56 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1a lot more sense. In <strong>C#</strong> 1, it didn’t matter whether or not two different event handlerslooked “quite similar”—you had to have a method with an exactly match<strong>in</strong>g signature<strong>in</strong> order to create a delegate <strong>in</strong>stance. In <strong>C#</strong> 2, you may well f<strong>in</strong>d yourself able to usethe same method to handle many different k<strong>in</strong>ds of events, particularly if the purposeof the method is fairly event <strong>in</strong>dependent, such as logg<strong>in</strong>g.<strong>C#</strong> 3 provides special syntax for <strong>in</strong>stantiat<strong>in</strong>g delegate types, us<strong>in</strong>g lambda expressions.To demonstrate these, we’ll use a new delegate type. As part of the CLR ga<strong>in</strong><strong>in</strong>ggenerics <strong>in</strong> .NET 2.0, generic delegate types became available and were used <strong>in</strong> a numberof API calls <strong>in</strong> generic collections. However, .NET 3.5 takes th<strong>in</strong>gs a step further,<strong>in</strong>troduc<strong>in</strong>g a group of generic delegate types called Func that all take a number ofparameters of specified types and return a value of another specified type. List<strong>in</strong>g 2.5gives an example of the use of a Func delegate type as well as lambda expressions.List<strong>in</strong>g 2.5Lambda expressions, which are like improved anonymous methodsFunc func = (x,y) => (x*y).ToStr<strong>in</strong>g();Console.WriteL<strong>in</strong>e (func(5, 20));Func is a delegate type that takes two <strong>in</strong>tegers and returns a str<strong>in</strong>g.The lambda expression <strong>in</strong> list<strong>in</strong>g 2.5 specifies that the delegate <strong>in</strong>stance (held <strong>in</strong>func) should multiply the two <strong>in</strong>tegers together and call ToStr<strong>in</strong>g(). The syntax ismuch more straightforward than that of anonymous methods, and there are otherbenefits <strong>in</strong> terms of the amount of type <strong>in</strong>ference the compiler is prepared to performfor you. Lambda expressions are absolutely crucial to LINQ, and you should get readyto make them a core part of your language toolkit. They’re not restricted to work<strong>in</strong>gwith LINQ, however—almost any use of anonymous methods from <strong>C#</strong> 2 can uselambda expressions <strong>in</strong> <strong>C#</strong> 3.To summarize, the new features related to delegates are as follows:■ Generics (generic delegate types)—<strong>C#</strong> 2■ Delegate <strong>in</strong>stance creation expressions—<strong>C#</strong> 2■ Anonymous methods—<strong>C#</strong> 2■ Delegate covariance/contravariance—<strong>C#</strong> 2■ Lambda expressions—<strong>C#</strong> 3The use of generics extends well beyond delegates, of course—they’re one of the pr<strong>in</strong>cipleenhancements to the type system, which we’ll look at next.2.4.2 Features related to the type systemThe primary new feature <strong>in</strong> <strong>C#</strong> 2 regard<strong>in</strong>g the type system is that of generics. Itlargely addresses the issues I raised <strong>in</strong> section 2.2.2 about strongly typed collections,although generic types are useful <strong>in</strong> a number of other situations too. As a feature, it’selegant, it solves a real problem, and despite a few wr<strong>in</strong>kles it generally works verywell. We’ve seen examples of this <strong>in</strong> quite a few places already, and it’s described fully<strong>in</strong> the next chapter, so I won’t go <strong>in</strong>to any more details here. It’ll be a brief reprieve,Licensed to Rhona Hadida


<strong>C#</strong> 2 and 3: new features on a solid base57though—generics form probably the most important feature <strong>in</strong> <strong>C#</strong> 2 with respect tothe type system, and you’ll see generic types throughout the rest of the book.<strong>C#</strong> 2 doesn’t tackle the general issue of covariant return types and contravariantparameters, but it does cover it for creat<strong>in</strong>g delegate <strong>in</strong>stances <strong>in</strong> certa<strong>in</strong> situations, aswe saw <strong>in</strong> section 2.4.1. <strong>C#</strong> 3 <strong>in</strong>troduces a wealth of new concepts <strong>in</strong> the type system,most notably anonymous types, implicitly typed local variables, and extension methods.Anonymous types themselves are mostly present for the sake of LINQ, where it’suseful to be able to effectively create a data transfer type with a bunch of read-onlyproperties without hav<strong>in</strong>g to actually write the code for them. There’s noth<strong>in</strong>g to stopthem from be<strong>in</strong>g used outside LINQ, however, which makes life easier for demonstrations.List<strong>in</strong>g 2.6 shows both features <strong>in</strong> action.List<strong>in</strong>g 2.6Demonstration of anonymous types and implicit typ<strong>in</strong>gvar jon = new { Name="Jon", Age=31 };var tom = new { Name="Tom", Age=4 };Console.WriteL<strong>in</strong>e ("{0} is {1}", jon.Name, jon.Age);Console.WriteL<strong>in</strong>e ("{0} is {1}", tom.Name, tom.Age);The first two l<strong>in</strong>es each show implicit typ<strong>in</strong>g (the use of var) and anonymous object<strong>in</strong>itializers (the new {…} bit), which create <strong>in</strong>stances of anonymous types.There are two th<strong>in</strong>gs worth not<strong>in</strong>g at this stage, long before we get <strong>in</strong>to thedetails—po<strong>in</strong>ts that have caused people to worry needlessly before. The first is that<strong>C#</strong> is still statically typed. The <strong>C#</strong> compiler has declared jon and tom to be of aparticular type, just as normal, and when we use the properties of the objects theyare normal properties—there’s no dynamic lookup go<strong>in</strong>g on. It’s just that we (assource code authors) couldn’t tell the compiler what type to use <strong>in</strong> the variabledeclaration because the compiler will be generat<strong>in</strong>g the type itself. The propertiesare also statically typed—here the Age property is of type <strong>in</strong>t, and the Name propertyof type str<strong>in</strong>g.The second po<strong>in</strong>t is that we haven’t created two different anonymous types here.The variables jon and tom both have the same type because the compiler uses theproperty names, types, and order to work out that it can generate just one type anduse it for both statements. This is done on a per-assembly basis, and makes life a lotsimpler <strong>in</strong> terms of be<strong>in</strong>g able to assign the value of one variable to another (forexample, jon=tom; would be permitted <strong>in</strong> the previous code) and similar operations.Extension methods are also there for the sake of LINQ but can be useful outside it.Th<strong>in</strong>k of all the times you’ve wished that a framework type had a certa<strong>in</strong> method, andyou’ve had to write a static utility method to implement it. For <strong>in</strong>stance, to create anew str<strong>in</strong>g by revers<strong>in</strong>g an exist<strong>in</strong>g one you might write a static Str<strong>in</strong>gUtil.Reversemethod. Well, the extension method feature effectively lets you call that static methodas if it existed on the str<strong>in</strong>g type itself, so you could writestr<strong>in</strong>g x = "dlrow olleH".Reverse();Licensed to Rhona Hadida


58 CHAPTER 2 Core foundations: build<strong>in</strong>g on <strong>C#</strong> 1Extension methods also let you appear to add methods with implementations to<strong>in</strong>terfaces—and <strong>in</strong>deed that’s what LINQ relies on heavily, allow<strong>in</strong>g calls to all k<strong>in</strong>ds ofmethods on IEnumerable that have never previously existed.Here’s the quick-view list of these features, along with which version of <strong>C#</strong> they’re<strong>in</strong>troduced <strong>in</strong>:■ Generics—<strong>C#</strong> 2■ Limited delegate covariance/contravariance—<strong>C#</strong> 2■ Anonymous types—<strong>C#</strong> 3■ Implicit typ<strong>in</strong>g—<strong>C#</strong> 3■ Extension methods—<strong>C#</strong> 3After that fairly diverse set of features on the type system <strong>in</strong> general, let’s look at thefeatures added to one very specific part of typ<strong>in</strong>g <strong>in</strong> .NET—value types.2.4.3 Features related to value typesThere are only two features to talk about here, and <strong>C#</strong> 2 <strong>in</strong>troduces them both. Thefirst goes back to generics yet aga<strong>in</strong>, and <strong>in</strong> particular collections. One common compla<strong>in</strong>tabout us<strong>in</strong>g value types <strong>in</strong> collections with .NET 1.1 was that due to all of the“general purpose” APIs be<strong>in</strong>g specified <strong>in</strong> terms of the object type, every operationthat added a struct value to a collection would <strong>in</strong>volve box<strong>in</strong>g it, and when retriev<strong>in</strong>g ityou’d have to unbox it. While box<strong>in</strong>g is pretty cheap on a “per call” basis, it can causea significant performance hit when it’s used every time with frequently accessed collections.It also takes more memory than it needs to, due to the per-object overhead.Generics fix both the speed and memory deficiencies by us<strong>in</strong>g the real type <strong>in</strong>volvedrather than just a general-purpose object. As an example, it would have been madnessto read a file and store each byte as an element <strong>in</strong> an ArrayList <strong>in</strong> .NET 1.1—but <strong>in</strong>.NET 2.0 it wouldn’t be particularly crazy to do the same with a List.The second feature addresses another common cause of compla<strong>in</strong>t, particularlywhen talk<strong>in</strong>g to databases—the fact that you can’t assign null to a value type variable.There’s no such concept as an <strong>in</strong>t value of null, for <strong>in</strong>stance, even though a database<strong>in</strong>teger field may well be nullable. At that po<strong>in</strong>t it can be hard to model the databasetable with<strong>in</strong> a statically typed class without a bit of ugl<strong>in</strong>ess of some form or another.Nullable types are part of .NET 2.0, and <strong>C#</strong> 2 <strong>in</strong>cludes extra syntax to make them easyto use. List<strong>in</strong>g 2.7 gives a brief example of this.List<strong>in</strong>g 2.7Demonstration of a variety of nullable type features<strong>in</strong>t? x = null; Declares and sets nullable variablex = 5;if (x != null) Tests for presence of “real” value{<strong>in</strong>t y = x.Value; Obta<strong>in</strong>s “real” valueConsole.WriteL<strong>in</strong>e (y);}<strong>in</strong>t z = x ?? 10; Uses null-coalesc<strong>in</strong>g operatorLicensed to Rhona Hadida


Summary59List<strong>in</strong>g 2.7 shows a number of the features of nullable types and the shorthand that <strong>C#</strong>provides for work<strong>in</strong>g with them. We’ll get around to the details of each feature <strong>in</strong>chapter 4, but the important th<strong>in</strong>g to th<strong>in</strong>k about is how much easier and cleaner allof this is than any of the alternative workarounds that have been used <strong>in</strong> the past.The list of enhancements is smaller this time, but they’re very important features<strong>in</strong> terms of both performance and elegance of expression:2.5 Summary■ Generics—<strong>C#</strong> 2■ Nullable types—<strong>C#</strong> 2This chapter has provided some revision of a few topics from <strong>C#</strong> 1. The aim wasn’t tocover any of the topics <strong>in</strong> its entirety, but merely to get everyone on the same page sothat I can describe the <strong>C#</strong> 2 and 3 features without worry<strong>in</strong>g about the ground thatI’m build<strong>in</strong>g on.All of the topics we’ve covered are core to <strong>C#</strong> and .NET, but with<strong>in</strong> community discussions,blogs, and even occasionally books often they’re either skimmed over tooquickly or not enough care is taken with the details. This has often left developers witha mistaken understand<strong>in</strong>g of how th<strong>in</strong>gs work, or with an <strong>in</strong>adequate vocabulary toexpress themselves. Indeed, <strong>in</strong> the case of characteriz<strong>in</strong>g type systems, computer sciencehas provided such a variety of mean<strong>in</strong>gs for some terms that they’ve become almost useless.Although this chapter hasn’t gone <strong>in</strong>to much depth about any one po<strong>in</strong>t, it willhopefully have cleared up any confusion that would have made the rest of the bookharder to understand.The three core topics we’ve briefly covered <strong>in</strong> this chapter are all significantlyenhanced <strong>in</strong> <strong>C#</strong> 2 and 3, and some features touch on more than one topic. In particular,generics has an impact on almost every area we’ve covered <strong>in</strong> this chapter—it’s probablythe most widely used and important feature <strong>in</strong> <strong>C#</strong> 2. Now that we’ve f<strong>in</strong>ished all ourpreparations, we can start look<strong>in</strong>g at it properly <strong>in</strong> the next chapter.Licensed to Rhona Hadida


Licensed to Rhona Hadida


Part 2<strong>C#</strong>2:solv<strong>in</strong>g the issuesof <strong>C#</strong>1In part 1 we took a quick look at a few of the features of <strong>C#</strong> 2. Now it’s time todo the job properly. We’ll see how <strong>C#</strong> 2 fixes various problems that developersran <strong>in</strong>to when us<strong>in</strong>g <strong>C#</strong> 1, and how <strong>C#</strong> 2 makes exist<strong>in</strong>g features more useful bystreaml<strong>in</strong><strong>in</strong>g them. This is no mean feat, and life with <strong>C#</strong> 2 is much more pleasantthan with <strong>C#</strong> 1.The new features <strong>in</strong> <strong>C#</strong> 2 have a certa<strong>in</strong> amount of <strong>in</strong>dependence. That’s notto say they’re not related at all, of course; many of the features are based on—orat least <strong>in</strong>teract with—the massive contribution that generics make to the language.However, the different topics we’ll look at <strong>in</strong> the next five chapters don’tcomb<strong>in</strong>e <strong>in</strong>to one cohesive whole.The first four chapters of this part cover the biggest new features. We’ll lookat the follow<strong>in</strong>g:■ Generics-—The most important new feature <strong>in</strong> <strong>C#</strong> 2 (and <strong>in</strong>deed <strong>in</strong> theCLR for .NET 2.0), generics allow type and method parameterization.■ Nullable types —Value types such as <strong>in</strong>t and DateTime don’t have any conceptof “no value present”—nullable types allow you to represent theabsence of a mean<strong>in</strong>gful value.■ Delegates —Although delegates haven’t changed at the CLR level, <strong>C#</strong> 2makes them a lot easier to work with. As well as a few simple shortcuts, the<strong>in</strong>troduction of anonymous methods beg<strong>in</strong>s the movement toward a morefunctional style of programm<strong>in</strong>g—this trend cont<strong>in</strong>ues <strong>in</strong> <strong>C#</strong> 3.■ Iterators —While us<strong>in</strong>g iterators has always been simple <strong>in</strong> <strong>C#</strong> with theforeach statement, it’s a pa<strong>in</strong> to implement them <strong>in</strong> <strong>C#</strong> 1. The <strong>C#</strong> 2 compileris happy to build a state mach<strong>in</strong>e for you beh<strong>in</strong>d the scenes, hid<strong>in</strong>g alot of the complexity <strong>in</strong>volved.Licensed to Rhona Hadida


Hav<strong>in</strong>g covered the major, complex new features of <strong>C#</strong> 2 with a chapter dedicated toeach one, chapter 7 rounds off our coverage by cover<strong>in</strong>g several simpler features. Simplerdoesn’t necessarily mean less useful, of course: partial types <strong>in</strong> particular are veryimportant for better designer support <strong>in</strong> Visual Studio 2005.As you can see, there’s a lot to cover. Take a deep breath, and let’s dive <strong>in</strong>to theworld of generics…Licensed to Rhona Hadida


Parameterizedtyp<strong>in</strong>g with genericsThis chapter covers■Generic types and methods■ Generic collections <strong>in</strong> .NET 2.0■■Limitations of genericsComparisons with other languagesTrue 1 story: the other day my wife and I did our weekly grocery shopp<strong>in</strong>g. Justbefore we left, she asked me if I had the list. I confirmed that <strong>in</strong>deed I did have thelist, and off we went. It was only when we got to the grocery store that our mistakemade itself obvious. My wife had been ask<strong>in</strong>g about the shopp<strong>in</strong>g list whereas I’dactually brought the list of neat features <strong>in</strong> <strong>C#</strong> 2. When we asked an assistantwhether we could buy any anonymous methods, we received a very strange look.If only we could have expressed ourselves more clearly! If only she’d had someway of say<strong>in</strong>g that she wanted me to br<strong>in</strong>g the list of items we wanted to buy! If onlywe’d had generics…1By which I mean “convenient for the purposes of <strong>in</strong>troduc<strong>in</strong>g the chapter”—not, you know, accurate as such.63Licensed to Rhona Hadida


64 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsFor most people, generics will be the most important new feature of <strong>C#</strong> 2. Theyenhance performance, make your code more expressive, and move a lot of safetyfrom execution time to compile time. Essentially they allow you to parameterize typesand methods—just as normal method calls often have parameters to tell them whatvalues to use, generic types and methods have type parameters to tell them what typesto use. It all sounds very confus<strong>in</strong>g to start with—and if you’re completely new togenerics you can expect a certa<strong>in</strong> amount of head scratch<strong>in</strong>g—but once you’ve gotthe basic idea, you’ll come to love them.In this chapter we’ll be look<strong>in</strong>g at how to use generic types and methods that othershave provided (whether <strong>in</strong> the framework or as third-party libraries), and how to writeyour own. We’ll see the most important generic types with<strong>in</strong> the framework, and take alook just under the surface to understand some of the performance implications ofgenerics. To conclude the chapter, I’ll present some of the most frequently encounteredlimitations of generics, along with possible workarounds, and compare generics<strong>in</strong> <strong>C#</strong> with similar features <strong>in</strong> other languages.First, though, we need to understand the problems that caused generics to bedevised <strong>in</strong> the first place.3.1 Why generics are necessaryHave you ever counted how many casts you have <strong>in</strong> your <strong>C#</strong> 1 code? If you use any ofthe built-<strong>in</strong> collections, or if you’ve written your own types that are designed to workwith many different types of data, you’ve probably got plenty of casts lurk<strong>in</strong>g <strong>in</strong> yoursource, quietly tell<strong>in</strong>g the compiler not to worry, that everyth<strong>in</strong>g’s f<strong>in</strong>e, just treat theexpression over there as if it had this particular type. Us<strong>in</strong>g almost any API that hasobject as either a parameter type or a return type will probably <strong>in</strong>volve casts at somepo<strong>in</strong>t. Hav<strong>in</strong>g a s<strong>in</strong>gle-class hierarchy with object as the root makes th<strong>in</strong>gs morestraightforward, but the object type <strong>in</strong> itself is extremely dull, and <strong>in</strong> order to do anyth<strong>in</strong>ggenu<strong>in</strong>ely useful with an object you almost always need to cast it.Casts are bad, m’kay? Not bad <strong>in</strong> an “almost never do this” k<strong>in</strong>d of way (like mutablestructs and nonprivate fields) but bad <strong>in</strong> a “necessary evil” k<strong>in</strong>d of way. They’re an<strong>in</strong>dication that you ought to give the compiler more <strong>in</strong>formation somehow, and thatthe way you’re choos<strong>in</strong>g is to get the compiler to trust you at compile time and generatea check to run at execution time, to keep you honest.Now, if you need to tell the compiler the <strong>in</strong>formation somewhere, chances are thatanyone read<strong>in</strong>g your code is also go<strong>in</strong>g to need that same <strong>in</strong>formation. They can see itwhere you’re cast<strong>in</strong>g, of course, but that’s not terribly useful. The ideal place to keepsuch <strong>in</strong>formation is usually at the po<strong>in</strong>t of declar<strong>in</strong>g a variable or method. This is evenmore important if you’re provid<strong>in</strong>g a type or method which other people will call withoutaccess to your code. Generics allow library providers to prevent their users from compil<strong>in</strong>gcode that calls the library with bad arguments. Previously we’ve had to rely on manuallywritten documentation—which is often <strong>in</strong>complete or <strong>in</strong>accurate, and is rarely read anyway.Armed with the extra <strong>in</strong>formation, everyone can work more productively: the compileris able to do more check<strong>in</strong>g; the IDE is able to present IntelliSense options basedLicensed to Rhona Hadida


Simple generics for everyday use65on the extra <strong>in</strong>formation (for <strong>in</strong>stance, offer<strong>in</strong>g the members of str<strong>in</strong>g as next stepswhen you access an element of a list of str<strong>in</strong>gs); callers of methods can be more certa<strong>in</strong>of correctness <strong>in</strong> terms of arguments passed <strong>in</strong> and values returned; and anyone ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>gyour code can better understand what was runn<strong>in</strong>g through your head when youorig<strong>in</strong>ally wrote it <strong>in</strong> the first place.NOTE Will generics reduce your bug count? Every description of generics I’ve read(<strong>in</strong>clud<strong>in</strong>g my own) emphasizes the importance of compile-time typecheck<strong>in</strong>g over execution-time type check<strong>in</strong>g. I’ll let you <strong>in</strong> on a secret: Ican’t remember ever fix<strong>in</strong>g a bug <strong>in</strong> released code that was directly dueto the lack of type check<strong>in</strong>g. In other words, the casts we’ve been putt<strong>in</strong>g<strong>in</strong> our <strong>C#</strong> 1 code have always worked <strong>in</strong> my experience. Those casts havebeen like warn<strong>in</strong>g signs, forc<strong>in</strong>g us to th<strong>in</strong>k about the type safety explicitlyrather than it flow<strong>in</strong>g naturally <strong>in</strong> the code we write. Although genericsmay not radically reduce the number of type safety bugs you encounter,the greater readability afforded can reduce the number of bugs acrossthe board. Code that is simple to understand is simple to get right.All of this would be enough to make generics worthwhile—but there are performanceimprovements too. First, as the compiler is able to perform more check<strong>in</strong>g, that leavesless need<strong>in</strong>g to be checked at execution time. Second, the JIT is able to treat value types<strong>in</strong> a particularly clever way that manages to elim<strong>in</strong>ate box<strong>in</strong>g and unbox<strong>in</strong>g <strong>in</strong> many situations.In some cases, this can make a huge difference to performance <strong>in</strong> terms ofboth speed and memory consumption.Many of the benefits of generics may strike you as be<strong>in</strong>g remarkably similar to thebenefits of static languages over dynamic ones: better compile-time check<strong>in</strong>g, more<strong>in</strong>formation expressed directly <strong>in</strong> the code, more IDE support, better performance.The reason for this is fairly simple: when you’re us<strong>in</strong>g a general API (for example,ArrayList) that can’t differentiate between the different types, you effectively are <strong>in</strong> adynamic situation <strong>in</strong> terms of access to that API. The reverse isn’t generally true, by theway—there are plenty of benefits available from dynamic languages <strong>in</strong> many situations,but they rarely apply to the choice between generic/nongeneric APIs. When youcan reasonably use generics, the decision to do so is usually a no-bra<strong>in</strong>er.So, those are the goodies await<strong>in</strong>g us <strong>in</strong> <strong>C#</strong> 2—now it’s time to actually start us<strong>in</strong>ggenerics.3.2 Simple generics for everyday useThe topic of generics has a lot of dark corners if you want to know everyth<strong>in</strong>g about it.The <strong>C#</strong> 2 language specification goes <strong>in</strong>to a great deal of detail <strong>in</strong> order to make surethat the behavior is specified <strong>in</strong> pretty much every conceivable case. However, wedon’t need to understand most of those corner cases <strong>in</strong> order to be productive. (Thesame is true <strong>in</strong> other areas, <strong>in</strong> fact. For example, you don’t need to know all the exactrules about def<strong>in</strong>itely assigned variables—you just fix the code appropriately when thecompiler compla<strong>in</strong>s.)Licensed to Rhona Hadida


66 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsThis section will cover most of what you’ll need <strong>in</strong> your day-to-day use of generics,both consum<strong>in</strong>g generic APIs that other people have created and creat<strong>in</strong>g your own. Ifyou get stuck while read<strong>in</strong>g this chapter but want to keep mak<strong>in</strong>g progress, I suggest youconcentrate on what you need to know <strong>in</strong> order to use generic types and methods with<strong>in</strong>the framework and other libraries; writ<strong>in</strong>g your own generic types and methods cropsup a lot less often than us<strong>in</strong>g the framework ones.We’ll start by look<strong>in</strong>g at one of the collection classes from .NET 2.0—Dictionary.3.2.1 Learn<strong>in</strong>g by example: a generic dictionaryUs<strong>in</strong>g generic types can be very straightforward if you don’t happen to hit some of thelimitations and start wonder<strong>in</strong>g what’s wrong. You don’t need to know any of the term<strong>in</strong>ologyto have a pretty good guess as to what the code will do when read<strong>in</strong>g it, andwith a bit of trial and error you can experiment your way to writ<strong>in</strong>g your own work<strong>in</strong>gcode too. (One of the benefits of generics is that more check<strong>in</strong>g is done at compiletime, so you’re more likely to have work<strong>in</strong>g code by the time it all compiles—thismakes the experimentation simpler.) Of course, the aim of this chapter is to give youthe knowledge so that you won’t be us<strong>in</strong>g guesswork—you’ll know what’s go<strong>in</strong>g on atevery stage.For now, though, let’s look at some code that is straightforward even if the syntax isunfamiliar. List<strong>in</strong>g 3.1 uses a Dictionary (roughly the generic equivalentof the Hashtable class you’ve almost certa<strong>in</strong>ly used with <strong>C#</strong> 1) to count the frequenciesof words <strong>in</strong> a given piece of text.List<strong>in</strong>g 3.1Us<strong>in</strong>g a Dictionary to count words <strong>in</strong> textstatic Dictionary CountWords(str<strong>in</strong>g text){Dictionary frequencies;frequencies = new Dictionary();str<strong>in</strong>g[] words = Regex.Split(text, @"\W+");foreach (str<strong>in</strong>g word <strong>in</strong> words){if (frequencies.Conta<strong>in</strong>sKey(word)){frequencies[word]++;}else{frequencies[word] = 1;}}return frequencies;}...str<strong>in</strong>g text = @"Do you like green eggs and ham?I do not like them, Sam-I-am.DBCAdds to orupdates mapCreates new map fromword to frequencySplits text<strong>in</strong>to wordsLicensed to Rhona Hadida


Simple generics for everyday use67I do not like green eggs and ham.";Dictionary frequencies = CountWords(text);foreach (KeyValuePair entry <strong>in</strong> frequencies){str<strong>in</strong>g word = entry.Key;<strong>in</strong>t frequency = entry.Value;Console.WriteL<strong>in</strong>e ("{0}: {1}", word, frequency);}Pr<strong>in</strong>ts eachkey/value pairfrom mapThe CountWords method B first creates an empty map from str<strong>in</strong>g to <strong>in</strong>t. This willeffectively count how often each word is used with<strong>in</strong> the given text. We then use a regularexpression C to split the text <strong>in</strong>to words. It’s crude—we end up with two empty str<strong>in</strong>gs(one at each end of the text), and I haven’t worried about the fact that “do” and “Do”are counted separately. These issues are easily fixable, but I wanted to keep the code assimple as possible for this example. For each word, we check whether or not it’s already<strong>in</strong> the map. If it is, we <strong>in</strong>crement the exist<strong>in</strong>g count; otherwise, we give the word an <strong>in</strong>itialcount of 1 D. Notice how the <strong>in</strong>crement<strong>in</strong>g code doesn’t need to do a cast to <strong>in</strong>t <strong>in</strong>order to perform the addition: the value we retrieve is known to be an <strong>in</strong>t at compiletime. The step <strong>in</strong>crement<strong>in</strong>g the count is actually perform<strong>in</strong>g a get on the <strong>in</strong>dexer forthe map, then <strong>in</strong>crement<strong>in</strong>g, then perform<strong>in</strong>g a set on the <strong>in</strong>dexer. Some developersmay f<strong>in</strong>d it easier to keep this explicit, us<strong>in</strong>g frequencies[word] = frequencies[word]+1; <strong>in</strong>stead.The f<strong>in</strong>al part of the list<strong>in</strong>g is fairly familiar: enumerat<strong>in</strong>g through a Hashtablegives a similar (nongeneric) DictionaryEntry with Key and Value properties foreach entry E. However, <strong>in</strong> <strong>C#</strong> 1 we would have needed to cast both the word and thefrequency as the key and value would have been returned as just object. That alsomeans that the frequency would have been boxed. Admittedly we don’t really have toput the word and the frequency <strong>in</strong>to variables—we could just have had a s<strong>in</strong>gle call toConsole.WriteL<strong>in</strong>e and passed entry.Key and entry.Value as arguments. I’ve reallyjust got the variables here to ram home the po<strong>in</strong>t that no cast<strong>in</strong>g is necessary.There are some differences between Hashtable and Dictionarybeyond what you might expect. We’re not look<strong>in</strong>g at them right now, but we’ll coverthem when we look at all of the .NET 2.0 collections <strong>in</strong> section 3.4. For the moment, ifyou experiment beyond any of the code listed here (and please do—there’s noth<strong>in</strong>glike actually cod<strong>in</strong>g to get the hang of a concept) and if it doesn’t do what you expect,just be aware that it might not be due to a lack of understand<strong>in</strong>g of generics. Checkthe documentation before panick<strong>in</strong>g!Now that we’ve seen an example, let’s look at what it means to talk aboutDictionary <strong>in</strong> the first place. What are TKey and TValue, and why dothey have angle brackets round them?3.2.2 Generic types and type parametersThere are two forms of generics: generic types (<strong>in</strong>clud<strong>in</strong>g classes, <strong>in</strong>terfaces, delegates,and structures—there are no generic enums) and generic methods. Both are essentiallya way of express<strong>in</strong>g an API (whether it’s for a s<strong>in</strong>gle generic method or a wholeELicensed to Rhona Hadida


68 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsgeneric type) such that <strong>in</strong> some places where you’d expect to see a normal type, yousee a type parameter <strong>in</strong>stead.A type parameter is a placeholder for a real type. Type parameters appear <strong>in</strong> anglebrackets with<strong>in</strong> a generic declaration, us<strong>in</strong>g commas to separate them. So <strong>in</strong> Dictionary the type parameters are TKey and TValue. When you use a generic typeor method, you specify the real types you want to use. These are called the typearguments—<strong>in</strong> list<strong>in</strong>g 3.1, for example, the type arguments were str<strong>in</strong>g (for TKey) and<strong>in</strong>t (for TValue).NOTE Jargon alert! There’s a lot of detailed term<strong>in</strong>ology <strong>in</strong>volved <strong>in</strong> generics.I’ve <strong>in</strong>cluded it for reference—and because very occasionally it makes iteasier to talk about topics <strong>in</strong> a precise manner. It could well be useful ifyou ever need to consult the language specification, but you’re unlikelyto need to use this term<strong>in</strong>ology <strong>in</strong> day-to-day life. Just gr<strong>in</strong> and bear it forthe moment.The form where none of the type parameters have been provided with type argumentsis called an unbound generic type. When type arguments are specified, the type is said tobe a constructed type. Unbound generic types are effectively bluepr<strong>in</strong>ts for constructedtypes, <strong>in</strong> a way similar to how types (generic or not) can be regarded as bluepr<strong>in</strong>ts forobjects. It’s a sort of extra layer of abstraction. Figure 3.1 shows this graphically.As a further complication, constructed types can be open or closed. An open type isone that <strong>in</strong>volves a type parameter from elsewhere (the enclos<strong>in</strong>g generic method orDictionary(unbound generic type)Specification oftype argumentsHashtableDictionary(constructed type)Dictionary(constructed type)(etc)InstantiationInstantiationInstantiationInstance ofHashtableInstance ofDictionaryInstance ofDictionaryNongenericbluepr<strong>in</strong>tsGenericbluepr<strong>in</strong>tsFigure 3.1 Unbound generic types act as bluepr<strong>in</strong>ts for constructed types, which then act as bluepr<strong>in</strong>tsfor actual objects, just as nongeneric types do.Licensed to Rhona Hadida


Simple generics for everyday use69type), whereas for a closed type all the types <strong>in</strong>volved are completely known about. Allcode actually executes <strong>in</strong> the context of a closed constructed type. The only time yousee an unbound generic type appear<strong>in</strong>g with<strong>in</strong> <strong>C#</strong> code (other than as a declaration)is with<strong>in</strong> the typeof operator, which we’ll meet <strong>in</strong> section 3.4.4.The idea of a type parameter “receiv<strong>in</strong>g” <strong>in</strong>formation and a type argument “provid<strong>in</strong>g”the <strong>in</strong>formation—the dashed l<strong>in</strong>es <strong>in</strong> figure 3.1—is exactly the same as withmethod parameters and arguments, although type arguments are always just names oftypes or type parameters.You can th<strong>in</strong>k of a closed type as hav<strong>in</strong>g the API of the open type, butwith the type parameters be<strong>in</strong>g replaced with their correspond<strong>in</strong>g type arguments.2 Table 3.1 shows some method and property declarations from the open typeDictionary and the equivalent member <strong>in</strong> closed type we built fromit—Dictionary .Table 3.1Examples of how method signatures <strong>in</strong> generic types conta<strong>in</strong> placeholders, which arereplaced when the type arguments are specifiedMethod signature <strong>in</strong> generic typepublic void Add(TKey key, TValue value)public TValue this [TKey key]{ get; set; }public bool Conta<strong>in</strong>sValue(TValue value)public bool Conta<strong>in</strong>sKey(TKey key)Method signature after type parameter replacementpublic void Add(str<strong>in</strong>g key, <strong>in</strong>t value)public <strong>in</strong>t this [str<strong>in</strong>g key]{ get; set; }public bool Conta<strong>in</strong>sValue(<strong>in</strong>t value)public bool Conta<strong>in</strong>sKey(str<strong>in</strong>g key)One important th<strong>in</strong>g to note is that none of the methods <strong>in</strong> table 3.1 are actuallygeneric methods. They’re just “normal” methods with<strong>in</strong> a generic type, and they happento use the type parameters declared as part of the type.Now that you know what TKey and TValue mean, and what the angle brackets arethere for, we can have a look at how Dictionary might be implemented,<strong>in</strong> terms of the type and member declarations. Here’s part of it—althoughthe actual method implementations are all miss<strong>in</strong>g, and there are more members<strong>in</strong> reality:namespace System.Collections.Generic{public class DictionaryDeclaresgeneric class2It doesn’t always work exactly that way—there are corner cases that break when you apply that simple rule—but it’s an easy way of th<strong>in</strong>k<strong>in</strong>g about generics that works <strong>in</strong> the vast majority of situations.Licensed to Rhona Hadida


70 CHAPTER 3 Parameterized typ<strong>in</strong>g with generics}{}: IEnumerablepublic Dictionary(){[...]}Declaresparameterlessconstructorpublic void Add (TKey key, TValue value){[...]}public TValue this [TKey key]{get { [...] }set { [...] }}public bool Conta<strong>in</strong>sValue (TValue value){[...]}public bool Conta<strong>in</strong>sKey (TKey key){[...]}[... other members ...]Implementsgeneric <strong>in</strong>terfaceDeclares methodus<strong>in</strong>g typeparametersNotice how Dictionary implements the generic <strong>in</strong>terface IEnumerable (and many other <strong>in</strong>terfaces <strong>in</strong> real life). Whatever typearguments you specify for the class are applied to the <strong>in</strong>terface where the same typeparameters are used—so <strong>in</strong> our example, Dictionary implementsIEnumerable. Now that’s actually sort of a “doublygeneric” <strong>in</strong>terface—it’s the IEnumerable <strong>in</strong>terface, with the structure KeyValue-Pair as the type argument. It’s because it implements that <strong>in</strong>terface thatlist<strong>in</strong>g 3.1 was able to enumerate the keys and values <strong>in</strong> the way that it did. It’s also worthpo<strong>in</strong>t<strong>in</strong>g out that the constructor doesn’t list the type parameters <strong>in</strong> angle brackets. Thetype parameters belong to the type rather than to the particular constructor, so that’swhere they’re declared.Generic types can effectively be overloaded on the number of type parameters—soyou could def<strong>in</strong>e MyType, MyType, MyType, MyType, and so forth, allwith<strong>in</strong> the same namespace. The names of the type parameters aren’t used when consider<strong>in</strong>gthis—just how many there are of them. These types are unrelated except <strong>in</strong>name—there’s no default conversion from one to another, for <strong>in</strong>stance. The same istrue for generic methods: two methods can be exactly the same <strong>in</strong> signature otherthan the number of type parameters.Licensed to Rhona Hadida


Simple generics for everyday use71NOTENam<strong>in</strong>g conventions for type parameters—Although you could have a type withtype parameters T, U, and V, it wouldn’t give much <strong>in</strong>dication of what theyactually meant, or how they should be used. Compare this with Dictionary, where it’s obvious that TKey represents the type of the keysand TValue represents the type of the values. Where you have a s<strong>in</strong>gle typeparameter and it’s clear what it means, T is conventionally used (Listis a good example of this). Multiple type parameters should usually benamed accord<strong>in</strong>g to mean<strong>in</strong>g, us<strong>in</strong>g the prefix T to <strong>in</strong>dicate a type parameter.Every so often you may run <strong>in</strong>to a type with multiple s<strong>in</strong>gle-letter typeparameters (SynchronizedKeyedCollection, for example), but youshould try to avoid creat<strong>in</strong>g the same situation yourself.Now that we’ve got an idea of what generic types do, let’s look at generic methods.3.2.3 Generic methods and read<strong>in</strong>g generic declarationsWe’ve mentioned generic methods a few times, but we haven’t actually met one yet.You may f<strong>in</strong>d the overall idea of generic methods more confus<strong>in</strong>g than generictypes—they’re somehow less natural for the bra<strong>in</strong>—but it’s the same basic pr<strong>in</strong>ciple.We’re used to the parameters and return value of a method hav<strong>in</strong>g firmly specifiedtypes—and we’ve seen how a generic type can use its type parameters <strong>in</strong> method declarations.Well, generic methods go one step further—even if you know exactly whichconstructed type you’re deal<strong>in</strong>g with, an <strong>in</strong>dividual method can have type parameterstoo. Don’t worry if you’re still none the wiser—the concept is likely to “click” at somepo<strong>in</strong>t after you’ve seen enough examples.Dictionary doesn’t have any generic methods, but its close neighborList does. As you can imag<strong>in</strong>e, List is just a list of items of whatever typeis specified—so List is just a list of str<strong>in</strong>gs, for <strong>in</strong>stance. Remember<strong>in</strong>g thatT is the type parameter for the whole class, let’s dissect a generic method declaration.Figure 3.2 shows what the different parts of the declaration of the ConvertAllmethod mean. 3Return type(a generic list)Type parameterParameter nameList ConvertAll(Converter conv)Method nameParameter type (generic delegate)Figure 3.2The anatomy of a generic method declaration3I’ve renamed the parameter from converter to conv so that it fits on one l<strong>in</strong>e, but everyth<strong>in</strong>g else is as documented.Licensed to Rhona Hadida


72 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsWhen you look at a generic declaration—whether it’s for a generic type or a genericmethod—it can be a bit daunt<strong>in</strong>g try<strong>in</strong>g to work out what it means, particularly if youhave to deal with generic types of generic types, as we did when we looked at the <strong>in</strong>terfaceimplemented by the dictionary. The key is not to panic—just take th<strong>in</strong>gs calmly,and pick an example situation. Use a different type for each type parameter, and applythem all consistently.In this case, let’s start off by replac<strong>in</strong>g the type parameter of the type conta<strong>in</strong><strong>in</strong>gthe method (the part of List). We’ve used List as an examplebefore, so let’s cont<strong>in</strong>ue to do so and replace T with str<strong>in</strong>g everywhere:List ConvertAll(Converter conv)That looks a bit better, but we’ve still got TOutput to deal with. We can tell that it’s amethod’s type parameter (apologies for the confus<strong>in</strong>g term<strong>in</strong>ology) because it’s <strong>in</strong>angle brackets directly after the name of the method. So, let’s try to use another familiartype—Guid—as the type argument for TOutput. The method declaration becomesList ConvertAll(Converter conv)To go through the bits of this from left to right:■ The method returns a List.■ The method’s name is ConvertAll.■ The method has a s<strong>in</strong>gle type parameter, and the type argument we’re us<strong>in</strong>g isGuid.■ The method takes a s<strong>in</strong>gle parameter, which is a Converter andis called conv.Now we just need to know what Converter is and we’re all done. Not surpris<strong>in</strong>gly,Converter is a constructed generic delegate type (the unboundtype is Converter), which is used to convert a str<strong>in</strong>g to a GUID.So, we have a method that can operate on a list of str<strong>in</strong>gs, us<strong>in</strong>g a converter to producea list of GUIDs. Now that we understand the method’s signature, it’s easier tounderstand the documentation, which confirms that this method does the obviousth<strong>in</strong>g and converts each element <strong>in</strong> the orig<strong>in</strong>al list <strong>in</strong>to the target type, and adds it toa list, which is then returned. Th<strong>in</strong>k<strong>in</strong>g about the signature <strong>in</strong> concrete terms gives usa clearer mental model, and makes it simpler to th<strong>in</strong>k about what we might expect themethod to do.Just to prove I haven’t been lead<strong>in</strong>g you down the garden path, let’s take a look atthis method <strong>in</strong> action. List<strong>in</strong>g 3.2 shows the conversion of a list of <strong>in</strong>tegers <strong>in</strong>to a list offloat<strong>in</strong>g-po<strong>in</strong>t numbers, where each element of the second list is the square root of thecorrespond<strong>in</strong>g element <strong>in</strong> the first list. After the conversion, we pr<strong>in</strong>t out the results.List<strong>in</strong>g 3.2The List.ConvertAll method <strong>in</strong> actionstatic double TakeSquareRoot (<strong>in</strong>t x){return Math.Sqrt(x);Licensed to Rhona Hadida


Simple generics for everyday use73}...List <strong>in</strong>tegers = new List();<strong>in</strong>tegers.Add(1);<strong>in</strong>tegers.Add(2);<strong>in</strong>tegers.Add(3);<strong>in</strong>tegers.Add(4);Converter converter = TakeSquareRoot;BCreates andpopulates listof <strong>in</strong>tegersCCreates delegate<strong>in</strong>stanceList doubles;doubles = <strong>in</strong>tegers.ConvertAll(converter);foreach (double d <strong>in</strong> doubles){Console.WriteL<strong>in</strong>e (d);}The creation and population of the list B is straightforward enough—it’s just a stronglytyped list of <strong>in</strong>tegers. C uses a feature of delegates (method group conversions), whichis new to <strong>C#</strong> 2 and which we’ll discuss <strong>in</strong> more detail <strong>in</strong> section 5.2. Although I don’t likeus<strong>in</strong>g a feature before describ<strong>in</strong>g it fully, the l<strong>in</strong>e would just have been too long to fit onthe page with the full version. It does what you expect it to, though. At D we call thegeneric method, specify<strong>in</strong>g the type argument for the method <strong>in</strong> the same way as we’veseen for generic types. We’ll see later (section 3.3.2) that you don’t always need to specifythe type argument—often the compiler can work it out itself, mak<strong>in</strong>g the code that bitmore compact. We could have omitted it this time, but I wanted to show the full syntax.Writ<strong>in</strong>g out the list that has been returned is simple, and when you run the code you’llsee it pr<strong>in</strong>t 1, 1.414..., 1.732..., and 2, as expected.So, what’s the po<strong>in</strong>t of all of this? We could have just used a foreach loop to gothrough the <strong>in</strong>tegers and pr<strong>in</strong>ted out the square root immediately, of course, but it’snot at all uncommon to want to convert a list of one type to a list of another by perform<strong>in</strong>gsome logic on it. The code to do it manually is still simple, but it’s easier to read aversion that just does it <strong>in</strong> a s<strong>in</strong>gle method call. That’s often the way with generic methods—theyoften do th<strong>in</strong>gs that previously you’d have happily done “longhand” but thatare just simpler with a method call. Before generics, there could have been a similaroperation to ConvertAll on ArrayList convert<strong>in</strong>g from object to object, but it wouldhave been a lot less satisfactory. Anonymous methods (see section 5.4) also help here—if we hadn’t wanted to <strong>in</strong>troduce an extra method, we could just have specified the conversion“<strong>in</strong>l<strong>in</strong>e.”Note that just because a method is generic doesn’t mean it has to be part of ageneric type. List<strong>in</strong>g 3.3 shows a generic method be<strong>in</strong>g declared and used with<strong>in</strong> aperfectly normal class.DCalls genericmethod toconvert listList<strong>in</strong>g 3.3Implement<strong>in</strong>g a generic method <strong>in</strong> a nongeneric typestatic List MakeList (T first, T second){List list = new List();list.Add (first);Licensed to Rhona Hadida


74 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericslist.Add (second);return list;}...List list = MakeList ("L<strong>in</strong>e 1", "L<strong>in</strong>e 2");foreach (str<strong>in</strong>g x <strong>in</strong> list){Console.WriteL<strong>in</strong>e (x);}The MakeList generic method only needs one type parameter (T). All it does is builda list conta<strong>in</strong><strong>in</strong>g the two parameters. It’s worth not<strong>in</strong>g that we can use T as a type argumentwhen we create the List <strong>in</strong> the method, however. Just as when we were look<strong>in</strong>gat generic declarations, th<strong>in</strong>k of the implementation as (roughly speak<strong>in</strong>g) replac<strong>in</strong>g allof the places where it says T with str<strong>in</strong>g. When we call the method, we use the same syntaxwe’ve seen before. In case you were wonder<strong>in</strong>g, a generic method with<strong>in</strong> a generictype doesn’t have to use the generic type’s type parameters—although most do.All OK so far? You should now have the hang of “simple” generics. There’s a bitmore complexity to come, I’m afraid, but if you’re happy with the fundamental ideaof generics, you’ve jumped the biggest hurdle. Don’t worry if it’s still a bit hazy—particularlywhen it comes to the open/closed/unbound/constructed term<strong>in</strong>ology—butnow would be a good time to do some experimentation so you can see generics <strong>in</strong>action before we go any further.The most important types to play with are List and Dictionary.A lot of the time you can get by just by <strong>in</strong>st<strong>in</strong>ct and experimentation, but if you wantmore details of these types, you can skip ahead to sections 3.5.1 and 3.5.2. Once you’reconfident us<strong>in</strong>g these types, you should f<strong>in</strong>d that you rarely want to use ArrayList orHashtable anymore.One th<strong>in</strong>g you may f<strong>in</strong>d when you experiment is that it’s hard to only go part ofthe way. Once you make one part of an API generic, you often f<strong>in</strong>d that you need torework other code to either also be generic or to put <strong>in</strong> the casts required by the morestrongly typed method calls you have now. An alternative can be to have a stronglytyped implementation, us<strong>in</strong>g generic classes under the covers, but leav<strong>in</strong>g a weaklytyped API for the moment. As time goes on, you’ll become more confident aboutwhen it’s appropriate to use generics.3.3 Beyond the basicsWhile the relatively simple uses of generics we’ve seen can get you a long way, thereare some more features available that can help you further. We’ll start off by exam<strong>in</strong><strong>in</strong>gtype constra<strong>in</strong>ts, which allow you more control over which type arguments can bespecified. They are useful when creat<strong>in</strong>g your own generic types and methods, andyou’ll need to understand them <strong>in</strong> order to know what options are available whenus<strong>in</strong>g the framework, too.We’ll then exam<strong>in</strong>e type <strong>in</strong>ference —a handy compiler trick that means that whenyou’re us<strong>in</strong>g generic methods, you don’t always have to explicitly state the typeLicensed to Rhona Hadida


Beyond the basics75parameters. You don’t have to use it, but it can make your code a lot easier to readwhen used appropriately. We’ll see <strong>in</strong> part 3 that the <strong>C#</strong> compiler is gradually be<strong>in</strong>gallowed to <strong>in</strong>fer a lot more <strong>in</strong>formation from your code, while still keep<strong>in</strong>g the languagesafe and statically typed.The last part of this section deals with obta<strong>in</strong><strong>in</strong>g the default value of a type parameterand what comparisons are available when you’re writ<strong>in</strong>g generic code. We’ll wrapup with an example demonstrat<strong>in</strong>g most of the features we’ve covered, as well as be<strong>in</strong>ga useful class <strong>in</strong> itself.Although this section delves a bit deeper <strong>in</strong>to generics, there’s noth<strong>in</strong>g really hardabout it. There’s plenty to remember, but all the features serve a purpose, and you’llbe grateful for them when you need them. Let’s get started.3.3.1 Type constra<strong>in</strong>tsSo far, all the type parameters we’ve seen can be applied to any type at all—they areunconstra<strong>in</strong>ed. We can have a List, a Dictionary, anyth<strong>in</strong>g.That’s f<strong>in</strong>e when we’re deal<strong>in</strong>g with collections that don’t have to <strong>in</strong>teract with whatthey store—but not all uses of generics are like that. Often you want to call methodson <strong>in</strong>stances of the type parameter, or create new <strong>in</strong>stances, or make sure you onlyaccept reference types (or only accept value types). In other words, you want to specifyrules to say which type arguments are considered valid for your generic type ormethod. In <strong>C#</strong> 2, you do this with constra<strong>in</strong>ts.Four k<strong>in</strong>ds of constra<strong>in</strong>ts are available, and the general syntax is the same for all ofthem. Constra<strong>in</strong>ts come at the end of the declaration of a generic method or type,and are <strong>in</strong>troduced by the contextual keyword where. They can be comb<strong>in</strong>ed together<strong>in</strong> sensible ways, as we’ll see later. First, however, we’ll explore each k<strong>in</strong>d of constra<strong>in</strong>t<strong>in</strong> turn.REFERENCE TYPE CONSTRAINTSThe first k<strong>in</strong>d of constra<strong>in</strong>t (which is expressed as T : class and must be the first constra<strong>in</strong>tspecified for that type parameter) simply ensures that the type argument usedis a reference type. This can be any class, <strong>in</strong>terface, array, or delegate—or anothertype parameter that is already known to be a reference type. For example, considerthe follow<strong>in</strong>g declaration:struct RefSample where T : classValid closed types <strong>in</strong>clude■■■RefSampleRefSampleRefSampleInvalid closed types <strong>in</strong>clude■■RefSampleRefSampleLicensed to Rhona Hadida


76 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsI deliberately made RefSample a struct (and therefore a value type) to emphasize thedifference between the constra<strong>in</strong>ed type parameter and the type itself. RefSample is still a value type with value semantics everywhere—it just happens to usethe str<strong>in</strong>g type wherever T is specified <strong>in</strong> its API.When a type parameter is constra<strong>in</strong>ed this way, you can compare references (<strong>in</strong>clud<strong>in</strong>gnull) with == and !=, but be aware that unless there are any other constra<strong>in</strong>ts, onlyreferences will be compared, even if the type <strong>in</strong> question overloads those operators (asstr<strong>in</strong>g does, for example). With a derivation type constra<strong>in</strong>t (described <strong>in</strong> a littlewhile), you can end up with “compiler guaranteed” overloads of == and !=, <strong>in</strong> whichcase those overloads are used—but that’s a relatively rare situation.VALUE TYPE CONSTRAINTSThis constra<strong>in</strong>t (expressed as T : struct) ensures that the type argument used is avalue type, <strong>in</strong>clud<strong>in</strong>g enums. It excludes nullable types (as described <strong>in</strong> chapter 4),however. Let’s look at an example declaration:class ValSample where T : structValid closed types <strong>in</strong>clude■■ValSampleValSampleInvalid closed types <strong>in</strong>clude■■ValSampleValSampleThis time ValSample is a reference type, despite T be<strong>in</strong>g constra<strong>in</strong>ed to be a valuetype. Note that System.Enum and System.ValueType are both reference types <strong>in</strong>themselves, so aren’t allowed as valid type arguments for ValSample. Like referencetype constra<strong>in</strong>ts, when there are multiple constra<strong>in</strong>ts for a particular type parameter,a value type constra<strong>in</strong>t must be the first one specified. When a type parameter is constra<strong>in</strong>edto be a value type, comparisons us<strong>in</strong>g == and != are prohibited.I rarely f<strong>in</strong>d myself us<strong>in</strong>g value or reference type constra<strong>in</strong>ts, although we’ll see<strong>in</strong> the next chapter that nullable types rely on value type constra<strong>in</strong>ts. The rema<strong>in</strong><strong>in</strong>gtwo constra<strong>in</strong>ts are likely to prove more useful to you when writ<strong>in</strong>g your owngeneric types.CONSTRUCTOR TYPE CONSTRAINTSThe third k<strong>in</strong>d of constra<strong>in</strong>t (which is expressed as T : new() and must be the lastconstra<strong>in</strong>t for any particular type parameter) simply checks that the type argumentused has a parameterless constructor, which can be used to create an <strong>in</strong>stance. Thisapplies to any value type; any nonstatic, nonabstract class without any explicitlydeclared constructors; and any nonabstract class with an explicit public parameterlessconstructor.Licensed to Rhona Hadida


Beyond the basics77NOTE<strong>C#</strong> vs. CLI standards—There is a discrepancy between the <strong>C#</strong> and CLIstandards when it comes to value types and constructors. The CLI specificationstates that value types can’t have parameterless constructors, butthere’s a special <strong>in</strong>struction to create a value without specify<strong>in</strong>g anyparameters. The <strong>C#</strong> specification states that all value types have a defaultparameterless constructor, and it uses the same syntax to call both explicitlydeclared constructors and the parameterless one, rely<strong>in</strong>g on the compilerto do the right th<strong>in</strong>g underneath. You can see this discrepancy atwork when you use reflection to f<strong>in</strong>d the constructors of a value type—you won’t see a parameterless one.Aga<strong>in</strong>, let’s look at a quick example, this time for a method. Just to show how it’s useful,I’ll give the implementation of the method too.public T CreateInstance() where T : new(){return new T();}This method just returns a new <strong>in</strong>stance of whatever type you specify, provid<strong>in</strong>gthat it has a parameterless constructor. So CreateInstance(); and Create-Instance(); are OK, but CreateInstance(); isn’t, because str<strong>in</strong>gdoesn’t have a parameterless constructor.There is no way of constra<strong>in</strong><strong>in</strong>g type parameters to force other constructor signatures—for<strong>in</strong>stance, you can’t specify that there has to be a constructor tak<strong>in</strong>g a s<strong>in</strong>glestr<strong>in</strong>g parameter. It can be frustrat<strong>in</strong>g, but that’s unfortunately just the way it is.Constructor type constra<strong>in</strong>ts can be useful when you need to use factory-like patterns,where one object will create another one as and when it needs to. Factories oftenneed to produce objects that are compatible with a certa<strong>in</strong> <strong>in</strong>terface, of course—andthat’s where our last type of constra<strong>in</strong>t comes <strong>in</strong>.DERIVATION TYPE CONSTRAINTSThe f<strong>in</strong>al (and most complicated) k<strong>in</strong>d of constra<strong>in</strong>t lets you specify another type thatthe type argument must derive from (<strong>in</strong> the case of a class) or implement (<strong>in</strong> the caseof an <strong>in</strong>terface). 4 For the purposes of constra<strong>in</strong>ts, types are deemed to derive fromthemselves. You can specify that one type argument must derive from another, too—this is called a type parameter constra<strong>in</strong>t and makes it harder to understand the declaration,but can be handy every so often. Table 3.2 shows some examples of generic typedeclarations with derivation type constra<strong>in</strong>ts, along with valid and <strong>in</strong>valid examples ofcorrespond<strong>in</strong>g constructed types.The third constra<strong>in</strong>t of T : IComparable is just one example of us<strong>in</strong>g a generictype as the constra<strong>in</strong>t. Other variations such as T : List (where U is another type4Strictly speak<strong>in</strong>g, an implicit reference conversion is OK too. This allows for a constra<strong>in</strong>t such as where T :IList to be satisfied by Circle[]. Even though Circle[] doesn’t actually implement IList, there is an implicit reference conversion available.Licensed to Rhona Hadida


78 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsTable 3.2Examples of derivation type constra<strong>in</strong>tsDeclarationclass Samplewhere T : Streamstruct Samplewhere T : IDisposableclass Samplewhere T : IComparableclass Samplewhere T : UConstructed type examplesValid: SampleSampleInvalid: SampleSampleValid: SampleSampleInvalid: SampleValid: SampleInvalid: SampleValid: SampleSampleInvalid: Sampleparameter) and T : IList are also f<strong>in</strong>e. You can specify multiple <strong>in</strong>terfaces,but only one class. For <strong>in</strong>stance, this is f<strong>in</strong>e (if hard to satisfy):class Sample where T : Stream,IEnumerable,IComparableBut this isn’t:class Sample where T : Stream,ArrayList,IComparableNo type can derive directly from more than one class anyway, so such a constra<strong>in</strong>twould usually either be impossible (like this one) or part of it would be redundant(specify<strong>in</strong>g that the type had to derive from both Stream and MemoryStream, for example).One more set of restrictions: the class you specify can’t be a struct, a sealed class(such as str<strong>in</strong>g), or any of the follow<strong>in</strong>g “special” types:■■■■System.ObjectSystem.EnumSystem.ValueTypeSystem.DelegateDerivation type constra<strong>in</strong>ts are probably the most useful k<strong>in</strong>d, as they mean you canuse members of the specified type on <strong>in</strong>stances of the type parameter. One particularlyhandy example of this is T : IComparable, so that you know you can comparetwo <strong>in</strong>stances of T mean<strong>in</strong>gfully and directly. We’ll see an example of this (as well asdiscuss other forms of comparison) <strong>in</strong> section 3.3.3.COMBINING CONSTRAINTSI’ve mentioned the possibility of hav<strong>in</strong>g multiple constra<strong>in</strong>ts, and we’ve seen them <strong>in</strong>action for derivation constra<strong>in</strong>ts, but we haven’t seen the different k<strong>in</strong>ds be<strong>in</strong>gLicensed to Rhona Hadida


Beyond the basics79comb<strong>in</strong>ed together. Obviously no type can be both a reference type and a value type, sothat comb<strong>in</strong>ation is forbidden, and as every value type has a parameterless constructor,specify<strong>in</strong>g the construction constra<strong>in</strong>t when you’ve already got a value type constra<strong>in</strong>tis also not allowed (but you can still use new T() with<strong>in</strong> methods if T is constra<strong>in</strong>ed tobe a value type). Different type parameters can have different constra<strong>in</strong>ts, and they’reeach <strong>in</strong>troduced with a separate where.Let’s see some valid and <strong>in</strong>valid examples:Valid:class Sample where T : class, Stream, new()class Sample where T : struct, IDisposableclass Sample where T : class where U : struct, Tclass Sample where T : Stream where U : IDisposableInvalid:class Sample where T : class, structclass Sample where T : Stream, classclass Sample where T : new(), Streamclass Sample where T : struct where U : class, Tclass Sample where T : Stream, U : IDisposableI <strong>in</strong>cluded the last example on each list because it’s so easy to try the <strong>in</strong>valid one<strong>in</strong>stead of the valid version, and the compiler error is not at all helpful. Just rememberthat each list of type parameter constra<strong>in</strong>ts needs its own <strong>in</strong>troductory where. Thethird valid example is <strong>in</strong>terest<strong>in</strong>g—if U is a value type, how can it derive from T, whichis a reference type? The answer is that T could be object or an <strong>in</strong>terface that U implements.It’s a pretty nasty constra<strong>in</strong>t, though.Now that you’ve got all the knowledge you need to read generic type declarations,let’s look at the type argument <strong>in</strong>ference that I mentioned earlier. In list<strong>in</strong>g 3.2 weexplicitly stated the type arguments to List.ConvertAll—but let’s now ask the compilerto work them out when it can, mak<strong>in</strong>g it simpler to call generic methods.3.3.2 Type <strong>in</strong>ference for type arguments of generic methodsSpecify<strong>in</strong>g type arguments when you’re call<strong>in</strong>g a generic method can often seempretty redundant. Usually it’s obvious what the type arguments should be, based onthe method arguments themselves. To make life easier, the <strong>C#</strong> 2 compiler is allowed tobe smart <strong>in</strong> tightly def<strong>in</strong>ed ways, so you can call the method without explicitly stat<strong>in</strong>gthe type arguments.Before we go any further, I should stress that this is only true for generic methods. Itdoesn’t apply to generic types. Now that we’ve got that cleared up, let’s look at the relevantl<strong>in</strong>es from list<strong>in</strong>g 3.3, and see how th<strong>in</strong>gs can be simplified. Here are the l<strong>in</strong>esdeclar<strong>in</strong>g and <strong>in</strong>vok<strong>in</strong>g the method:static List MakeList (T first, T second)...List list = MakeList ("L<strong>in</strong>e 1", "L<strong>in</strong>e 2");Now look at the arguments we’ve specified—they’re both str<strong>in</strong>gs. Each of the parameters<strong>in</strong> the method is declared to be of type T. Even if we hadn’t got the partLicensed to Rhona Hadida


80 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsof the method <strong>in</strong>vocation expression, it would be fairly obvious that we meant to callthe method us<strong>in</strong>g str<strong>in</strong>g as the type argument for T. The compiler allows you to omitit, leav<strong>in</strong>g this:List list = MakeList ("L<strong>in</strong>e 1", "L<strong>in</strong>e 2");That’s a bit neater, isn’t it? At least, it’s shorter. That doesn’t always mean it’s morereadable, of course—<strong>in</strong> some cases it’ll make it harder for the reader to work out whattype arguments you’re try<strong>in</strong>g to use, even if the compiler can do it easily. I recommendthat you judge each case on its merits.Notice how the compiler def<strong>in</strong>itely knows that we’re us<strong>in</strong>g str<strong>in</strong>g as the typeparameter, because the assignment to list works too, and that still does specify thetype argument (and has to). The assignment has no <strong>in</strong>fluence on the type parameter<strong>in</strong>ference process, however. It just means that if the compiler works out what type argumentsit th<strong>in</strong>ks you want to use but gets it wrong, you’re still likely to get a compiletimeerror.How could the compiler get it wrong? Well, suppose we actually wanted to useobject as the type argument. Our method parameters are still valid, but the compilerth<strong>in</strong>ks we actually meant to use str<strong>in</strong>g, as they’re both str<strong>in</strong>gs. Chang<strong>in</strong>g one of theparameters to explicitly be cast to object makes type <strong>in</strong>ference fail, as one of themethod arguments would suggest that T should be str<strong>in</strong>g, and the other suggests thatT should be object. The compiler could look at this and say that sett<strong>in</strong>g T to objectwould satisfy everyth<strong>in</strong>g but sett<strong>in</strong>g T to str<strong>in</strong>g wouldn’t, but the specification only givesa limited number of steps to follow. This area is already fairly complicated <strong>in</strong> <strong>C#</strong> 2, and<strong>C#</strong> 3 takes th<strong>in</strong>gs even further. I won’t try to give all of the nuts and bolts of the <strong>C#</strong> 2 ruleshere, but the basic steps are as follows.1 For each method argument (the bits <strong>in</strong> normal parentheses, not angle brackets),try to <strong>in</strong>fer some of the type arguments of the generic method, us<strong>in</strong>g somefairly simple techniques.2 Check that all the results from the first step are consistent—<strong>in</strong> other words, ifone argument implied one type argument for a particular type parameter, andanother implied a different type argument for the same type parameter, then<strong>in</strong>ference fails for the method call.3 Check that all the type parameters needed for the generic method have been<strong>in</strong>ferred. You can’t let the compiler <strong>in</strong>fer some while you specify others explicitly—it’sall or noth<strong>in</strong>g.To avoid learn<strong>in</strong>g all the rules (and I wouldn’t recommend it unless you’re particularly<strong>in</strong>terested <strong>in</strong> the f<strong>in</strong>e details), there’s one simple th<strong>in</strong>g to do: try it to see what happens.If you th<strong>in</strong>k the compiler might be able to <strong>in</strong>fer all the type arguments, try call<strong>in</strong>g themethod without specify<strong>in</strong>g any. If it fails, stick the type arguments <strong>in</strong> explicitly. You losenoth<strong>in</strong>g more than the time it takes to compile the code once, and you don’t have tohave all the extra language-lawyer garbage <strong>in</strong> your head.Licensed to Rhona Hadida


Beyond the basics813.3.3 Implement<strong>in</strong>g genericsAlthough you’re likely to spend more time us<strong>in</strong>g generic types and methods than writ<strong>in</strong>gthem yourself, there are a few th<strong>in</strong>gs you should know for those occasions whereyou’re provid<strong>in</strong>g the implementation. Most of the time you can just pretend T (or whateveryour type parameter is called) is just the name of a type and get on with writ<strong>in</strong>gcode as if you weren’t us<strong>in</strong>g generics at all. There are a few extra th<strong>in</strong>gs you shouldknow, however.DEFAULT VALUE EXPRESSIONSWhen you know exactly what type you’re work<strong>in</strong>g with, you know its “default” value—the value an otherwise un<strong>in</strong>itialized field would have, for <strong>in</strong>stance. When you don’tknow what type you’re referr<strong>in</strong>g to, you can’t specify that default value directly. Youcan’t use null because it might not be a reference type. You can’t use 0 because itmight not be a numeric type. While it’s fairly rare to need the default value, it can beuseful on occasion. Dictionary provides a good example—it has aTryGetValue method that works a bit like the TryParse methods on the numerictypes: it uses an output parameter for the value you’re try<strong>in</strong>g to fetch, and a Booleanreturn value to <strong>in</strong>dicate whether or not it succeeded. This means that the method hasto have some value of type TValue to populate the output parameter with. (Rememberthat output parameters must be assigned before the method returns normally.)NOTEThe TryXXX pattern—There are a few patterns <strong>in</strong> .NET that are easily identifiableby the names of the methods <strong>in</strong>volved—Beg<strong>in</strong>XXX and EndXXXsuggest an asynchronous operation, for example. The TryXXX pattern isone that has had its use expanded between .NET 1.1 and 2.0. It’sdesigned for situations that might normally be considered to be errors(<strong>in</strong> that the method can’t perform its primary duty) but where failurecould well occur without this <strong>in</strong>dicat<strong>in</strong>g a serious issue, and shouldn’t bedeemed exceptional. For <strong>in</strong>stance, users can often fail to type <strong>in</strong> numberscorrectly, so be<strong>in</strong>g able to try to parse some text without hav<strong>in</strong>g to catchan exception and swallow it is very useful. Not only does it improve performance<strong>in</strong> the failure case, but more importantly, it saves exceptionsfor genu<strong>in</strong>e error cases where someth<strong>in</strong>g is wrong <strong>in</strong> the system (howeverwidely you wish to <strong>in</strong>terpret that). It’s a useful pattern to have up yoursleeve as a library designer, when applied appropriately.<strong>C#</strong> 2 provides the default value expression to cater for just this need. The specificationdoesn’t refer to it as an operator, but you can th<strong>in</strong>k of it as be<strong>in</strong>g similar to the typeofoperator, just return<strong>in</strong>g a different value. List<strong>in</strong>g 3.4 shows this <strong>in</strong> a generic method,and also gives an example of type <strong>in</strong>ference and a derivation type constra<strong>in</strong>t <strong>in</strong> action.List<strong>in</strong>g 3.4Compar<strong>in</strong>g a given value to the default <strong>in</strong> a generic waystatic <strong>in</strong>t CompareToDefault (T value)where T : IComparable{return value.CompareTo(default(T));Licensed to Rhona Hadida


82 CHAPTER 3 Parameterized typ<strong>in</strong>g with generics}...Console.WriteL<strong>in</strong>e(CompareToDefault("x"));Console.WriteL<strong>in</strong>e(CompareToDefault(10));Console.WriteL<strong>in</strong>e(CompareToDefault(0));Console.WriteL<strong>in</strong>e(CompareToDefault(-10));Console.WriteL<strong>in</strong>e(CompareToDefault(DateTime.M<strong>in</strong>Value));List<strong>in</strong>g 3.4 shows a generic method be<strong>in</strong>g used with three different types: str<strong>in</strong>g,<strong>in</strong>t, and DateTime. The CompareToDefault method dictates that it can only beused with types implement<strong>in</strong>g the IComparable <strong>in</strong>terface, which allows us to callCompareTo(T) on the value passed <strong>in</strong>. The other value we use for the comparison is thedefault value for the type. As str<strong>in</strong>g is a reference type, the default value is null—andthe documentation for CompareTo states that for reference types, everyth<strong>in</strong>g should begreater than null so the first result is 1. The next three l<strong>in</strong>es show comparisons with thedefault value of <strong>in</strong>t, demonstrat<strong>in</strong>g that the default value is 0. The output of the last l<strong>in</strong>eis 0, show<strong>in</strong>g that DateTime.M<strong>in</strong>Value is the default value for DateTime.Of course, the method <strong>in</strong> list<strong>in</strong>g 3.4 will fail if you pass it null as the argument—the l<strong>in</strong>e call<strong>in</strong>g CompareTo will throw NullReferenceException <strong>in</strong> the normal way.Don’t worry about it for the moment—there’s an alternative us<strong>in</strong>g IComparer, aswe’ll see soon.DIRECT COMPARISONSAlthough list<strong>in</strong>g 3.4 showed how a comparison is possible, we don’t always want to constra<strong>in</strong>our types to implement IComparable or its sister <strong>in</strong>terface, IEquatable,which provides a strongly typed Equals(T) method to complement the Equals(object)method that all types have. Without the extra <strong>in</strong>formation these <strong>in</strong>terfaces give us accessto, there is little we can do <strong>in</strong> terms of comparisons, other than call<strong>in</strong>g Equals(object),which will result <strong>in</strong> box<strong>in</strong>g the value we want to compare with when it’s a value type.(In fact, there are a couple of types to help us <strong>in</strong> some situations—we’ll come to them<strong>in</strong> a m<strong>in</strong>ute.)When a type parameter is unconstra<strong>in</strong>ed (<strong>in</strong> other words, no constra<strong>in</strong>ts are appliedto it), you can use == and != operators but only to compare a value of that type withnull. You can’t compare two values of T with each other. In the case where the typeargument provided for T is a value type (other than a nullable type), a comparison withnull will always decide they are unequal (so the comparison can be removed by the JITcompiler). When the type argument is a reference type, the normal reference comparisonwill be used. When the type argument is a nullable type, the comparison will do theobvious th<strong>in</strong>g, treat<strong>in</strong>g an <strong>in</strong>stance without a value as null. (Don’t worry if this last bitdoesn’t make sense yet—it will when you’ve read the next chapter. Some features aretoo <strong>in</strong>tertw<strong>in</strong>ed to allow me to describe either of them completely without referr<strong>in</strong>g tothe other, unfortunately.)When a type parameter is constra<strong>in</strong>ed to be a value type, == and != can’t be used withit at all. When it’s constra<strong>in</strong>ed to be a reference type, the k<strong>in</strong>d of comparison performeddepends on exactly what the type parameter is constra<strong>in</strong>ed to be. If it’s just a referencetype, simple reference comparisons are performed. If it’s further constra<strong>in</strong>ed toLicensed to Rhona Hadida


Beyond the basics83derive from a particular type that overloads the == and != operators, those overloads areused. Beware, however—extra overloads that happen to be made available by the typeargument specified by the caller are not used. List<strong>in</strong>g 3.5 demonstrates this with a simplereference type constra<strong>in</strong>t and a type argument of str<strong>in</strong>g.List<strong>in</strong>g 3.5Comparisons us<strong>in</strong>g == and != us<strong>in</strong>g reference comparisonsstatic bool AreReferencesEqual (T first, T second)where T : class{B Comparesreturn first==second;references}...str<strong>in</strong>g name = "Jon";str<strong>in</strong>g <strong>in</strong>tro1 = "My name is "+name;str<strong>in</strong>g <strong>in</strong>tro2 = "My name is "+name;Console.WriteL<strong>in</strong>e (<strong>in</strong>tro1==<strong>in</strong>tro2);Console.WriteL<strong>in</strong>e (AreReferencesEqual(<strong>in</strong>tro1, <strong>in</strong>tro2));Even though str<strong>in</strong>g overloads == (as demonstrated by C pr<strong>in</strong>t<strong>in</strong>g True), this overloadis not used by the comparison at B. Basically, when AreReferencesEqual is compiledthe compiler doesn’t know what overloads will be available—it’s as if the parameterspassed <strong>in</strong> were just of type object.This is not just specific to operators—when the compiler encounters ageneric type, it resolves all the method overloads when compil<strong>in</strong>gthe unbound generic type, rather than reconsider<strong>in</strong>g each possiblemethod call for more specific overloads at execution time. For <strong>in</strong>stance, astatement of Console.WriteL<strong>in</strong>e (default(T)); will always resolve to callConsole.WriteL<strong>in</strong>e(object value)—it doesn’t call Console.WriteL<strong>in</strong>e(str<strong>in</strong>g value) when T happens to be str<strong>in</strong>g. This is similar to the normalsituation of overloads be<strong>in</strong>g chosen at compile time rather than executiontime, but readers familiar with templates <strong>in</strong> C++ may be surprised nonetheless.Two classes that are extremely useful when it comes to compar<strong>in</strong>g values are Equality-Comparer and Comparer, both <strong>in</strong> the System.Collections.Generic namespace.They implement IEqualityComparer (useful for compar<strong>in</strong>g and hash<strong>in</strong>g dictionarykeys) and IComparer (useful for sort<strong>in</strong>g) respectively, and the Default propertyreturns an implementation that generally does the right th<strong>in</strong>g for the appropriate type.See the documentation for more details, but consider us<strong>in</strong>g these (and similar typessuch as Str<strong>in</strong>gComparer) when perform<strong>in</strong>g comparisons. We’ll use Equality-Comparer <strong>in</strong> our next example.Caution!Possiblyunexpectedbehavior!Compares us<strong>in</strong>gstr<strong>in</strong>g overloadFULL COMPARISON EXAMPLE: REPRESENTING A PAIR OF VALUESTo f<strong>in</strong>ish off our section on implement<strong>in</strong>g generics—and <strong>in</strong>deed “medium-level” generics—here’sa complete example. It implements a useful generic type—a Pair, which just holds two values together, like a key/value pair, but withno expectations as to the relationship between the two values. As well as provid<strong>in</strong>g propertiesto access the values themselves, we’ll override Equals and GetHashCode to allowCLicensed to Rhona Hadida


84 CHAPTER 3 Parameterized typ<strong>in</strong>g with generics<strong>in</strong>stances of our type to play nicely when used as keys <strong>in</strong> a dictionary. List<strong>in</strong>g 3.6 givesthe complete code.List<strong>in</strong>g 3.6Generic class represent<strong>in</strong>g a pair of valuesus<strong>in</strong>g System;us<strong>in</strong>g System.Collections.Generic;public sealed class Pair: IEquatable{private readonly TFirst first;private readonly TSecond second;}public Pair(TFirst first, TSecond second){this.first = first;this.second = second;}public TFirst First{get { return first; }}public TSecond Second{get { return second; }}public bool Equals(Pair other){if (other == null){return false;}return EqualityComparer.Default.Equals(this.First, other.First) &&EqualityComparer.Default.Equals(this.Second, other.Second);}public override bool Equals(object o){return Equals(o as Pair);}public override <strong>in</strong>t GetHashCode(){return EqualityComparer.Default.GetHashCode(first) * 37 +EqualityComparer.Default.GetHashCode(second);}List<strong>in</strong>g 3.6 is very straightforward. The constituent values are stored <strong>in</strong> appropriatelytyped member variables, and access is provided by simple read-only properties. WeLicensed to Rhona Hadida


Advanced generics85implement IEquatable to give a strongly typed API that willavoid unnecessary execution time checks. The equality and hash-code computationsboth use the default equality comparer for the two type parameters—these handlenulls for us automatically, which makes the code somewhat simpler. 5If we wanted to support sort<strong>in</strong>g, we could implement IComparer , perhaps order<strong>in</strong>g by the first component and then the second.This k<strong>in</strong>d of type is a good candidate for bear<strong>in</strong>g <strong>in</strong> m<strong>in</strong>d what functionality you mightwant, but not actually implement<strong>in</strong>g until you need it.We’ve f<strong>in</strong>ished look<strong>in</strong>g at our “<strong>in</strong>termediate” features now. I realize it can all seemcomplicated at first sight, but don’t be put off: the benefits far outweigh the addedcomplexity. Over time they become second nature. Now that you’ve got the Pair classas an example, it might be worth look<strong>in</strong>g over your own code base to see whetherthere are some patterns that you keep reimplement<strong>in</strong>g solely to use different types.With any large topic there is always more to learn. The next section will take youthrough the most important advanced topics <strong>in</strong> generics. If you’re feel<strong>in</strong>g a bit overwhelmedby now, you might want to skip to the relative comfort of section 3.5, where weexplore the generic collections provided <strong>in</strong> the framework. It’s well worth understand<strong>in</strong>gthe topics <strong>in</strong> the next section eventually, but if everyth<strong>in</strong>g so far has been new to youit wouldn’t hurt to skip it for the moment.3.4 Advanced genericsYou may be expect<strong>in</strong>g me to claim that <strong>in</strong> the rest of this chapter we’ll be cover<strong>in</strong>gevery aspect of generics that we haven’t looked at so far. However, there are so many littlenooks and crannies <strong>in</strong>volv<strong>in</strong>g generics, that’s simply not possible—or at least, I certa<strong>in</strong>lywouldn’t want to even read about all the details, let alone write about them.Fortunately the nice people at Microsoft and ECMA have written all the details <strong>in</strong> thefreely available language specification, 6 so if you ever want to check some obscure situationthat isn’t covered here, that should be your next port of call. Arguably if yourcode ends up <strong>in</strong> a corner case complicated enough that you need to consult the specificationto work out what it should do, you should refactor it <strong>in</strong>to a more obvious formanyway; you don’t want each ma<strong>in</strong>tenance eng<strong>in</strong>eer from now until eternity to have toread the specification.My aim with this section is to cover everyth<strong>in</strong>g you’re likely to want to know aboutgenerics. It talks more about the CLR and framework side of th<strong>in</strong>gs than the particularsyntax of the <strong>C#</strong> 2 language, although of course it’s all relevant when develop<strong>in</strong>g<strong>in</strong> <strong>C#</strong> 2. We’ll start by consider<strong>in</strong>g static members of generic types, <strong>in</strong>clud<strong>in</strong>g type <strong>in</strong>itialization.From there, it’s a natural step to wonder just how all this is implemented5The formula used for calculat<strong>in</strong>g the hash code based on the two “part” results comes from read<strong>in</strong>g EffectiveJava (Prentice Hall PTR, 2001) by Joshua Bloch. It certa<strong>in</strong>ly doesn’t guarantee a good distribution of hashcodes, but <strong>in</strong> my op<strong>in</strong>ion it’s better than us<strong>in</strong>g a bitwise exclusive OR. See Effective Java for more details, and<strong>in</strong>deed for many other useful tips.6http://www.ecma-<strong>in</strong>ternational.org/publications/standards/Ecma-334.htmLicensed to Rhona Hadida


86 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsunder the covers—although we won’t be go<strong>in</strong>g so deep that you need a flashlight.We’ll have a look at what happens when you enumerate a generic collection us<strong>in</strong>gforeach <strong>in</strong> <strong>C#</strong> 2, and round off the section by see<strong>in</strong>g how reflection <strong>in</strong> the .NETFramework is affected by generics.3.4.1 Static fields and static constructorsJust as <strong>in</strong>stance fields belong to an <strong>in</strong>stance, static fields belong to the type they’redeclared <strong>in</strong>. That is, if you declare a static field x <strong>in</strong> class SomeClass, there’s exactlyone SomeClass.x field, no matter how many <strong>in</strong>stances of SomeClass you create, andno matter how many types derive from SomeClass. 7 That’s the familiar scenario from<strong>C#</strong> 1—so how does it map across to generics?The answer is that each closed type has its own set of static fields. This is easiest tosee with an example. List<strong>in</strong>g 3.7 creates a generic type <strong>in</strong>clud<strong>in</strong>g a static field. We setthe field’s value for different closed types, and then pr<strong>in</strong>t out the values to show thatthey are separate.List<strong>in</strong>g 3.7Proof that different closed types have different static fieldsclass TypeWithField{public static str<strong>in</strong>g field;public static void Pr<strong>in</strong>tField(){Console.WriteL<strong>in</strong>e(field+": "+typeof(T).Name);}}...TypeWithField.field = "First";TypeWithField.field = "Second";TypeWithField.field = "Third";TypeWithField.Pr<strong>in</strong>tField();TypeWithField.Pr<strong>in</strong>tField();TypeWithField.Pr<strong>in</strong>tField();We set the value of each field to a different value, and pr<strong>in</strong>t out each field along withthe name of the type argument used for that closed type. Here’s the output from list<strong>in</strong>g3.7:First: Int32Second: Str<strong>in</strong>gThird: DateTimeSo the basic rule is “one static field per closed type.” The same applies for static <strong>in</strong>itializersand static constructors. However, it’s possible to have one generic type nested7Well, one per application doma<strong>in</strong>. For the purposes of this section, we’ll assume we’re only deal<strong>in</strong>g with oneapplication doma<strong>in</strong>. The concepts for different application doma<strong>in</strong>s work the same with generics as with nongenerictypes.Licensed to Rhona Hadida


Advanced generics87with<strong>in</strong> another, and types with multiple generic parameters. This sounds a lot morecomplicated, but it works as you probably th<strong>in</strong>k it should. List<strong>in</strong>g 3.8 shows this <strong>in</strong>action, this time us<strong>in</strong>g static constructors to show just how many types there are.List<strong>in</strong>g 3.8Static constructors with nested generic typesNote!Only 5 l<strong>in</strong>esof output…public class Outer{public class Inner{static Inner(){Console.WriteL<strong>in</strong>e("Outer.Inner",typeof(T).Name,typeof(U).Name,typeof(V).Name);}public static void DummyMethod(){}}}...Outer.Inner.DummyMethod();Outer.Inner.DummyMethod();Outer.Inner.DummyMethod();Outer.Inner.DummyMethod();Outer.Inner.DummyMethod();Outer.Inner.DummyMethod();Each different list of type arguments counts as a different closed type, sothe output of list<strong>in</strong>g 3.8 looks like this:Outer.InnerOuter.InnerOuter.InnerOuter.InnerOuter.InnerJust as with nongeneric types, the static constructor for any closed type is only executedonce, which is why the last l<strong>in</strong>e of list<strong>in</strong>g 3.8 doesn’t create a sixth l<strong>in</strong>e of output—thestatic constructor for Outer.Inner executed earlier,produc<strong>in</strong>g the second l<strong>in</strong>e of output. To clear up any doubts, if we had a nongenericPla<strong>in</strong>Inner class <strong>in</strong>side Outer, there would still have been one possibleOuter.Pla<strong>in</strong>Inner type per closed Outer type, so Outer.Pla<strong>in</strong>Inner wouldbe separate from Outer.Pla<strong>in</strong>Inner, with a separate set of static fields asseen earlier.Now that we’ve seen just what constitutes a different type, we should th<strong>in</strong>k aboutwhat the effects of that might be <strong>in</strong> terms of the amount of native code generated.And no, it’s not as bad as you might th<strong>in</strong>k…Licensed to Rhona Hadida


88 CHAPTER 3 Parameterized typ<strong>in</strong>g with generics3.4.2 How the JIT compiler handles genericsGiven that we have all of these different closed types, the JIT’s job is to convert the ILof the generic type <strong>in</strong>to native code so it can actually be run. In some ways, weshouldn’t care exactly how it does that—beyond keep<strong>in</strong>g a close eye on memory andCPU time, we wouldn’t see much difference if the JIT did the obvious th<strong>in</strong>g and generatednative code for each closed type separately, as if each one had noth<strong>in</strong>g to do withany other type. However, the JIT authors are clever enough that it’s worth see<strong>in</strong>g justwhat they’ve done.Let’s start with a simple situation first, with a s<strong>in</strong>gle type parameter—we’ll useList for the sake of convenience. The JIT creates different code for each value typeargument—<strong>in</strong>t, long, Guid, and the like—that we use. However, it shares the nativecode generated for all the closed types that use a reference type as the type argument,such as str<strong>in</strong>g, Stream, and Str<strong>in</strong>gBuilder. It can do this because all references are thesame size (they are 4 bytes on a 32-bit CLR and 8 bytes on a 64-bit CLR, but with<strong>in</strong> anyone CLR all references are the same size). An array of references will always be the samesize whatever the references happen to be. The space required on the stack for a referencewill always be the same. It can use the same register optimizations whatever typeis be<strong>in</strong>g used—the List goes on.Each of the types still has its own static fields as described <strong>in</strong> section 3.4.1, but thecode that is executed is reused. Of course, the JIT still does all of this lazily—it won’tgenerate the code for List before it needs to, and it will cache that code for allfuture uses of List. In theory, it’s possible to share code for at least some valuetypes. The JIT would have to be careful, not just due to size but also for garbage collectionreasons—it has to be able to quickly identify areas of a struct value that are livereferences. However, value types that are the same size and have the same <strong>in</strong>-memoryfootpr<strong>in</strong>t as far as the GC is concerned could share code. At the time of this writ<strong>in</strong>g,that’s been of sufficiently low priority that it hasn’t been implemented and it may wellstay that way.This level of detail is primarily of academic <strong>in</strong>terest, but it does have aslight performance impact <strong>in</strong> terms of more code be<strong>in</strong>g JIT compiled.However, the performance benefits of generics can be huge, and aga<strong>in</strong>that comes down to hav<strong>in</strong>g the opportunity to JIT to different code fordifferent types. Consider a List, for <strong>in</strong>stance. In .NET 1.1, add<strong>in</strong>g<strong>in</strong>dividual bytes to an ArrayList would have meant box<strong>in</strong>g eachone of them, and stor<strong>in</strong>g a reference to each boxed value. Us<strong>in</strong>gList has no such impact—List has a member of type T[] toreplace the object[] with<strong>in</strong> ArrayList, and that array is of the appropriate type,tak<strong>in</strong>g the appropriate space. So List has a straight byte[] with<strong>in</strong> it used tostore the elements of the array. (In many ways this makes a List behave likea MemoryStream.)Figure 3.3 shows an ArrayList and a List, each with the same six values.(The arrays themselves have more than six elements, to allow for growth. BothHighperformance—avoids box<strong>in</strong>gLicensed to Rhona Hadida


Advanced generics89ArrayListobject[]BoxedbytesListbyte[]elementsref5elements5refrefref3ref315countref15count106refref106121312......13Figure 3.3 Visual demonstration of why List takes up a lot less space than ArrayListwhen stor<strong>in</strong>g value typesList and ArrayList have a buffer, and they create a larger buffer when theyneed to.)The difference <strong>in</strong> efficiency here is <strong>in</strong>credible. Let’s look at the ArrayList first,consider<strong>in</strong>g a 32-bit CLR. 8 Each of the boxed bytes will take up 8 bytes of object overhead,plus 4 bytes (1 byte, rounded up to a word boundary) for the data itself. On topof that, you’ve got all the references themselves, each of which takes up 4 bytes. So foreach byte of useful data, we’re pay<strong>in</strong>g at least 16 bytes—and then there’s the extraunused space for references <strong>in</strong> the buffer.Compare this with the List. Each byte <strong>in</strong> the list takes up a s<strong>in</strong>gle bytewith<strong>in</strong> the elements array. There’s still “wasted” space <strong>in</strong> the buffer, wait<strong>in</strong>g to be usedpotentially by new items—but at least we’re only wast<strong>in</strong>g a s<strong>in</strong>gle byte per unused elementthere.We don’t just ga<strong>in</strong> space, but execution speed too. We don’t need the time taken toallocate the box, the type check<strong>in</strong>g <strong>in</strong>volved <strong>in</strong> unbox<strong>in</strong>g the bytes <strong>in</strong> order to get atthem, or the garbage collection of the boxes when they’re no longer referenced.We don’t have to go down to the CLR level to f<strong>in</strong>d th<strong>in</strong>gs happen<strong>in</strong>g transparentlyon our behalf, however. <strong>C#</strong> has always made life easier with syntactic shortcuts,and our next section looks at a familiar example but with a generic twist: iterat<strong>in</strong>gwith foreach.8When runn<strong>in</strong>g on a 64-bit CLR, the overheads are bigger.Licensed to Rhona Hadida


90 CHAPTER 3 Parameterized typ<strong>in</strong>g with generics3.4.3 Generic iterationOne of the most common operations you’ll want to perform on a collection (usuallyan array or a list) is to iterate through all its elements. The simplest way of do<strong>in</strong>g thatis usually to use the foreach statement. In <strong>C#</strong> 1 this relied on the collection eitherimplement<strong>in</strong>g the System.Collections.IEnumerable <strong>in</strong>terface or hav<strong>in</strong>g a similarGetEnumerator() method that returned a type with a suitable MoveNext() methodand Current property. The Current property didn’t have to be of type object—andthat was the whole po<strong>in</strong>t of hav<strong>in</strong>g these extra rules, which look odd on first sight. Yes,even <strong>in</strong> <strong>C#</strong> 1 you could avoid box<strong>in</strong>g and unbox<strong>in</strong>g dur<strong>in</strong>g iteration if you had acustom-made enumeration type.<strong>C#</strong> 2 makes this somewhat easier, as the rules for the foreach statement have beenextended to also use the System.Collections.Generic.IEnumerable <strong>in</strong>terfacealong with its partner, IEnumerator. These are simply the generic equivalents ofthe old enumeration <strong>in</strong>terfaces, and they’re used <strong>in</strong> preference to the nongeneric versions.That means that if you iterate through a generic collection of value type elements—List,for example—then no box<strong>in</strong>g is performed at all. If the old<strong>in</strong>terface had been used <strong>in</strong>stead, then although we wouldn’t have <strong>in</strong>curred the box<strong>in</strong>gcost while stor<strong>in</strong>g the elements of the list, we’d still have ended up box<strong>in</strong>g them whenwe retrieved them us<strong>in</strong>g foreach!All of this is done for you under the covers—all you need to do is use the foreachstatement <strong>in</strong> the normal way, us<strong>in</strong>g the type argument of the collection as the type ofthe iteration variable, and all will be well. That’s not the end of the story, however. Inthe relatively rare situation that you need to implement iteration over one of your owntypes, you’ll f<strong>in</strong>d that IEnumerable extends the old IEnumerable <strong>in</strong>terface, whichmeans you’ve got to implement two different methods:IEnumerator GetEnumerator();IEnumerator GetEnumerator();Can you see the problem? The methods differ only <strong>in</strong> return type, and the rules of<strong>C#</strong> prevent you from writ<strong>in</strong>g two such methods normally. If you th<strong>in</strong>k back to section2.2.2, we saw a similar situation—and we can use the same workaround. If youimplement IEnumerable us<strong>in</strong>g explicit <strong>in</strong>terface implementation, you can implementIEnumerable with a “normal” method. Fortunately, because IEnumeratorextends IEnumerator, you can use the same return value for both methods, andimplement the nongeneric method by just call<strong>in</strong>g the generic version. Of course,now you need to implement IEnumerator and you quickly run <strong>in</strong>to similar problems,this time with the Current property.List<strong>in</strong>g 3.9 gives a full example, implement<strong>in</strong>g an enumerable class that always justenumerates to the <strong>in</strong>tegers 0 to 9.List<strong>in</strong>g 3.9 A full generic enumeration—of the numbers 0 to 9class Count<strong>in</strong>gEnumerable: IEnumerable{public IEnumerator GetEnumerator()BImplementsIEnumerableimplicitlyLicensed to Rhona Hadida


Advanced generics91}{}return new Count<strong>in</strong>gEnumerator();IEnumerator IEnumerable.GetEnumerator(){return GetEnumerator();}CImplementsIEnumerableexplicitlyclass Count<strong>in</strong>gEnumerator : IEnumerator{<strong>in</strong>t current = -1;public bool MoveNext(){current++;return current < 10;}public <strong>in</strong>t Current{get { return current; }}object IEnumerator.Current{get { return Current; }}public void Reset(){current = -1;}public void Dispose(){}}...Count<strong>in</strong>gEnumerable counter = new Count<strong>in</strong>gEnumerable();foreach (<strong>in</strong>t x <strong>in</strong> counter){Console.WriteL<strong>in</strong>e(x);}DImplementsIEnumerator.CurrentimplicitlyEImplementsIEnumerator.CurrentexplicitlyProves thatenumerabletype worksClearly this isn’t useful <strong>in</strong> terms of the result, but it shows the little hoops you haveto go through <strong>in</strong> order to implement generic enumeration appropriately—at least ifyou’re do<strong>in</strong>g it all longhand. (And that’s without mak<strong>in</strong>g an effort to throw exceptionsif Current is accessed at an <strong>in</strong>appropriate time.) If you th<strong>in</strong>k that list<strong>in</strong>g 3.9looks like a lot of work just to pr<strong>in</strong>t out the numbers 0 to 9, I can’t help but agreewith you—and there’d be even more code if we wanted to iterate through anyth<strong>in</strong>guseful. Fortunately we’ll see <strong>in</strong> chapter 6 that <strong>C#</strong> 2 takes a large amount of the workaway from enumerators <strong>in</strong> many cases. I’ve shown the “full” version so you canappreciate the slight wr<strong>in</strong>kles that have been <strong>in</strong>troduced by the design decision forIEnumerable to extend IEnumerable.FLicensed to Rhona Hadida


92 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsWe only need the trick of us<strong>in</strong>g explicit <strong>in</strong>terface implementation twice—oncefor IEnumerable.GetEnumerator C, and once at IEnumerator.Current E. Both ofthese call their generic equivalents (B and D respectively). Another addition toIEnumerator is that it extends IDisposable, so you have to provide a Disposemethod. The foreach statement <strong>in</strong> <strong>C#</strong> 1 already called Dispose on an enumerator ifit implemented IDisposable, but <strong>in</strong> <strong>C#</strong> 2 there’s no execution time test<strong>in</strong>grequired—if the compiler f<strong>in</strong>ds that you’ve implemented IEnumerable, it createsan unconditional call to Dispose at the end of the loop (<strong>in</strong> a f<strong>in</strong>ally block).Many enumerators won’t actually need to dispose of anyth<strong>in</strong>g, but it’s nice to knowthat when it is required, the most common way of work<strong>in</strong>g through an enumerator(the foreach statement F) handles the call<strong>in</strong>g side automatically.We’ll now go from compile-time efficiency to execution-time flexibility: our f<strong>in</strong>aladvanced topic is reflection. Even <strong>in</strong> .NET 1.0/1.1 reflection could be a little tricky,but generic types and methods <strong>in</strong>troduce an extra level of complexity. The frameworkprovides everyth<strong>in</strong>g we need (with a bit of helpful syntax from <strong>C#</strong> 2 as a language),and although the additional considerations can be daunt<strong>in</strong>g, it’s not too bad if youtake it one step at a time.3.4.4 Reflection and genericsReflection is used by different people for all sorts of th<strong>in</strong>gs. You might use it forexecution-time <strong>in</strong>trospection of objects to perform a simple form of data b<strong>in</strong>d<strong>in</strong>g. Youmight use it to <strong>in</strong>spect a directory full of assemblies to f<strong>in</strong>d implementations of a plug<strong>in</strong><strong>in</strong>terface. You might write a file for an Inversion of Control 9 framework to load anddynamically configure your application’s components. As the uses of reflection are sodiverse, I won’t focus on any particular one but give you more general guidance on perform<strong>in</strong>gcommon tasks. We’ll start by look<strong>in</strong>g at the extensions to the typeof operator.USING TYPEOF WITH GENERIC TYPESReflection is all about exam<strong>in</strong><strong>in</strong>g objects and their types. As such, one of the mostimportant th<strong>in</strong>gs you need to be able to do is obta<strong>in</strong> a reference to the System.Typeobject, which allows access to all the <strong>in</strong>formation about a particular type. <strong>C#</strong> uses thetypeof operator to obta<strong>in</strong> such a reference for types known at compile time, and thishas been extended to encompass generic types.There are two ways of us<strong>in</strong>g typeof with generic types—one retrieves the generictype def<strong>in</strong>ition (<strong>in</strong> other words, the unbound generic type) and one retrieves a particularconstructed type. To obta<strong>in</strong> the generic type def<strong>in</strong>ition—that is, the type with noneof the type arguments specified—you simply take the name of the type as it wouldhave been declared and remove the type parameter names, keep<strong>in</strong>g any commas. Toretrieve constructed types, you specify the type arguments <strong>in</strong> the same way as youwould to declare a variable of the generic type. List<strong>in</strong>g 3.10 gives an example of bothuses. It uses a generic method so we can revisit how typeof can be used with a typeparameter, which we previously saw <strong>in</strong> list<strong>in</strong>g 3.7.9See http://www.mart<strong>in</strong>fowler.com/articles/<strong>in</strong>jection.htmlLicensed to Rhona Hadida


Advanced generics93List<strong>in</strong>g 3.10Us<strong>in</strong>g the typeof operator with type parametersstatic void DemonstrateTypeof(){Console.WriteL<strong>in</strong>e(typeof(X));Console.WriteL<strong>in</strong>e(typeof(List));Console.WriteL<strong>in</strong>e(typeof(Dictionary));Displays method’stype parameterConsole.WriteL<strong>in</strong>e(typeof(List));Console.WriteL<strong>in</strong>e(typeof(Dictionary));DisplaysgenerictypesConsole.WriteL<strong>in</strong>e(typeof(List));Console.WriteL<strong>in</strong>e(typeof(Dictionary));}...DemonstrateTypeof();Displays closedtypes (despiteus<strong>in</strong>g typeparameter)Most of list<strong>in</strong>g 3.10 is as you might naturally expect, but it’s worth po<strong>in</strong>t<strong>in</strong>g out twoth<strong>in</strong>gs. First, look at the syntax for obta<strong>in</strong><strong>in</strong>g the generic type def<strong>in</strong>ition of Dictionary. The comma <strong>in</strong> the angle brackets is required to effectively tell the compilerto look for the type with two type parameters: remember that there can be severalgeneric types with the same name, as long as they vary by the number of type parametersthey have. Similarly, you’d retrieve the generic type def<strong>in</strong>ition for MyClassus<strong>in</strong>g typeof(MyClass). The number of type parameters is specified <strong>in</strong> IL (and <strong>in</strong>full type names as far as the framework is concerned) by putt<strong>in</strong>g a back tick after the firstpart of the type name and then the number. The type parameters are then <strong>in</strong>dicated <strong>in</strong>square brackets <strong>in</strong>stead of the angle brackets we’re used to. For <strong>in</strong>stance, the second l<strong>in</strong>epr<strong>in</strong>ted ends with List`1[T], show<strong>in</strong>g that there is one type parameter, and the thirdl<strong>in</strong>e <strong>in</strong>cludes Dictionary`2[TKey,TValue].Second, note that wherever the method’s type parameter is used, the actual valueof the type argument is used at execution time. So the first l<strong>in</strong>e B pr<strong>in</strong>ts List`1 rather than List`1, which you might have expected. In otherwords, a type that is open at compile time may be closed at execution time. This is veryconfus<strong>in</strong>g. You should be aware of it <strong>in</strong> case you don’t get the results you expect, but otherwisedon’t worry. To retrieve a truly open constructed type at execution time, you need towork a bit harder. See the MSDN documentation for Type.IsGenericType for a suitablyconvoluted example.For reference, here’s the output of list<strong>in</strong>g 3.10:System.Int32System.Collections.Generic.List`1[T]System.Collections.Generic.Dictionary`2[TKey,TValue]System.Collections.Generic.List`1[System.Int32]System.Collections.Generic.Dictionary`2[System.Str<strong>in</strong>g,System.Int32]System.Collections.Generic.List`1[System.Int64]System.Collections.Generic.Dictionary`2[System.Int64,System.Guid]Hav<strong>in</strong>g retrieved an object represent<strong>in</strong>g a generic type, there are many “next steps”you can take. All the previously available ones (f<strong>in</strong>d<strong>in</strong>g the members of the type, creat<strong>in</strong>gan <strong>in</strong>stance, and so on) are still present—although some are not applicable forBDisplaysclosed typesLicensed to Rhona Hadida


94 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsgeneric type def<strong>in</strong>itions—and there are new ones as well that let you <strong>in</strong>quire about thegeneric nature of the type.METHODS AND PROPERTIES OF SYSTEM.TYPEThere are far too many new methods and properties to look at them all <strong>in</strong> detail, butthere are two particularly important ones: GetGenericTypeDef<strong>in</strong>ition and Make-GenericType. They are effectively opposites—the first acts on a constructed type,retriev<strong>in</strong>g the generic type def<strong>in</strong>ition; the second acts on a generic type def<strong>in</strong>ition andreturns a constructed type. Arguably it would have been clearer if this method hadbeen called ConstructGenericType, MakeConstructedType, or some other name withconstruct or constructed <strong>in</strong> it, but we’re stuck with what we’ve got.Just like normal types, there is only one Type object for any particular type—so call<strong>in</strong>gMakeGenericType twice with the same types as parameters will return the same referencetwice, and call<strong>in</strong>g GetGenericTypeDef<strong>in</strong>ition on two types constructed fromthe same generic type def<strong>in</strong>ition will likewise give the same result for both calls.Another method—this time one which already existed <strong>in</strong> .NET 1.1—that is worthexplor<strong>in</strong>g is Type.GetType, and its related Assembly.GetType method, both of whichprovide a dynamic equivalent to typeof. You might expect to be able to feed each l<strong>in</strong>eof the output of list<strong>in</strong>g 3.10 to the GetType method called on an appropriate assembly,but unfortunately life isn’t quite that straightforward. It’s f<strong>in</strong>e for closed constructedtypes—the type arguments just go <strong>in</strong> square brackets. For generic type def<strong>in</strong>itions,however, you need to remove the square brackets entirely—otherwise GetType th<strong>in</strong>ksyou mean an array type. List<strong>in</strong>g 3.11 shows all of these methods <strong>in</strong> action.List<strong>in</strong>g 3.11Various ways of retriev<strong>in</strong>g generic and constructed Type objectsstr<strong>in</strong>g listTypeName = "System.Collections.Generic.List`1";Type defByName = Type.GetType(listTypeName);Type closedByName = Type.GetType(listTypeName+"[System.Str<strong>in</strong>g]");Type closedByMethod = defByName.MakeGenericType(typeof(str<strong>in</strong>g));Type closedByTypeof = typeof(List);Console.WriteL<strong>in</strong>e (closedByMethod==closedByName);Console.WriteL<strong>in</strong>e (closedByName==closedByTypeof);Type defByTypeof = typeof(List);Type defByMethod = closedByName.GetGenericTypeDef<strong>in</strong>ition();Console.WriteL<strong>in</strong>e (defByMethod==defByName);Console.WriteL<strong>in</strong>e (defByName==defByTypeof);The output of list<strong>in</strong>g 3.11 is just True four times, validat<strong>in</strong>g that however you obta<strong>in</strong> areference to a particular type object, there is only one such object <strong>in</strong>volved.As I mentioned earlier, there are many new methods and properties on Type,such as GetGenericArguments, IsGenericTypeDef<strong>in</strong>ition, and IsGenericType. Thedocumentation for IsGenericType is probably the best start<strong>in</strong>g po<strong>in</strong>t for furtherexploration.Licensed to Rhona Hadida


Advanced generics95REFLECTING GENERIC METHODSGeneric methods have a similar (though smaller) set of additional properties andmethods. List<strong>in</strong>g 3.12 gives a brief demonstration of this, call<strong>in</strong>g a generic method byreflection.List<strong>in</strong>g 3.12Retriev<strong>in</strong>g and <strong>in</strong>vok<strong>in</strong>g a generic method with reflectionpublic static void Pr<strong>in</strong>tTypeParameter(){Console.WriteL<strong>in</strong>e (typeof(T));}...Type type = typeof(Snippet);MethodInfo def<strong>in</strong>ition = type.GetMethod("Pr<strong>in</strong>tTypeParameter");MethodInfo constructed;constructed = def<strong>in</strong>ition.MakeGenericMethod(typeof(str<strong>in</strong>g));constructed.Invoke(null, null);First we retrieve the generic method def<strong>in</strong>ition, and then we make a constructedgeneric method us<strong>in</strong>g MakeGenericMethod. As with types, we could go the other way ifwe wanted to—but unlike Type.GetType, there is no way of specify<strong>in</strong>g a constructedmethod <strong>in</strong> the GetMethod call. The framework also has a problem if there are methodsthat are overloaded purely by number of type parameters—there are no methods<strong>in</strong> Type that allow you to specify the number of type parameters, so <strong>in</strong>stead you’d haveto call Type.GetMethods and f<strong>in</strong>d the right one by look<strong>in</strong>g through all the methods.After retriev<strong>in</strong>g the constructed method, we <strong>in</strong>voke it. The arguments <strong>in</strong> this exampleare both null as we’re <strong>in</strong>vok<strong>in</strong>g a static method that doesn’t take any “normal”parameters. The output is System.Str<strong>in</strong>g, as we’d expect.Note that the methods retrieved from generic type def<strong>in</strong>itions cannot be <strong>in</strong>vokeddirectly—<strong>in</strong>stead, you must get the method from a constructed type. This applies toboth generic methods and nongeneric methods.Aga<strong>in</strong>, more methods and properties are available on MethodInfo, and IsGeneric-Method is a good start<strong>in</strong>g po<strong>in</strong>t <strong>in</strong> MSDN. Hopefully the <strong>in</strong>formation <strong>in</strong> this section willhave been enough to get you go<strong>in</strong>g, though—and to po<strong>in</strong>t out some of the added complexitiesyou might not have otherwise anticipated when first start<strong>in</strong>g to access generictypes and methods with reflection.That’s all we’re go<strong>in</strong>g to cover <strong>in</strong> the way of advanced features. Just to reiterate,this is not meant to have been an absolutely complete guide by any means—but mostdevelopers are unlikely to need to know the more obscure areas. I hope for your sakethat you fall <strong>in</strong>to this camp, as specifications tend to get harder to read the deeper yougo <strong>in</strong>to them. Remember that unless you’re work<strong>in</strong>g alone and just for yourself,you’re unlikely to be the only one to work on your code. If you need features that aremore complex than the ones demonstrated here, you almost certa<strong>in</strong>ly shouldn’tassume that anyone read<strong>in</strong>g your code will understand it without help. On the otherhand, if you f<strong>in</strong>d that your coworkers don’t know about some of the topics we’ve coveredso far, please feel free to direct them to the nearest bookshop…Licensed to Rhona Hadida


96 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsThe next section is much more down to earth than our <strong>in</strong>vestigations <strong>in</strong>to reflectionand the bowels of the JIT. It covers the most common use of generics: the standardcollection classes.3.5 Generic collection classes <strong>in</strong> .NET 2.0Although this book is primarily about <strong>C#</strong> as a language, it would be foolish to ignore thefact that <strong>C#</strong> is almost always used with<strong>in</strong> the .NET Framework, and that <strong>in</strong> order to usethe language effectively you’ll need to have a certa<strong>in</strong> amount of knowledge of thelibraries too. I won’t be go<strong>in</strong>g <strong>in</strong>to the details of ADO.NET, ASP.NET, and the like, butyou’re bound to use collections <strong>in</strong> almost any .NET program of any size. This sectionwill cover the core collections found <strong>in</strong> the System.Collections.Generic namespace.We’ll start <strong>in</strong> familiar territory with List.3.5.1 ListWe’ve already seen List several times. Broadly speak<strong>in</strong>g, it’s the generic equivalentof the nongeneric ArrayList type, which has been a part of .NET from the wordgo. There are some new features, and a few operations <strong>in</strong> ArrayList didn’t make it toList. Most of the features that have been removed from List have also beenremoved from other collections, so we’ll cover them here and then just refer to themlater on when talk<strong>in</strong>g about the other collections. Many of the new features <strong>in</strong>List (beyond “be<strong>in</strong>g generic”) aren’t available <strong>in</strong> the other generic collections.The comb<strong>in</strong>ation of these factors leads to our discussion of List be<strong>in</strong>g the longest<strong>in</strong> this section—but then it’s probably the most widely used collection <strong>in</strong> real-lifecode, too. When you th<strong>in</strong>k of us<strong>in</strong>g a list of data items <strong>in</strong> your code, List is thedefault choice.I won’t bore you with the most common operations (add<strong>in</strong>g, remov<strong>in</strong>g, fetch<strong>in</strong>g,and replac<strong>in</strong>g items) but will merely po<strong>in</strong>t out that List makes itself available <strong>in</strong> alarge number of situations us<strong>in</strong>g old APIs by implement<strong>in</strong>g IList as well as IList.Enough of look<strong>in</strong>g backward, though—let’s see what’s new.NEW FEATURES OF LISTThe new methods available with<strong>in</strong> List are all powered by generics—<strong>in</strong> particular,generic delegates. This is part of a general trend toward us<strong>in</strong>g delegates more heavily<strong>in</strong> the framework, which has been made simpler by the improvements <strong>in</strong> delegate syntaxavailable <strong>in</strong> <strong>C#</strong> 2. (There would have been little po<strong>in</strong>t <strong>in</strong> add<strong>in</strong>g lots of delegatespecificfeatures <strong>in</strong>to the framework with the syntax be<strong>in</strong>g as clunky as it was <strong>in</strong> <strong>C#</strong> 1.)We can now do the follow<strong>in</strong>g:■ Convert each element of the list to a different type, result<strong>in</strong>g <strong>in</strong> a new list(ConvertAll).■ Check whether any of the elements <strong>in</strong> the list match a given predicate (Exists).■ Check whether all of the elements <strong>in</strong> the list match a given predicate(TrueForAll).■ F<strong>in</strong>d the first, last, or all elements <strong>in</strong> the list match<strong>in</strong>g a predicate (F<strong>in</strong>dXXX).Licensed to Rhona Hadida


Generic collection classes <strong>in</strong> .NET 2.097■Remove all elements <strong>in</strong> the list match<strong>in</strong>g a given predicate (RemoveAll).■ Perform a given action on each element on the list (ForEach). 10We’ve already seen the ConvertAll method <strong>in</strong> list<strong>in</strong>g 3.2, but there are two more delegatetypes that are very important for this extra functionality: Predicate andAction, which have the follow<strong>in</strong>g signatures:public delegate bool Predicate (T obj)public delegate void Action (T obj)A predicate is a way of test<strong>in</strong>g whether a value matches a criterion. For <strong>in</strong>stance, youcould have a predicate that tested for str<strong>in</strong>gs hav<strong>in</strong>g a length greater than 5, or onethat tested whether an <strong>in</strong>teger was even. An action does exactly what you might expectit to—performs an action with the specified value. You might pr<strong>in</strong>t the value to theconsole, add it to another collection—whatever you want.For simple examples, most of the methods listed here are easily achieved with aforeach loop. However, us<strong>in</strong>g a delegate allows the behavior to come from somewhereother than the immediate code <strong>in</strong> the foreach loop. With the improvements todelegates <strong>in</strong> <strong>C#</strong> 2, it can also be a bit simpler than the loop.List<strong>in</strong>g 3.13 shows the last two methods—ForEach and RemoveAll—<strong>in</strong> action. Wetake a list of the <strong>in</strong>tegers from 2 to 100, remove multiples of 2, then multiples of 3, andso forth up to 10, f<strong>in</strong>ally list<strong>in</strong>g the numbers. You may well recognize this as a slightvariation on the “Sieve of Eratosthenes” method of f<strong>in</strong>d<strong>in</strong>g prime numbers. I’ve usedthe streaml<strong>in</strong>ed method of creat<strong>in</strong>g delegates to make the example more realistic.Even though we haven’t covered the syntax yet (you can peep ahead to chapter 5 ifyou want to get the details), it should be fairly obvious what’s go<strong>in</strong>g on here.List<strong>in</strong>g 3.13Pr<strong>in</strong>t<strong>in</strong>g primes us<strong>in</strong>g RemoveAll and ForEach from ListList candidates = new List();for (<strong>in</strong>t i=2; i


98 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsList<strong>in</strong>g 3.13 starts off by just creat<strong>in</strong>g a list of all the <strong>in</strong>tegers between 2 and 100 <strong>in</strong>clusiveB—noth<strong>in</strong>g spectacular here, although once aga<strong>in</strong> I should po<strong>in</strong>t out thatthere’s no box<strong>in</strong>g <strong>in</strong>volved. The delegate used <strong>in</strong> step C is a Predicate , andthe one used <strong>in</strong> D is an Action. One po<strong>in</strong>t to note is how simple the use ofRemoveAll is. Because you can’t change the contents of a collection while iterat<strong>in</strong>gover it, the typical ways of remov<strong>in</strong>g multiple elements from a list have previously beenas follows:■ Iterate us<strong>in</strong>g the <strong>in</strong>dex <strong>in</strong> ascend<strong>in</strong>g order, decrement<strong>in</strong>g the <strong>in</strong>dex variablewhenever you remove an element.■ Iterate us<strong>in</strong>g the <strong>in</strong>dex <strong>in</strong> descend<strong>in</strong>g order to avoid excessive copy<strong>in</strong>g.■ Create a new list of the elements to remove, and then iterate through the newlist, remov<strong>in</strong>g each element <strong>in</strong> turn from the old list.None of these is particularly satisfactory—the predicate approach is much neater, giv<strong>in</strong>gemphasis to what you want to achieve rather than how exactly it should happen. It’s agood idea to experiment with predicates a bit to get comfortable with them, particularlyif you’re likely to be us<strong>in</strong>g <strong>C#</strong> 3 <strong>in</strong> a production sett<strong>in</strong>g any time <strong>in</strong> the near future—thismore functional style of cod<strong>in</strong>g is go<strong>in</strong>g to be <strong>in</strong>creas<strong>in</strong>gly important over time.Next we’ll have a brief look at the methods that are present <strong>in</strong> ArrayList but notList, and consider why that might be the case.FEATURES “MISSING” FROM LISTA few methods <strong>in</strong> ArrayList have been shifted around a little—the static ReadOnlymethod is replaced by the AsReadOnly <strong>in</strong>stance method, and TrimToSize is nearlyreplaced by TrimExcess (the difference is that TrimExcess won’t do anyth<strong>in</strong>g if thesize and capacity are nearly the same anyway). There are a few genu<strong>in</strong>ely “miss<strong>in</strong>g”pieces of functionality, however. These are listed, along with the suggestedworkaround, <strong>in</strong> table 3.3.Table 3.3Methods from ArrayList with no direct equivalent <strong>in</strong> ListArrayList methodWay of achiev<strong>in</strong>g similar effectAdapterCloneFixedSizeRepeatSetRangeSynchronizedNone providedlist.GetRange (0, list.Count) or new List(list)Nonefor loop or write a replacement generic methodfor loop or write a replacement generic methodSynchronizedCollectionThe Synchronized method was a bad idea <strong>in</strong> ArrayList to start with, <strong>in</strong> my view. Mak<strong>in</strong>g<strong>in</strong>dividual calls to a collection doesn’t make the collection thread-safe, because somany operations (the most common is iterat<strong>in</strong>g over the collection) <strong>in</strong>volve multipleLicensed to Rhona Hadida


Generic collection classes <strong>in</strong> .NET 2.099calls. To make those operations thread-safe, the collection needs to be locked for theduration of the operation. (It requires cooperation from other code us<strong>in</strong>g the samecollection, of course.) In short, the Synchronized method gave the appearance ofsafety without the reality. It’s better not to give the wrong impression <strong>in</strong> the firstplace—developers just have to be careful when work<strong>in</strong>g with collections accessed <strong>in</strong>multiple threads. SynchronizedCollection performs broadly the same role as asynchronized ArrayList. I would argue that it’s still not a good idea to use this, for thereasons outl<strong>in</strong>ed <strong>in</strong> this paragraph—the safety provided is largely illusory. Ironically,this would be a great collection to support a ForEach method, where it could automaticallyhold the lock for the duration of the iteration over the collection—but there’sno such method.That completes our coverage of List. The next collection under the microscopeis Dictionary, which we’ve already seen so much of.3.5.2 DictionaryThere is less to say about Dictionary (just called Dictionary forthe rest of this section, for simplicity) than there was about List, although it’sanother heavily used type. As stated earlier, it’s the generic replacement for Hashtableand the related classes, such as Str<strong>in</strong>gDictionary. There aren’t many features present<strong>in</strong> Dictionary that aren’t <strong>in</strong> Hashtable, although this is partly because the ability tospecify a comparison <strong>in</strong> the form of an IEqualityComparer was added to Hashtable <strong>in</strong>.NET 2.0. This allows for th<strong>in</strong>gs like case-<strong>in</strong>sensitive comparisons of str<strong>in</strong>gs withoutus<strong>in</strong>g a separate type of dictionary. IEqualityComparer and its generic equivalent,IEqualityComparer, have both Equals and GetHashCode. Prior to .NET 2.0 thesewere split <strong>in</strong>to IComparer (which had to give an order<strong>in</strong>g, not just test for equality) andIHashCodeProvider. This separation was awkward, hence the move to IEquality-Comparer for 2.0. Dictionary exposes its IEqualityComparer <strong>in</strong> the publicComparer property.The most important difference between Dictionary and Hashtable (beyond thenormal benefits of generics) is their behavior when asked to fetch the value associatedwith a key that they don’t know about. When presented with a key that isn’t <strong>in</strong> themap, the <strong>in</strong>dexer of Hashtable will just return null. By contrast, Dictionary willthrow a KeyNotFoundException. Both of them support the Conta<strong>in</strong>sKey method totell beforehand whether a given key is present. Dictionary also providesTryGetValue, which retrieves the value if a suitable entry is present, stor<strong>in</strong>g it <strong>in</strong> theoutput parameter and return<strong>in</strong>g true. If the key is not present, TryGetValue will setthe output parameter to the default value of TValue and return false. This avoidshav<strong>in</strong>g to search for the key twice, while still allow<strong>in</strong>g the caller to dist<strong>in</strong>guish betweenthe situation where a key isn’t present at all, and the one where it’s present but its associatedvalue is the default value of TValue. Mak<strong>in</strong>g the <strong>in</strong>dexer throw an exception isof more debatable merit, but it does make it very clear when a lookup has failed<strong>in</strong>stead of mask<strong>in</strong>g the failure by return<strong>in</strong>g a potentially valid value.Licensed to Rhona Hadida


100 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsJust as with List, there is no way of obta<strong>in</strong><strong>in</strong>g a synchronized Dictionary,nor does it implement ICloneable. The dictionary equivalent of Synchronized-Collection is SynchronizedKeyedCollection (which <strong>in</strong> fact derives fromSynchronizedCollection).With the lack of additional functionality, another example of Dictionarywould be relatively po<strong>in</strong>tless. Let’s move on to two types that are closely related toeach other: Queue and Stack.3.5.3 Queue and StackThe generic queue and stack classes are essentially the same as their nongeneric counterparts.The same features are “miss<strong>in</strong>g” from the generic versions as with the othercollections—lack of clon<strong>in</strong>g, and no way of creat<strong>in</strong>g a synchronized version. Asbefore, the two types are closely related—both act as lists that don’t allow randomaccess, <strong>in</strong>stead only allow<strong>in</strong>g elements to be removed <strong>in</strong> a certa<strong>in</strong> order. Queues act <strong>in</strong>a first <strong>in</strong>, first out (FIFO) fashion, while stacks have last <strong>in</strong>, first out (LIFO) semantics.Both have Peek methods that return the next element that would be removed butwithout actually remov<strong>in</strong>g it. This behavior is demonstrated <strong>in</strong> list<strong>in</strong>g 3.14.List<strong>in</strong>g 3.14Demonstration of Queue and StackQueue queue = new Queue();Stack stack = new Stack();for (<strong>in</strong>t i=0; i < 10; i++){queue.Enqueue(i);stack.Push(i);}for (<strong>in</strong>t i=0; i < 10; i++){Console.WriteL<strong>in</strong>e ("Stack:{0} Queue:{1}",stack.Pop(), queue.Dequeue());}The output of list<strong>in</strong>g 3.14 is as follows:Stack:9 Queue:0Stack:8 Queue:1Stack:7 Queue:2Stack:6 Queue:3Stack:5 Queue:4Stack:4 Queue:5Stack:3 Queue:6Stack:2 Queue:7Stack:1 Queue:8Stack:0 Queue:9You can enumerate Stack and Queue <strong>in</strong> the same way as with a list, but <strong>in</strong> myexperience this is used relatively rarely. Most of the uses I’ve seen have <strong>in</strong>volved athread-safe wrapper be<strong>in</strong>g put around either class, enabl<strong>in</strong>g a producer/consumerLicensed to Rhona Hadida


Generic collection classes <strong>in</strong> .NET 2.0101pattern for multithread<strong>in</strong>g. This is not particularly hard to write, and third-partyimplementations are available, but hav<strong>in</strong>g these classes directly available <strong>in</strong> the frameworkwould be more welcome.Next we’ll look at the generic versions of SortedList, which are similar enough tobe tw<strong>in</strong>s.3.5.4 SortedList and SortedDictionaryThe nam<strong>in</strong>g of SortedList has always bothered me. It feels more like a map or dictionarythan a list. You can access the elements by <strong>in</strong>dex as you can for other lists(although not with an <strong>in</strong>dexer)—but you can also access the value of each element(which is a key/value pair) by key. The important part of SortedList is that when youenumerate it, the entries come out sorted by key. Indeed, a common way of us<strong>in</strong>gSortedList is to access it as a map when writ<strong>in</strong>g to it, but then enumerate the entries<strong>in</strong> order.There are two generic classes that map to the same sort of behavior: Sorted-List and SortedDictionary. (From here on I’ll justcall them SortedList and SortedDictionary to save space.) They’re very similar<strong>in</strong>deed—it’s mostly the performance that differs. SortedList uses less memory,but SortedDictionary is faster <strong>in</strong> the general case when it comes to add<strong>in</strong>g entries.However, if you add them <strong>in</strong> the sort order of the keys to start with, SortedListwill be faster.NOTEA difference of limited benefit—SortedList allows you to f<strong>in</strong>d the <strong>in</strong>dex ofa particular key or value us<strong>in</strong>g IndexOfKey and IndexOfValue, and toremove an entry by <strong>in</strong>dex with RemoveAt. To retrieve an entry by <strong>in</strong>dex,however, you have to use the Keys or Values properties, which implementIList and IList, respectively. The nongeneric versionsupports more direct access, and a private method exists <strong>in</strong> the generic version,but it’s not much use while it’s private. SortedDictionary doesn’tsupport any of these operations.If you want to see either of these classes <strong>in</strong> action, use list<strong>in</strong>g 3.1 as a good start<strong>in</strong>gpo<strong>in</strong>t. Just chang<strong>in</strong>g Dictionary to SortedDictionary or SortedList will ensure thatthe words are pr<strong>in</strong>ted <strong>in</strong> alphabetical order, for example.Our f<strong>in</strong>al collection class is genu<strong>in</strong>ely new, rather than a generic version of anexist<strong>in</strong>g nongeneric type. It’s that staple of computer science courses everywhere: thel<strong>in</strong>ked list.3.5.5 L<strong>in</strong>kedListI suspect you know what a l<strong>in</strong>ked list is. Instead of keep<strong>in</strong>g an array that is quick toaccess but slow to <strong>in</strong>sert <strong>in</strong>to, a l<strong>in</strong>ked list stores its data by build<strong>in</strong>g up a cha<strong>in</strong> ofnodes, each of which is l<strong>in</strong>ked to the next one. Doubly l<strong>in</strong>ked lists (likeL<strong>in</strong>kedList) store a l<strong>in</strong>k to the previous node as well as the next one, so you caneasily iterate backward as well as forward.Licensed to Rhona Hadida


102 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsL<strong>in</strong>ked lists make it easy to <strong>in</strong>sert another node <strong>in</strong>to the cha<strong>in</strong>—as long as youalready have a handle on the node represent<strong>in</strong>g the <strong>in</strong>sertion position. All the listneeds to do is create a new node, and make the appropriate l<strong>in</strong>ks between that nodeand the ones that will be before and after it. Lists stor<strong>in</strong>g all their data <strong>in</strong> a pla<strong>in</strong> array(as List does) need to move all the entries that will come after the new one, whichcan be very expensive—and if the array runs out of spare capacity, the whole lot mustbe copied. Enumerat<strong>in</strong>g a l<strong>in</strong>ked list from start to end is also cheap—but randomaccess (fetch<strong>in</strong>g the fifth element, then the thousandth, then the second) is slowerthan us<strong>in</strong>g an array-backed list. Indeed, L<strong>in</strong>kedList doesn’t even provide a randomaccess method or <strong>in</strong>dexer. Despite its name, it doesn’t implement IList.L<strong>in</strong>ked lists are usually more expensive <strong>in</strong> terms of memory than their array-backedcous<strong>in</strong>s due to the extra l<strong>in</strong>k node required for each value. However, they don’t havethe “wasted” space of the spare array capacity of List.The l<strong>in</strong>ked list implementation <strong>in</strong> .NET 2.0 is a relatively pla<strong>in</strong> one—it doesn’t supportcha<strong>in</strong><strong>in</strong>g two lists together to form a larger one, or splitt<strong>in</strong>g an exist<strong>in</strong>g one <strong>in</strong>totwo, for example. However, it can still be useful if you want fast <strong>in</strong>sertions at both thestart and end of the list (or <strong>in</strong> between if you keep a reference to the appropriate node),and only need to read the values from start to end, or vice versa.Our f<strong>in</strong>al ma<strong>in</strong> section of the chapter looks at some of the limitations of generics<strong>in</strong> <strong>C#</strong> and considers similar features <strong>in</strong> other languages.3.6 Limitations of generics <strong>in</strong> <strong>C#</strong> and other languagesThere is no doubt that generics contribute a great deal to <strong>C#</strong> <strong>in</strong> terms of expressiveness,type safety, and performance. The feature has been carefully designed to copewith most of the tasks that C++ programmers typically used templates for, but withoutsome of the accompany<strong>in</strong>g disadvantages. However, this is not to say limitations don’texist. There are some problems that C++ templates solve with ease but that <strong>C#</strong> genericscan’t help with. Similarly, while generics <strong>in</strong> Java are generally less powerful than <strong>in</strong><strong>C#</strong>, there are some concepts that can be expressed <strong>in</strong> Java but that don’t have a <strong>C#</strong>equivalent. This section will take you through some of the most commonly encounteredweaknesses, as well as briefly compare the <strong>C#</strong>/.NET implementation of genericswith C++ templates and Java generics.It’s important to stress that po<strong>in</strong>t<strong>in</strong>g out these snags does not imply that theyshould have been avoided <strong>in</strong> the first place. In particular, I’m <strong>in</strong> no way say<strong>in</strong>g that Icould have done a better job! The language and platform designers have had to balancepower with complexity (and the small matter of achiev<strong>in</strong>g both design andimplementation with<strong>in</strong> a reasonable timescale). It’s possible that future improvementswill either remove some of these issues or lessen their impact. Most likely, youwon’t encounter problems, and if you do, you’ll be able to work around them with theguidance given here.We’ll start with the answer to a question that almost everyone raises sooner or later:why can’t I convert a List to List?Licensed to Rhona Hadida


Limitations of generics <strong>in</strong> <strong>C#</strong> and other languages1033.6.1 Lack of covariance and contravarianceIn section 2.3.2, we looked at the covariance of arrays—the fact that an array of a referencetype can be viewed as an array of its base type, or an array of any of the <strong>in</strong>terfacesit implements. Generics don’t support this—they are <strong>in</strong>variant. This is for the sake oftype safety, as we’ll see, but it can be annoy<strong>in</strong>g.WHY DON’T GENERICS SUPPORT COVARIANCE?Let’s suppose we have two classes, Animal and Cat, where Cat derives from Animal. Inthe code that follows, the array code (on the left) is valid <strong>C#</strong> 2; the generic code (onthe right) isn’t:Valid (at compile-time):Animal[] animals = new Cat[5];animals[0] = new Animal();Invalid:List animals=new List();animals.Add(new Animal());The compiler has no problem with the second l<strong>in</strong>e <strong>in</strong> either case, but the first l<strong>in</strong>e onthe right causes the error:error CS0029: Cannot implicitly convert type'System.Collections.Generic.List' to'System.Collections.Generic.List'This was a deliberate choice on the part of the framework and language designers. Theobvious question to ask is why this is prohibited—and the answer lies on the secondl<strong>in</strong>e. There is noth<strong>in</strong>g about the second l<strong>in</strong>e that should raise any suspicion. After all,List effectively has a method with the signature void Add(Animal value)—you should be able to put a Turtle <strong>in</strong>to any list of animals, for <strong>in</strong>stance. However, theactual object referred to by animals is a Cat[] (<strong>in</strong> the code on the left) or a List(on the right), both of which require that only references to <strong>in</strong>stances of Cat are stored<strong>in</strong> them. Although the array version will compile, it will fail at execution time. This wasdeemed by the designers of generics to be worse than fail<strong>in</strong>g at compile time, which isreasonable—the whole po<strong>in</strong>t of static typ<strong>in</strong>g is to f<strong>in</strong>d out about errors before the codeever gets run.NOTE So why are arrays covariant? Hav<strong>in</strong>g answered the question about whygenerics are <strong>in</strong>variant, the next obvious step is to question why arrays arecovariant. Accord<strong>in</strong>g to the Common Language Infrastructure AnnotatedStandard (Addison-Wesley Professional, 2003), for the first edition thedesigners wished to reach as broad an audience as possible, which <strong>in</strong>cludedbe<strong>in</strong>g able to run code compiled from Java source. In other words, .NET hascovariant arrays because Java has covariant arrays—despite this be<strong>in</strong>g aknown “wart” <strong>in</strong> Java.So, that’s why th<strong>in</strong>gs are the way they are—but why should you care, and how can youget around the restriction?Licensed to Rhona Hadida


104 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsWHERE COVARIANCE WOULD BE USEFULSuppose you are implement<strong>in</strong>g a platform-agnostic storage system, 11 which could runacross WebDAV, NFS, Samba, NTFS, ReiserFS, files <strong>in</strong> a database, you name it. You mayhave the idea of storage locations, which may conta<strong>in</strong> sublocations (th<strong>in</strong>k of directoriesconta<strong>in</strong><strong>in</strong>g files and more directories, for <strong>in</strong>stance). You could have an <strong>in</strong>terface like this:■■■■public <strong>in</strong>terface IStorageLocation{Stream OpenForRead();...IEnumerable GetSublocations();}That all seems reasonable and easy to implement. The problem comes when yourimplementation (FabulousStorageLocation for <strong>in</strong>stance) stores its list of sublocationsfor any particular location as List. You mightexpect to be able to either return the list reference directly, or possibly call AsRead-Only to avoid clients tamper<strong>in</strong>g with your list, and return the result—but that wouldbe an implementation of IEnumerable <strong>in</strong>stead of anIEnumerable.Here are some options:■ Make your list a List <strong>in</strong>stead. This is likely to mean you needto cast every time you fetch an entry <strong>in</strong> order to get at your implementationspecificbehavior. You might as well not be us<strong>in</strong>g generics <strong>in</strong> the first place.Implement GetSublocations us<strong>in</strong>g the funky new iteration features of <strong>C#</strong> 2, asdescribed <strong>in</strong> chapter 6. That happens to work <strong>in</strong> this example, because the<strong>in</strong>terface uses IEnumerable. It wouldn’t work if we had toreturn an IList <strong>in</strong>stead. It also requires each implementationto have the same k<strong>in</strong>d of code. It’s only a few l<strong>in</strong>es, but it’s still <strong>in</strong>elegant.Create a new copy of the list, this time as List. In somecases (particularly if the <strong>in</strong>terface did require you to return an IList), this would be a good th<strong>in</strong>g to do anyway—it keeps thelist returned separate from the <strong>in</strong>ternal list. You could even use List.Convert-All to do it <strong>in</strong> a s<strong>in</strong>gle l<strong>in</strong>e. It <strong>in</strong>volves copy<strong>in</strong>g everyth<strong>in</strong>g <strong>in</strong> the list, though,which may be an unnecessary expense if you trust your callers to use thereturned list reference appropriately.Make the <strong>in</strong>terface generic, with the type parameter represent<strong>in</strong>g the actual typeof storage sublocation be<strong>in</strong>g represented. For <strong>in</strong>stance, FabulousStorage-Location might implement IStorageLocation.It looks a little odd, but this recursive-look<strong>in</strong>g use of generics can be quite usefulat times. 12Create a generic helper method (preferably <strong>in</strong> a common class library) thatconverts IEnumerator to IEnumerator, where TSourcederives from TDest.11 Yes, another one.12 For <strong>in</strong>stance, you might have a type parameter T with a constra<strong>in</strong>t that any <strong>in</strong>stance can be compared to another<strong>in</strong>stance of T for equality—<strong>in</strong> other words, someth<strong>in</strong>g like MyClass where T : IEquatable.Licensed to Rhona Hadida


Limitations of generics <strong>in</strong> <strong>C#</strong> and other languages105When you run <strong>in</strong>to covariance issues, you may need to consider all of these optionsand anyth<strong>in</strong>g else you can th<strong>in</strong>k of. It depends heavily on the exact nature of the situation.Unfortunately, covariance isn’t the only problem we have to consider. There’salso the matter of contravariance, which is like covariance <strong>in</strong> reverse.WHERE CONTRAVARIANCE WOULD BE USEFULContravariance feels slightly less <strong>in</strong>tuitive than covariance, but it does make sense.Where covariance is about declar<strong>in</strong>g that we will return a more specific object from amethod than the <strong>in</strong>terface requires us to, contravariance is about be<strong>in</strong>g will<strong>in</strong>g toaccept a more general parameter.For <strong>in</strong>stance, suppose we had an IShape <strong>in</strong>terface 13 that conta<strong>in</strong>ed the Area property.It’s easy to write an implementation of IComparer that sorts by area.We’d then like to be able to write the follow<strong>in</strong>g code:IComparer areaComparer = new AreaComparer();List circles = new List();circles.Add(new Circle(20));circles.Add(new Circle(10));circles.Sort(areaComparer);That won’t work, though, because the Sort method on List effectively takesan IComparer. The fact that our AreaComparer can compare any shaperather than just circles doesn’t impress the compiler at all. It considers IComparer and IComparer to be completely different types. Madden<strong>in</strong>g, isn’tit? It would be nice if the Sort method had this signature <strong>in</strong>stead:void Sort(IComparer comparer) where T : SUnfortunately, not only is that not the signature of Sort, but it can’t be—the constra<strong>in</strong>tis <strong>in</strong>valid, because it’s a constra<strong>in</strong>t on T <strong>in</strong>stead of S. We want a derivation typeconstra<strong>in</strong>t but <strong>in</strong> the other direction, constra<strong>in</strong><strong>in</strong>g the S to be somewhere up the<strong>in</strong>heritance tree of T <strong>in</strong>stead of down.Given that this isn’t possible, what can we do? There are fewer options this timethan before. First, you could create a generic class with the follow<strong>in</strong>g declaration:ComparisonHelper : IComparerwhere TDerived : TBaseYou’d then create a constructor that takes (and stores) an IComparer as aparameter. The implementation of IComparer would just return the resultof call<strong>in</strong>g the Compare method of the IComparer. You could then sort theList by creat<strong>in</strong>g a new ComparisonHelper that uses thearea comparison.The second option is to make the area comparison class generic, with a derivationconstra<strong>in</strong>t, so it can compare any two values of the same type, as long as that typeimplements IShape. Of course, you can only do this when you’re able to change thecomparison class—but it’s a nice solution when it’s available.13 You didn’t really expect to get through the whole book without see<strong>in</strong>g a shape-related example, did you?Licensed to Rhona Hadida


106 CHAPTER 3 Parameterized typ<strong>in</strong>g with genericsNotice that the various options for both covariance and contravariance use moregenerics and constra<strong>in</strong>ts to express the <strong>in</strong>terface <strong>in</strong> a more general manner, or to providegeneric “helper” methods. I know that add<strong>in</strong>g a constra<strong>in</strong>t makes it sound lessgeneral, but the generality is added by first mak<strong>in</strong>g the type or method generic. Whenyou run <strong>in</strong>to a problem like this, add<strong>in</strong>g a level of genericity somewhere with anappropriate constra<strong>in</strong>t should be the first option to consider. Generic methods (ratherthan generic types) are often helpful here, as type <strong>in</strong>ference can make the lack of variance<strong>in</strong>visible to the naked eye. This is particularly true <strong>in</strong> <strong>C#</strong> 3, which has strongertype <strong>in</strong>ference capabilities than <strong>C#</strong> 2.NOTEIs this really the best we can do?—As we’ll see later, Java supports covarianceand contravariance with<strong>in</strong> its generics—so why can’t <strong>C#</strong>? Well, a lot of itboils down to the implementation—the fact that the Java runtimedoesn’t get <strong>in</strong>volved with generics; it’s basically a compile-time feature.However, the CLR does support limited generic covariance and contravariance,just on <strong>in</strong>terfaces and delegates. <strong>C#</strong> doesn’t expose this feature(neither does VB.NET), and none of the framework libraries use it. The<strong>C#</strong> compiler consumes covariant and contravariant <strong>in</strong>terfaces as if theywere <strong>in</strong>variant. Add<strong>in</strong>g variance is under consideration for <strong>C#</strong> 4,although no firm commitments have been made. Eric Lippert has writtena whole series of blog posts about the general problem, and what mighthappen <strong>in</strong> future versions of <strong>C#</strong>: http://blogs.msdn.com/ericlippert/archive/tags/Covariance+and+Contravariance/default.aspx.This limitation is a very common cause of questions on <strong>C#</strong> discussion groups. Therema<strong>in</strong><strong>in</strong>g issues are either relatively academic or affect only a moderate subset of thedevelopment community. The next one mostly affects those who do a lot of calculations(usually scientific or f<strong>in</strong>ancial) <strong>in</strong> their work.3.6.2 Lack of operator constra<strong>in</strong>ts or a “numeric” constra<strong>in</strong>t<strong>C#</strong> is not without its downside when it comes to heavily mathematical code. The needto explicitly use the Math class for every operation beyond the simplest arithmetic andthe lack of C-style typedefs to allow the data representation used throughout a programto be easily changed have always been raised by the scientific community as barriersto <strong>C#</strong>’s adoption. Generics weren’t likely to fully solve either of those issues, butthere’s a common problem that stops generics from help<strong>in</strong>g as much as they couldhave. Consider this (illegal) generic method:public T F<strong>in</strong>dMean(IEnumerable data){T sum = default(T);<strong>in</strong>t count = 0;foreach (T datum <strong>in</strong> data){sum += datum;count++;}Licensed to Rhona Hadida


Limitations of generics <strong>in</strong> <strong>C#</strong> and other languages107}return sum/count;Obviously that could never work for all types of data—what could it mean to add oneException to another, for <strong>in</strong>stance? Clearly a constra<strong>in</strong>t of some k<strong>in</strong>d is called for…someth<strong>in</strong>g that is able to express what we need to be able to do: add two <strong>in</strong>stances of Ttogether, and divide a T by an <strong>in</strong>teger. If that were available, even if it were limited tobuilt-<strong>in</strong> types, we could write generic algorithms that wouldn’t care whether they werework<strong>in</strong>g on an <strong>in</strong>t, a long, a double, a decimal, and so forth. Limit<strong>in</strong>g it to the built<strong>in</strong>types would have been disappo<strong>in</strong>t<strong>in</strong>g but better than noth<strong>in</strong>g. The ideal solutionwould have to also allow user-def<strong>in</strong>ed types to act <strong>in</strong> a numeric capacity—so you coulddef<strong>in</strong>e a Complex type to handle complex numbers, for <strong>in</strong>stance. That complex numbercould then store each of its components <strong>in</strong> a generic way as well, so you couldhave a Complex, a Complex, and so on. 14Two related solutions present themselves. One would be simply to allow constra<strong>in</strong>tson operators, so you could write a set of constra<strong>in</strong>ts such aswhere T : T operator+ (T,T), T operator/ (T, <strong>in</strong>t)This would require that T have the operations we need <strong>in</strong> the earlier code. The othersolution would be to def<strong>in</strong>e a few operators and perhaps conversions that must be supported<strong>in</strong> order for a type to meet the extra constra<strong>in</strong>t—we could make it the“numeric constra<strong>in</strong>t” written where T : numeric.One problem with both of these options is that they can’t be expressed as normal<strong>in</strong>terfaces, because operator overload<strong>in</strong>g is performed with static members, whichcan’t implement <strong>in</strong>terfaces. It would require a certa<strong>in</strong> amount of shoehorn<strong>in</strong>g, <strong>in</strong>other words.Various smart people (<strong>in</strong>clud<strong>in</strong>g Eric Gunnerson and Anders Hejlsberg, whoought to be able to th<strong>in</strong>k of <strong>C#</strong> tricks if anyone can) have thought about this, and witha bit of extra code, some solutions have been found. They’re slightly clumsy, but theywork. Unfortunately, due to current JIT optimization limitations, you have to pickbetween pleasant syntax (x=y+z) that reads nicely but performs poorly, and a methodbasedsyntax (x=y.Add(z)) that performs without significant overhead but looks like adog’s d<strong>in</strong>ner when you’ve got anyth<strong>in</strong>g even moderately complicated go<strong>in</strong>g on.The details are beyond the scope of this book, but are very clearly presented athttp://www.lambda-comput<strong>in</strong>g.com/publications/articles/generics2/ <strong>in</strong> an article onthe matter.The two limitations we’ve looked at so far have been quite practical—they’ve beenissues you may well run <strong>in</strong>to dur<strong>in</strong>g actual development. However, if you’re generallycurious like I am, you may also be ask<strong>in</strong>g yourself about other limitations that don’tnecessarily slow down development but are <strong>in</strong>tellectual curiosities. In particular, justwhy are generics limited to types and methods?14 More mathematically m<strong>in</strong>ded readers might want to consider what a Complex wouldmean. You’re on your own there, I’m afraid.Licensed to Rhona Hadida


108 CHAPTER 3 Parameterized typ<strong>in</strong>g with generics3.6.3 Lack of generic properties, <strong>in</strong>dexers, and other member typesWe’ve seen generic types (classes, structs, delegates, and <strong>in</strong>terfaces) and we’ve seengeneric methods. There are plenty of other members that could be parameterized.However, there are no generic properties, <strong>in</strong>dexers, operators, constructors, f<strong>in</strong>alizers,or events. First let’s be clear about what we mean here: clearly an <strong>in</strong>dexer can havea return type that is a type parameter—List is an obvious example. KeyValue-Pair provides similar examples for properties. What you can’t have isan <strong>in</strong>dexer or property (or any of the other members <strong>in</strong> that list) with extra typeparameters. Leav<strong>in</strong>g the possible syntax of declaration aside for the m<strong>in</strong>ute, let’s lookat how these members might have to be called:SomeClass <strong>in</strong>stance = new SomeClass("x");<strong>in</strong>t x = <strong>in</strong>stance.SomeProperty;byte y = <strong>in</strong>stance.SomeIndexer["key"];<strong>in</strong>stance.Click += ByteHandler;<strong>in</strong>stance = <strong>in</strong>stance + <strong>in</strong>stance;I hope you’ll agree that all of those look somewhat silly. F<strong>in</strong>alizers can’t even be calledexplicitly from <strong>C#</strong> code, which is why there isn’t a l<strong>in</strong>e for them. The fact that we can’tdo any of these isn’t go<strong>in</strong>g to cause significant problems anywhere, as far as I cansee—it’s just worth be<strong>in</strong>g aware of it as an academic limitation.The one exception to this is possibly the constructor. However, a static genericmethod <strong>in</strong> the class is a good workaround for this, and the syntax with two lists of typearguments is horrific.These are by no means the only limitations of <strong>C#</strong> generics, but I believe they’re theones that you’re most likely to run up aga<strong>in</strong>st, either <strong>in</strong> your daily work, <strong>in</strong> communityconversations, or when idly consider<strong>in</strong>g the feature as a whole. In our next two sectionswe’ll see how some aspects of these aren’t issues <strong>in</strong> the two languages whose featuresare most commonly compared with <strong>C#</strong>’s generics: C++ (with templates) and Java(with generics as of Java 5). We’ll tackle C++ first.3.6.4 Comparison with C++ templatesC++ templates are a bit like macros taken to an extreme level. They’re <strong>in</strong>credibly powerful,but have costs associated with them both <strong>in</strong> terms of code bloat and ease ofunderstand<strong>in</strong>g.When a template is used <strong>in</strong> C++, the code is compiled for that particular set of templatearguments, as if the template arguments were <strong>in</strong> the source code. This means thatthere’s not as much need for constra<strong>in</strong>ts, as the compiler will check whether you’reallowed to do everyth<strong>in</strong>g you want to with the type anyway while it’s compil<strong>in</strong>g the codefor this particular set of template arguments. The C++ standards committee has recognizedthat constra<strong>in</strong>ts are still useful, though, and they will be present <strong>in</strong> C++0x (thenext version of C++) under the name of concepts.The C++ compiler is smart enough to compile the code only once for any given setof template arguments, but it isn’t able to share code <strong>in</strong> the way that the CLR does withLicensed to Rhona Hadida


Limitations of generics <strong>in</strong> <strong>C#</strong> and other languages109reference types. That lack of shar<strong>in</strong>g does have its benefits, though—it allows typespecificoptimizations, such as <strong>in</strong>l<strong>in</strong><strong>in</strong>g method calls for some type parameters but notothers, from the same template. It also means that overload resolution can be performedseparately for each set of type parameters, rather than just once based solelyon the limited knowledge the <strong>C#</strong> compiler has due to any constra<strong>in</strong>ts present.Don’t forget that with “normal” C++ there’s only one compilation <strong>in</strong>volved, ratherthan the “compile to IL” then “JIT compile to native code” model of .NET. A programus<strong>in</strong>g a standard template <strong>in</strong> ten different ways will <strong>in</strong>clude the code ten times <strong>in</strong> a C++program. A similar program <strong>in</strong> <strong>C#</strong> us<strong>in</strong>g a generic type from the framework <strong>in</strong> ten differentways won’t <strong>in</strong>clude the code for the generic type at all—it will refer to it, and theJIT will compile as many different versions as required (as described <strong>in</strong> section 3.4.2) atexecution time.One significant feature that C++ templates have over <strong>C#</strong> generics is that the templatearguments don’t have to be type names. Variable names, function names, and constantexpressions can be used as well. A common example of this is a buffer type that has thesize of the buffer as one of the template arguments—so a buffer will alwaysbe a buffer of 20 <strong>in</strong>tegers, and a buffer will always be a buffer of 35 doubles.This ability is crucial to template metaprogramm<strong>in</strong>g 15 —an 15 advanced C++ technique thevery idea of which scares me, but that can be very powerful <strong>in</strong> the hands of experts.C++ templates are more flexible <strong>in</strong> other ways, too. They don’t suffer from theproblem described <strong>in</strong> 3.6.2, and there are a few other restrictions that don’t exist <strong>in</strong>C++: you can derive a class from one of its type parameters, and you can specialize atemplate for a particular set of type arguments. The latter ability allows the templateauthor to write general code to be used when there’s no more knowledge availablebut specific (often highly optimized) code for particular types.The same variance issues of .NET generics exist <strong>in</strong> C++ templates as well—anexample given by Bjarne Stroustrup 16 is that there are no implicit conversionsbetween Vector and Vector with similar reason<strong>in</strong>g—<strong>in</strong> this case,it might allow you to put a square peg <strong>in</strong> a round hole.For further details of C++ templates, I recommend Stroustrup’s The C++Programm<strong>in</strong>g Language (Addison-Wesley, 1991). It’s not always the easiest book tofollow, but the templates chapter is fairly clear (once you get your m<strong>in</strong>d around C++term<strong>in</strong>ology and syntax). For more comparisons with .NET generics, look at the blogpost by the Visual C++ team on this topic: http://blogs.msdn.com/branbray/archive/2003/11/19/51023.aspx.The other obvious language to compare with <strong>C#</strong> <strong>in</strong> terms of generics is Java, which<strong>in</strong>troduced the feature <strong>in</strong>to the ma<strong>in</strong>stream language for the 1.5 release, 17 severalyears after other projects had compilers for their Java-like languages.15 http://en.wikipedia.org/wiki/Template_metaprogramm<strong>in</strong>g16 The <strong>in</strong>ventor of C++.17 Or 5.0, depend<strong>in</strong>g on which number<strong>in</strong>g system you use. Don’t get me started.Licensed to Rhona Hadida


110 CHAPTER 3 Parameterized typ<strong>in</strong>g with generics3.6.5 Comparison with Java genericsWhere C++ <strong>in</strong>cludes more of the template <strong>in</strong> the generated code than <strong>C#</strong> does, Java<strong>in</strong>cludes less. In fact, the Java runtime doesn’t know about generics at all. The Javabytecode (roughly equivalent term<strong>in</strong>ology to IL) for a generic type <strong>in</strong>cludes someextra metadata to say that it’s generic, but after compilation the call<strong>in</strong>g code doesn’thave much to <strong>in</strong>dicate that generics were <strong>in</strong>volved at all—and certa<strong>in</strong>ly an <strong>in</strong>stance ofa generic type only knows about the nongeneric side of itself. For example, an<strong>in</strong>stance of HashSet doesn’t know whether it was created as a HashSet ora HashSet. The compiler effectively just adds casts where necessary and performsmore sanity check<strong>in</strong>g. Here’s an example—first the generic Java code:ArrayList str<strong>in</strong>gs = new ArrayList();str<strong>in</strong>gs.add("hello");Str<strong>in</strong>g entry = str<strong>in</strong>gs.get(0);str<strong>in</strong>gs.add(new Object());and now the equivalent nongeneric code:ArrayList str<strong>in</strong>gs = new ArrayList();str<strong>in</strong>gs.add("hello");Str<strong>in</strong>g entry = (Str<strong>in</strong>g) str<strong>in</strong>gs.get(0);str<strong>in</strong>gs.add(new Object());They would generate the same Java bytecode, except for the last l<strong>in</strong>e—which is valid<strong>in</strong> the nongeneric case but caught by the compiler as an error <strong>in</strong> the generic version.You can use a generic type as a “raw” type, which is equivalent to us<strong>in</strong>gjava.lang.Object for each of the type arguments. This rewrit<strong>in</strong>g—and loss of <strong>in</strong>formation—iscalled type erasure. Java doesn’t have user-def<strong>in</strong>ed value types, but you can’teven use the built-<strong>in</strong> ones as type arguments. Instead, you have to use the boxed version—ArrayListfor a list of <strong>in</strong>tegers, for example.You may be forgiven for th<strong>in</strong>k<strong>in</strong>g this is all a bit disappo<strong>in</strong>t<strong>in</strong>g compared withgenerics <strong>in</strong> <strong>C#</strong>, but there are some nice features of Java generics too:■■■■The runtime doesn’t know anyth<strong>in</strong>g about generics, so you can use code compiledus<strong>in</strong>g generics on an older version, as long as you don’t use any classes ormethods that aren’t present on the old version. Version<strong>in</strong>g <strong>in</strong> .NET is muchstricter <strong>in</strong> general—you have to compile us<strong>in</strong>g the oldest environment you wantto run on. That’s safer, but less flexible.You don’t need to learn a new set of classes to use Java generics—where a nongenericdeveloper would use ArrayList, a generic developer just uses Array-List. Exist<strong>in</strong>g classes can reasonably easily be “upgraded” to generic versions.The previous feature has been utilized quite effectively with the reflection system—java.lang.Class(the equivalent of System.Type) is generic, whichallows compile-time type safety to be extended to cover many situations <strong>in</strong>volv<strong>in</strong>greflection. In some other situations it’s a pa<strong>in</strong>, however.Java has support for covariance and contravariance us<strong>in</strong>g wildcards. For<strong>in</strong>stance, ArrayList


Summary111My personal op<strong>in</strong>ion is that .NET generics are superior <strong>in</strong> almost every respect,although every time I run <strong>in</strong>to a covariance/contravariance issue I suddenly wish Ihad wildcards. Java with generics is still much better than Java without generics, butthere are no performance benefits and the safety only applies at compile time. Ifyou’re <strong>in</strong>terested <strong>in</strong> the details, they’re <strong>in</strong> the Java language specification, or youcould read Gilad Bracha’s excellent guide to them at http://java.sun.com/j2se/1.5/pdf/generics-tutorial.pdf.3.7 SummaryPhew! It’s a good th<strong>in</strong>g generics are simpler to use <strong>in</strong> reality than they are <strong>in</strong> description.Although they can get complicated, they’re widely regarded as the most importantaddition to <strong>C#</strong> 2 and are <strong>in</strong>credibly useful. The worst th<strong>in</strong>g about writ<strong>in</strong>g codeus<strong>in</strong>g generics is that if you ever have to go back to <strong>C#</strong> 1, you’ll miss them terribly.In this chapter I haven’t tried to cover absolutely every detail of what is and isn’tallowed when us<strong>in</strong>g generics—that’s the job of the language specification, and itmakes for very dry read<strong>in</strong>g. Instead, I’ve aimed for a practical approach, provid<strong>in</strong>g the<strong>in</strong>formation you’ll need <strong>in</strong> everyday use, with a smatter<strong>in</strong>g of theory for the sake ofacademic <strong>in</strong>terest.We’ve seen three ma<strong>in</strong> benefits to generics: compile-time type safety, performance,and code expressiveness. Be<strong>in</strong>g able to get the IDE and compiler to validate your codeearly is certa<strong>in</strong>ly a good th<strong>in</strong>g, but it’s arguable that more is to be ga<strong>in</strong>ed from tools provid<strong>in</strong>g<strong>in</strong>telligent options based on the types <strong>in</strong>volved than the actual “safety” aspect.Performance is improved most radically when it comes to value types, which nolonger need to be boxed and unboxed when they’re used <strong>in</strong> strongly typed genericAPIs, particularly the generic collection types provided <strong>in</strong> .NET 2.0. Performance withreference types is usually improved but only slightly.Your code is able to express its <strong>in</strong>tention more clearly us<strong>in</strong>g generics—<strong>in</strong>stead of acomment or a long variable name required to describe exactly what types are<strong>in</strong>volved, the details of the type itself can do the work. Comments and variable namescan often become <strong>in</strong>accurate over time, as they can be left alone when code ischanged—but the type <strong>in</strong>formation is “correct” by def<strong>in</strong>ition.Generics aren’t capable of do<strong>in</strong>g everyth<strong>in</strong>g we might sometimes like them to do,and we’ve studied some of their limitations <strong>in</strong> the chapter, but if you truly embrace<strong>C#</strong> 2 and the generic types with<strong>in</strong> the .NET 2.0 Framework, you’ll come across gooduses for them <strong>in</strong>credibly frequently <strong>in</strong> your code.This topic will come up time and time aga<strong>in</strong> <strong>in</strong> future chapters, as other new featuresbuild on this key one. Indeed, the subject of our next chapter would be verydifferent without generics—we’re go<strong>in</strong>g to look at nullable types, as implementedby Nullable.Licensed to Rhona Hadida


Say<strong>in</strong>g noth<strong>in</strong>gwith nullable typesThis chapter covers■■Motivation for null valuesFramework and runtime support■ Language support <strong>in</strong> <strong>C#</strong> 2■Patterns us<strong>in</strong>g nullable typesNullity is a concept that has provoked a certa<strong>in</strong> amount of debate over the years. Isa null reference a value, or the absence of a value? Is “noth<strong>in</strong>g” a “someth<strong>in</strong>g”? Inthis chapter, I’ll try to stay more practical than philosophical. First we’ll look at whythere’s a problem <strong>in</strong> the first place—why you can’t set a value type variable to null<strong>in</strong> <strong>C#</strong> 1 and what the traditional alternatives have been. After that I’ll <strong>in</strong>troduce youto our knight <strong>in</strong> sh<strong>in</strong><strong>in</strong>g armor—System.Nullable—before we see how <strong>C#</strong> 2makes work<strong>in</strong>g with nullable types a bit simpler and more compact. Like generics,nullable types sometimes have some uses beyond what you might expect, and we’lllook at a few examples of these at the end of the chapter.So, when is a value not a value? Let’s f<strong>in</strong>d out.112Licensed to Rhona Hadida


What do you do when you just don’t have a value?1134.1 What do you do when you just don’t have a value?The <strong>C#</strong> and .NET designers don’t add features just for kicks. There has to be a real, significantproblem to be fixed before they’ll go as far as chang<strong>in</strong>g <strong>C#</strong> as a language or.NET at the platform level. In this case, the problem is best summed up <strong>in</strong> one of themost frequently asked questions <strong>in</strong> <strong>C#</strong> and .NET discussion groups:I need to set my DateTime 1 variable to null, but the compiler won’t let me.What should I do?It’s a question that comes up fairly naturally—a simple example might be <strong>in</strong> ane-commerce application where users are look<strong>in</strong>g at their account history. If an orderhas been placed but not delivered, there may be a purchase date but no dispatchdate—so how would you represent that <strong>in</strong> a type that is meant to provide theorder details?The answer to the question is usually <strong>in</strong> two parts: first, why you can’t just use null<strong>in</strong> the first place, and second, which options are available. Let’s look at the two parts separately—assum<strong>in</strong>gthat the developer ask<strong>in</strong>g the question is us<strong>in</strong>g <strong>C#</strong> 1.4.1.1 Why value type variables can’t be nullAs we saw <strong>in</strong> chapter 2, the value of a reference type variable is a reference, and thevalue of a value type variable is the “real” value itself. A “normal” reference value issome way of gett<strong>in</strong>g at an object, but null acts as a special value that means “I don’trefer to any object.” If you want to th<strong>in</strong>k of references as be<strong>in</strong>g like URLs, null is (veryroughly speak<strong>in</strong>g) the reference equivalent of about:blank. It’s represented as allzeroes <strong>in</strong> memory (which is why it’s the default value for all reference types—clear<strong>in</strong>ga whole block of memory is cheap, so that’s the way objects are <strong>in</strong>itialized), but it’s stillbasically stored <strong>in</strong> the same way as other references. There’s no “extra bit” hiddensomewhere for each reference type variable. That means we can’t use the “all zeroes”value for a “real” reference, but that’s OK—our memory is go<strong>in</strong>g to run out longbefore we have that many live objects anyway.The last sentence is the key to why null isn’t a valid value type value, though. Let’sconsider the byte type as a familiar one that is easy to th<strong>in</strong>k about. The value of a variableof type byte is stored <strong>in</strong> a s<strong>in</strong>gle byte—it may be padded for alignment purposes,but the value itself is conceptually only made up of one byte. We’ve got to be able tostore the values 0–255 <strong>in</strong> that variable; otherwise it’s useless for read<strong>in</strong>g arbitraryb<strong>in</strong>ary data. So, with the 256 “normal” values and one null value, we’d have to copewith a total of 257 values, and there’s no way of squeez<strong>in</strong>g that many values <strong>in</strong>to a s<strong>in</strong>glebyte. Now, the designers could have decided that every value type would have anextra flag bit somewhere determ<strong>in</strong><strong>in</strong>g whether a value was null or a “real” value, butthe memory usage implications are horrible, not to mention the fact that we’d have tocheck the flag every time we wanted to use the value. So <strong>in</strong> a nutshell, with value types1It’s almost always DateTime rather than any other value type. I’m not entirely sure why—it’s as if developers<strong>in</strong>herently understand why a byte shouldn’t be null, but feel that dates are more “<strong>in</strong>herently nullable.”Licensed to Rhona Hadida


114 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable typesyou often care about hav<strong>in</strong>g the whole range of possible bit patterns available as realvalues, whereas with reference types we’re happy enough to lose one potential value<strong>in</strong> order to ga<strong>in</strong> the benefits of hav<strong>in</strong>g a null value.That’s the usual situation—now why would you want to be able to represent nullfor a value type anyway? The most common immediate reason is simply because databasestypically support NULL as a value for every type (unless you specifically make thefield non-nullable), so you can have nullable character data, nullable <strong>in</strong>tegers, nullableBooleans—the whole works. When you fetch data from a database, it’s generallynot a good idea to lose <strong>in</strong>formation, so you want to be able to represent the nullity ofwhatever you read, somehow.That just moves the question one step further on, though. Why do databasesallow null values for dates, <strong>in</strong>tegers and the like? Null values are typically used forunknown or miss<strong>in</strong>g values such as the dispatch date <strong>in</strong> our earlier e-commerceexample. Nullity represents an absence of def<strong>in</strong>ite <strong>in</strong>formation, which can be important<strong>in</strong> many situations.That br<strong>in</strong>gs us to options for represent<strong>in</strong>g null values <strong>in</strong> <strong>C#</strong> 1.4.1.2 Patterns for represent<strong>in</strong>g null values <strong>in</strong> <strong>C#</strong> 1There are three basic patterns commonly used to get around the lack of nullablevalue types <strong>in</strong> <strong>C#</strong> 1. Each of them has its pros and cons—mostly cons—and all of themare fairly unsatisfy<strong>in</strong>g. However, it’s worth know<strong>in</strong>g them, partly to more fully appreciatethe benefits of the <strong>in</strong>tegrated solution <strong>in</strong> <strong>C#</strong> 2.PATTERN 1: THE MAGIC VALUEThe first pattern tends to be used as the solution for DateTime, because few peopleexpect their databases to actually conta<strong>in</strong> dates <strong>in</strong> 1AD. In other words, it goes aga<strong>in</strong>st thereason<strong>in</strong>g I gave earlier, expect<strong>in</strong>g every possible value to be available. So, we sacrificeone value (typically DateTime.M<strong>in</strong>Value) to mean a null value. The semantic mean<strong>in</strong>g ofthat will vary from application to application—it may mean that the user hasn’t enteredthe value <strong>in</strong>to a form yet, or that it’s <strong>in</strong>appropriate for that record, for example.The good news is that us<strong>in</strong>g a magic value doesn’t waste any memory or need anynew types. However, it does rely on you pick<strong>in</strong>g an appropriate value that will never beone you actually want to use for real data. Also, it’s basically <strong>in</strong>elegant. It just doesn’tfeel right. If you ever f<strong>in</strong>d yourself need<strong>in</strong>g to go down this path, you should at leasthave a constant (or static read-only value for types that can’t be expressed as constants)represent<strong>in</strong>g the magic value—comparisons with DateTime.M<strong>in</strong>Value everywhere,for <strong>in</strong>stance, don’t express the mean<strong>in</strong>g of the magic value.ADO.NET has a variation on this pattern where the same magic value—DBNull.Value—is used for all null values, of whatever type. In this case, an extra valueand <strong>in</strong>deed an extra type have been <strong>in</strong>troduced to <strong>in</strong>dicate when a database hasreturned null. However, it’s only applicable where compile-time type safety isn’timportant (<strong>in</strong> other words when you’re happy to use object and cast after test<strong>in</strong>g fornullity), and aga<strong>in</strong> it doesn’t feel quite right. In fact, it’s a mixture of the “magic value”pattern and the “reference type wrapper” pattern, which we’ll look at next.Licensed to Rhona Hadida


System.Nullable and System.Nullable115PATTERN 2: A REFERENCE TYPE WRAPPERThe second solution can take two forms. The simpler one is to just use object as thevariable type, box<strong>in</strong>g and unbox<strong>in</strong>g values as necessary. The more complex (andrather more appeal<strong>in</strong>g) form is to have a reference type for each value type you need<strong>in</strong> a nullable form, conta<strong>in</strong><strong>in</strong>g a s<strong>in</strong>gle <strong>in</strong>stance variable of that value type, and withimplicit conversion operators to and from the value type. With generics, you could dothis <strong>in</strong> one generic type—but if you’re us<strong>in</strong>g <strong>C#</strong> 2 anyway, you might as well use thenullable types described <strong>in</strong> this chapter <strong>in</strong>stead. If you’re stuck <strong>in</strong> <strong>C#</strong> 1, you have tocreate extra source code for each type you wish to wrap. This isn’t hard to put <strong>in</strong> theform of a template for automatic code generation, but it’s still a burden that is bestavoided if possible.Both of these forms have the problem that while they allow you to use nulldirectly, they do require objects to be created on the heap, which can lead to garbagecollection pressure if you need to use this approach very frequently, and adds memoryuse due to the overheads associated with objects. For the more complex solution, youcould make the reference type mutable, which may reduce the number of <strong>in</strong>stancesyou need to create but could also make for some very un<strong>in</strong>tuitive code.PATTERN 3: AN EXTRA BOOLEAN FLAGThe f<strong>in</strong>al pattern revolves around hav<strong>in</strong>g a normal value type value available, andanother value—a Boolean flag—<strong>in</strong>dicat<strong>in</strong>g whether the value is “real” or whether itshould be disregarded. Aga<strong>in</strong>, there are two ways of implement<strong>in</strong>g this solution.Either you could ma<strong>in</strong>ta<strong>in</strong> two separate variables <strong>in</strong> the code that uses the value, oryou could encapsulate the “value plus flag” <strong>in</strong>to another value type.This latter solution is quite similar to the more complicated reference type ideadescribed earlier, except that you avoid the garbage-collection issue by us<strong>in</strong>g a valuetype, and <strong>in</strong>dicate nullity with<strong>in</strong> the encapsulated value rather than by virtue of a nullreference. The downside of hav<strong>in</strong>g to create a new one of these types for every valuetype you wish to handle is the same, however. Also, if the value is ever boxed for somereason, it will be boxed <strong>in</strong> the normal way whether it’s considered to be null or not.The last pattern (<strong>in</strong> the more encapsulated form) is effectively how nullable typeswork <strong>in</strong> <strong>C#</strong> 2. We’ll see that when the new features of the framework, CLR, and languageare all comb<strong>in</strong>ed, the solution is significantly neater than anyth<strong>in</strong>g that was possible <strong>in</strong><strong>C#</strong> 1. Our next section deals with just the support provided by the framework and theCLR: if <strong>C#</strong> 2 only supported generics, the whole of section 4.2 would still be relevant andthe feature would still work and be useful. However, <strong>C#</strong> 2 provides extra syntactic sugarto make it even better—that’s the subject of section 4.3.4.2 System.Nullable and System.NullableThe core structure at the heart of nullable types is System.Nullable. In addition,the System.Nullable static class provides utility methods that occasionally make nullabletypes easier to work with. (From now on I’ll leave out the namespace, to make lifesimpler.) We’ll look at both of these types <strong>in</strong> turn, and for this section I’ll avoid any extrafeatures provided by the language, so you’ll be able to understand what’s go<strong>in</strong>g on <strong>in</strong>the IL code when we do look at the <strong>C#</strong> 2 syntactic sugar.Licensed to Rhona Hadida


116 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable types4.2.1 Introduc<strong>in</strong>g NullableAs you can tell by its name, Nullable is a generic type. The type parameter T has thevalue type constra<strong>in</strong>t on it. As I mentioned <strong>in</strong> section 3.3.1, this also means you can’tuse another nullable type as the argument—so Nullable is forbidden,for <strong>in</strong>stance, even though Nullable is a value type <strong>in</strong> every other way. The typeof T for any particular nullable type is called the underly<strong>in</strong>g type of that nullable type. Forexample, the underly<strong>in</strong>g type of Nullable is <strong>in</strong>t.The most important parts of Nullable are its properties, HasValue andValue. They do the obvious th<strong>in</strong>g: Value represents the non-nullable value (the“real” one, if you will) when there is one, and throws an InvalidOperation-Exception if (conceptually) there is no real value. HasValue is simply a Booleanproperty <strong>in</strong>dicat<strong>in</strong>g whether there’s a real value or whether the <strong>in</strong>stance should beregarded as null. For now, I’ll talk about an “<strong>in</strong>stance with a value” and an “<strong>in</strong>stancewithout a value,” which mean <strong>in</strong>stances where the HasValue property returns true orfalse, respectively.Now that we know what we want the properties to achieve, let’s see how to createan <strong>in</strong>stance of the type. Nullable has two constructors: the default one (creat<strong>in</strong>gan <strong>in</strong>stance without a value) and one tak<strong>in</strong>g an <strong>in</strong>stance of T as the value. Once an<strong>in</strong>stance has been constructed, it is immutable.NOTEValue types and mutability—A type is said to be immutable if it is designed sothat an <strong>in</strong>stance can’t be changed after it’s been constructed. Immutabletypes often make life easier when it comes to topics such as multithread<strong>in</strong>g,where it helps to know that nobody can be chang<strong>in</strong>g values <strong>in</strong> onethread while you’re read<strong>in</strong>g them <strong>in</strong> a different one. However, immutabilityis also important for value types. As a general rule, value types shouldalmost always be immutable. If you need a way of bas<strong>in</strong>g one value onanother, follow the lead of DateTime and TimeSpan—provide methodsthat return a new value rather than modify<strong>in</strong>g an exist<strong>in</strong>g one. That way,you avoid situations where you th<strong>in</strong>k you’re chang<strong>in</strong>g a variable but actuallyyou’re chang<strong>in</strong>g the value returned by a property or method, which is justa copy of the variable’s value. The compiler is usually smart enough towarn you about this, but it’s worth try<strong>in</strong>g to avoid the situation <strong>in</strong> the firstplace. Very few value types <strong>in</strong> the framework are mutable, fortunately.Nullable <strong>in</strong>troduces a s<strong>in</strong>gle new method, GetValueOrDefault, which has twooverloads. Both return the value of the <strong>in</strong>stance if there is one, or a default value otherwise.One overload doesn’t have any parameters (<strong>in</strong> which case the generic defaultvalue of the underly<strong>in</strong>g type is used), and the other allows you to specify the defaultvalue to return if necessary.The other methods implemented by Nullable all override exist<strong>in</strong>g methods:GetHashCode, ToStr<strong>in</strong>g, and Equals. GetHashCode returns 0 if the <strong>in</strong>stance doesn’thave a value, or the result of call<strong>in</strong>g GetHashCode on the value if there is one.ToStr<strong>in</strong>g returns an empty str<strong>in</strong>g if there isn’t a value, or the result of call<strong>in</strong>gLicensed to Rhona Hadida


System.Nullable and System.Nullable117ToStr<strong>in</strong>g on the value if there is. Equals is slightly more complicated—we’ll comeback to it when we’ve discussed box<strong>in</strong>g.F<strong>in</strong>ally, two conversions are provided by the framework. First, there is an implicitconversion from T to Nullable. This always results <strong>in</strong> an <strong>in</strong>stance where HasValuereturns true. Likewise, there is an explicit operator convert<strong>in</strong>g from Nullable toT, which behaves exactly the same as the Value property, <strong>in</strong>clud<strong>in</strong>g throw<strong>in</strong>g an exceptionwhen there is no real value to return.NOTEWrapp<strong>in</strong>g and unwrapp<strong>in</strong>g—The <strong>C#</strong> specification names the process ofconvert<strong>in</strong>g an <strong>in</strong>stance of T to an <strong>in</strong>stance of Nullable wrapp<strong>in</strong>g, withthe obvious opposite process be<strong>in</strong>g called unwrapp<strong>in</strong>g. The <strong>C#</strong> specificationactually def<strong>in</strong>es these terms with reference to the constructor tak<strong>in</strong>ga parameter and the Value property, respectively. Indeed these calls aregenerated by the <strong>C#</strong> code, even when it otherwise looks as if you’re us<strong>in</strong>gthe conversions provided by the framework. The results are the sameeither way, however. For the rest of this chapter, I won’t dist<strong>in</strong>guishbetween the two implementations available.Before we go any further, let’s see all this <strong>in</strong> action. List<strong>in</strong>g 4.1 shows everyth<strong>in</strong>g youcan do with Nullable directly, leav<strong>in</strong>g Equals aside for the moment.List<strong>in</strong>g 4.1Us<strong>in</strong>g various members of Nullablestatic void Display(Nullable x){Console.WriteL<strong>in</strong>e ("HasValue: {0}", x.HasValue);if (x.HasValue){Console.WriteL<strong>in</strong>e ("Value: {0}", x.Value);Console.WriteL<strong>in</strong>e ("Explicit conversion: {0}", (<strong>in</strong>t)x);}Console.WriteL<strong>in</strong>e ("GetValueOrDefault(): {0}",x.GetValueOrDefault());Console.WriteL<strong>in</strong>e ("GetValueOrDefault(10): {0}",x.GetValueOrDefault(10));Console.WriteL<strong>in</strong>e ("ToStr<strong>in</strong>g(): \"{0}\"", x.ToStr<strong>in</strong>g());Console.WriteL<strong>in</strong>e ("GetHashCode(): {0}", x.GetHashCode());Console.WriteL<strong>in</strong>e ();}...Nullable x = 5;x = new Nullable(5);Console.WriteL<strong>in</strong>e("Instance with value:");Display(x);x = new Nullable();Console.WriteL<strong>in</strong>e("Instance without value:");Display(x);In list<strong>in</strong>g 4.1 we first show the two different ways (<strong>in</strong> terms of <strong>C#</strong> source code) of wrapp<strong>in</strong>ga value of the underly<strong>in</strong>g type, and then we use various different members on theLicensed to Rhona Hadida


118 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable types<strong>in</strong>stance. Next, we create an <strong>in</strong>stance that doesn’t have a value, and use the same members<strong>in</strong> the same order, just omitt<strong>in</strong>g the Value property and the explicit conversion to<strong>in</strong>t s<strong>in</strong>ce these would throw exceptions. The output of list<strong>in</strong>g 4.1 is as follows:Instance with value:HasValue: TrueValue: 5Explicit conversion: 5GetValueOrDefault(): 5GetValueOrDefault(10): 5ToStr<strong>in</strong>g(): "5"GetHashCode(): 5Instance without value:HasValue: FalseGetValueOrDefault(): 0GetValueOrDefault(10): 10ToStr<strong>in</strong>g(): ""GetHashCode(): 0So far, you could probably have predicted all of the results just by look<strong>in</strong>g at the membersprovided by Nullable. When it comes to box<strong>in</strong>g and unbox<strong>in</strong>g, however,there’s special behavior to make nullable types behave how we’d really like them tobehave, rather than how they’d behave if we slavishly followed the normal box<strong>in</strong>g rules.4.2.2 Box<strong>in</strong>g and unbox<strong>in</strong>gIt’s important to remember that Nullable is a struct—a value type. This means thatif you want to convert it to a reference type (object is the most obvious example), you’llneed to box it. It is only with respect to box<strong>in</strong>g and unbox<strong>in</strong>g that the CLR itself has anyspecial behavior regard<strong>in</strong>g nullable types—the rest is “standard” generics, conversions,method calls, and so forth. In fact, the behavior was only changed shortly before therelease of .NET 2.0, as the result of community requests.An <strong>in</strong>stance of Nullable is boxed to either a null reference (if it doesn’t have avalue) or a boxed value of T (if it does). You can unbox from a boxed value either toits normal type or to the correspond<strong>in</strong>g nullable type. Unbox<strong>in</strong>g a null reference willthrow a NullReferenceException if you unbox to the normal type, but will unbox toan <strong>in</strong>stance without a value if you unbox to the appropriate nullable type. This behavioris shown <strong>in</strong> list<strong>in</strong>g 4.2.List<strong>in</strong>g 4.2Box<strong>in</strong>g and unbox<strong>in</strong>g behavior of nullable typesNullable nullable = 5;object boxed = nullable;Console.WriteL<strong>in</strong>e(boxed.GetType());Boxes a nullablewith value<strong>in</strong>t normal = (<strong>in</strong>t)boxed;Console.WriteL<strong>in</strong>e(normal);nullable = (Nullable)boxed;Console.WriteL<strong>in</strong>e(nullable);Unboxes to nonnullablevariableUnboxes tonullable variableLicensed to Rhona Hadida


System.Nullable and System.Nullable119nullable = new Nullable();boxed = nullable;Console.WriteL<strong>in</strong>e (boxed==null);nullable = (Nullable)boxed;Console.WriteL<strong>in</strong>e(nullable.HasValue);Boxes a nullablewithout valueUnboxes tonullable variableThe output of list<strong>in</strong>g 4.2 shows that the type of the boxed value is pr<strong>in</strong>ted as System.Int32 (not System.Nullable). It then confirms that we can retrievethe value by unbox<strong>in</strong>g to either just <strong>in</strong>t or to Nullable. F<strong>in</strong>ally, the output demonstrateswe can box from a nullable <strong>in</strong>stance without a value to a null reference andsuccessfully unbox aga<strong>in</strong> to another value-less nullable <strong>in</strong>stance. If we’d tried unbox<strong>in</strong>gthe last value of boxed to a non-nullable <strong>in</strong>t, the program would have blown up with aNullReferenceException.Now that we understand the behavior of box<strong>in</strong>g and unbox<strong>in</strong>g, we can beg<strong>in</strong> totackle the behavior of Nullable.Equals.4.2.3 Equality of Nullable <strong>in</strong>stancesNullable overrides object.Equals(object) but doesn’t <strong>in</strong>troduce any equalityoperators or provide an Equals(Nullable) method. S<strong>in</strong>ce the framework has suppliedthe basic build<strong>in</strong>g blocks, languages can add extra functionality on top, <strong>in</strong>clud<strong>in</strong>gmak<strong>in</strong>g exist<strong>in</strong>g operators work as we’d expect them to. We’ll see the details ofthat <strong>in</strong> section 4.3.3, but the basic equality as def<strong>in</strong>ed by the vanilla Equals methodfollows these rules for a call to first.Equals(second):■ If first has no value and second is null, they are equal.■ If first has no value and second isn’t null, they aren’t equal.■ If first has a value and second is null, they aren’t equal.■ Otherwise, they’re equal if first’s value is equal to second.Note that we don’t have to consider the case where second is another Nullablebecause the rules of box<strong>in</strong>g prohibit that situation. The type of second is object, so <strong>in</strong>order to be a Nullable it would have to be boxed, and as we have just seen, box<strong>in</strong>ga nullable <strong>in</strong>stance creates a box of the non-nullable type or returns a null reference.The rules are consistent with the rules of equality elsewhere <strong>in</strong> .NET, so you can usenullable <strong>in</strong>stances as keys for dictionaries and any other situations where you needequality. Just don’t expect it to differentiate between a non-nullable <strong>in</strong>stance and anullable <strong>in</strong>stance with a value—it’s all been carefully set up so that those two cases aretreated the same way as each other.That covers the Nullable structure itself, but it has a shadowy partner: theNullable class.4.2.4 Support from the nongeneric Nullable classThe System.Nullable struct does almost everyth<strong>in</strong>g you want it to. However, itreceives a little help from the System.Nullable class. This is a static class—it onlyLicensed to Rhona Hadida


120 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable typesconta<strong>in</strong>s static methods, and you can’t create an <strong>in</strong>stance of it. 2 In fact, everyth<strong>in</strong>g itdoes could have been done equally well by other types, and if Microsoft had seenwhere they were go<strong>in</strong>g right from the beg<strong>in</strong>n<strong>in</strong>g, it might not have even existed—which would have saved a little confusion over what the two types are there for, asidefrom anyth<strong>in</strong>g else. However, this accident of history has three methods to its name,and they’re still useful.The first two are comparison methods:public static <strong>in</strong>t Compare(Nullable n1, Nullable n2)public static bool Equals(Nullable n1, Nullable n2)Compare uses Comparer.Default to compare the two underly<strong>in</strong>g values (if theyexist), and Equals uses EqualityComparer.Default. In the face of <strong>in</strong>stances withno values, the values returned from each method comply with the .NET conventionsof nulls compar<strong>in</strong>g equal to each other and less than anyth<strong>in</strong>g else.Both of these methods could quite happily be part of Nullable as static butnongeneric methods. The one small advantage of hav<strong>in</strong>g them as generic methods <strong>in</strong>a nongeneric type is that generic type <strong>in</strong>ference can be applied, so you’ll rarely needto explicitly specify the type parameter.The f<strong>in</strong>al method of System.Nullable isn’t generic—<strong>in</strong>deed, it absolutely couldn’tbe. Its signature is as follows:public static Type GetUnderly<strong>in</strong>gType (Type nullableType)If the parameter is a nullable type, the method returns its underly<strong>in</strong>g type; otherwiseit returns null. The reason this couldn’t be a generic method is that if you knew theunderly<strong>in</strong>g type to start with, you wouldn’t have to call it!We’ve now seen what the framework and the CLR provide to support nullabletypes—but <strong>C#</strong> 2 adds language features to make life a lot more pleasant.4.3 <strong>C#</strong> 2’s syntactic sugar for nullable typesThe examples so far have shown nullable types do<strong>in</strong>g their job, but they’ve not beenparticularly pretty to look at. Admittedly it makes it obvious that you are us<strong>in</strong>g nullabletypes when you have to type Nullable around the name of the type you’re really<strong>in</strong>terested <strong>in</strong>, but it makes the nullability more prom<strong>in</strong>ent than the name of the typeitself, which is surely not a good idea.In addition, the very name “nullable” suggests that we should be able to assignnull to a variable of a nullable type, and we haven’t seen that—we’ve always used thedefault constructor of the type. In this section we’ll see how <strong>C#</strong> 2 deals with theseissues and others.Before we get <strong>in</strong>to the details of what <strong>C#</strong> 2 provides as a language, there’s one def<strong>in</strong>itionI can f<strong>in</strong>ally <strong>in</strong>troduce. The null value of a nullable type is the value whereHasValue returns false—or an “<strong>in</strong>stance without a value,” as I’ve referred to it <strong>in</strong> section4.2. I didn’t use it before because it’s specific to <strong>C#</strong>. The CLI specification2You’ll learn more about static classes <strong>in</strong> chapter 7.Licensed to Rhona Hadida


<strong>C#</strong> 2’s syntactic sugar for nullable types121doesn’t mention it, and the documentation for Nullable itself doesn’t mention it.I’ve honored that difference by wait<strong>in</strong>g until we’re specifically talk<strong>in</strong>g about <strong>C#</strong> 2itself before <strong>in</strong>troduc<strong>in</strong>g the term.With that out of the way, let’s see what features <strong>C#</strong> 2 gives us, start<strong>in</strong>g by reduc<strong>in</strong>gthe clutter <strong>in</strong> our code.4.3.1 The ? modifierThere are some elements of syntax that may be unfamiliar at first but have an appropriatefeel to them. The conditional operator (a ? b : c) is one of them for me—it asksa question and then has two correspond<strong>in</strong>g answers. In the same way, the ? operatorfor nullable types just feels right to me.It’s a shorthand way of us<strong>in</strong>g a nullable type, so <strong>in</strong>stead of us<strong>in</strong>g Nullable we can use byte? throughout our code. The two are <strong>in</strong>terchangeable and compile toexactly the same IL, so you can mix and match them if you want to, but on behalf ofwhoever reads your code next, I’d urge you to pick one way or the other and use itconsistently. List<strong>in</strong>g 4.3 is exactly equivalent to list<strong>in</strong>g 4.2 but uses the ? modifier.List<strong>in</strong>g 4.3The same code as list<strong>in</strong>g 4.2 but us<strong>in</strong>g the ? modifier<strong>in</strong>t? nullable = 5;object boxed = nullable;Console.WriteL<strong>in</strong>e(boxed.GetType());<strong>in</strong>t normal = (<strong>in</strong>t)boxed;Console.WriteL<strong>in</strong>e(normal);nullable = (<strong>in</strong>t?)boxed;Console.WriteL<strong>in</strong>e(nullable);nullable = new <strong>in</strong>t?();boxed = nullable;Console.WriteL<strong>in</strong>e (boxed==null);nullable = (<strong>in</strong>t?)boxed;Console.WriteL<strong>in</strong>e(nullable.HasValue);I won’t go through what the code does or how it does it, because the result is exactly thesame as list<strong>in</strong>g 4.2. The two list<strong>in</strong>gs compile down to the same IL—they’re just us<strong>in</strong>g differentsyntax, just as us<strong>in</strong>g <strong>in</strong>t is <strong>in</strong>terchangeable with System.Int32. The only changesare the ones <strong>in</strong> bold. You can use the shorthand version everywhere, <strong>in</strong>clud<strong>in</strong>g <strong>in</strong>method signatures, typeof expressions, casts, and the like.The reason I feel the modifier is very well chosen is that it adds an air of uncerta<strong>in</strong>tyto the nature of the variable. Does the variable nullable <strong>in</strong> list<strong>in</strong>g 4.3 have an<strong>in</strong>teger value? Well, at any particular time it might, or it might be the null value. Fromnow on, we’ll use the ? modifier <strong>in</strong> all the examples—it’s neater, and it’s arguably theidiomatic way to use nullable types <strong>in</strong> <strong>C#</strong> 2. However, you may feel that it’s too easy tomiss when read<strong>in</strong>g the code, <strong>in</strong> which case there’s certa<strong>in</strong>ly noth<strong>in</strong>g to stop you fromus<strong>in</strong>g the longer syntax. You may wish to compare the list<strong>in</strong>gs <strong>in</strong> this section and theprevious one to see which you f<strong>in</strong>d clearer.Licensed to Rhona Hadida


122 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable typesGiven that the <strong>C#</strong> 2 specification def<strong>in</strong>es the null value, it would be pretty odd if wecouldn’t use the null literal we’ve already got <strong>in</strong> the language <strong>in</strong> order to represent it.Fortunately we can, as our next section will show.4.3.2 Assign<strong>in</strong>g and compar<strong>in</strong>g with nullA very concise author could cover this whole section <strong>in</strong> a s<strong>in</strong>gle sentence: “The <strong>C#</strong>compiler allows the use of null to represent the null value of a nullable type <strong>in</strong> bothcomparisons and assignments.” I prefer to show you what it means <strong>in</strong> real code, as wellas th<strong>in</strong>k about why the language has been given this feature.You may have felt a bit uncomfortable every time we’ve used the default constructorof Nullable. It achieves the desired behavior, but it doesn’t express the reasonwe want to do it—it doesn’t leave the right impression with the reader. We want togive the same sort of feel<strong>in</strong>g that us<strong>in</strong>g null does with reference types. If it seems oddto you that I’ve talked about feel<strong>in</strong>gs <strong>in</strong> both this section and the last one, just th<strong>in</strong>kabout who writes code, and who reads it. Sure, the compiler has to understand thecode, and it couldn’t care less about the subtle nuances of style—but very few piecesof code used <strong>in</strong> production systems are written and then never read aga<strong>in</strong>. Anyth<strong>in</strong>gyou can do to get the reader <strong>in</strong>to the mental process you were go<strong>in</strong>g through whenyou orig<strong>in</strong>ally wrote the code is good—and us<strong>in</strong>g the familiar null literal helps toachieve that.With that <strong>in</strong> m<strong>in</strong>d, we’re go<strong>in</strong>g to change the example we’re us<strong>in</strong>g from one thatjust shows syntax and behavior to one that gives an impression of how nullable typesmight be used. We’ll consider model<strong>in</strong>g a Person class where you need to know thename, date of birth, and date of death of a person. We’ll only keep track of peoplewho have def<strong>in</strong>itely been born, but some of those people may still be alive—<strong>in</strong> whichcase our date of death is represented by null. List<strong>in</strong>g 4.4 shows some of the possiblecode. Although a real class would clearly have more operations available, we’re justlook<strong>in</strong>g at the calculation of age for this example.List<strong>in</strong>g 4.4Part of a Person class <strong>in</strong>clud<strong>in</strong>g calculation of ageclass Person{DateTime birth;DateTime? death;str<strong>in</strong>g name;public TimeSpan Age{get{if (death==null){BChecksHasValuereturn DateTime.Now-birth;}else{return death.Value-birth;CUnwraps forcalculationLicensed to Rhona Hadida


<strong>C#</strong> 2’s syntactic sugar for nullable types123}}}public Person(str<strong>in</strong>g name,DateTime birth,DateTime? death){this.birth = birth;this.death = death;this.name = name;}}...Person tur<strong>in</strong>g = new Person("Alan Tur<strong>in</strong>g",new DateTime(1912, 6, 23),new DateTime(1954, 6, 7));Person knuth = new Person("Donald Knuth",new DateTime(1938, 1, 10),null);List<strong>in</strong>g 4.4 doesn’t produce any output, but the very fact that it compiles might havesurprised you before read<strong>in</strong>g this chapter. Apart from the use of the ? modifier caus<strong>in</strong>gconfusion, you might have found it very odd that you could compare a DateTime?with null, or pass null as the argument for a DateTime? parameter.Hopefully by now the mean<strong>in</strong>g is <strong>in</strong>tuitive—when we compare the death variablewith null, we’re ask<strong>in</strong>g whether its value is the null value or not. Likewise when we usenull as a DateTime? <strong>in</strong>stance, we’re really creat<strong>in</strong>g the null value for the type by call<strong>in</strong>gthe default constructor. Indeed, you can see <strong>in</strong> the generated IL that the code thecompiler spits out for list<strong>in</strong>g 4.4 really does just call the death.HasValue property B,and creates a new <strong>in</strong>stance of DateTime? E us<strong>in</strong>g the default constructor (represented<strong>in</strong> IL as the <strong>in</strong>itobj <strong>in</strong>struction). The date of Alan Tur<strong>in</strong>g’s death D is createdby call<strong>in</strong>g the normal DateTime constructor and then pass<strong>in</strong>g the result <strong>in</strong>to theNullable constructor, which takes a parameter.I mention look<strong>in</strong>g at the IL because that can be a useful way of f<strong>in</strong>d<strong>in</strong>g out whatyour code is actually do<strong>in</strong>g, particularly if someth<strong>in</strong>g compiles when you don’t expectit to. You can use the ildasm tool that comes with the .NET SDK, or for a rather betteruser <strong>in</strong>terface you can use Reflector, 3 which has many other features (most notablydecompilation to high-level languages such as <strong>C#</strong> as well as disassembly to IL).We’ve seen how <strong>C#</strong> provides shorthand syntax for the concept of a null value, mak<strong>in</strong>gthe code more expressive once nullable types are understood <strong>in</strong> the first place. However,one part of list<strong>in</strong>g 4.4 took a bit more work than we might have hoped—the subtractionat C. Why did we have to unwrap the value? Why could we not just return death-birthdirectly? What would we want that expression to mean <strong>in</strong> the case (excluded <strong>in</strong> our codeby our earlier test for null, of course) where death had been null? These questions—and more—are answered <strong>in</strong> our next section.EDWraps DateTimeas a nullableSpecifies a nulldate of death3Available free of charge from http://www.aisto.com/roeder/dotnet/Licensed to Rhona Hadida


124 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable types4.3.3 Nullable conversions and operatorsWe’ve seen that we can compare <strong>in</strong>stances of nullable types with null, but there areother comparisons that can be made and other operators that can be used <strong>in</strong> somecases. Likewise we’ve seen wrapp<strong>in</strong>g and unwrapp<strong>in</strong>g, but other conversions can beused with some types. This section expla<strong>in</strong>s what’s available. I’m afraid it’s pretty muchimpossible to make this k<strong>in</strong>d of topic genu<strong>in</strong>ely excit<strong>in</strong>g, but carefully designed featureslike these are what make <strong>C#</strong> a pleasant language to work with <strong>in</strong> the long run.Don’t worry if not all of it s<strong>in</strong>ks <strong>in</strong> the first time: just remember that the details are hereif you need to reference them <strong>in</strong> the middle of a cod<strong>in</strong>g session.The “executive summary” is that if there is an operator or conversion available ona non-nullable value type, and that operator or conversion only <strong>in</strong>volves other nonnullablevalue types, then the nullable value type also has the same operator or conversionavailable, usually convert<strong>in</strong>g the non-nullable value types <strong>in</strong>to their nullableequivalents. To give a more concrete example, there’s an implicit conversion from <strong>in</strong>tto long, and that means there’s also an implicit conversion from <strong>in</strong>t? to long? thatbehaves <strong>in</strong> the obvious manner.Unfortunately, although that broad description gives the right general idea, theexact rules are slightly more complicated. Each one is simple, but there are quite a fewof them. It’s worth know<strong>in</strong>g about them because otherwise you may well end up star<strong>in</strong>gat a compiler error or warn<strong>in</strong>g for a while, wonder<strong>in</strong>g why it believes you’re try<strong>in</strong>gto make a conversion that you never <strong>in</strong>tended <strong>in</strong> the first place. We’ll start with theconversions, and then look at operators.CONVERSIONS INVOLVING NULLABLE TYPESFor completeness, let’s start with the conversions we already know about:■ An implicit conversion from the null literal to T?■ An implicit conversion from T to T?■An explicit conversion from T? to TNow consider the predef<strong>in</strong>ed and user-def<strong>in</strong>ed conversions available on types. For<strong>in</strong>stance, there is a predef<strong>in</strong>ed conversion from <strong>in</strong>t to long. For any conversion likethis, from one non-nullable value type (S) to another (T), the follow<strong>in</strong>g conversionsare also available:■ S? to T? (explicit or implicit depend<strong>in</strong>g on orig<strong>in</strong>al conversion)■ S to T? (explicit or implicit depend<strong>in</strong>g on orig<strong>in</strong>al conversion)■ S? to T (always explicit)To carry our example forward, this means that you can convert implicitly from <strong>in</strong>t? tolong? and from <strong>in</strong>t to long? as well as explicitly from long? to <strong>in</strong>t. The conversionsbehave <strong>in</strong> the natural way, with null values of S? convert<strong>in</strong>g to null values of T?, andnon-null values us<strong>in</strong>g the orig<strong>in</strong>al conversion. As before, the explicit conversion fromS? to T will throw an InvalidOperationException when convert<strong>in</strong>g from a null valueof S?. For user-def<strong>in</strong>ed conversions, these extra conversions <strong>in</strong>volv<strong>in</strong>g nullable typesare known as lifted conversions.Licensed to Rhona Hadida


<strong>C#</strong> 2’s syntactic sugar for nullable types125So far, so relatively simple. Now let’s consider the operators, where th<strong>in</strong>gs areslightly more tricky.OPERATORS INVOLVING NULLABLE TYPES<strong>C#</strong> allows the follow<strong>in</strong>g operators to be overloaded:■ Unary: + ++ - -- ! ~ true false■ B<strong>in</strong>ary: + - * / % & | ^ >■ Equality: 4 == !=■ Relational: < > =When these operators are overloaded for a non-nullable value type T, the nullable typeT? has the same operators, with slightly different operand and result types. These arecalled lifted operators whether they’re predef<strong>in</strong>ed operators like addition on numerictypes, or user-def<strong>in</strong>ed operators like add<strong>in</strong>g a TimeSpan to a DateTime. There are a fewrestrictions as to when they apply:■ The true and false operators are never lifted. They’re <strong>in</strong>credibly rare <strong>in</strong> thefirst place, though, so it’s no great loss.■ Only operators with non-nullable value types for the operands are lifted.■ For the unary and b<strong>in</strong>ary operators (other than equality and relational operators),the return type has to be a non-nullable value type.■ For the equality and relational operators, the return type has to be bool.■ The & and | operators on bool? have separately def<strong>in</strong>ed behavior, which we’llsee <strong>in</strong> section 4.3.6.For all the operators, the operand types become their nullable equivalents. For theunary and b<strong>in</strong>ary operators, the return type also becomes nullable, and a null value isreturned if any of the operands is a null value. The equality and relational operatorskeep their non-nullable Boolean return types. For equality, two null values are consideredequal, and a null value and any non-null value are considered different, which isconsistent with the behavior we saw <strong>in</strong> section 4.2.3. The relational operators alwaysreturn false if either operand is a null value. When none of the operands is a null value,the operator of the non-nullable type is <strong>in</strong>voked <strong>in</strong> the obvious way.All these rules sound more complicated than they really are—for the most part,everyth<strong>in</strong>g works as you probably expect it to. It’s easiest to see what happens with afew examples, and as <strong>in</strong>t has so many predef<strong>in</strong>ed operators (and <strong>in</strong>tegers can be soeasily expressed), it’s the natural demonstration type. Table 4.1 shows a number ofexpressions, the lifted operator signature, and the result. It is assumed that there arevariables four, five, and nullInt, each with type <strong>in</strong>t? and with the obvious values.Possibly the most surpris<strong>in</strong>g l<strong>in</strong>e of the table is the bottom one—that a null valueisn’t deemed “less than or equal to” another null value, even though they are deemed4The equality and relational operators are, of course, b<strong>in</strong>ary operators themselves, but we’ll see that theybehave slightly differently to the others, hence their separation here.Licensed to Rhona Hadida


126 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable typesTable 4.1Examples of lifted operators applied to nullable <strong>in</strong>tegersExpression Lifted operator Result-nullInt <strong>in</strong>t? –(<strong>in</strong>t? x) null-five <strong>in</strong>t? –(<strong>in</strong>t? x) -5five + nullInt <strong>in</strong>t? +(<strong>in</strong>t? x, <strong>in</strong>t? y) nullfive + five <strong>in</strong>t? +(<strong>in</strong>t? x, <strong>in</strong>t? y) 10nullInt == nullInt bool ==(<strong>in</strong>t? x, <strong>in</strong>t? y) truefive == five bool ==(<strong>in</strong>t? x, <strong>in</strong>t? y) truefive == nullInt bool ==(<strong>in</strong>t? x, <strong>in</strong>t? y) falsefive == four bool ==(<strong>in</strong>t? x, <strong>in</strong>t? y) falsefour < five bool


<strong>C#</strong> 2’s syntactic sugar for nullable types127Now we’re able to answer the question at the end of the previous section—why weused death.Value-birth <strong>in</strong> list<strong>in</strong>g 4.4 <strong>in</strong>stead of just death-birth. Apply<strong>in</strong>g the previousrules, we could have used the latter expression, but the result would have been aTimeSpan? <strong>in</strong>stead of a TimeSpan. This would have left us with the options of cast<strong>in</strong>g theresult to TimeSpan, us<strong>in</strong>g its Value property, or chang<strong>in</strong>g the Age property to return aTimeSpan?—which just pushes the issue onto the caller. It’s still a bit ugly, but we’ll seea nicer implementation of the Age property <strong>in</strong> section 4.3.5.In the list of restrictions regard<strong>in</strong>g operator lift<strong>in</strong>g, I mentioned that bool? worksslightly differently than the other types. Our next section expla<strong>in</strong>s this and pulls backthe lens to see the bigger picture of why all these operators work the way they do.4.3.4 Nullable logicI vividly remember my early electronics lessons at school. They always seemed torevolve around either work<strong>in</strong>g out the voltage across different parts of a circuit us<strong>in</strong>gthe V=IR formula, or apply<strong>in</strong>g truth tables—the reference charts for expla<strong>in</strong><strong>in</strong>g the differencebetween NAND gates and NOR gates and so on. The idea is simple—a truthtable maps out every possible comb<strong>in</strong>ation of <strong>in</strong>puts <strong>in</strong>to whatever piece of logicyou’re <strong>in</strong>terested <strong>in</strong> and tells you the output.The truth tables we drew for simple, two-<strong>in</strong>put logic gates always had four rows—each of the two <strong>in</strong>puts had two possible values, which means there were four possiblecomb<strong>in</strong>ations. Boolean logic is simple like that—but what should happen when you’vegot a tristate logical type? Well, bool? is just such a type—the value can be true,false, or null. That means that our truth tables now have to have n<strong>in</strong>e rows for ourb<strong>in</strong>ary operators, as there are n<strong>in</strong>e comb<strong>in</strong>ations. The specification only highlightsthe logical AND and <strong>in</strong>clusive OR operators (& and |, respectively) because the otheroperators—unary logical negation (!) and exclusive OR (^)—follow the same rules asother lifted operators. There are no conditional logical operators (the short-circuit<strong>in</strong>g&& and || operators) def<strong>in</strong>ed for bool?, which makes life simpler.For the sake of completeness, table 4.2 gives the truth tables for all four valid bool?operators.Table 4.2Truth table for the logical operators AND, <strong>in</strong>clusive OR, exclusive OR, and logicalnegation, applied to the bool? typex y x & y x | y x ^ y !xtrue true true true false falsetrue false false true true falsetrue null null true null falsefalse true false true true truefalse false false false false truefalse null false null null trueLicensed to Rhona Hadida


128 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable typesTable 4.2Truth table for the logical operators AND, <strong>in</strong>clusive OR, exclusive OR, and logicalnegation, applied to the bool? type (cont<strong>in</strong>ued)x y x & y x | y x ^ y !xnull true null true null nullnull false false null null nullnull null null null null nullFor those who f<strong>in</strong>d reason<strong>in</strong>g about rules easier to understand than look<strong>in</strong>g up values<strong>in</strong> tables, the idea is that a null bool? value is <strong>in</strong> some senses a “maybe.” If you imag<strong>in</strong>ethat each null entry <strong>in</strong> the <strong>in</strong>put side of the table is a variable <strong>in</strong>stead, then you willalways get a null value on the output side of the table if the result depends on thevalue of that variable. For <strong>in</strong>stance, look<strong>in</strong>g at the third l<strong>in</strong>e of the table, the expressiontrue & y will only be true if y is true, but the expression true | y will always betrue whatever the value of y is, so the nullable results are null and true, respectively.When consider<strong>in</strong>g the lifted operators and particularly how nullable logic works, thelanguage designers had two slightly contradictory sets of exist<strong>in</strong>g behavior—<strong>C#</strong> 1 nullreferences and SQL NULL values. In many cases these don’t conflict at all—<strong>C#</strong> 1 had noconcept of apply<strong>in</strong>g logical operators to null references, so there was no problem <strong>in</strong>us<strong>in</strong>g the SQL-like results given earlier. The def<strong>in</strong>itions we’ve seen may surprise someSQL developers, however, when it comes to comparisons. In standard SQL, the result ofcompar<strong>in</strong>g two values (<strong>in</strong> terms of equality or greater than/less than) is always unknownif either value is NULL. The result <strong>in</strong> <strong>C#</strong> 2 is never null, and <strong>in</strong> particular two null valuesare considered to be equal to each other.NOTE Rem<strong>in</strong>der: this is <strong>C#</strong> specific! It’s worth remember<strong>in</strong>g that the lifted operatorsand conversions, along with the bool? logic described <strong>in</strong> this section,are all provided by the <strong>C#</strong> compiler and not by the CLR or the frameworkitself. If you use ildasm on code that evaluates any of these nullable operators,you’ll f<strong>in</strong>d that the compiler has created all the appropriate IL totest for null values and deal with them accord<strong>in</strong>gly. This means that differentlanguages can behave differently on these matters—def<strong>in</strong>itelysometh<strong>in</strong>g to look out for if you need to port code between different.NET-based languages.We now certa<strong>in</strong>ly know enough to use nullable types and predict how they’ll behave,but <strong>C#</strong> 2 has a sort of “bonus track” when it comes to syntax enhancements: the nullcoalesc<strong>in</strong>g operator.4.3.5 The null coalesc<strong>in</strong>g operatorAside from the ? modifier, all of the rest of the <strong>C#</strong> compiler’s tricks so far to do withnullable types have worked with the exist<strong>in</strong>g syntax. However, <strong>C#</strong> 2 <strong>in</strong>troduces a newoperator that can occasionally make code shorter and sweeter. It’s called the null coalesc<strong>in</strong>goperator and appears <strong>in</strong> code as ?? between its two operands. It’s a bit like theconditional operator but specially tweaked for nulls.Licensed to Rhona Hadida


<strong>C#</strong> 2’s syntactic sugar for nullable types129It’s a b<strong>in</strong>ary operator that evaluates first ?? second by go<strong>in</strong>g through the follow<strong>in</strong>gsteps (roughly speak<strong>in</strong>g):1 Evaluate first.2 If the result is non-null, that’s the result of the whole expression.3 Otherwise, evaluate second; the result then becomes the result of the wholeexpression.I say “roughly speak<strong>in</strong>g” because the formal rules <strong>in</strong> the specification <strong>in</strong>volve lots of situationswhere there are conversions <strong>in</strong>volved between the types of first and second.As ever, these aren’t important <strong>in</strong> most uses of the operator, and I don’t <strong>in</strong>tend to gothrough them—consult the specification if you need the details.Importantly, if the type of the second operand is the underly<strong>in</strong>g type of the firstoperand (and therefore non-nullable), then the overall result is that underly<strong>in</strong>g type.Let’s take a look at a practical use for this by revisit<strong>in</strong>g the Age property of list<strong>in</strong>g 4.4.As a rem<strong>in</strong>der, here’s how it was implemented back then, along with the relevant variabledeclarations:DateTime birth;DateTime? death;public TimeSpan Age{get{if (death==null){return DateTime.Now-birth;}else{return death.Value-birth;}}}Note how both branches of the if statement subtract the value of birth from some nonnullDateTime value. The value we’re <strong>in</strong>terested <strong>in</strong> is the latest time the person wasalive—the time of the person’s death if he or she has already died, or now otherwise. Tomake progress <strong>in</strong> little steps, let’s try just us<strong>in</strong>g the normal conditional operator first:DateTime lastAlive = (death==null ? DateTime.Now : death.Value);return lastAlive–birth;That’s progress of a sort, but arguably the conditional operator has actually made itharder to read rather than easier, even though the new code is shorter. The conditionaloperator is often like that—how much you use it is a matter of personal preference,although it’s worth consult<strong>in</strong>g the rest of your team before us<strong>in</strong>g it extensively. Let’s seehow the null coalesc<strong>in</strong>g operator improves th<strong>in</strong>gs. We want to use the value of death ifit’s non-null, and DateTime.Now otherwise. We can change the implementation toDateTime lastAlive = death ?? DateTime.Now;return lastAlive–birth;Licensed to Rhona Hadida


130 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable typesNote how the type of the result is DateTime rather than DateTime? because we’veused DateTime.Now as the second operand. We could shorten the whole th<strong>in</strong>g to oneexpression:return (death ?? DateTime.Now)-birth;However, this is a bit more obscure—<strong>in</strong> particular, <strong>in</strong> the two-l<strong>in</strong>e version the name ofthe lastAlive variable helps the reader to see why we’re apply<strong>in</strong>g the null coalesc<strong>in</strong>goperator. I hope you agree that the two-l<strong>in</strong>e version is simpler and more readable thaneither the orig<strong>in</strong>al version us<strong>in</strong>g the if statement or the version us<strong>in</strong>g the normal conditionaloperator from <strong>C#</strong> 1. Of course, it relies on the reader understand<strong>in</strong>g what thenull coalesc<strong>in</strong>g operator does. In my experience, this is one of the least well-knownaspects of <strong>C#</strong> 2, but it’s useful enough to make it worth try<strong>in</strong>g to enlighten yourcoworkers rather than avoid<strong>in</strong>g it.There are two further aspects that <strong>in</strong>crease the operator’s usefulness, too. First, itdoesn’t just apply to nullable types—reference types can also be used; you just can’tuse a non-nullable value type for the first operand as that would be po<strong>in</strong>tless. Also, it’sright associative, which means an expression of the form first ?? second ?? third isevaluated as first ?? (second ?? third)—and so it cont<strong>in</strong>ues for more operands.You can have any number of expressions, and they’ll be evaluated <strong>in</strong> order, stopp<strong>in</strong>gwith the first non-null result. If all of the expressions evaluate to null, the result will benull too.As a concrete example of this, suppose you have an onl<strong>in</strong>e order<strong>in</strong>g system (andwho doesn’t these days?) with the concepts of a bill<strong>in</strong>g address, contact address, andshipp<strong>in</strong>g address. The bus<strong>in</strong>ess rules declare that any user must have a bill<strong>in</strong>g address,but the contact address is optional. The shipp<strong>in</strong>g address for a particular order is alsooptional, default<strong>in</strong>g to the bill<strong>in</strong>g address. These “optional” addresses are easily representedas null references <strong>in</strong> the code. To work out who to contact <strong>in</strong> the case of aproblem with a shipment, the code <strong>in</strong> <strong>C#</strong> 1 might look someth<strong>in</strong>g like this:Address contact = user.ContactAddress;if (contact==null){contact = order.Shipp<strong>in</strong>gAddress;if (contact==null){contact = user.Bill<strong>in</strong>gAddress;}}Us<strong>in</strong>g the conditional operator <strong>in</strong> this case is even more horrible. Us<strong>in</strong>g the null coalesc<strong>in</strong>goperator, however, makes the code very straightforward:Address contact = user.ContactAddress ??order.Shipp<strong>in</strong>gAddress ??user.Bill<strong>in</strong>gAddress;If the bus<strong>in</strong>ess rules changed to use the shipp<strong>in</strong>g address by default <strong>in</strong>stead of theuser’s contact address, the change here would be extremely obvious. It wouldn’t beLicensed to Rhona Hadida


Novel uses of nullable types131particularly tax<strong>in</strong>g with the if/else version, but I know I’d have to stop and th<strong>in</strong>ktwice, and verify the code mentally. I’d also be rely<strong>in</strong>g on unit tests, so there’d be relativelylittle chance of me actually gett<strong>in</strong>g it wrong, but I’d prefer not to th<strong>in</strong>k aboutth<strong>in</strong>gs like this unless I absolutely have to.NOTEEveryth<strong>in</strong>g <strong>in</strong> moderation—Just <strong>in</strong> case you may be th<strong>in</strong>k<strong>in</strong>g that my code islittered with uses of the null coalesc<strong>in</strong>g operator, it’s really not. I tend toconsider it when I see default<strong>in</strong>g mechanisms <strong>in</strong>volv<strong>in</strong>g nulls and possiblythe conditional operator, but it doesn’t come up very often. When its useis natural, however, it can be a powerful tool <strong>in</strong> the battle for readability.We’ve seen how nullable types can be used for “ord<strong>in</strong>ary” properties of objects—caseswhere we just naturally might not have a value for some particular aspect that is stillbest expressed with a value type. Those are the more obvious uses for nullable typesand <strong>in</strong>deed the most common ones. However, a few patterns aren’t as obvious but canstill be powerful when you’re used to them. We’ll explore two of these patterns <strong>in</strong> ournext section. This is more for the sake of <strong>in</strong>terest than as part of learn<strong>in</strong>g about thebehavior of nullable types themselves—you now have all the tools you need to usethem <strong>in</strong> your own code. If you’re <strong>in</strong>terested <strong>in</strong> quirky ideas and perhaps try<strong>in</strong>g someth<strong>in</strong>gnew, however, read on…4.4 Novel uses of nullable typesBefore nullable types became a reality, I saw lots of people effectively ask<strong>in</strong>g for them,usually related to database access. That’s not the only use they can be put to, however.The patterns presented <strong>in</strong> this section are slightly unconventional but can make codesimpler. If you only ever stick to “normal” idioms of <strong>C#</strong>, that’s absolutely f<strong>in</strong>e—this sectionmight not be for you, and I have a lot of sympathy with that po<strong>in</strong>t of view. I usuallyprefer simple code over code that is “clever”—but if a whole pattern provides benefitswhen it’s known, that sometimes makes the pattern worth learn<strong>in</strong>g. Whether or notyou use these techniques is of course entirely up to you—but you may f<strong>in</strong>d that theysuggest other ideas to use elsewhere <strong>in</strong> your code. Without further ado, let’s start withan alternative to the TryXXX pattern mentioned <strong>in</strong> section 3.2.6.4.4.1 Try<strong>in</strong>g an operation without us<strong>in</strong>g output parametersThe pattern of us<strong>in</strong>g a return value to say whether or not an operation worked, andan output parameter to return the real result, is becom<strong>in</strong>g an <strong>in</strong>creas<strong>in</strong>gly commonone <strong>in</strong> the .NET Framework. I have no issues with the aims—the idea that somemethods are likely to fail to perform their primary purpose <strong>in</strong> nonexceptional circumstancesis common sense. My one problem with it is that I’m just not a huge fanof output parameters. There’s someth<strong>in</strong>g slightly clumsy about the syntax of declar<strong>in</strong>ga variable on one l<strong>in</strong>e, then immediately us<strong>in</strong>g it as an output parameter.Methods return<strong>in</strong>g reference types have often used a pattern of return<strong>in</strong>g nullon failure and non-null on success. It doesn’t work so well when null is a valid returnvalue <strong>in</strong> the success case. Hashtable is an example of both of these statements, <strong>in</strong> aLicensed to Rhona Hadida


132 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable typesslightly ambivalent way. You see, null is a theoretically valid value <strong>in</strong> a Hashtable, but<strong>in</strong> my experience most uses of Hashtable never use null values, which makes it perfectlyacceptable to have code that assumes that a null value means a miss<strong>in</strong>g key.One common scenario is to have each value of the Hashtable as a list: the first timean item is added for a particular key, a new list is created and the item added to it.Thereafter, add<strong>in</strong>g another item for the same key <strong>in</strong>volves add<strong>in</strong>g the item to theexist<strong>in</strong>g list. Here’s the code <strong>in</strong> <strong>C#</strong> 1:ArrayList list = hash[key];if (list==null){list = new ArrayList();hash[key] = list;}list.Add(newItem);Hopefully you’d use variable names more specific to your situation, but I’m sure youget the idea and may well have used the pattern yourself. 5 With nullable types, this patterncan be extended to value types—and <strong>in</strong> fact, it’s safer with value types, because ifthe natural result type is a value type, then a null value could only be returned as a failurecase. Nullable types add that extra Boolean piece of <strong>in</strong>formation <strong>in</strong> a nice genericway with language support—so why not use them?To demonstrate this pattern <strong>in</strong> practice and <strong>in</strong> a context other than dictionarylookups, I’ll use the classic example of the TryXXX pattern—pars<strong>in</strong>g an <strong>in</strong>teger. Theimplementation of the TryParse method <strong>in</strong> list<strong>in</strong>g 4.5 shows the version of the patternus<strong>in</strong>g an output parameter, but then we see the use of the version us<strong>in</strong>g nullabletypes <strong>in</strong> the ma<strong>in</strong> part at the bottom.List<strong>in</strong>g 4.5An alternative implementation of the TryXXX patternstatic <strong>in</strong>t? TryParse (str<strong>in</strong>g data){<strong>in</strong>t ret;Classic call withif (<strong>in</strong>t.TryParse(data, out ret)) output parameter{return ret;}else{return null;}}...Nullable<strong>in</strong>t? parsed = TryParse("Not valid"); callif (parsed != null){Console.WriteL<strong>in</strong>e ("Parsed to {0}", parsed.Value);}5Wouldn’t it be great if Hashtable and Dictionary could take a delegate to call whenever anew value was required due to look<strong>in</strong>g up a miss<strong>in</strong>g key? Situations like this would be a lot simpler.Licensed to Rhona Hadida


Novel uses of nullable types133else{Console.WriteL<strong>in</strong>e ("Couldn't parse");}You may well th<strong>in</strong>k there’s very little to dist<strong>in</strong>guish the two versions here—they’re thesame number of l<strong>in</strong>es, after all. However, I believe there’s a difference <strong>in</strong> emphasis. Thenullable version encapsulates the natural return value and the success or failure <strong>in</strong>to as<strong>in</strong>gle variable. It also separates the “do<strong>in</strong>g” from the “test<strong>in</strong>g,” which puts the emphasis<strong>in</strong> the right place <strong>in</strong> my op<strong>in</strong>ion. Usually, if I call a method <strong>in</strong> the condition part of anif statement, that method’s primary purpose is to return a Boolean value. Here, thereturn value is <strong>in</strong> some ways less important than the output parameter. When you’reread<strong>in</strong>g code, it’s easy to miss an output parameter <strong>in</strong> a method call and be left wonder<strong>in</strong>gwhat’s actually do<strong>in</strong>g all the work and magically giv<strong>in</strong>g the answer. With the nullableversion, this is more explicit—the result of the method has all the <strong>in</strong>formationwe’re <strong>in</strong>terested <strong>in</strong>. I’ve used this technique <strong>in</strong> a number of places (often with rathermore method parameters, at which po<strong>in</strong>t output parameters become even harder tospot) and believe it has improved the general feel of the code. Of course, this only worksfor value types.Another advantage of this pattern is that it can be used <strong>in</strong> conjunction with thenull coalesc<strong>in</strong>g operator—you can try to understand several pieces of <strong>in</strong>put, stopp<strong>in</strong>gat the first valid one. The normal TryXXX pattern allows this us<strong>in</strong>g the short-circuit<strong>in</strong>goperators, but the mean<strong>in</strong>g isn’t nearly as clear when you use the same variable fortwo different output parameters <strong>in</strong> the same statement.The next pattern is an answer to a specific pa<strong>in</strong> po<strong>in</strong>t—the irritation and fluff thatcan be present when writ<strong>in</strong>g multitiered comparisons.4.4.2 Pa<strong>in</strong>less comparisons with the null coalesc<strong>in</strong>g operatorI suspect you dislike writ<strong>in</strong>g the same code over and over aga<strong>in</strong> as much as I do. Refactor<strong>in</strong>gcan often get rid of duplication, but there are some cases that resist refactor<strong>in</strong>gsurpris<strong>in</strong>gly effectively. Code for Equals and Compare often falls firmly <strong>in</strong>to this category<strong>in</strong> my experience.Suppose you are writ<strong>in</strong>g an e-commerce site and have a list of products. You maywish to sort them by popularity (descend<strong>in</strong>g), then price, then name—so that the fivestar-ratedproducts come first, but the cheapest with<strong>in</strong> those come before the moreexpensive ones. If there are multiple products with the same price, products beg<strong>in</strong>n<strong>in</strong>gwith A are listed before products beg<strong>in</strong>n<strong>in</strong>g with B. This isn’t a problem specificto e-commerce sites—sort<strong>in</strong>g data by multiple criteria is a fairly common requirement<strong>in</strong> comput<strong>in</strong>g.Assum<strong>in</strong>g we have a suitable Product type, we can write the comparison with codelike this <strong>in</strong> <strong>C#</strong> 1:public <strong>in</strong>t Compare(Product first, Product second){// Reverse comparison of popularity to sort descend<strong>in</strong>g<strong>in</strong>t ret = second.Popularity.CompareTo(first.Popularity);Licensed to Rhona Hadida


134 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable types}if (ret != 0){return ret;}ret = first.Price.CompareTo(second.Price);if (ret != 0){return ret;}return first.Name.CompareTo(second.Name);This assumes that we won’t be asked to compare null references, and that all of theproperties will return non-null references too. We could use some up-front null comparisonsand Comparer.Default to handle those cases, but that would make thecode even longer and more <strong>in</strong>volved. The code could be shorter (and avoid return<strong>in</strong>gfrom the middle of the method) by rearrang<strong>in</strong>g it slightly, but the fundamental “compare,check, compare, check” pattern would still be present, and it wouldn’t be asobvious that once we’ve got a nonzero answer, we’re done.Ah… now, that last sentence is rem<strong>in</strong>iscent of someth<strong>in</strong>g else: the null coalesc<strong>in</strong>goperator. As we saw <strong>in</strong> section 4.3, if we have a lot of expressions separated by ?? thenthe operator will be repeatedly applied until it hits a non-null expression. Now all we’vegot to do is work out a way of return<strong>in</strong>g null <strong>in</strong>stead of zero from a comparison. This iseasy to do <strong>in</strong> a separate method, and that can also encapsulate the use of the defaultcomparer. We can even have an overload to use a specific comparer if we want. We’llalso deal with the case where either of the Product references we’re passed is null. First,let’s look at the class implement<strong>in</strong>g our helper methods, as shown <strong>in</strong> list<strong>in</strong>g 4.6.List<strong>in</strong>g 4.6Helper class for provid<strong>in</strong>g “partial comparisons”public static class PartialComparer{public static <strong>in</strong>t? Compare(T first, T second){return Compare(Comparer.Default, first, second);}public static <strong>in</strong>t? Compare(IComparer comparer,T first,T second){<strong>in</strong>t ret = comparer.Compare(first, second);if (ret == 0){return null;}return ret;}public static <strong>in</strong>t? ReferenceCompare(T first, T second)where T : classLicensed to Rhona Hadida


Novel uses of nullable types135}{}if (first==second){return 0;}if (first==null){return -1;}if (second==null){return 1;}return null;The Compare methods <strong>in</strong> list<strong>in</strong>g 4.6 are almost pathetically simple—when a comparerisn’t specified, the default comparer for the type is used, and all that happens to thecomparison’s return value is that zero is translated to null. The ReferenceComparemethod is longer but still very straightforward: it basically returns the correct comparisonresult (–1, 0, or 1) if it can tell the result just from the references, and nullotherwise. Even though this class is simple, it’s remarkably useful. We can nowreplace our previous product comparison with a neater implementation:public <strong>in</strong>t Compare(Product first, Product second){return PC.ReferenceCompare (first, second) ??// Reverse comparison of popularity to sort descend<strong>in</strong>gPC.Compare (second.Popularity, first.Popularity) ??PC.Compare (first.Price, second.Price) ??PC.Compare (first.Name, second.Name) ??0;}As you may have noticed, I’ve used PC rather than PartialComparer—this is solelyfor the sake of be<strong>in</strong>g able to fit the l<strong>in</strong>es on the pr<strong>in</strong>ted page. In real source I woulduse the full type name and have one comparison per l<strong>in</strong>e. Of course, if you wantedshort l<strong>in</strong>es for some reason, you could specify a us<strong>in</strong>g directive to make PC an aliasfor PartialComparer—I just wouldn’t recommend it.The f<strong>in</strong>al 0 is to <strong>in</strong>dicate that if all of the earlier comparisons have passed, the twoProduct <strong>in</strong>stances are equal. We could have just used Comparer.Default.Compare(first.Name, second.Name) as the f<strong>in</strong>al comparison, but that would hurt thesymmetry of the method.This comparison plays nicely with nulls, is easy to modify, forms an easy pattern touse for other comparisons, and only compares as far as it needs to: if the prices are different,the names won’t be compared.You may be wonder<strong>in</strong>g whether the same technique could be applied to equalitytests, which often have similar patterns. There’s much less po<strong>in</strong>t <strong>in</strong> the case of equality,because after the nullity and reference equality tests, you can just use && to provide theLicensed to Rhona Hadida


136 CHAPTER 4 Say<strong>in</strong>g noth<strong>in</strong>g with nullable typesdesired short-circuit<strong>in</strong>g functionality for Booleans. A method return<strong>in</strong>g a bool? can beused to obta<strong>in</strong> an <strong>in</strong>itial def<strong>in</strong>itely equal, def<strong>in</strong>itely not equal or unknown result based on thereferences, however. The complete code of PartialComparer on this book’s websiteconta<strong>in</strong>s the appropriate utility method and examples of its use.4.5 SummaryWhen faced with a problem, developers tend to take the easiest short-term solution,even if it’s not particularly elegant. That’s often exactly the right decision—we don’twant to be guilty of overeng<strong>in</strong>eer<strong>in</strong>g, after all. However, it’s always nice when a goodsolution is also the easiest solution.Nullable types solve a very specific problem that only had somewhat ugly solutionsbefore <strong>C#</strong> 2. The features provided are just a better-supported version of a solutionthat was feasible but time-consum<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 1. The comb<strong>in</strong>ation of generics (to avoidcode duplication), CLR support (to provide suitable box<strong>in</strong>g and unbox<strong>in</strong>g behavior),and language support (to provide concise syntax along with convenient conversionsand operators) makes the solution far more compell<strong>in</strong>g than it was previously.It so happens that <strong>in</strong> provid<strong>in</strong>g nullable types, the <strong>C#</strong> and Framework designershave made some other patterns available that just weren’t worth the effort before.We’ve looked at some of them <strong>in</strong> this chapter, and I wouldn’t be at all surprised to seemore of them appear<strong>in</strong>g over time.So far our two new features (generics and nullable types) have addressed areaswhere <strong>in</strong> <strong>C#</strong> 1 we occasionally had to hold our noses due to unpleasant code smells.This pattern cont<strong>in</strong>ues <strong>in</strong> the next chapter, where we discuss the enhancements to delegates.These form an important part of the subtle change of direction of both the <strong>C#</strong>language and the .NET Framework, toward a slightly more functional viewpo<strong>in</strong>t. Thisemphasis is made even clearer <strong>in</strong> <strong>C#</strong> 3, so while we’re not look<strong>in</strong>g at those features quiteyet, the delegate enhancements <strong>in</strong> <strong>C#</strong> 2 act as a bridge between the familiarity of <strong>C#</strong> 1and the potentially revolutionary style of <strong>C#</strong> 3.Licensed to Rhona Hadida


Fast-tracked delegatesThis chapter covers■■■■■Longw<strong>in</strong>ded <strong>C#</strong> 1 syntaxSimplified delegate constructionCovariance and contravarianceAnonymous methodsCaptured variablesThe journey of delegates <strong>in</strong> <strong>C#</strong> and .NET is an <strong>in</strong>terest<strong>in</strong>g one, show<strong>in</strong>g remarkableforesight (or really good luck) on the part of the designers. The conventions suggestedfor event handlers <strong>in</strong> .NET 1.0/1.1 didn’t make an awful lot of sense—until<strong>C#</strong> 2 showed up. Likewise, the effort put <strong>in</strong>to delegates for <strong>C#</strong> 2 seems <strong>in</strong> some waysout of proportion to how widely used they are—until you see how pervasive theyare <strong>in</strong> idiomatic <strong>C#</strong> 3 code. In other words, it’s as if the language and platformdesigners had a vision of at least the rough direction they would be tak<strong>in</strong>g, yearsbefore the dest<strong>in</strong>ation itself became clear.Of course, <strong>C#</strong> 3 is not a “f<strong>in</strong>al dest<strong>in</strong>ation” <strong>in</strong> itself, and we may be see<strong>in</strong>g furtheradvances for delegates <strong>in</strong> the future—but the differences between <strong>C#</strong> 1 and<strong>C#</strong> 3 <strong>in</strong> this area are startl<strong>in</strong>g. (The primary change <strong>in</strong> <strong>C#</strong> 3 support<strong>in</strong>g delegates is<strong>in</strong> lambda expressions, which we’ll meet <strong>in</strong> chapter 9.)137Licensed to Rhona Hadida


138 CHAPTER 5 Fast-tracked delegates<strong>C#</strong> 2 is a sort of stepp<strong>in</strong>g stone <strong>in</strong> terms of delegates. Its new features pave the wayfor the even more dramatic changes of <strong>C#</strong> 3, keep<strong>in</strong>g developers reasonably comfortablewhile still provid<strong>in</strong>g useful benefits. The extent to which this was a f<strong>in</strong>ely balancedact as opposed to <strong>in</strong>tuition and a follow<strong>in</strong>g w<strong>in</strong>d is likely to stay unknown, butwe can certa<strong>in</strong>ly reap the benefits.Delegates play a more prom<strong>in</strong>ent part <strong>in</strong> .NET 2.0 than <strong>in</strong> earlier versions,although they’re not as common as they are <strong>in</strong> .NET 3.5. In chapter 3 we saw how theycan be used to convert from a list of one type to a list of another type, and way back <strong>in</strong>chapter 1 we sorted a list of products us<strong>in</strong>g the Comparison delegate <strong>in</strong>stead of theIComparer <strong>in</strong>terface. Although the framework and <strong>C#</strong> keep a respectful distance fromeach other where possible, I believe that the language and platform drove each otherhere: the <strong>in</strong>clusion of more delegate-based API calls supported the improved syntaxavailable <strong>in</strong> <strong>C#</strong> 2, and vice versa.In this chapter we’ll see how <strong>C#</strong> 2 makes two small changes that make life easierwhen creat<strong>in</strong>g delegate <strong>in</strong>stances from normal methods, and then we’ll look at thebiggest change: anonymous methods, which allow you to specify a delegate <strong>in</strong>stance’saction <strong>in</strong>l<strong>in</strong>e at the po<strong>in</strong>t of its creation. The largest section of the chapter is devotedto the most complicated part of anonymous methods, captured variables, which providedelegate <strong>in</strong>stances with a richer environment to play <strong>in</strong>. We’ll cover the topic <strong>in</strong>significant detail due to its importance and complexity.First, though, let’s rem<strong>in</strong>d ourselves of the pa<strong>in</strong> po<strong>in</strong>ts of <strong>C#</strong> 1’s delegate facilities.5.1 Say<strong>in</strong>g goodbye to awkward delegate syntaxThe syntax for delegates <strong>in</strong> <strong>C#</strong> 1 doesn’t sound too bad—the language already has syntacticsugar around Delegate.Comb<strong>in</strong>e, Delegate.Remove, and the <strong>in</strong>vocation of delegate<strong>in</strong>stances. It makes sense to specify the delegate type when creat<strong>in</strong>g a delegate<strong>in</strong>stance—it’s the same syntax used to create <strong>in</strong>stances of other types, after all.This is all true, but for some reason it also sucks. It’s hard to say exactly why the delegatecreation expressions of <strong>C#</strong> 1 raise hackles, but they do—at least for me. Whenhook<strong>in</strong>g up a bunch of event handlers, it just looks ugly to have to write “newEventHandler” (or whatever is required) all over the place, when the event itself hasspecified which delegate type it will use. Beauty is <strong>in</strong> the eye of the beholder, of course,and you could argue that there’s less call for guesswork when read<strong>in</strong>g event handler wir<strong>in</strong>gcode <strong>in</strong> the <strong>C#</strong> 1 style, but the extra text just gets <strong>in</strong> the way and distracts from theimportant part of the code: which method you want to handle the event.Life becomes a bit more black and white when you consider covariance and contravarianceas applied to delegates. Suppose you’ve got an event handl<strong>in</strong>g methodthat saves the current document, or just logs that it’s been called, or any number ofother actions that may well not need to know details of the event. The event itselfshouldn’t m<strong>in</strong>d that your method is capable of work<strong>in</strong>g with only the <strong>in</strong>formation providedby the EventHandler signature, even though it is declared to pass <strong>in</strong> mouseevent details. Unfortunately, <strong>in</strong> <strong>C#</strong> 1 you have to have a different method for each differentevent handler signature.Licensed to Rhona Hadida


Say<strong>in</strong>g goodbye to awkward delegate syntax139Likewise it’s undeniably ugly to write methods that are so simple that their implementationis shorter than their signature, solely because delegates need to have codeto execute and that code has to be <strong>in</strong> the form of a method. It adds an extra layer of<strong>in</strong>direction between the code creat<strong>in</strong>g the delegate <strong>in</strong>stance and the code that shouldexecute when the delegate <strong>in</strong>stance is <strong>in</strong>voked. Often extra layers of <strong>in</strong>direction arewelcome—and of course that option hasn’t been removed <strong>in</strong> <strong>C#</strong> 2—but at the sametime it often makes the code harder to read, and pollutes the class with a bunch ofmethods that are only used for delegates.Unsurpris<strong>in</strong>gly, all of these are improved greatly <strong>in</strong> <strong>C#</strong> 2. The syntax can still occasionallybe wordier than we might like (which is where lambda expressions come <strong>in</strong>toplay <strong>in</strong> <strong>C#</strong> 3), but the difference is significant. To illustrate the pa<strong>in</strong>, we’ll start withsome code <strong>in</strong> <strong>C#</strong> 1 and improve it <strong>in</strong> the next couple of sections. List<strong>in</strong>g 5.1 builds a(very) simple form with a button and subscribes to three of the button’s events.List<strong>in</strong>g 5.1Subscrib<strong>in</strong>g to three of a button's eventsstatic void LogPla<strong>in</strong>Event(object sender, EventArgs e){Console.WriteL<strong>in</strong>e ("LogPla<strong>in</strong>");}static void LogKeyEvent(object sender, KeyPressEventArgs e){Console.WriteL<strong>in</strong>e ("LogKey");}static void LogMouseEvent(object sender, MouseEventArgs e){Console.WriteL<strong>in</strong>e ("LogMouse");}...Button button = new Button();button.Text = "Click me";button.Click += new EventHandler(LogPla<strong>in</strong>Event);button.KeyPress += new KeyPressEventHandler(LogKeyEvent);button.MouseClick += new MouseEventHandler(LogMouseEvent);Form form = new Form();form.AutoSize=true;form.Controls.Add(button);Application.Run(form);The output l<strong>in</strong>es <strong>in</strong> the three event handl<strong>in</strong>g methods are there to prove that the codeis work<strong>in</strong>g: if you press the spacebar with the button highlighted, you’ll see that theClick and KeyPress events are both raised; press<strong>in</strong>g Enter just raises the Click event;click<strong>in</strong>g on the button raises the Click and MouseClick events. In the follow<strong>in</strong>g sectionswe’ll improve this code us<strong>in</strong>g some of the <strong>C#</strong> 2 features.Let’s start by ask<strong>in</strong>g the compiler to make a pretty obvious deduction—which delegatetype we want to use when subscrib<strong>in</strong>g to an event.Licensed to Rhona Hadida


140 CHAPTER 5 Fast-tracked delegates5.2 Method group conversionsIn <strong>C#</strong> 1, if you want to create a delegate <strong>in</strong>stance you need to specify both the delegatetype and the action. If you remember from chapter 2, we def<strong>in</strong>ed the action to be themethod to call and (for <strong>in</strong>stance methods) the target to call it on. So for example, <strong>in</strong> list<strong>in</strong>g5.1 when we needed to create a KeyPressEventHandler we used this expression:new KeyPressEventHandler(LogKeyEvent)As a stand-alone expression, it doesn’t look too bad. Even used <strong>in</strong> a simple event subscriptionit’s tolerable. It becomes a bit uglier when used as part of a longer expression.A common example of this is start<strong>in</strong>g a new thread:Thread t = new Thread (new ThreadStart(MyMethod));What we want to do is start a new thread that will execute MyMethod as simply as possible.<strong>C#</strong> 2 allows you to do this by means of an implicit conversion from a method groupto a compatible delegate type. A method group is simply the name of a method,optionally with a target—exactly the same k<strong>in</strong>d of expression as we used <strong>in</strong> <strong>C#</strong> 1 to createdelegate <strong>in</strong>stances, <strong>in</strong> other words. (Indeed, the expression was called a methodgroup back then—it’s just that the conversion wasn’t available.) If the method isgeneric, the method group may also specify type arguments. The new implicit conversionallows us to turn our event subscription <strong>in</strong>tobutton.KeyPress += LogKeyEvent;Likewise the thread creation code becomes simplyThread t = new Thread (MyMethod);The readability differences between the orig<strong>in</strong>al and the “streaml<strong>in</strong>ed” versions aren’thuge for a s<strong>in</strong>gle l<strong>in</strong>e, but <strong>in</strong> the context of a significant amount of code, they canreduce the clutter considerably. To make it look less like magic, let’s take a brief lookat what this conversion is do<strong>in</strong>g.First, let’s consider the expressions LogKeyEvent and MyMethod as they appear <strong>in</strong>the examples. The reason they’re classified as method groups is that more than onemethod may be available, due to overload<strong>in</strong>g. The implicit conversions available willconvert a method group to any delegate type with a compatible signature. So, if youhad two method signatures as follows:void MyMethod()void MyMethod(object sender, EventArgs e)you could use MyMethod as the method group <strong>in</strong> an assignment to either a ThreadStartor an EventHandler as follows:ThreadStart x = MyMethod;EventHandler y = MyMethod;However, you couldn’t use it as the parameter to a method that itself was overloaded totake either a ThreadStart or an EventHandler—the compiler would compla<strong>in</strong> thatthe conversion was ambiguous. Likewise, you unfortunately can’t use an implicitLicensed to Rhona Hadida


Covariance and contravariance141method group conversion to convert to the pla<strong>in</strong> System.Delegate type s<strong>in</strong>ce thecompiler doesn’t know which specific delegate type to create an <strong>in</strong>stance of. This is abit of a pa<strong>in</strong>, but you can still be slightly briefer than <strong>in</strong> <strong>C#</strong> 1 by mak<strong>in</strong>g the conversionexplicit. For example:Delegate <strong>in</strong>valid = SomeMethod;Delegate valid = (ThreadStart)SomeMethod;As with generics, the precise rules of conversion are slightly complicated, and the “justtry it” rule works very well: if the compiler compla<strong>in</strong>s that it doesn’t have enough<strong>in</strong>formation, just tell it what conversion to use and all should be well. If it doesn’t compla<strong>in</strong>,you should be f<strong>in</strong>e. For the exact details, consult the language specification.Speak<strong>in</strong>g of possible conversions, there may be more than you expect, as we’ll see <strong>in</strong>our next section.5.3 Covariance and contravarianceWe’ve already talked quite a lot about the concepts of covariance and contravariance<strong>in</strong> different contexts, usually bemoan<strong>in</strong>g their absence, but delegate constructionis the one area <strong>in</strong> which they are actually available <strong>in</strong> <strong>C#</strong>. If you want to refreshyourself about the mean<strong>in</strong>g of the terms at a relatively detailed level, refer back tosection 2.3.2—but the gist of the topic with respect to delegates is that if it would bevalid (<strong>in</strong> a static typ<strong>in</strong>g sense) to call a method and use its return value everywherethat you could <strong>in</strong>voke an <strong>in</strong>stance of a particular delegate type and use its returnvalue, then that method can be used to create an <strong>in</strong>stance of that delegate type.That’s all pretty wordy, but it’s a lot simpler with examples.Let’s consider the event handlers we’ve got <strong>in</strong> our little W<strong>in</strong>dows Forms application.The signatures 1 of the three delegate types <strong>in</strong>volved are as follows:void EventHandler (object sender, EventArgs e)void KeyPressEventHandler (object sender, KeyPressEventArgs e)void MouseEventHandler (object sender, MouseEventArgs e)Now, consider that KeyPressEventArgs and MouseEventArgs both derive from Event-Args (as do a lot of other types—at the time of this writ<strong>in</strong>g, MSDN lists 386 types thatderive directly from EventArgs). So, if you have a method that takes an EventArgsparameter, you could always call it with a KeyPressEventArgs argument <strong>in</strong>stead. Ittherefore makes sense to be able to use a method with the same signature asEventHandler to create an <strong>in</strong>stance of KeyPressEventHandler—and that’s exactly what<strong>C#</strong> 2 does. This is an example of contravariance of parameter types.To see that <strong>in</strong> action, let’s th<strong>in</strong>k back to list<strong>in</strong>g 5.1 and suppose that we don’t needto know which event was fir<strong>in</strong>g—we just want to write out the fact that an event hashappened. Us<strong>in</strong>g method group conversions and contravariance, our code becomesquite a lot simpler, as shown <strong>in</strong> list<strong>in</strong>g 5.2.1I’ve removed the public delegate part for reasons of space.Licensed to Rhona Hadida


142 CHAPTER 5 Fast-tracked delegatesList<strong>in</strong>g 5.2Demonstration of method group conversions and delegate contravariancestatic void LogPla<strong>in</strong>Event(object sender, EventArgs e){Console.WriteL<strong>in</strong>e ("An event occurred");}...Button button = new Button();button.Text = "Click me";button.Click += LogPla<strong>in</strong>Event;button.KeyPress += LogPla<strong>in</strong>Event;button.MouseClick += LogPla<strong>in</strong>Event;Form form = new Form();form.AutoSize=true;form.Controls.Add(button);Application.Run(form);We’ve managed to completely remove the two handler methods that dealt specificallywith key and mouse events, us<strong>in</strong>g one event handl<strong>in</strong>g method B for everyth<strong>in</strong>g. Ofcourse, this isn’t terribly useful if you want to do different th<strong>in</strong>gs for different types ofevents, but sometimes all you need to know is that an event occurred and, potentially,the source of the event. The subscription to the Click event C only uses the implicitconversion we discussed <strong>in</strong> the previous section because it has a simple EventArgsparameter, but the other event subscriptions D <strong>in</strong>volve the conversion and contravariancedue to their different parameter types.I mentioned earlier that the .NET 1.0/1.1 event handler convention didn’t makemuch sense when it was first <strong>in</strong>troduced. This example shows exactly why the guidel<strong>in</strong>esare more useful with <strong>C#</strong> 2. The convention dictates that event handlers shouldhave a signature with two parameters, the first of which is of type object and is theorig<strong>in</strong> of the event, and the second of which carries any extra <strong>in</strong>formation about theevent <strong>in</strong> a type deriv<strong>in</strong>g from EventArgs. Before contravariance became available,this wasn’t useful—there was no benefit to mak<strong>in</strong>g the <strong>in</strong>formational parameterderive from EventArgs, and sometimes there wasn’t much use for the orig<strong>in</strong> of theevent. It was often more sensible just to pass the relevant <strong>in</strong>formation directly <strong>in</strong> theform of normal parameters, just like any other method. Now, however, you can use amethod with the EventHandler signature as the action for any delegate type thathonors the convention.Demonstrat<strong>in</strong>g covariance of return types is a little harder as relatively few built-<strong>in</strong>delegates are declared with a nonvoid return type. There are some available, but it’s easierto declare our own delegate type that uses Stream as its return type. For simplicitywe’ll make it parameterless: 2delegate Stream StreamFactory();Uses methodgroup conversionB Handlesall eventsWe can now use this with a method that is declared to return a specific type of stream,as shown <strong>in</strong> list<strong>in</strong>g 5.3. We declare a method that always returns a MemoryStream with2Return type covariance and parameter type contravariance can be used at the same time, although you’reunlikely to come across situations where it would be useful.CUses conversionD and contravarianceLicensed to Rhona Hadida


Covariance and contravariance143some random data, and then use that method as the action for a StreamFactory delegate<strong>in</strong>stance.List<strong>in</strong>g 5.3Demonstration of covariance of return types for delegatesdelegate Stream StreamFactory();static MemoryStream GenerateRandomData(){byte[] buffer = new byte[16];new Random().NextBytes(buffer);return new MemoryStream(buffer);}...StreamFactory factory = GenerateRandomData;B Declares delegate type return<strong>in</strong>g StreamCDeclares methodreturn<strong>in</strong>g MemoryStreamDConverts methodgroup with covarianceStream stream = factory();<strong>in</strong>t data;while ( (data=stream.ReadByte()) != -1){Console.WriteL<strong>in</strong>e(data);}E InvokesdelegateThe actual generation and display of the data <strong>in</strong> list<strong>in</strong>g 5.3 is only present to give the codesometh<strong>in</strong>g to do. (In particular, the way of generat<strong>in</strong>g random data is pretty awful!)The important po<strong>in</strong>ts are the annotated l<strong>in</strong>es. We declare that the delegate type has areturn type of Stream B, but the GenerateRandomData method C has a return type ofMemoryStream. The l<strong>in</strong>e creat<strong>in</strong>g the delegate <strong>in</strong>stance D performs the conversion wesaw earlier and uses covariance of return types to allow GenerateRandomData to be usedfor the action for StreamFactory. By the time we <strong>in</strong>voke the delegate <strong>in</strong>stance E, thecompiler no longer knows that a MemoryStream will be returned—if we changed the typeof the stream variable to MemoryStream, we’d get a compilation error.Covariance and contravariance can also be used to construct one delegate <strong>in</strong>stancefrom another. For <strong>in</strong>stance, consider these two l<strong>in</strong>es of code (which assume an appropriateHandleEvent method):EventHandler general = new EventHandler(HandleEvent);KeyPressEventHandler key = new KeyPressEventHandler(general);The first l<strong>in</strong>e is valid <strong>in</strong> <strong>C#</strong> 1, but the second isn’t—<strong>in</strong> order to construct one delegatefrom another <strong>in</strong> <strong>C#</strong> 1, the signatures of the two delegate types <strong>in</strong>volved have to match.For <strong>in</strong>stance, you could create a MethodInvoker from a ThreadStart—but youcouldn’t do what we’re do<strong>in</strong>g <strong>in</strong> the previous code. We’re us<strong>in</strong>g contravariance to createa new delegate <strong>in</strong>stance from an exist<strong>in</strong>g one with a compatible delegate type signature,where compatibility is def<strong>in</strong>ed <strong>in</strong> a less restrictive manner <strong>in</strong> <strong>C#</strong> 2 than <strong>in</strong> <strong>C#</strong> 1.This new flexibility <strong>in</strong> <strong>C#</strong> 2 causes one of the very few cases where exist<strong>in</strong>g valid<strong>C#</strong>1 code may produce different results when compiled under <strong>C#</strong> 2: if a derived classoverloads a method declared <strong>in</strong> its base class, a delegate creation expression that previouslyonly matched the base class method could now match the derived classmethod due to covariance or contravariance. In this case the derived class methodwill take priority <strong>in</strong> <strong>C#</strong> 2. List<strong>in</strong>g 5.4 gives an example of this.Licensed to Rhona Hadida


144 CHAPTER 5 Fast-tracked delegatesList<strong>in</strong>g 5.4 Demonstration of break<strong>in</strong>g change between <strong>C#</strong> 1 and <strong>C#</strong> 2delegate void SampleDelegate(str<strong>in</strong>g x);public void CandidateAction(str<strong>in</strong>g x){Console.WriteL<strong>in</strong>e("Snippet.CandidateAction");}public class Derived : Snippet{public void CandidateAction(object o){Console.WriteL<strong>in</strong>e("Derived.CandidateAction");}}...Derived x = new Derived();SampleDelegate factory = new SampleDelegate(x.CandidateAction);factory("test");Remember that Snippy 3 will be generat<strong>in</strong>g all of this code with<strong>in</strong> a class called Snippetwhich the nested type derives from. Under <strong>C#</strong> 1, list<strong>in</strong>g 5.4 would pr<strong>in</strong>t Snippet.CandidateAction because the method tak<strong>in</strong>g an object parameter wasn’t compatiblewith SampleDelegate. Under <strong>C#</strong> 2, however, it is compatible and is the method chosendue to be<strong>in</strong>g declared <strong>in</strong> a more derived type—so the result is that Derived.CandidateAction is pr<strong>in</strong>ted. Fortunately, the <strong>C#</strong> 2 compiler knows that this is a break<strong>in</strong>gchange and issues an appropriate warn<strong>in</strong>g.Enough doom and gloom about potential breakage, however. We’ve still got to seethe most important new feature regard<strong>in</strong>g delegates: anonymous methods. They’re abit more complicated than the topics we’ve covered so far, but they’re also very powerful—anda large step toward <strong>C#</strong> 3.5.4 Inl<strong>in</strong>e delegate actions with anonymous methodsHave you ever been writ<strong>in</strong>g <strong>C#</strong> 1 and had to implement a delegate with a particularsignature, even though you’ve already got a method that does what you want butdoesn’t happen to have quite the right parameters? Have you ever had to implement adelegate that only needs to do one teeny, t<strong>in</strong>y th<strong>in</strong>g, and yet you need a whole extramethod? Have you ever been frustrated at hav<strong>in</strong>g to navigate away from an importantbit of code <strong>in</strong> order to see what the delegate you’re us<strong>in</strong>g does, only to f<strong>in</strong>d that themethod used is only two l<strong>in</strong>es long? This k<strong>in</strong>d of th<strong>in</strong>g happened to me quite regularlywith <strong>C#</strong> 1. The covariance and contravariance features we’ve just talked about cansometimes help with the first problem, but often they don’t. Anonymous methods, whichare also new <strong>in</strong> <strong>C#</strong> 2, can pretty much always help with these issues.Informally, anonymous methods allow you to specify the action for a delegate<strong>in</strong>stance <strong>in</strong>l<strong>in</strong>e as part of the delegate <strong>in</strong>stance creation expression. This means there’s3In case you skipped the first chapter, Snippy is a tool I’ve used to create short but complete code samples. Seesection 1.4.2 for more details.Licensed to Rhona Hadida


Inl<strong>in</strong>e delegate actions with anonymous methods145no need to “pollute” the rest of your class with an extra method conta<strong>in</strong><strong>in</strong>g a smallpiece of code that is only useful <strong>in</strong> one place and doesn’t make sense elsewhere.Anonymous methods also provide some far more powerful behavior <strong>in</strong> the formof closures, but we’ll come to them <strong>in</strong> section 5.5. For the moment, let’s stick with relativelysimple stuff—as you may have noticed, a common theme <strong>in</strong> this book is thatyou can go a long way <strong>in</strong> <strong>C#</strong> 2 without deal<strong>in</strong>g with the more complex aspects of thelanguage. Not only is this good <strong>in</strong> terms of learn<strong>in</strong>g the new features gradually, butif you only use the more complicated areas when they provide a lot of benefit, yourcode will be easier to understand as well. First we’ll see examples of anonymousmethods that take parameters but don’t return any values; then we’ll explore thesyntax <strong>in</strong>volved <strong>in</strong> provid<strong>in</strong>g return values and a shortcut available when we don’tneed to use the parameters passed to us.5.4.1 Start<strong>in</strong>g simply: act<strong>in</strong>g on a parameterIn chapter 3 we saw the Action delegate type. As a rem<strong>in</strong>der, its signature is verysimple (aside from the fact that it’s generic):public delegate void Action(T obj)In other words, an Action does someth<strong>in</strong>g with an <strong>in</strong>stance of T. So anAction could reverse the str<strong>in</strong>g and pr<strong>in</strong>t it out, an Action could pr<strong>in</strong>tout the square root of the number passed to it, and an Action couldf<strong>in</strong>d the average of all the numbers given to it and pr<strong>in</strong>t that out. By complete co<strong>in</strong>cidence,these examples are all implemented us<strong>in</strong>g anonymous methods <strong>in</strong> list<strong>in</strong>g 5.5.List<strong>in</strong>g 5.5Anonymous methods used with the Action delegate typeAction pr<strong>in</strong>tReverse = delegate(str<strong>in</strong>g text){char[] chars = text.ToCharArray();Array.Reverse(chars);Console.WriteL<strong>in</strong>e(new str<strong>in</strong>g(chars));};BUses anonymousmethod to createActionAction pr<strong>in</strong>tRoot = delegate(<strong>in</strong>t number){Console.WriteL<strong>in</strong>e(Math.Sqrt(number));};Action pr<strong>in</strong>tMean = delegate(IList numbers){double total = 0;foreach (double value <strong>in</strong> numbers){total += value;}Console.WriteL<strong>in</strong>e(total/numbers.Count);};CUses loop <strong>in</strong>anonymousmethoddouble[] samples = {1.5, 2.5, 3, 4.5};Licensed to Rhona Hadida


146 CHAPTER 5 Fast-tracked delegatespr<strong>in</strong>tReverse("Hello world");pr<strong>in</strong>tRoot(2);pr<strong>in</strong>tMean(samples);List<strong>in</strong>g 5.5 shows a few of the different features of anonymous methods.First, the syntax of anonymous methods: use the delegate keyword, followedby the parameters (if there are any), followed by the code for theaction of the delegate <strong>in</strong>stance, <strong>in</strong> a block. The str<strong>in</strong>g reversal code Bshows that the block can conta<strong>in</strong> local variable declarations, and the “listaverag<strong>in</strong>g” code C demonstrates loop<strong>in</strong>g with<strong>in</strong> the block. Basically, anyth<strong>in</strong>gyou can do <strong>in</strong> a normal method body, you can do <strong>in</strong> an anonymousmethod. 4 Likewise, the result of an anonymous method is a delegate<strong>in</strong>stance that can be used like any other one D. Be warned that contravariancedoesn’t apply to anonymous methods: you have to specify the parameter types thatmatch the delegate type exactly.In terms of implementation, we are still creat<strong>in</strong>g a method for each delegate<strong>in</strong>stance: the compiler will generate a method with<strong>in</strong> the class and use that as theaction it uses to create the delegate <strong>in</strong>stance, just as if it were a normal method. TheCLR neither knows nor cares that an anonymous method was used. You can see theextra methods with<strong>in</strong> the compiled code us<strong>in</strong>g ildasm or Reflector. (Reflector knowshow to <strong>in</strong>terpret the IL to display anonymous methods <strong>in</strong> the method that uses them,but the extra methods are still visible.)It’s worth po<strong>in</strong>t<strong>in</strong>g out at this stage that list<strong>in</strong>g 5.5 is “exploded” compared withhow you may well see anonymous methods <strong>in</strong> real code. You’ll often see them usedas parameters to another method (rather than assigned to a variable of the delegatetype) and with very few l<strong>in</strong>e breaks—compactness is part of the reason forus<strong>in</strong>g them, after all. For example, we mentioned <strong>in</strong> chapter 3 that List has aForEach method that takes an Action as a parameter and performs that actionon each element. List<strong>in</strong>g 5.6 shows an extreme example of this, apply<strong>in</strong>g the same“square root<strong>in</strong>g” action we used <strong>in</strong> list<strong>in</strong>g 5.5, but <strong>in</strong> a compact form.Anonymousmethodsjust conta<strong>in</strong>normal codeDInvokes delegatesas normalList<strong>in</strong>g 5.6Extreme example of code compactness. Warn<strong>in</strong>g: unreadable code ahead!List x = new List();x.Add(5);x.Add(10);x.Add(15);x.Add(20);x.Add(25);x.ForEach(delegate(<strong>in</strong>t n){Console.WriteL<strong>in</strong>e(Math.Sqrt(n));});That’s pretty horrendous—especially when at first sight the last six characters appear tobe ordered almost at random. There’s a happy medium, of course. I tend to break my4One slight oddity is that if you’re writ<strong>in</strong>g an anonymous method <strong>in</strong> a value type, you can’t reference this fromwith<strong>in</strong> it. There’s no such restriction with<strong>in</strong> a reference type.Licensed to Rhona Hadida


Inl<strong>in</strong>e delegate actions with anonymous methods147usual “braces on a l<strong>in</strong>e on their own” rule for anonymous methods (as I do for trivialproperties) but still allow a decent amount of whitespace. I’d usually write the last l<strong>in</strong>eof list<strong>in</strong>g 5.6 as someth<strong>in</strong>g likex.ForEach(delegate(<strong>in</strong>t n){ Console.WriteL<strong>in</strong>e(Math.Sqrt(n)); });The parentheses and braces are now less confus<strong>in</strong>g, and the “what it does” part standsout appropriately. Of course, how you space out your code is entirely your own bus<strong>in</strong>ess,but I encourage you to actively th<strong>in</strong>k about where you want to strike the balance,and talk about it with your teammates to try to achieve some consistency. Consistencydoesn’t always lead to the most readable code, however—sometimes keep<strong>in</strong>g everyth<strong>in</strong>gon one l<strong>in</strong>e is the most straightforward format.You should also consider how much code it makes sense to <strong>in</strong>clude <strong>in</strong> anonymousmethods. The first two examples <strong>in</strong> list<strong>in</strong>g 5.5 are reasonable , but pr<strong>in</strong>tMean is probablydo<strong>in</strong>g enough work to make it worth hav<strong>in</strong>g as a separate method. Aga<strong>in</strong>, it’s a balanc<strong>in</strong>gact.So far the only <strong>in</strong>teraction we’ve had with the call<strong>in</strong>g code is through parameters.What about return values?5.4.2 Return<strong>in</strong>g values from anonymous methodsThe Action delegate has a void return type, so we haven’t had to return anyth<strong>in</strong>gfrom our anonymous methods. To demonstrate how we can do so when we need to,we’ll use the new Predicate delegate type. We saw this briefly <strong>in</strong> chapter 3, buthere’s its signature just as a rem<strong>in</strong>der:public delegate bool Predicate(T obj)List<strong>in</strong>g 5.7 shows an anonymous method creat<strong>in</strong>g an <strong>in</strong>stance of Predicate toreturn whether the argument passed <strong>in</strong> is odd or even. Predicates are usually used <strong>in</strong>filter<strong>in</strong>g and match<strong>in</strong>g—you could use the code <strong>in</strong> list<strong>in</strong>g 5.7 to filter a list to one conta<strong>in</strong><strong>in</strong>gjust the even elements, for <strong>in</strong>stance.List<strong>in</strong>g 5.7Return<strong>in</strong>g a value from an anonymous methodPredicate isEven = delegate(<strong>in</strong>t x){ return x%2 == 0; };Console.WriteL<strong>in</strong>e(isEven(1));Console.WriteL<strong>in</strong>e(isEven(4));The new syntax is almost certa<strong>in</strong>ly what you’d have expected—we just return theappropriate value as if the anonymous method were a normal method. You may haveexpected to see a return type declared near the parameter type, but there’s no need.The compiler just checks that all the possible return values are compatible with thedeclared return type of the delegate type it’s try<strong>in</strong>g to convert the anonymousmethod <strong>in</strong>to.Licensed to Rhona Hadida


148 CHAPTER 5 Fast-tracked delegatesNOTE Just what are you return<strong>in</strong>g from? When you return a value from an anonymousmethod it really is only return<strong>in</strong>g from the anonymous method—it’s not return<strong>in</strong>g from the method that is creat<strong>in</strong>g the delegate <strong>in</strong>stance.It’s all too easy to look down some code, see the return keyword, andth<strong>in</strong>k that it’s an exit po<strong>in</strong>t from the current method.Relatively few delegates <strong>in</strong> .NET 2.0 return values—<strong>in</strong> particular, few event handlersdo, partly because when the event is raised only the return value from the last actionto be called would be available. The Predicate delegate type we’ve used so farisn’t used very widely <strong>in</strong> .NET 2.0, but it becomes important <strong>in</strong> .NET 3.5 where it’s a keypart of LINQ. Another useful delegate type with a return value is Comparison,which can be used when sort<strong>in</strong>g items. This works very well with anonymous methods.Often you only need a particular sort order <strong>in</strong> one situation, so it makes sense to beable to specify that order <strong>in</strong>l<strong>in</strong>e, rather than expos<strong>in</strong>g it as a method with<strong>in</strong> the rest ofthe class. List<strong>in</strong>g 5.8 demonstrates this, pr<strong>in</strong>t<strong>in</strong>g out the files with<strong>in</strong> the C:\ directory,order<strong>in</strong>g them first by name and then (separately) by size.List<strong>in</strong>g 5.8Us<strong>in</strong>g anonymous methods to sort files simplystatic void SortAndShowFiles(str<strong>in</strong>g title,Comparison sortOrder){FileInfo[] files = new DirectoryInfo(@"C:\").GetFiles();Array.Sort(files, sortOrder);Console.WriteL<strong>in</strong>e (title);foreach (FileInfo file <strong>in</strong> files){Console.WriteL<strong>in</strong>e (" {0} ({1} bytes)",file.Name, file.Length);}}...SortAndShowFiles("Sorted by name:",delegate(FileInfo first, FileInfo second){ return first.Name.CompareTo(second.Name); });SortAndShowFiles("Sorted by length:",delegate(FileInfo first, FileInfo second){ return first.Length.CompareTo(second.Length); });If we weren’t us<strong>in</strong>g anonymous methods, we’d have to have a separate method foreach of these sort orders. Instead, list<strong>in</strong>g 5.8 makes it clear what we’ll sort by <strong>in</strong> eachcase right where we call SortAndShowFiles. (Sometimes you’ll be call<strong>in</strong>g Sort directlyat the po<strong>in</strong>t where the anonymous method is called for. In this case we’re perform<strong>in</strong>gthe same fetch/sort/display sequence twice, just with different sort orders, so I encapsulatedthat sequence <strong>in</strong> its own method.)Licensed to Rhona Hadida


Inl<strong>in</strong>e delegate actions with anonymous methods149There’s one special syntactic shortcut that is sometimes available. If you don’t careabout the parameters of a delegate, you don’t have to declare them at all. Let’s seehow that works.5.4.3 Ignor<strong>in</strong>g delegate parametersJust occasionally, you want to implement a delegate that doesn’t depend on its parametervalues. You may wish to write an event handler whose behavior was only appropriatefor one event and didn’t depend on the event arguments: sav<strong>in</strong>g the user’swork, for <strong>in</strong>stance. Indeed, the event handlers from our orig<strong>in</strong>al example <strong>in</strong> list<strong>in</strong>g5.1 fit this criterion perfectly. In this case, you can leave out the parameter listentirely, just us<strong>in</strong>g the delegate keyword and then the block of code to use as theaction for the method. List<strong>in</strong>g 5.9 is equivalent to list<strong>in</strong>g 5.1 but uses this syntax.List<strong>in</strong>g 5.9Subscrib<strong>in</strong>g to events with anonymous methods that ignore parametersButton button = new Button();button.Text = "Click me";button.Click += delegate { Console.WriteL<strong>in</strong>e("LogPla<strong>in</strong>"); };button.KeyPress += delegate { Console.WriteL<strong>in</strong>e("LogKey"); };button.MouseClick += delegate { Console.WriteL<strong>in</strong>e("LogMouse"); };Form form = new Form();form.AutoSize=true;form.Controls.Add(button);Application.Run(form);Normally we’d have had to write each subscription as someth<strong>in</strong>g like this:button.Click += delegate (object sender, EventArgs e) { ... };That wastes a lot of space for little reason—we don’t need the values of the parameters,so the compiler lets us get away with not specify<strong>in</strong>g them at all. List<strong>in</strong>g 5.9 alsohappens to be a perfect example of how consistency of formatt<strong>in</strong>g isn’t always a goodth<strong>in</strong>g—I played around with a few ways of lay<strong>in</strong>g out the code and decided this was theclearest form.I’ve found this shortcut most useful when it comes to implement<strong>in</strong>g myNeat trick forevents!own events. I get sick of hav<strong>in</strong>g to perform a nullity check before rais<strong>in</strong>gan event. One way of gett<strong>in</strong>g around this is to make sure that theevent starts off with a handler, which is then never removed. As long asthe handler doesn’t do anyth<strong>in</strong>g, all you lose is a t<strong>in</strong>y bit of performance.Before <strong>C#</strong> 2, you had to explicitly create a method with theright signature, which usually wasn’t worth the benefit. Now, however,you can do this:public event EventHandler Click = delegate {};From then on, you can just call Click without any tests to see whether there are anyhandlers subscribed to the event.You should be aware of one trap about this “parameter wildcard<strong>in</strong>g” feature—ifthe anonymous method could be converted to multiple delegate types (for example,Licensed to Rhona Hadida


150 CHAPTER 5 Fast-tracked delegatesto call different method overloads) then the compiler needs more help. To show youwhat I mean, let’s look at how we start threads. There are four thread constructors <strong>in</strong>.NET 2.0:public Thread (ParameterizedThreadStart start)public Thread (ThreadStart start)public Thread (ParameterizedThreadStart start, <strong>in</strong>t maxStackSize)public Thread (ThreadStart start, <strong>in</strong>t maxStackSize)The two delegate types <strong>in</strong>volved arepublic delegate void ThreadStart()public delegate void ParameterizedThreadStart(object obj)Now, consider the follow<strong>in</strong>g three attempts to create a new thread:new Thread(delegate() { Console.WriteL<strong>in</strong>e("t1"); } );new Thread(delegate(object o) { Console.WriteL<strong>in</strong>e("t2"); } );new Thread(delegate { Console.WriteL<strong>in</strong>e("t3"); } );The first and second l<strong>in</strong>es conta<strong>in</strong> parameter lists—the compiler knows that it can’t convertthe anonymous method <strong>in</strong> the first l<strong>in</strong>e <strong>in</strong>to a ParameterizedThreadStart, or convertthe anonymous method <strong>in</strong> the second l<strong>in</strong>e <strong>in</strong>to a ThreadStart. Those l<strong>in</strong>es compile,because there’s only one applicable constructor overload <strong>in</strong> each case. The third l<strong>in</strong>e,however, is ambiguous—the anonymous method can be converted <strong>in</strong>to either delegatetype, so both of the constructor overloads tak<strong>in</strong>g just one parameter are applicable. Inthis situation, the compiler throws its hands up and issues an error. You can solve thiseither by specify<strong>in</strong>g the parameter list explicitly or cast<strong>in</strong>g the anonymous method to theright delegate type.Hopefully what you’ve seen of anonymous methods so far will have provokedsome thought about your own code, and made you consider where you could usethese techniques to good effect. Indeed, even if anonymous methods could only dowhat we’ve already seen, they’d still be very useful. However, there’s more to anonymousmethods than just avoid<strong>in</strong>g the <strong>in</strong>clusion of an extra method <strong>in</strong> your code.Anonymous methods are <strong>C#</strong> 2’s implementation of a feature known elsewhere as closuresby way of captured variables. Our next section expla<strong>in</strong>s both of these terms andshows how anonymous methods can be extremely powerful—and confus<strong>in</strong>g if you’renot careful.5.5 Captur<strong>in</strong>g variables <strong>in</strong> anonymous methodsI don’t like hav<strong>in</strong>g to give warn<strong>in</strong>gs, but I th<strong>in</strong>k it makes sense to <strong>in</strong>clude one here: ifthis topic is new to you, then don’t start this section until you’re feel<strong>in</strong>g reasonablyawake and have a bit of time to spend on it. I don’t want to alarm you unnecessarily,and you should feel confident that there’s noth<strong>in</strong>g so <strong>in</strong>sanely complicated that youwon’t be able to understand it with a little effort. It’s just that captured variables canbe somewhat confus<strong>in</strong>g to start with, partly because they overturn some of your exist<strong>in</strong>gknowledge and <strong>in</strong>tuition.Licensed to Rhona Hadida


Captur<strong>in</strong>g variables <strong>in</strong> anonymous methods151Stick with it, though! The payback can be massive <strong>in</strong> terms of code simplicity andreadability. This topic will also be crucial when we come to look at lambda expressionsand LINQ <strong>in</strong> <strong>C#</strong> 3, so it’s worth the <strong>in</strong>vestment. Let’s start off with a few def<strong>in</strong>itions.5.5.1 Def<strong>in</strong><strong>in</strong>g closures and different types of variablesThe concept of closures is a very old one, first implemented <strong>in</strong> Scheme, but it’s been ga<strong>in</strong><strong>in</strong>gmore prom<strong>in</strong>ence <strong>in</strong> recent years as more ma<strong>in</strong>stream languages have taken it onboard. The basic idea is that a function 5 is able to <strong>in</strong>teract with an environment beyondthe parameters provided to it. That’s all there is to it <strong>in</strong> abstract terms, but to understandhow it applies to <strong>C#</strong> 2, we need a couple more terms:■■An outer variable is a local variable or parameter 6 whose scope <strong>in</strong>cludes an anonymousmethod. The this reference also counts as an outer variable of anyanonymous method where it can be used.A captured outer variable (usually shortened to just “captured variable”) is an outervariable that is used with<strong>in</strong> an anonymous method. So to go back to closures,the function part is the anonymous method, and the environment it can <strong>in</strong>teractwith is the set of variables captured by it.That’s all very dry and may be hard to imag<strong>in</strong>e, but the ma<strong>in</strong> thrust is that an anonymousmethod can use local variables def<strong>in</strong>ed <strong>in</strong> the same method that declares it. Thismay not sound like a big deal, but <strong>in</strong> many situations it’s enormously handy—you canuse contextual <strong>in</strong>formation that you have “on hand” rather than hav<strong>in</strong>g to set up extratypes just to store data you already know. We’ll see some useful concrete examplessoon, I promise—but first it’s worth look<strong>in</strong>g at some code to clarify these def<strong>in</strong>itions.List<strong>in</strong>g 5.10 provides an example with a number of local variables. It’s just a s<strong>in</strong>glemethod, so it can’t be run on its own. I’m not go<strong>in</strong>g to expla<strong>in</strong> how it would work orwhat it would do yet, but just expla<strong>in</strong> how the different variables are classified.List<strong>in</strong>g 5.10Examples of different k<strong>in</strong>ds of variables with respect to anonymous methodsvoid Enclos<strong>in</strong>gMethod(){<strong>in</strong>t outerVariable = 5;str<strong>in</strong>g capturedVariable = "captured";BOuter variable(uncaptured)COuter variablecaptured byanonymousmethodif (DateTime.Now.Hour==23){<strong>in</strong>t normalLocalVariable = DateTime.Now.M<strong>in</strong>ute;Console.WriteL<strong>in</strong>e(normalLocalVariable);}ThreadStart x = delegate(){str<strong>in</strong>g anonLocal="local to anonymous method";EDLocal variable ofnormal methodLocal variableof anonymousmethod5This is general computer science term<strong>in</strong>ology, not <strong>C#</strong> term<strong>in</strong>ology.6Exclud<strong>in</strong>g ref and out parameters.Licensed to Rhona Hadida


152 CHAPTER 5 Fast-tracked delegates}};x();Console.WriteL<strong>in</strong>e(capturedVariable + anonLocal);Let’s go through all the variables from the simplest to the most complicated:■ normalLocalVariable d isn’t an outer variable because there are no anonymousmethods with<strong>in</strong> its scope. It behaves exactly the way that local variablesalways have.■ anonLocal e isn’t an outer variable either, but it’s local to the anonymousmethod, not to Enclos<strong>in</strong>gMethod. It will only exist (<strong>in</strong> terms of be<strong>in</strong>g present <strong>in</strong>an execut<strong>in</strong>g stack frame) when the delegate <strong>in</strong>stance is <strong>in</strong>voked.■ outerVariable B is an outer variable because the anonymous method isdeclared with<strong>in</strong> its scope. However, the anonymous method doesn’t refer to it,so it’s not captured.■ capturedVariable c is an outer variable because the anonymous method isdeclared with<strong>in</strong> its scope, and it’s captured by virtue of be<strong>in</strong>g used at f.Okay, so we now understand the term<strong>in</strong>ology, but we’re not a lot closer to see<strong>in</strong>gwhat captured variables do. I suspect you could guess the output if we ran themethod from list<strong>in</strong>g 5.10, but there are some other cases that would probably surpriseyou. We’ll start off with a simple example and gradually build up to more complexones.5.5.2 Exam<strong>in</strong><strong>in</strong>g the behavior of captured variablesWhen a variable is captured, it really is the variable that’s captured by the anonymousmethod, not its value at the time the delegate <strong>in</strong>stance was created. We’ll see later thatthis has far-reach<strong>in</strong>g consequences, but first we’ll make sure we understand what thatmeans for a relatively straightforward situation. List<strong>in</strong>g 5.11 has a captured variableand an anonymous method that both pr<strong>in</strong>ts out and changes the variable. We’ll seethat changes to the variable from outside the anonymous method are visible with<strong>in</strong>the anonymous method, and vice versa. We’re us<strong>in</strong>g the ThreadStart delegate typefor simplicity as we don’t need a return type or any parameters—no extra threads areactually created, though.FCapture ofouter variableList<strong>in</strong>g 5.11Access<strong>in</strong>g a variable both <strong>in</strong>side and outside an anonymous methodstr<strong>in</strong>g captured = "before x is created";ThreadStart x = delegate{Console.WriteL<strong>in</strong>e(captured);captured = "changed by x";};captured = "directly before x is <strong>in</strong>voked";x();Licensed to Rhona Hadida


Captur<strong>in</strong>g variables <strong>in</strong> anonymous methods153Console.WriteL<strong>in</strong>e (captured);captured = "before second <strong>in</strong>vocation";x();The output of list<strong>in</strong>g 5.11 is as follows:directly before x is <strong>in</strong>vokedchanged by xbefore second <strong>in</strong>vocationLet’s look at how this happens. First, we declare the variable captured and set its valuewith a perfectly normal str<strong>in</strong>g literal. So far, there’s noth<strong>in</strong>g special about the variable.We then declare x and set its value us<strong>in</strong>g an anonymous method that capturescaptured. The delegate <strong>in</strong>stance will always pr<strong>in</strong>t out the current value of captured,and then set it to “changed by x”.Just to make it absolutely clear that just creat<strong>in</strong>g the delegate <strong>in</strong>stance didn’t readthe variable and stash its value away somewhere, we now change the value of capturedto “directly before x is <strong>in</strong>voked”. We then <strong>in</strong>voke x for the first time. It reads the valueof captured and pr<strong>in</strong>ts it out—our first l<strong>in</strong>e of output. It sets the value of captured to“changed by x” and returns. When the delegate <strong>in</strong>stance returns, the “normal”method cont<strong>in</strong>ues <strong>in</strong> the usual way. It pr<strong>in</strong>ts out the current value of captured, giv<strong>in</strong>gus our second l<strong>in</strong>e of output.The normal method then changes the value of captured yet aga<strong>in</strong> (this time tobefore second <strong>in</strong>vocation) and <strong>in</strong>vokes x for the second time. The current value ofcaptured is pr<strong>in</strong>ted out, giv<strong>in</strong>g our last l<strong>in</strong>e of output. The delegate <strong>in</strong>stance changescaptured to changed by x and returns, at which po<strong>in</strong>t the normal method has run outof code and we’re done.That’s a lot of detail about how a pretty short piece of code works, but there’s reallyonly one crucial idea <strong>in</strong> it: the captured variable is the same one that the rest of the methoduses. For some people, that’s hard to grasp; for others it comes naturally. Don’t worry ifit’s tricky to start with—it’ll get easier over time. Even if you’ve understood everyth<strong>in</strong>geasily so far, you may be wonder<strong>in</strong>g why you’d want to do any of this. It’s about time wehad an example that was actually useful.5.5.3 What’s the po<strong>in</strong>t of captured variables?To put it simply, captured variables get rid of the need for you to write extra classesjust to store the <strong>in</strong>formation a delegate needs to act on, beyond what it’s passed asparameters. Before ParameterizedThreadStart existed, if you wanted to start a new(non-threadpool) thread and give it some <strong>in</strong>formation—the URL of a page to fetch,for <strong>in</strong>stance—you had to create an extra type to hold the URL and put the action ofthe ThreadStart delegate <strong>in</strong>stance <strong>in</strong> that type. It was all a very ugly way of achiev<strong>in</strong>gsometh<strong>in</strong>g that should have been simple.As another example, suppose you had a list of people and wanted to write amethod that would return a second list conta<strong>in</strong><strong>in</strong>g all the people who were under agiven age. We know about a method on List that returns another list of everyth<strong>in</strong>gmatch<strong>in</strong>g a predicate: the F<strong>in</strong>dAll method. Before anonymous methods and capturedLicensed to Rhona Hadida


154 CHAPTER 5 Fast-tracked delegatesvariables were around, it wouldn’t have made much sense for List.F<strong>in</strong>dAll toexist, because of all the hoops you’d have to go through <strong>in</strong> order to create the rightdelegate to start with. It would have been simpler to do all the iteration and copy<strong>in</strong>gmanually. With <strong>C#</strong> 2, however, we can do it all very, very easily:List F<strong>in</strong>dAllYoungerThan(List people, <strong>in</strong>t limit){return people.F<strong>in</strong>dAll (delegate (Person person){ return person.Age < limit; });}Here we’re captur<strong>in</strong>g the limit parameter with<strong>in</strong> the delegate <strong>in</strong>stance—if we’d hadanonymous methods but not captured variables, we could have performed a test aga<strong>in</strong>sta hard-coded limit, but not one that was passed <strong>in</strong>to the method as a parameter. I hopeyou’ll agree that this approach is very neat—it expresses exactly what we want to do withmuch less fuss about exactly how it should happen than you’d have seen <strong>in</strong> a <strong>C#</strong> 1 version.(It’s even neater <strong>in</strong> <strong>C#</strong> 3, admittedly… 7 ) It’s relatively rare that you come across a situationwhere you need to write to a captured variable, but aga<strong>in</strong> that can certa<strong>in</strong>ly haveits uses.Still with me? Good. So far, we’ve only used the delegate <strong>in</strong>stance with<strong>in</strong> the methodthat creates it. That doesn’t raise many questions about the lifetime of the captured variables—butwhat would happen if the delegate <strong>in</strong>stance escaped <strong>in</strong>to the big bad world?How would it cope after the method that created it had f<strong>in</strong>ished?5.5.4 The extended lifetime of captured variablesThe simplest way of tackl<strong>in</strong>g this topic is to state a rule, give an example, and thenth<strong>in</strong>k about what would happen if the rule weren’t <strong>in</strong> place. Here we go:A captured variable lives for at least as long as any delegate <strong>in</strong>stancereferr<strong>in</strong>g to it.Don’t worry if it doesn’t make a lot of sense yet—that’s what the example is for. List<strong>in</strong>g5.12 shows a method that returns a delegate <strong>in</strong>stance. That delegate <strong>in</strong>stance is createdus<strong>in</strong>g an anonymous method that captures an outer variable. So, what will happenwhen the delegate is <strong>in</strong>voked after the method has returned?List<strong>in</strong>g 5.12Demonstration of a captured variable hav<strong>in</strong>g its lifetime extendedstatic ThreadStart CreateDelegateInstance(){<strong>in</strong>t counter = 5;ThreadStart ret = delegate{Console.WriteL<strong>in</strong>e(counter);counter++;};7In case you’re wonder<strong>in</strong>g: return people.Where(person => person.Age < limit);Licensed to Rhona Hadida


Captur<strong>in</strong>g variables <strong>in</strong> anonymous methods155ret();return ret;}...ThreadStart x = CreateDelegateInstance();x();x();The output of list<strong>in</strong>g 5.12 consists of the numbers 5, 6, and 7 on separate l<strong>in</strong>es. Thefirst l<strong>in</strong>e of output comes from the <strong>in</strong>vocation of the delegate <strong>in</strong>stance with<strong>in</strong>CreateDelegateInstance, so it makes sense that the value of i is available at thatpo<strong>in</strong>t. But what about after the method has returned? Normally we would considercounter to be on the stack, so when the stack frame for CreateDelegateInstance isdestroyed we’d expect counter to effectively vanish… and yet subsequent <strong>in</strong>vocationsof the returned delegate <strong>in</strong>stance seem to keep us<strong>in</strong>g it!The secret is to challenge the assumption that counter is on the stack <strong>in</strong> the firstplace. It isn’t. The compiler has actually created an extra class to hold the variable.The CreateDelegateInstance method has a reference to an <strong>in</strong>stance of that class so itcan use counter, and the delegate has a reference to the same <strong>in</strong>stance—which liveson the heap <strong>in</strong> the normal way. That <strong>in</strong>stance isn’t eligible for garbage collection untilthe delegate is ready to be collected. Some aspects of anonymous methods are verycompiler specific (<strong>in</strong> other words different compilers could achieve the same semantics<strong>in</strong> different ways), but it’s hard to see how the specified behavior could beachieved without us<strong>in</strong>g an extra class to hold the captured variable. Note that if youonly capture this, no extra types are required—the compiler just creates an <strong>in</strong>stancemethod to act as the delegate’s action.OK, so local variables aren’t always local anymore. You may well be wonder<strong>in</strong>g whatI could possibly throw at you next—let’s see now, how about multiple delegates captur<strong>in</strong>gdifferent <strong>in</strong>stances of the same variable? It sounds crazy, so it’s just the k<strong>in</strong>d ofth<strong>in</strong>g you should be expect<strong>in</strong>g by now.5.5.5 Local variable <strong>in</strong>stantiationsOn a good day, captured variables act exactly the way I expect them to at a glance. Ona bad day, I’m still surprised when I’m not tak<strong>in</strong>g a great deal of care. When there areproblems, it’s almost always due to forgett<strong>in</strong>g just how many “<strong>in</strong>stances” of local variablesI’m actually creat<strong>in</strong>g. A local variable is said to be <strong>in</strong>stantiated each time executionenters the scope where it’s declared. Here’s a simple example compar<strong>in</strong>g two verysimilar bits of code:<strong>in</strong>t s<strong>in</strong>gle;for (<strong>in</strong>t i=0; i < 10; i++){s<strong>in</strong>gle = 5;Console.WriteL<strong>in</strong>e(s<strong>in</strong>gle+i);}for (<strong>in</strong>t i=0; i < 10; i++){<strong>in</strong>t multiple = 5;Console.WriteL<strong>in</strong>e(multiple+i);}Licensed to Rhona Hadida


156 CHAPTER 5 Fast-tracked delegatesIn the good old days, it was reasonable to say that pieces of code like this were semanticallyidentical. Indeed, they’d usually compile to the same IL. They still will, if therearen’t any anonymous methods <strong>in</strong>volved. All the space for local variables is allocatedon the stack at the start of the method, so there’s no cost to “redeclar<strong>in</strong>g” the variablefor each iteration of the loop. However, <strong>in</strong> our new term<strong>in</strong>ology the s<strong>in</strong>gle variablewill be <strong>in</strong>stantiated only once, but the multiple variable will be <strong>in</strong>stantiated tentimes—it’s as if there are ten local variables, all called multiple, which are createdone after another.Variable<strong>in</strong>stance iscapturedI’m sure you can see where I’m go<strong>in</strong>g—when a variable is captured, it’sthe relevant “<strong>in</strong>stance” of the variable that is captured. If we capturedmultiple <strong>in</strong>side the loop, the variable captured <strong>in</strong> the first iterationwould be different from the variable captured the second time round,and so on. List<strong>in</strong>g 5.13 shows exactly this effect.List<strong>in</strong>g 5.13Captur<strong>in</strong>g multiple variable <strong>in</strong>stantiations with multiple delegatesList list = new List();for (<strong>in</strong>t <strong>in</strong>dex=0; <strong>in</strong>dex < 5; <strong>in</strong>dex++){<strong>in</strong>t counter = <strong>in</strong>dex*10;list.Add (delegate{Console.WriteL<strong>in</strong>e(counter);counter++;});}B Instantiates counterCPr<strong>in</strong>ts and <strong>in</strong>crementscaptured variableforeach (ThreadStart t <strong>in</strong> list){t();}DExecutes allfive delegate<strong>in</strong>stanceslist[0]();list[0]();list[0]();EExecutes first onethree more timeslist[1](); F Executes second one aga<strong>in</strong>List<strong>in</strong>g 5.13 creates five different delegate <strong>in</strong>stances C—one for each time we goaround the loop. Invok<strong>in</strong>g the delegate will pr<strong>in</strong>t out the value of counter and then<strong>in</strong>crement it. Now, because counter is declared <strong>in</strong>side the loop, it is <strong>in</strong>stantiated foreach iteration B, and so each delegate captures a different variable. So, when we gothrough and <strong>in</strong>voke each delegate D, we see the different values <strong>in</strong>itially assigned tocounter: 0, 10, 20, 30, 40. Just to hammer the po<strong>in</strong>t home, when we then go back tothe first delegate <strong>in</strong>stance and execute it three more times E, it keeps go<strong>in</strong>g fromwhere that <strong>in</strong>stance’s counter variable had left off: 1, 2, 3. F<strong>in</strong>ally we execute the seconddelegate <strong>in</strong>stance F, and that keeps go<strong>in</strong>g from where that <strong>in</strong>stance’s countervariable had left off: 11.Licensed to Rhona Hadida


Captur<strong>in</strong>g variables <strong>in</strong> anonymous methods157So, each of the delegate <strong>in</strong>stances has captured a different variable <strong>in</strong> thiscase. Before we leave this example, I should po<strong>in</strong>t out what would have happenedif we’d captured <strong>in</strong>dex—the variable declared by the for loop—<strong>in</strong>stead of counter. In this case, all the delegates would have shared thesame variable. The output would have been the numbers 5 to 14; 5 firstbecause the last assignment to <strong>in</strong>dex before the loop term<strong>in</strong>ates wouldhave set it to 5, and then <strong>in</strong>crement<strong>in</strong>g the same variable regardless ofwhich delegate was <strong>in</strong>volved. We’d see the same behavior with a foreachloop: the variable declared by the <strong>in</strong>itial part of the loop is only <strong>in</strong>stantiated once. It’seasy to get this wrong! If you want to capture the value of a loop variable for that particulariteration of the loop, <strong>in</strong>troduce another variable with<strong>in</strong> the loop, copy the loopvariable’s value <strong>in</strong>to it, and capture that new variable—effectively what we’ve done <strong>in</strong> list<strong>in</strong>g5.13 with the counter variable.For our f<strong>in</strong>al example, let’s look at someth<strong>in</strong>g really nasty—shar<strong>in</strong>g some capturedvariables but not others.Be carefulwithfor/foreachloops!5.5.6 Mixtures of shared and dist<strong>in</strong>ct variablesLet me just say before I show you this next example that it’s not code I’d recommend.In fact, the whole po<strong>in</strong>t of present<strong>in</strong>g it is to show how if you try to use captured variables<strong>in</strong> too complicated a fashion, th<strong>in</strong>gs can get tricky really fast. List<strong>in</strong>g 5.14 createstwo delegate <strong>in</strong>stances that each capture “the same” two variables. However, the storygets more convoluted when we look at what’s actually captured.List<strong>in</strong>g 5.14Captur<strong>in</strong>g variables <strong>in</strong> different scopes. Warn<strong>in</strong>g: nasty code ahead!ThreadStart[] delegates = new ThreadStart[2];<strong>in</strong>t outside = 0;for (<strong>in</strong>t i=0; i < 2; i++){<strong>in</strong>t <strong>in</strong>side = 0;C Instantiatesvariablemultiple timesB Instantiatesvariable once}delegates[i] = delegate{Console.WriteL<strong>in</strong>e ("({0},{1})",outside, <strong>in</strong>side);outside++;<strong>in</strong>side++;};DCaptures variableswith anonymousmethodThreadStart first = delegates[0];ThreadStart second = delegates[1];first();first();first();second();second();Licensed to Rhona Hadida


158 CHAPTER 5 Fast-tracked delegatesHow long would it take you to predict the output from list<strong>in</strong>g 5.14 (even with theannotations)? Frankly it would take me a little while—longer than I like to spendunderstand<strong>in</strong>g code. Just as an exercise, though, let’s look at what happens.First let’s consider the outside variable B. The scope it’s declared <strong>in</strong> is only enteredonce, so it’s a straightforward case—there’s only ever one of it, effectively. The <strong>in</strong>sidevariable C is a different matter—each loop iteration <strong>in</strong>stantiates a new one. Thatmeans that when we create the delegate <strong>in</strong>stance D the outside variable is sharedbetween the two delegate <strong>in</strong>stances, but each of them has its own <strong>in</strong>side variable.After the loop has ended, we call the first delegate <strong>in</strong>stance we created three times.Because it’s <strong>in</strong>crement<strong>in</strong>g both of its captured variables each time, and we started offwith them both as 0, we see (0,0), then (1,1), then (2,2). The difference between thetwo variables <strong>in</strong> terms of scope becomes apparent when we execute the second delegate<strong>in</strong>stance. It has a different <strong>in</strong>side variable, so that still has its <strong>in</strong>itial value of 0,but the outside variable is the one we’ve already <strong>in</strong>cremented three times. The outputfrom call<strong>in</strong>g the second delegate twice is therefore (3,0), then (4,1).NOTE How does this happen <strong>in</strong>ternally? Just for the sake of <strong>in</strong>terest, let’s th<strong>in</strong>kabout how this is implemented—at least with Microsoft’s <strong>C#</strong> 2 compiler.What happens is that one extra class is generated to hold the outer variable,and another one is generated to hold an <strong>in</strong>ner variable and a referenceto the first extra class. Essentially, each scope that conta<strong>in</strong>s a capturedvariable gets its own type, with a reference to the next scope out that conta<strong>in</strong>sa captured variable. In our case, there were two <strong>in</strong>stances of the typehold<strong>in</strong>g <strong>in</strong>ner, and they both refer to the same <strong>in</strong>stance of the type hold<strong>in</strong>gouter. Other implementations may vary, but this is the most obviousway of do<strong>in</strong>g th<strong>in</strong>gs.Even after you understand this code fully, it’s still quite a good template for experiment<strong>in</strong>gwith other elements of captured variables. As we noted earlier, certa<strong>in</strong> elementsof variable capture are implementation specific, and it’s often useful to refer tothe specification to see what’s guaranteed—but it’s also important to be able to justplay with code to see what happens.It’s possible that there are situations where code like list<strong>in</strong>g 5.14 would be the simplestand clearest way of express<strong>in</strong>g the desired behavior—but I’d have to see it to believeit, and I’d certa<strong>in</strong>ly want comments <strong>in</strong> the code to expla<strong>in</strong> what would happen. So, whenis it appropriate to use captured variables, and what do you need to look out for?5.5.7 Captured variable guidel<strong>in</strong>es and summaryHopefully this section has conv<strong>in</strong>ced you to be very careful with captured variables.They make good logical sense (and any change to make them simpler would probablyeither make them less useful or less logical), but they make it quite easy to producehorribly complicated code.Don’t let that discourage you from us<strong>in</strong>g them sensibly, though—they can save youmasses of tedious code, and when they’re used appropriately they can be the mostreadable way of gett<strong>in</strong>g the job done. But what counts as “sensible”?Licensed to Rhona Hadida


Captur<strong>in</strong>g variables <strong>in</strong> anonymous methods159GUIDELINES FOR USING CAPTURED VARIABLESThe follow<strong>in</strong>g is a list of suggestions for us<strong>in</strong>g captured variables:■ If code that doesn’t use captured variables is just as simple as code that does,don’t use them.■ Before captur<strong>in</strong>g a variable declared by a for or foreach statement, considerwhether your delegate is go<strong>in</strong>g to live beyond the loop iteration, and whetheryou want it to see the subsequent values of that variable. If not, create anothervariable <strong>in</strong>side the loop that just copies the value you do want.■ If you create multiple delegate <strong>in</strong>stances (whether <strong>in</strong> a loop or explicitly) thatcapture variables, put thought <strong>in</strong>to whether you want them to capture the samevariable.■ If you capture a variable that doesn’t actually change (either <strong>in</strong> the anonymousmethod or the enclos<strong>in</strong>g method body), then you don’t need to worry as much.■ If the delegate <strong>in</strong>stances you create never “escape” from the method—<strong>in</strong> otherwords, they’re never stored anywhere else, or returned, or used for start<strong>in</strong>gthreads—life is a lot simpler.■ Consider the extended lifetime of any captured variables <strong>in</strong> terms of garbagecollection. This is normally not an issue, but if you capture an object that isexpensive <strong>in</strong> terms of memory, it may be significant.The first po<strong>in</strong>t is the golden rule. Simplicity is a good th<strong>in</strong>g—so any time the use of acaptured variable makes your code simpler (after you’ve factored <strong>in</strong> the additional<strong>in</strong>herent complexity of forc<strong>in</strong>g your code’s ma<strong>in</strong>ta<strong>in</strong>ers to understand what the capturedvariable does), use it. You need to <strong>in</strong>clude that extra complexity <strong>in</strong> your considerations,that’s all—don’t just go for m<strong>in</strong>imal l<strong>in</strong>e count.We’ve covered a lot of ground <strong>in</strong> this section, and I’m aware that it can be hard totake <strong>in</strong>. I’ve listed the most important th<strong>in</strong>gs to remember next, so that if you need tocome back to this section another time you can jog your memory without hav<strong>in</strong>g toread through the whole th<strong>in</strong>g aga<strong>in</strong>:■ The variable is captured—not its value at the po<strong>in</strong>t of delegate <strong>in</strong>stance creation.■ Captured variables have lifetimes extended to at least that of the captur<strong>in</strong>g delegate.■ Multiple delegates can capture the same variable…■ …but with<strong>in</strong> loops, the same variable declaration can effectively refer to differentvariable “<strong>in</strong>stances.”■ for/foreach loop declarations create variables that live for the duration of theloop—they’re not <strong>in</strong>stantiated on each iteration.■ Captured variables aren’t really local variables—extra types are created wherenecessary.■ Be careful! Simple is almost always better than clever.We’ll see more variables be<strong>in</strong>g captured when we look at <strong>C#</strong> 3 and its lambda expressions,but for now you may be relieved to hear that we’ve f<strong>in</strong>ished our rundown of thenew <strong>C#</strong> 2 delegate features.Licensed to Rhona Hadida


160 CHAPTER 5 Fast-tracked delegates5.6 Summary<strong>C#</strong> 2 has radically changed the ways <strong>in</strong> which delegates can be created, and <strong>in</strong> do<strong>in</strong>gso it’s opened up the framework to a more functional style of programm<strong>in</strong>g. Thereare more methods <strong>in</strong> .NET 2.0 that take delegates as parameters than there were <strong>in</strong>.NET 1.0/1.1, and this trend cont<strong>in</strong>ues <strong>in</strong> .NET 3.5. The List type is the bestexample of this, and is a good test-bed for check<strong>in</strong>g your skills at us<strong>in</strong>g anonymousmethods and captured variables. Programm<strong>in</strong>g <strong>in</strong> this way requires a slightly differentm<strong>in</strong>d-set—you must be able to take a step back and consider what the ultimateaim is, and whether it’s best expressed <strong>in</strong> the traditional <strong>C#</strong> manner, or whether afunctional approach makes th<strong>in</strong>gs clearer.All the changes to delegate handl<strong>in</strong>g are useful, but they do add complexity to thelanguage, particularly when it comes to captured variables. Closures are always tricky<strong>in</strong> terms of quite how the available environment is shared, and <strong>C#</strong> is no different <strong>in</strong>this respect. The reason they’ve lasted so long as an idea, however, is that they canmake code simpler to understand and more immediate. The balanc<strong>in</strong>g act betweencomplexity and simplicity is always a difficult one, and it’s worth not be<strong>in</strong>g too ambitiousto start with. As anonymous methods and captured variables become more common,we should all expect to get better at work<strong>in</strong>g with them and understand<strong>in</strong>g whatthey’ll do. They’re certa<strong>in</strong>ly not go<strong>in</strong>g away, and <strong>in</strong>deed LINQ encourages their useeven further.Anonymous methods aren’t the only change <strong>in</strong> <strong>C#</strong> 2 that <strong>in</strong>volves the compiler creat<strong>in</strong>gextra types beh<strong>in</strong>d the scenes, do<strong>in</strong>g devious th<strong>in</strong>gs with variables that appear tobe local. We’ll see a lot more of this <strong>in</strong> our next chapter, where the compiler effectivelybuilds a whole state mach<strong>in</strong>e for us <strong>in</strong> order to make it easier for the developerto implement iterators.Licensed to Rhona Hadida


Implement<strong>in</strong>giterators the easy wayThis chapter covers■ Implement<strong>in</strong>g iterators <strong>in</strong> <strong>C#</strong> 1■ Iterator blocks <strong>in</strong> <strong>C#</strong> 2■■A simple Range typeIterators as corout<strong>in</strong>esThe iterator pattern is an example of a behavioral pattern—a design pattern thatsimplifies communication between objects. It’s one of the simplest patterns tounderstand, and <strong>in</strong>credibly easy to use. In essence, it allows you to access all theelements <strong>in</strong> a sequence of items without car<strong>in</strong>g about what k<strong>in</strong>d of sequence it is—an array, a list, a l<strong>in</strong>ked list, or none of the above. This can be very effective forbuild<strong>in</strong>g a data pipel<strong>in</strong>e, where an item of data enters the pipel<strong>in</strong>e and goes througha number of different transformations or filters before com<strong>in</strong>g out at the otherend. Indeed, this is one of the core patterns of LINQ, as we’ll see <strong>in</strong> part 3.In .NET, the iterator pattern is encapsulated by the IEnumerator and IEnumerable<strong>in</strong>terfaces and their generic equivalents. (The nam<strong>in</strong>g is unfortunate—the patternis normally called iteration rather than enumeration to avoid gett<strong>in</strong>g confused with161Licensed to Rhona Hadida


162 CHAPTER 6 Implement<strong>in</strong>g iterators the easy wayother mean<strong>in</strong>gs of the word enumeration. I’ve used iterator and iterable throughout thischapter.) If a type implements IEnumerable, that means it can be iterated over; call<strong>in</strong>gthe GetEnumerator method will return the IEnumerator implementation, which is theiterator itself.As a language, <strong>C#</strong> 1 has built-<strong>in</strong> support for consum<strong>in</strong>g iterators us<strong>in</strong>g the foreachstatement. This makes it <strong>in</strong>credibly easy to iterate over collections—easier than us<strong>in</strong>ga straight for loop—and is nicely expressive. The foreach statement compiles downto calls to the GetEnumerator and MoveNext methods and the Current property, withsupport for dispos<strong>in</strong>g the iterator afterwards if IDisposable has been implemented.It’s a small but useful piece of syntactic sugar.In <strong>C#</strong> 1, however, implement<strong>in</strong>g an iterator is a relatively difficult task. The syntacticsugar provided by <strong>C#</strong> 2 makes this much simpler, which can sometimes lead to the iteratorpattern be<strong>in</strong>g worth implement<strong>in</strong>g <strong>in</strong> cases where otherwise it would have causedmore work than it saved.In this chapter we’ll look at just what is required to implement an iterator and thesupport given by <strong>C#</strong> 2. As a complete example we’ll create a useful Range class that canbe used <strong>in</strong> numerous situations, and then we’ll explore an excit<strong>in</strong>g (if slightly off-thewall)use of the iteration syntax <strong>in</strong> a new concurrency library from Microsoft.As <strong>in</strong> other chapters, let’s start off by look<strong>in</strong>g at why this new feature was <strong>in</strong>troduced.We’ll implement an iterator the hard way.6.1 <strong>C#</strong> 1: the pa<strong>in</strong> of handwritten iteratorsWe’ve already seen one example of an iterator implementation <strong>in</strong> section 3.4.3 whenwe looked at what happens when you iterate over a generic collection. In some waysthat was harder than a real <strong>C#</strong> 1 iterator implementation would have been, because weimplemented the generic <strong>in</strong>terfaces as well—but <strong>in</strong> some ways it was easier because itwasn’t actually iterat<strong>in</strong>g over anyth<strong>in</strong>g useful.To put the <strong>C#</strong> 2 features <strong>in</strong>to context, we’ll first implement an iterator that is aboutas simple as it can be while still provid<strong>in</strong>g real, useful values. Suppose we had a new typeof collection—which can happen, even though .NET provides most of the collectionsyou’ll want to use <strong>in</strong> normal applications. We’ll implement IEnumerable so that usersof our new class can easily iterate over all the values <strong>in</strong> the collection. We’ll ignore theguts of the collection here and just concentrate on the iteration side. Our collection willstore its values <strong>in</strong> an array (object[]—no generics here!), and the collection will havethe <strong>in</strong>terest<strong>in</strong>g feature that you can set its logical “start<strong>in</strong>g po<strong>in</strong>t”—so if the array had fiveelements, you could set the start po<strong>in</strong>t to 2, and expect elements 2, 3, 4, 0, and then 1to be returned. (This constra<strong>in</strong>t prevents us from implement<strong>in</strong>g GetEnumerator bysimply call<strong>in</strong>g the same method on the array itself. That would defeat the purpose ofthe exercise.)To make the class easy to demonstrate, we’ll provide both the values and the start<strong>in</strong>gpo<strong>in</strong>t <strong>in</strong> the constructor. So, we should be able to write code such as list<strong>in</strong>g 6.1 <strong>in</strong>order to iterate over the collection.Licensed to Rhona Hadida


<strong>C#</strong> 1: the pa<strong>in</strong> of handwritten iterators163List<strong>in</strong>g 6.1Code us<strong>in</strong>g the (as yet unimplemented) new collection typeobject[] values = {"a", "b", "c", "d", "e"};IterationSample collection = new IterationSample(values, 3);foreach (object x <strong>in</strong> collection){Console.WriteL<strong>in</strong>e (x);}Runn<strong>in</strong>g list<strong>in</strong>g 6.1 should (eventually) produce output of “d”, “e”, “a”, “b”, and f<strong>in</strong>ally“c” because we specified a start<strong>in</strong>g po<strong>in</strong>t of 3. Now that we know what we need toachieve, let’s look at the skeleton of the class as shown <strong>in</strong> list<strong>in</strong>g 6.2.List<strong>in</strong>g 6.2Skeleton of the new collection type, with no iterator implementationus<strong>in</strong>g System;us<strong>in</strong>g System.Collections;public class IterationSample : IEnumerable{object[] values;<strong>in</strong>t start<strong>in</strong>gPo<strong>in</strong>t;}public IterationSample (object[] values, <strong>in</strong>t start<strong>in</strong>gPo<strong>in</strong>t){this.values = values;this.start<strong>in</strong>gPo<strong>in</strong>t = start<strong>in</strong>gPo<strong>in</strong>t;}public IEnumerator GetEnumerator(){throw new NotImplementedException();}As you can see, we haven’t implemented GetEnumerator yet, but the rest of the code isready to go. So, how do we go about implement<strong>in</strong>g GetEnumerator? The first th<strong>in</strong>g tounderstand is that we need to store some state somewhere. One important aspect ofthe iterator pattern is that we don’t return all of the data <strong>in</strong> one go—the client justasks for one element at a time. That means we need to keep track of how far we’vealready gone through our array.So, where should this state live? Suppose we tried to put it <strong>in</strong> the IterationSampleclass itself, mak<strong>in</strong>g that implement IEnumerator as well as IEnumerable. At first sight,this looks like a good plan—after all, the data is <strong>in</strong> the right place, <strong>in</strong>clud<strong>in</strong>g the start<strong>in</strong>gpo<strong>in</strong>t. Our GetEnumerator method could just return this. However, there’s a bigproblem with this approach—if GetEnumerator is called several times, several <strong>in</strong>dependentiterators should be returned. For <strong>in</strong>stance, we should be able to use twoforeach statements, one <strong>in</strong>side another, to get all possible pairs of values. That suggestswe need to create a new object each time GetEnumerator is called. We could stillimplement the functionality directly with<strong>in</strong> IterationSample, but then we’d have aclass that didn’t have a clear s<strong>in</strong>gle responsibility—it would be pretty confus<strong>in</strong>g.Licensed to Rhona Hadida


164 CHAPTER 6 Implement<strong>in</strong>g iterators the easy wayInstead, let’s create another class to implement the iterator itself. We’ll use the factthat <strong>in</strong> <strong>C#</strong> a nested type has access to its enclos<strong>in</strong>g type’s private members, whichmeans we can just store a reference to the “parent” IterationSample, along with thestate of how far we’ve gone so far. This is shown <strong>in</strong> list<strong>in</strong>g 6.3.List<strong>in</strong>g 6.3class IterationSampleIterator : IEnumerator{IterationSample parent;<strong>in</strong>t position;}<strong>in</strong>ternal IterationSampleIterator(IterationSample parent){this.parent = parent;position = -1;Starts before}first elementpublic bool MoveNext(){if (position != parent.values.Length){position++;}return position < parent.values.Length;}public object Current{get{if (position==-1 ||position==parent.values.Length){throw new InvalidOperationException();}}}Nested class implement<strong>in</strong>g the collection’s iterator<strong>in</strong>t <strong>in</strong>dex = (position+parent.start<strong>in</strong>gPo<strong>in</strong>t);<strong>in</strong>dex = <strong>in</strong>dex % parent.values.Length;return parent.values[<strong>in</strong>dex];public void Reset(){position = -1;}HDMoves back tobefore first elementWhat a lot of code to perform such a simple task! We remember the orig<strong>in</strong>al collectionof values we’re iterat<strong>in</strong>g over B and keep track of where we would be <strong>in</strong> a simplezero-based array C. To return an element we offset that <strong>in</strong>dex by the start<strong>in</strong>gpo<strong>in</strong>t G. In keep<strong>in</strong>g with the <strong>in</strong>terface, we consider our iterator to start logicallybefore the first element D, so the client will have to call MoveNext before us<strong>in</strong>g theCurrent property for the first time. The conditional <strong>in</strong>crement at E makes the testBRefers tocollection we’reiterat<strong>in</strong>g overCE Incrementsposition if we’restill go<strong>in</strong>gFIndicateshow far we’veiteratedPrevents access beforefirst or after last elementGImplementswraparoundLicensed to Rhona Hadida


<strong>C#</strong> 2: simple iterators with yield statements165at F simple and correct even if MoveNext is called aga<strong>in</strong> after it’s first reported thatthere’s no more data available. To reset the iterator, we set our logical position backto “before the first element” H.Most of the logic <strong>in</strong>volved is fairly straightforward, although there’s lots of roomfor off-by-one errors; <strong>in</strong>deed, my first implementation failed its unit tests for preciselythat reason. The good news is that it works, and that we only need to implementIEnumerable <strong>in</strong> IterationSample to complete the example:public IEnumerator GetEnumerator(){return new IterationSampleIterator(this);}I won’t reproduce the comb<strong>in</strong>ed code here, but it’s available on the book’s website,<strong>in</strong>clud<strong>in</strong>g list<strong>in</strong>g 6.1, which now has the expected output.It’s worth bear<strong>in</strong>g <strong>in</strong> m<strong>in</strong>d that this is a relatively simple example—there’s not a lotof state to keep track of, and no attempt to check whether the collection has changedbetween iterations. With this large burden <strong>in</strong>volved to implement a simple iterator, weshouldn’t be surprised at the rarity of implement<strong>in</strong>g the pattern <strong>in</strong> <strong>C#</strong> 1. Developershave generally been happy to use foreach on the collections provided by the framework,but they use more direct (and collection-specific) access when it comes to theirown collections.So, 40 l<strong>in</strong>es of code to implement the iterator <strong>in</strong> <strong>C#</strong> 1, not <strong>in</strong>clud<strong>in</strong>g comments.Let’s see if <strong>C#</strong> 2 can do any better.6.2 <strong>C#</strong> 2: simple iterators with yield statementsI’ve always been the k<strong>in</strong>d of person who likes to stay up until midnight on ChristmasEve <strong>in</strong> order to open a present as soon as Christmas Day arrives. In the same way, Ith<strong>in</strong>k I’d f<strong>in</strong>d it almost impossible to wait any significant amount of time before show<strong>in</strong>gyou how neat the solution is <strong>in</strong> <strong>C#</strong> 2.6.2.1 Introduc<strong>in</strong>g iterator blocks and yield returnThis chapter wouldn’t exist if <strong>C#</strong> 2 didn’t have a powerful feature that cut down theamount of code you had to write to implement iterators. In some other topics theamount of code has only been reduced slightly, or has just made someth<strong>in</strong>g more elegant.In this case, however, the amount of code required is reduced massively. List<strong>in</strong>g 6.4shows the complete implementation of the GetEnumerator method <strong>in</strong> <strong>C#</strong> 2.List<strong>in</strong>g 6.4Iterat<strong>in</strong>g through the sample collection with <strong>C#</strong> 2 and yield returnMuchsimpler, isn’tit?public IEnumerator GetEnumerator(){for (<strong>in</strong>t <strong>in</strong>dex=0; <strong>in</strong>dex < values.Length; <strong>in</strong>dex++){yield return values[(<strong>in</strong>dex+start<strong>in</strong>gPo<strong>in</strong>t)%values.Length];}}Licensed to Rhona Hadida


166 CHAPTER 6 Implement<strong>in</strong>g iterators the easy wayFour l<strong>in</strong>es of implementation, two of which are just braces. Just to make it clear, thatreplaces the whole of the IterationSampleIterator class. Completely. At least <strong>in</strong> thesource code… Later on we’ll see what the compiler has done beh<strong>in</strong>d our back, andsome of the quirks of the implementation it’s provided, but for the moment let’s lookat the source code we’ve used.The method looks like a perfectly normal one until you see the use of yieldreturn. That’s what tells the <strong>C#</strong> compiler that this isn’t a normal method but oneimplemented with an iterator block. The method is declared to return an IEnumerator,and you can only use iterator blocks to implement methods 1 that have a return type ofIEnumerable, IEnumerator, or one of the generic equivalents. The yield type of the iteratorblock is object if the declared return type of the method is a nongeneric <strong>in</strong>terface,or the type argument of the generic <strong>in</strong>terface otherwise. For <strong>in</strong>stance, a methoddeclared to return IEnumerable would have a yield type of str<strong>in</strong>g.No normal return statements are allowed with<strong>in</strong> iterator blocks—only yieldreturn. All yield return statements <strong>in</strong> the block have to try to return a value compatiblewith the yield type of the block. To use our previous example, you couldn’t writeyield return 1; <strong>in</strong> a method declared to return IEnumerable.NOTERestrictions on yield return—There are a few further restrictions on yieldstatements. You can’t use yield return <strong>in</strong>side a try block if it has anycatch blocks, and you can’t use either yield return or yield break(which we’ll come to shortly) <strong>in</strong> a f<strong>in</strong>ally block. That doesn’t mean youcan’t use try/catch or try/f<strong>in</strong>ally blocks <strong>in</strong>side iterators—it justrestricts what you can do <strong>in</strong> them.The big idea that you need to get your head around when it comes toiterator blocks is that although you’ve written a method that looks like itexecutes sequentially, what you’ve actually asked the compiler to do iscreate a state mach<strong>in</strong>e for you. This is necessary for exactly the same reasonwe had to put so much effort <strong>in</strong>to implement<strong>in</strong>g the iterator <strong>in</strong><strong>C#</strong>1—the caller only wants to see one element at a time, so we need tokeep track of what we were do<strong>in</strong>g when we last returned a value.In iterator blocks, the compiler creates a state mach<strong>in</strong>e (<strong>in</strong> the form of anested type), which remembers exactly where we were with<strong>in</strong> the block and what valuesthe local variables (<strong>in</strong>clud<strong>in</strong>g parameters) had at that po<strong>in</strong>t. The compiler analyzesthe iterator block and creates a class that is similar to the longhandimplementation we wrote earlier, keep<strong>in</strong>g all the necessary state as <strong>in</strong>stance variables.Let’s th<strong>in</strong>k about what this state mach<strong>in</strong>e has to do <strong>in</strong> order to implement the iterator:Compilerdoes allthe work!■■It has to have some <strong>in</strong>itial state.Whenever MoveNext is called, it has to execute code from the GetEnumeratormethod until we’re ready to provide the next value (<strong>in</strong> other words, until we hita yield return statement).1Or properties, as we’ll see later on. You can’t use an iterator block <strong>in</strong> an anonymous method, though.Licensed to Rhona Hadida


<strong>C#</strong> 2: simple iterators with yield statements167■ When the Current property is used, it has to return the last value we yielded.■ It has to know when we’ve f<strong>in</strong>ished yield<strong>in</strong>g values so that MoveNext can returnfalse.The second po<strong>in</strong>t <strong>in</strong> this list is the tricky one, because it always needs to “restart” thecode from the po<strong>in</strong>t it had previously reached. Keep<strong>in</strong>g track of the local variables (asthey appear <strong>in</strong> the method) isn’t too hard—they’re just represented by <strong>in</strong>stance variables<strong>in</strong> the state mach<strong>in</strong>e. The restart<strong>in</strong>g aspect is trickier, but the good news is thatunless you’re writ<strong>in</strong>g a <strong>C#</strong> compiler yourself, you needn’t care about how it’s achieved:the result from a black box po<strong>in</strong>t of view is that it just works. You can write perfectlynormal code with<strong>in</strong> the iterator block and the compiler is responsible for mak<strong>in</strong>g surethat the flow of execution is exactly as it would be <strong>in</strong> any other method; the differenceis that a yield return statement appears to only “temporarily” exit the method—youcould th<strong>in</strong>k of it as be<strong>in</strong>g paused, effectively.Next we’ll exam<strong>in</strong>e the flow of execution <strong>in</strong> more detail, and <strong>in</strong> a more visual way.6.2.2 Visualiz<strong>in</strong>g an iterator’s workflowIt may help to th<strong>in</strong>k about how iterators execute <strong>in</strong> terms of a sequence diagram. 2Rather than draw<strong>in</strong>g the diagram out by hand, let’s write a program to pr<strong>in</strong>t it out(list<strong>in</strong>g 6.5). The iterator itself just provides a sequence of numbers (0, 1, 2, –1) andthen f<strong>in</strong>ishes. The <strong>in</strong>terest<strong>in</strong>g part isn’t the numbers provided so much as the flow ofthe code.List<strong>in</strong>g 6.5Show<strong>in</strong>g the sequence of calls between an iterator and its callerstatic readonly str<strong>in</strong>g Padd<strong>in</strong>g = new str<strong>in</strong>g(' ', 30);static IEnumerable GetEnumerable(){Console.WriteL<strong>in</strong>e ("{0}Start of GetEnumerator()", Padd<strong>in</strong>g);for (<strong>in</strong>t i=0; i < 3; i++){Console.WriteL<strong>in</strong>e ("{0}About to yield {1}", Padd<strong>in</strong>g, i);yield return i;Console.WriteL<strong>in</strong>e ("{0}After yield", Padd<strong>in</strong>g);}Console.WriteL<strong>in</strong>e ("{0}Yield<strong>in</strong>g f<strong>in</strong>al value", Padd<strong>in</strong>g);yield return -1;}Console.WriteL<strong>in</strong>e ("{0}End of GetEnumerator()", Padd<strong>in</strong>g);...IEnumerable iterable = GetEnumerable();IEnumerator iterator = iterable.GetEnumerator();2See http://en.wikipedia.org/wiki/Sequence_diagram if this is unfamiliar to you.Licensed to Rhona Hadida


168 CHAPTER 6 Implement<strong>in</strong>g iterators the easy wayConsole.WriteL<strong>in</strong>e ("Start<strong>in</strong>g to iterate");while (true){Console.WriteL<strong>in</strong>e ("Call<strong>in</strong>g MoveNext()...");bool result = iterator.MoveNext();Console.WriteL<strong>in</strong>e ("... MoveNext result={0}", result);if (!result){break;}Console.WriteL<strong>in</strong>e ("Fetch<strong>in</strong>g Current...");Console.WriteL<strong>in</strong>e ("... Current result={0}", iterator.Current);}List<strong>in</strong>g 6.5 certa<strong>in</strong>ly isn’t pretty, particularly around the iteration side of th<strong>in</strong>gs. Inthe normal course of events we’d just use a foreach loop, but to show exactly what’shappen<strong>in</strong>g when, I had to break the use of the iterator out <strong>in</strong>to little pieces. Thiscode broadly does what foreach does, although foreach also calls Dispose at theend, which is important for iterator blocks, as we’ll see shortly. As you can see,there’s no difference <strong>in</strong> the syntax with<strong>in</strong> the method even though this time we’rereturn<strong>in</strong>g IEnumerable <strong>in</strong>stead of IEnumerator. Here’s the output fromlist<strong>in</strong>g 6.5:Start<strong>in</strong>g to iterateCall<strong>in</strong>g MoveNext()...Start of GetEnumerator()About to yield 0... MoveNext result=TrueFetch<strong>in</strong>g Current...... Current result=0Call<strong>in</strong>g MoveNext()...After yieldAbout to yield 1... MoveNext result=TrueFetch<strong>in</strong>g Current...... Current result=1Call<strong>in</strong>g MoveNext()...After yieldAbout to yield 2... MoveNext result=TrueFetch<strong>in</strong>g Current...... Current result=2Call<strong>in</strong>g MoveNext()...After yieldYield<strong>in</strong>g f<strong>in</strong>al value... MoveNext result=TrueFetch<strong>in</strong>g Current...... Current result=-1Call<strong>in</strong>g MoveNext()...End of GetEnumerator()... MoveNext result=FalseThere are various important th<strong>in</strong>gs to note from this output:Licensed to Rhona Hadida


<strong>C#</strong> 2: simple iterators with yield statements169■ None of the code we wrote <strong>in</strong> GetEnumerator is called until the first call toMoveNext.■ Call<strong>in</strong>g MoveNext is the place all the work gets done; fetch<strong>in</strong>g Current doesn’trun any of our code.■ The code stops execut<strong>in</strong>g at yield return and picks up aga<strong>in</strong> just afterwards atthe next call to MoveNext.■ We can have multiple yield return statements <strong>in</strong> different places <strong>in</strong> the method.■ The code doesn’t end at the last yield return—<strong>in</strong>stead, the call to MoveNextthat causes us to reach the end of the method is the one that returns false.There are two th<strong>in</strong>gs we haven’t seen yet—an alternative way of halt<strong>in</strong>g the iteration,and how f<strong>in</strong>ally blocks work <strong>in</strong> this somewhat odd form of execution. Let’s take alook at them now.6.2.3 Advanced iterator execution flowIn normal methods, the return statement has two effects: First, it supplies the valuethe caller sees as the return value. Second, it term<strong>in</strong>ates the execution of the method,execut<strong>in</strong>g any appropriate f<strong>in</strong>ally blocks on the way out. We’ve seen that the yieldreturn statement temporarily exits the method, but only until MoveNext is calledaga<strong>in</strong>, and we haven’t exam<strong>in</strong>ed the behavior of f<strong>in</strong>ally blocks at all yet. How can wereally stop the method, and what happens to all of those f<strong>in</strong>ally blocks? We’ll startwith a fairly simple construct—the yield break statement.ENDING AN ITERATOR WITH YIELD BREAKYou can always f<strong>in</strong>d a way to make a method have a s<strong>in</strong>gle exit po<strong>in</strong>t, and many peoplework very hard to achieve this. 3 The same techniques can be applied <strong>in</strong> iteratorblocks. However, should you wish to have an “early out,” the yield break statement isyour friend. This effectively term<strong>in</strong>ates the iterator, mak<strong>in</strong>g the current call to Move-Next return false.List<strong>in</strong>g 6.6 demonstrates this by count<strong>in</strong>g up to 100 but stopp<strong>in</strong>g early if it runs outof time. This also demonstrates the use of a method parameter <strong>in</strong> an iterator block, 4and proves that the name of the method is irrelevant.List<strong>in</strong>g 6.6Demonstration of yield breakstatic IEnumerable CountWithTimeLimit(DateTime limit){for (<strong>in</strong>t i=1; i = limit){yield break;Stops if ourtime is up3I personally f<strong>in</strong>d that the hoops you have to jump through to achieve this often make the code much harderto read than just hav<strong>in</strong>g multiple return po<strong>in</strong>ts, especially as try/f<strong>in</strong>ally is available for cleanup and youneed to account for the possibility of exceptions occurr<strong>in</strong>g anyway. However, the po<strong>in</strong>t is that it can all be done.4Note that methods tak<strong>in</strong>g ref or out parameters can’t be implemented with iterator blocks.Licensed to Rhona Hadida


170 CHAPTER 6 Implement<strong>in</strong>g iterators the easy way}}...}yield return i;DateTime stop = DateTime.Now.AddSeconds(2);foreach (<strong>in</strong>t i <strong>in</strong> CountWithTimeLimit(stop)){Console.WriteL<strong>in</strong>e ("Received {0}", i);Thread.Sleep(300);}Typically when you run list<strong>in</strong>g 6.6 you’ll see about seven l<strong>in</strong>es of output. The foreachloop term<strong>in</strong>ates perfectly normally—as far as it’s concerned, the iterator has just runout of elements to iterate over. The yield break statement behaves very much like areturn statement <strong>in</strong> a normal method.So far, so simple. There’s one last aspect execution flow to explore: how and whenf<strong>in</strong>ally blocks are executed.EXECUTION OF FINALLY BLOCKSWe’re used to f<strong>in</strong>ally blocks execut<strong>in</strong>g whenever we leave the relevant scope. Iteratorblocks don’t behave quite like normal methods, though—as we’ve seen, a yieldreturn statement effectively pauses the method rather than exit<strong>in</strong>g it. Follow<strong>in</strong>g thatlogic, we wouldn’t expect any f<strong>in</strong>ally blocks to be executed at that po<strong>in</strong>t—and<strong>in</strong>deed they aren’t.However, appropriate f<strong>in</strong>ally blocks are executed when a yield break statement ishit, just as you’d expect them to be when return<strong>in</strong>g from a normal method. 5 List<strong>in</strong>g 6.7shows this <strong>in</strong> action—it’s the same code as list<strong>in</strong>g 6.6, but with a f<strong>in</strong>ally block. Thechanges are shown <strong>in</strong> bold.List<strong>in</strong>g 6.7Demonstration of yield break work<strong>in</strong>g with try/f<strong>in</strong>allystatic IEnumerable CountWithTimeLimit(DateTime limit){try{for (<strong>in</strong>t i=1; i = limit){yield break;}yield return i;}}f<strong>in</strong>ally{5They’re also called when execution leaves the relevant scope without reach<strong>in</strong>g either a yield return or ayield break statement. I’m only focus<strong>in</strong>g on the behavior of the two yield statements here because that’swhere the flow of execution is new and different.Licensed to Rhona Hadida


<strong>C#</strong> 2: simple iterators with yield statements171}}...Console.WriteL<strong>in</strong>e ("Stopp<strong>in</strong>g!");Executes howeverthe loop endsDateTime stop = DateTime.Now.AddSeconds(2);foreach (<strong>in</strong>t i <strong>in</strong> CountWithTimeLimit(stop)){Console.WriteL<strong>in</strong>e ("Received {0}", i);Thread.Sleep(300);}The f<strong>in</strong>ally block <strong>in</strong> list<strong>in</strong>g 6.7 is executed whether the iterator block just f<strong>in</strong>ishes bycount<strong>in</strong>g to 100, or whether it has to stop due to the time limit be<strong>in</strong>g reached. (Itwould also execute if the code threw an exception.) However, there are other ways wemight try to avoid the f<strong>in</strong>ally block from be<strong>in</strong>g called… let’s try to be sneaky.We’ve seen that code <strong>in</strong> the iterator block is only executed when MoveNext iscalled. So what happens if we never call MoveNext? Or if we call it a few times and thenstop? Let’s consider chang<strong>in</strong>g the “call<strong>in</strong>g” part of list<strong>in</strong>g 6.7 to this:DateTime stop = DateTime.Now.AddSeconds(2);foreach (<strong>in</strong>t i <strong>in</strong> CountWithTimeLimit(stop)){Console.WriteL<strong>in</strong>e ("Received {0}", i);if (i > 3){Console.WriteL<strong>in</strong>e("Return<strong>in</strong>g");return;}Thread.Sleep(300);}Here we’re not stopp<strong>in</strong>g early <strong>in</strong> the iterator code—we’re stopp<strong>in</strong>g early <strong>in</strong> the codeus<strong>in</strong>g the iterator. The output is perhaps surpris<strong>in</strong>g:Received 1Received 2Received 3Received 4Return<strong>in</strong>gStopp<strong>in</strong>g!Here, code is be<strong>in</strong>g executed after the return statement <strong>in</strong> the foreach loop. Thatdoesn’t normally happen unless there’s a f<strong>in</strong>ally block <strong>in</strong>volved—and <strong>in</strong> this casethere are two! We already know about the f<strong>in</strong>ally block <strong>in</strong> the iterator method, butthe question is what’s caus<strong>in</strong>g it to be executed. I gave a h<strong>in</strong>t to this earlier on—foreach calls Dispose on the IEnumerator it’s provided with, <strong>in</strong> its own f<strong>in</strong>ally block(just like the us<strong>in</strong>g statement). When you call Dispose on an iterator created with aniterator block before it’s f<strong>in</strong>ished iterat<strong>in</strong>g, the state mach<strong>in</strong>e executes any f<strong>in</strong>allyblocks that are <strong>in</strong> the scope of where the code is currently “paused.”We can prove very easily that it’s the call to Dispose that triggers this by us<strong>in</strong>g theiterator manually:Licensed to Rhona Hadida


172 CHAPTER 6 Implement<strong>in</strong>g iterators the easy wayDateTime stop = DateTime.Now.AddSeconds(2);IEnumerable iterable = CountWithTimeLimit(stop);IEnumerator iterator = iterable.GetEnumerator();iterator.MoveNext();Console.WriteL<strong>in</strong>e ("Received {0}", iterator.Current);iterator.MoveNext();Console.WriteL<strong>in</strong>e ("Received {0}", iterator.Current);This time the “stopp<strong>in</strong>g” l<strong>in</strong>e is never pr<strong>in</strong>ted. It’s relatively rare that you’ll want to term<strong>in</strong>atean iterator before it’s f<strong>in</strong>ished, and it’s relatively rare that you’ll be iterat<strong>in</strong>gmanually <strong>in</strong>stead of us<strong>in</strong>g foreach, but if you do, remember to wrap the iterator <strong>in</strong> aus<strong>in</strong>g statement.We’ve now covered most of the behavior of iterator blocks, but before we endthis section it’s worth consider<strong>in</strong>g a few oddities to do with the current Microsoftimplementation.6.2.4 Quirks <strong>in</strong> the implementationIf you compile iterator blocks with the Microsoft <strong>C#</strong> 2 compiler and look at the result<strong>in</strong>gIL <strong>in</strong> either ildasm or Reflector, you’ll see the nested type that the compiler hasgenerated for us beh<strong>in</strong>d the scenes. In my case when compil<strong>in</strong>g our (evolved) firstiterator block example, it was called IterationSample.d__0 (wherethe angle brackets aren’t <strong>in</strong>dicat<strong>in</strong>g a generic type parameter, by the way). I won’t gothrough exactly what’s generated <strong>in</strong> detail here, but it’s worth look<strong>in</strong>g at it <strong>in</strong> Reflectorto get a feel for what’s go<strong>in</strong>g on, preferably with the language specification next toyou: the specification def<strong>in</strong>es different states the type can be <strong>in</strong>, and this descriptionmakes the generated code easier to follow.Fortunately, as developers we don’t need to care much about the hoops the compilerhas to jump through. However, there are a few quirks about the implementationthat are worth know<strong>in</strong>g about:■ Before MoveNext is called for the first time, the Current property will alwaysreturn null (or the default value for the relevant type, for the generic <strong>in</strong>terface).■ After MoveNext has returned false, the Current property will always return thelast value returned.■ Reset always throws an exception <strong>in</strong>stead of resett<strong>in</strong>g like our manual implementationdid. This is required behavior, laid down <strong>in</strong> the specification.■ The nested class always implements both the generic and nongeneric form ofIEnumerator (and the generic and nongeneric IEnumerable where appropriate).Fail<strong>in</strong>g to implement Reset is quite reasonable—the compiler can’t reasonably workout what you’d need to do <strong>in</strong> order to reset the iterator, or even whether it’s feasible.Arguably Reset shouldn’t have been <strong>in</strong> the IEnumerator <strong>in</strong>terface to start with, and Icerta<strong>in</strong>ly can’t remember the last time I called it.Implement<strong>in</strong>g extra <strong>in</strong>terfaces does no harm either. It’s <strong>in</strong>terest<strong>in</strong>g that if yourmethod returns IEnumerable you end up with one class implement<strong>in</strong>g five <strong>in</strong>terfacesLicensed to Rhona Hadida


Real-life example: iterat<strong>in</strong>g over ranges173(<strong>in</strong>clud<strong>in</strong>g IDisposable). The language specification expla<strong>in</strong>s it <strong>in</strong> detail, but theupshot is that as a developer you don’t need to worry.The behavior of Current is odd—<strong>in</strong> particular, keep<strong>in</strong>g hold of the last item aftersupposedly mov<strong>in</strong>g off it could keep it from be<strong>in</strong>g garbage collected. It’s possible thatthis may be fixed <strong>in</strong> a later release of the <strong>C#</strong> compiler, though it’s unlikely as it couldbreak exist<strong>in</strong>g code. 6 Strictly speak<strong>in</strong>g, it’s correct from the <strong>C#</strong> 2 language specificationpo<strong>in</strong>t of view—the behavior of the Current property is undef<strong>in</strong>ed. It would benicer if it implemented the property <strong>in</strong> the way that the framework documentationsuggests, however, throw<strong>in</strong>g exceptions at appropriate times.So, there are a few t<strong>in</strong>y drawbacks from us<strong>in</strong>g the autogenerated code, but sensiblecallers won’t have any problems—and let’s face it, we’ve saved a lot of code <strong>in</strong> order tocome up with the implementation. This means it makes sense to use iterators morewidely than we might have done <strong>in</strong> <strong>C#</strong> 1. Our next section provides some sample codeso you can check your understand<strong>in</strong>g of iterator blocks and see how they’re useful <strong>in</strong>real life rather than just <strong>in</strong> theoretical scenarios.6.3 Real-life example: iterat<strong>in</strong>g over rangesHave you ever written some code that is really simple <strong>in</strong> itself but makes your projectmuch neater? It happens to me every so often, and it usually makes me happier than itprobably ought to—enough to get strange looks from colleagues, anyway. That sort ofslightly childish delight is particularly strong when it comes to us<strong>in</strong>g a new languagefeature <strong>in</strong> a way that is clearly nicer and not just do<strong>in</strong>g it for the sake of play<strong>in</strong>g withnew toys.6.3.1 Iterat<strong>in</strong>g over the dates <strong>in</strong> a timetableWhile work<strong>in</strong>g on a project <strong>in</strong>volv<strong>in</strong>g timetables, I came across a few loops, all ofwhich started like this:for (DateTime day = timetable.StartDate;day


174 CHAPTER 6 Implement<strong>in</strong>g iterators the easy wayonly made a few for loops neater <strong>in</strong> this case. In <strong>C#</strong> 2, however, it was easy. With<strong>in</strong> theclass represent<strong>in</strong>g the timetable, I simply added a property:public IEnumerable DateRange{get{for (DateTime day = StartDate;day


Real-life example: iterat<strong>in</strong>g over ranges175■ We want to be able to f<strong>in</strong>d out whether a particular value is with<strong>in</strong> the range.■ We want to be able to iterate through the range easily.The last po<strong>in</strong>t is obviously the most important one for this chapter, but the othersshape the fundamental decisions and ask further questions. In particular, it seemsobvious that this should use generics, but should we allow any type to be used for thebounds of the range, us<strong>in</strong>g an appropriate IComparer, or should we only allow typesthat implement IComparable, where T is the same type? When we’re iterat<strong>in</strong>g, howdo we move from one value to another? Should we always have to be able to iterateover a range, even if we’re only <strong>in</strong>terested <strong>in</strong> the other aspects? Should we be able tohave a “reverse” range (<strong>in</strong> other words, one with a start that is greater than the end,and therefore counts down rather than up)? Should the start and end po<strong>in</strong>ts be exclusiveor <strong>in</strong>clusive?All of these are important questions, and the normal answers would promote flexibilityand usefulness of the type—but our overrid<strong>in</strong>g priority here is to keep th<strong>in</strong>gssimple. So:■ We’ll make comparisons simple by constra<strong>in</strong><strong>in</strong>g the range’s type parameter T toimplement IComparable.■ We’ll make the class abstract and require a GetNextValue method to be implemented,which will be used dur<strong>in</strong>g iteration.■ We won’t worry about the idea of a range that can’t be iterated over.■ We won’t allow reverse ranges (so the end value must always be greater than orequal to the start value).■ Start and end po<strong>in</strong>ts will both be <strong>in</strong>clusive (so both the start and end po<strong>in</strong>ts areconsidered to be members of the range). One consequence of this is that wecan’t represent an empty range.The decision to make it an abstract class isn’t as limit<strong>in</strong>g as it possibly sounds—itmeans we’ll have derived classes like Int32Range and DateTimeRange that allow you tospecify the “step” to use when iterat<strong>in</strong>g. If we ever wanted a more general range, wecould always create a derived type that allows the step to be specified as a Converterdelegate. For the moment, however, let’s concentrate on the base type. With all therequirements specified, 8 we’re ready to write the code.6.3.3 Implementation us<strong>in</strong>g iterator blocksWith <strong>C#</strong> 2, implement<strong>in</strong>g this (fairly limited) Range type is remarkably easy. The hardestpart (for me) is remember<strong>in</strong>g how IComparable.CompareTo works. The trick I usuallyuse is to remember that if you compare the return value with 0, the result is the sameas apply<strong>in</strong>g that comparison operator between the two values <strong>in</strong>volved, <strong>in</strong> the orderthey’re specified. So x.CompareTo(y) < 0 has the same mean<strong>in</strong>g as x < y, for example.8If only real life were as simple as this. We haven’t had to get project approval and specification sign-off froma dozen different parties, nor have we had to create a project plan complete with resource requirements.Beautiful!Licensed to Rhona Hadida


176 CHAPTER 6 Implement<strong>in</strong>g iterators the easy wayList<strong>in</strong>g 6.8 is the complete Range class, although we can’t quite use it yet as it’s stillabstract.List<strong>in</strong>g 6.8The abstract Range class allow<strong>in</strong>g flexible iteration over its valuesus<strong>in</strong>g System;us<strong>in</strong>g System.Collections;us<strong>in</strong>g System.Collections.Generic;public abstract class Range : IEnumerablewhere T : IComparable{Breadonly T start;readonly T end;public Range(T start, T end) CPrevents{“reversed”if (start.CompareTo(end) > 0) ranges{throw new ArgumentOutOfRangeException();}this.start = start;this.end = end;}public T Start{get { return start; }}public T End{get { return end; }}public bool Conta<strong>in</strong>s(T value){return value.CompareTo(start) >= 0 &&value.CompareTo(end)


Real-life example: iterat<strong>in</strong>g over ranges177}}return GetEnumerator();protected abstract T GetNextValue(T current);The code is quite straightforward, due to <strong>C#</strong> 2’s iterator blocks. The type constra<strong>in</strong>t onT B ensures that we’ll be able to compare two values, and we perform an executiontimecheck to prevent a range be<strong>in</strong>g constructed with a lower bound higher than theupper bound C. We still need to work around the problem of implement<strong>in</strong>g bothIEnumerable and IEnumerable by us<strong>in</strong>g explicit <strong>in</strong>terface implementation for thenongeneric type E and expos<strong>in</strong>g the generic <strong>in</strong>terface implementation <strong>in</strong> the moreusual, implicit way D. If you don’t immediately see why this is necessary, look back tothe descriptions <strong>in</strong> sections 2.2.2 and 3.4.3.The actual iteration merely starts with the lower end of the range, and repeatedlyfetches the next value by call<strong>in</strong>g the abstract GetNextValue method F until we’vereached or exceeded the top of the range. If the last value found is <strong>in</strong> the range, weyield that as well. Note that the GetNextValue method shouldn’t need to keep anystate—given one value, it should merely return the next one <strong>in</strong> the range. This is usefulas it means that we should be able to make most of the derived types immutable,which is always nice. It’s easy to derive from Range, and we’ll implement the twoexamples given earlier—a range for dates and times (DateTimeRange) and a rangefor <strong>in</strong>tegers (Int32Range). They’re very short and very similar—list<strong>in</strong>g 6.9 shows bothof them together.F“Steps” fromone value tothe nextList<strong>in</strong>g 6.9Two classes derived from Range, to iterate over dates/times and <strong>in</strong>tegersus<strong>in</strong>g System;public class DateTimeRange : Range{readonly TimeSpan step;public DateTimeRange(DateTime start, DateTime end): this (start, end, TimeSpan.FromDays(1)){ }Usesdefaultsteppublic DateTimeRange(DateTime start,DateTime end,TimeSpan step): base(start, end){this.step = step;}Usesspecifiedstep}protected override DateTime GetNextValue(DateTime current){return current + step;Uses step to}f<strong>in</strong>d next valuepublic class Int32Range : Range{readonly <strong>in</strong>t step;Licensed to Rhona Hadida


178 CHAPTER 6 Implement<strong>in</strong>g iterators the easy way}public Int32Range(<strong>in</strong>t start, <strong>in</strong>t end): this (start, end, 1){ }public Int32Range(<strong>in</strong>t start, <strong>in</strong>t end, <strong>in</strong>t step): base(start, end){this.step = step;}protected override <strong>in</strong>t GetNextValue(<strong>in</strong>t current){return current + step;}If we could have specified addition (potentially us<strong>in</strong>g another type) as a type parameter,we could have used a s<strong>in</strong>gle type everywhere, which would have been neat. Thereare other obvious candidates, such as S<strong>in</strong>gleRange, DoubleRange, and DecimalRange,which I haven’t shown here. Even though we have to derive an extra class for eachtype we want to iterate over, the ga<strong>in</strong> over <strong>C#</strong> 1 is still tremendous. Without genericsthere would have been casts everywhere (and box<strong>in</strong>g for value types, which probably<strong>in</strong>cludes most types you want to use for ranges), and without iterator blocks the codefor the separate iterator type we’d have needed would probably have been about aslong as the base class itself. It’s worth not<strong>in</strong>g that when we use the step to f<strong>in</strong>d the nextvalue we don’t need to change anyth<strong>in</strong>g with<strong>in</strong> the <strong>in</strong>stance—both of these types areimmutable and so can be freely shared between threads, returned as properties, andused for all k<strong>in</strong>ds of other operations without fear.With the DateTimeRange type <strong>in</strong> place, I could replace the DateRange property <strong>in</strong>my timetable application, and remove the StartDate and EndDate properties entirely.The closely related values are now nicely encapsulated, the birds are s<strong>in</strong>g<strong>in</strong>g, and all isright with the world. There’s a lot more we could do to our Range type, but for themoment it’s served its purpose well.The Range type is just one example of a way <strong>in</strong> which iteration presents itself as anatural option <strong>in</strong> <strong>C#</strong> 2 where it would have been significantly less elegant <strong>in</strong> <strong>C#</strong> 1. Ihope you’ll consider it next time you f<strong>in</strong>d yourself writ<strong>in</strong>g a start/end pair of variablesor properties. As examples go, however, it’s pretty tame—iterat<strong>in</strong>g over a range isn’texactly a novel idea. To close the chapter, we’ll look at a considerably less conventionaluse of iterator blocks—this time for the purpose of provid<strong>in</strong>g one side of a multithreadedconversation.6.4 Pseudo-synchronous code withthe Concurrency and Coord<strong>in</strong>ation RuntimeThe Concurrency and Coord<strong>in</strong>ation Runtime (CCR) is a library developed by Microsoft tooffer an alternative way of writ<strong>in</strong>g asynchronous code that is amenable to complexcoord<strong>in</strong>ation. At the time of this writ<strong>in</strong>g, it’s only available as part of the MicrosoftRobotics Studio, 9 although hopefully that will change. We’re not go<strong>in</strong>g to delve <strong>in</strong>toLicensed to Rhona Hadida


Pseudo-synchronous code with the Concurrency and Coord<strong>in</strong>ation Runtime179the depths of it—fasc<strong>in</strong>at<strong>in</strong>g as it is—but it has one very <strong>in</strong>terest<strong>in</strong>g feature that’s relevantto this chapter. Rather than present real code that could compile and run (<strong>in</strong>volv<strong>in</strong>gpages and pages of description of the library), we’ll just look at some pseudo-codeand the ideas beh<strong>in</strong>d it. The purpose of this section isn’t to th<strong>in</strong>k too deeply aboutasynchronous code, but to show how by add<strong>in</strong>g <strong>in</strong>telligence to the compiler, a differentform of programm<strong>in</strong>g becomes feasible and reasonably elegant. The CCR usesiterator blocks <strong>in</strong> an <strong>in</strong>terest<strong>in</strong>g way that takes a certa<strong>in</strong> amount of mental effort tostart with. However, once you see the pattern it can lead to a radically different way ofth<strong>in</strong>k<strong>in</strong>g about asynchronous execution.Suppose we’re writ<strong>in</strong>g a server that needs to handle lots of requests. As part ofdeal<strong>in</strong>g with those requests, we need to first call a web service to fetch an authenticationtoken, and then use that token to get data from two <strong>in</strong>dependent data sources(say a database and another web service). We then process that data and return theresult. Each of the fetch stages could take a while—a second, say. The normal twooptions available are simply synchronous and asynchronous. The pseudo-code for thesynchronous version might look someth<strong>in</strong>g like this:Hold<strong>in</strong>gsValue ComputeTotalStockValue(str<strong>in</strong>g user, str<strong>in</strong>g password){Token token = AuthService.Check(user, password);Hold<strong>in</strong>gs stocks = DbService.GetStockHold<strong>in</strong>gs(token);StockRates rates = StockService.GetRates(token);}return ProcessStocks(stocks, rates);That’s very simple and easy to understand, but if each request takes a second, thewhole operation will take three seconds and tie up a thread for the whole timeit’s runn<strong>in</strong>g. If we want to scale up to hundreds of thousands of requests runn<strong>in</strong>g <strong>in</strong>parallel, we’re <strong>in</strong> trouble. Now let’s consider a fairly simple asynchronous version,which avoids ty<strong>in</strong>g up a thread when noth<strong>in</strong>g’s happen<strong>in</strong>g 10 and uses parallel callswhere possible:void StartComput<strong>in</strong>gTotalStockValue(str<strong>in</strong>g user, str<strong>in</strong>g password){AuthService.Beg<strong>in</strong>Check(user, password, AfterAuthCheck, null);}void AfterAuthCheck(IAsyncResult result){Token token = AuthService.EndCheck(result);IAsyncResult hold<strong>in</strong>gsAsync = DbService.Beg<strong>in</strong>GetStockHold<strong>in</strong>gs(token, null, null);StockService.Beg<strong>in</strong>GetRates(token, AfterGetRates, hold<strong>in</strong>gsAsync);}9http://www.microsoft.com/robotics10 Well, mostly—<strong>in</strong> order to keep it relatively simple, it might still be <strong>in</strong>efficient as we’ll see <strong>in</strong> a moment.Licensed to Rhona Hadida


180 CHAPTER 6 Implement<strong>in</strong>g iterators the easy wayvoid AfterGetRates(IAsyncResult result){IAsyncResult hold<strong>in</strong>gsAsync = (IAsyncResult)result.AsyncState;StockRates rates = StockService.EndGetRates(result);Hold<strong>in</strong>gs hold<strong>in</strong>gs = DbService.EndGetStockHold<strong>in</strong>gs(hold<strong>in</strong>gsAsync);}OnRequestComplete(ProcessStocks(stocks, rates));This is much harder to read and understand—and that’s only a simple version! Thecoord<strong>in</strong>ation of the two parallel calls is only achievable <strong>in</strong> a simple way because wedon’t need to pass any other state around, and even so it’s not ideal. (It will still blocka thread wait<strong>in</strong>g for the database call to complete if the second web service call completesfirst.) It’s far from obvious what’s go<strong>in</strong>g on, because the code is jump<strong>in</strong>g arounddifferent methods so much.By now you may well be ask<strong>in</strong>g yourself where iterators come <strong>in</strong>to the picture. Well,the iterator blocks provided by <strong>C#</strong> 2 effectively allow you to “pause” current executionat certa<strong>in</strong> po<strong>in</strong>ts of the flow through the block, and then come back to the same place,with the same state. The clever folks design<strong>in</strong>g the CCR realized that that’s exactlywhat’s needed for someth<strong>in</strong>g called a cont<strong>in</strong>uation-pass<strong>in</strong>g style of cod<strong>in</strong>g. We need totell the system that there are certa<strong>in</strong> operations we need to perform—<strong>in</strong>clud<strong>in</strong>g start<strong>in</strong>gother operations asynchronously—but that we’re then happy to wait until theasynchronous operations have f<strong>in</strong>ished before we cont<strong>in</strong>ue. We do this by provid<strong>in</strong>gthe CCR with an implementation of IEnumerator (where ITask is an <strong>in</strong>terfacedef<strong>in</strong>ed by the CCR). Here’s pseudo-code for our request handl<strong>in</strong>g <strong>in</strong> this style:IEnumerator ComputeTotalStockValue(str<strong>in</strong>g user, str<strong>in</strong>g pass){Token token = null;}yield return Ccr.ReceiveTask(AuthService.CcrCheck(user, pass)delegate(Token t){ token = t; });Hold<strong>in</strong>gs stocks = null;StockRates rates = null;yield return Ccr.MultipleReceiveTask(DbService.CcrGetStockHold<strong>in</strong>gs(token),StockService.CcrGetRates(token),delegate(Stocks s, StockRates sr){ stocks = s; rates = sr; });OnRequestComplete(ProcessStocks(stocks, rates));Confused? I certa<strong>in</strong>ly was when I first saw it—but now I’m somewhat <strong>in</strong> awe of how neatit is. The CCR will call <strong>in</strong>to our code (with a call to MoveNext on the iterator), and we’llexecute until and <strong>in</strong>clud<strong>in</strong>g the first yield return statement. The CcrCheck methodwith<strong>in</strong> AuthService would kick off an asynchronous request, and the CCR would waitLicensed to Rhona Hadida


Summary181(without us<strong>in</strong>g a dedicated thread) until it had completed, call<strong>in</strong>g the supplied delegateto handle the result. It would then call MoveNext aga<strong>in</strong>, and our method wouldcont<strong>in</strong>ue. This time we kick off two requests <strong>in</strong> parallel, and the CCR to call another delegatewith the results of both operations when they’ve both f<strong>in</strong>ished. After that, Move-Next is called for a f<strong>in</strong>al time and we get to complete the request process<strong>in</strong>g.Although it’s obviously more complicated than the synchronous version, it’s still all<strong>in</strong> one method, it will get executed <strong>in</strong> the order written, and the method itself canhold the state (<strong>in</strong> the local variables, which become state <strong>in</strong> the extra type generatedby the compiler). It’s fully asynchronous, us<strong>in</strong>g as few threads as it can get away with. Ihaven’t shown any error handl<strong>in</strong>g, but that’s also available <strong>in</strong> a sensible fashion thatforces you to th<strong>in</strong>k about the issue at appropriate places.It all takes a while to get your head around (at least unless you’ve seen cont<strong>in</strong>uationpass<strong>in</strong>gstyle code before) but the potential benefits <strong>in</strong> terms of writ<strong>in</strong>g correct, scalablecode are enormous—and it’s only feasible <strong>in</strong> such a neat way due to <strong>C#</strong> 2’s syntacticsugar around iterators and anonymous methods. The CCR hasn’t hit the ma<strong>in</strong>stream atthe time of writ<strong>in</strong>g, but it’s possible that it will become another normal part of thedevelopment toolkit 11 —and that other novel uses for iterator blocks will be thought upover time. As I said earlier, the po<strong>in</strong>t of the section is to open your m<strong>in</strong>d to possible usesof the work that the compiler can do for you beyond just simple iteration.6.5 Summary<strong>C#</strong> supports many patterns <strong>in</strong>directly, <strong>in</strong> terms of it be<strong>in</strong>g feasible to implement them<strong>in</strong> <strong>C#</strong>. However, relatively few patterns are directly supported <strong>in</strong> terms of language featuresbe<strong>in</strong>g specifically targeted at a particular pattern. In <strong>C#</strong> 1, the iterator patternwas directly supported from the po<strong>in</strong>t of view of the call<strong>in</strong>g code, but not from theperspective of the collection be<strong>in</strong>g iterated over. Writ<strong>in</strong>g a correct implementation ofIEnumerable was time-consum<strong>in</strong>g and error-prone, without be<strong>in</strong>g <strong>in</strong>terest<strong>in</strong>g. In <strong>C#</strong> 2the compiler does all the mundane work for you, build<strong>in</strong>g a state mach<strong>in</strong>e to copewith the “call-back” nature of iterators.It should be noted that iterator blocks have one aspect <strong>in</strong> common with the anonymousmethods we saw <strong>in</strong> chapter 5, even though the actual features are very different.In both cases, extra types may be generated, and a potentially complicated code transformationis applied to the orig<strong>in</strong>al source. Compare this with <strong>C#</strong> 1 where most of thetransformations for syntactic sugar (lock, us<strong>in</strong>g, and foreach be<strong>in</strong>g the most obviousexamples) were quite straightforward. We’ll see this trend toward smarter compilationcont<strong>in</strong>u<strong>in</strong>g with almost every aspect of <strong>C#</strong> 3.As well as see<strong>in</strong>g a real-life example of the use of iterators, we’ve taken a look athow one particular library has used them <strong>in</strong> a fairly radical way that has little to dowith what comes to m<strong>in</strong>d when we th<strong>in</strong>k about iteration over a collection. It’s worthbear<strong>in</strong>g <strong>in</strong> m<strong>in</strong>d that different languages have also looked at this sort of problem11 Some aspects of the CCR may also become available as part of the Parallel Extensions library described <strong>in</strong>chapter 13.Licensed to Rhona Hadida


182 CHAPTER 6 Implement<strong>in</strong>g iterators the easy waybefore—<strong>in</strong> computer science the term corout<strong>in</strong>e is applied to concepts of this nature.Different languages have historically supported them to a greater or lesser extent,with tricks be<strong>in</strong>g applicable to simulate them sometimes—for example, SimonTatham has an excellent article 12 on how even C can express corout<strong>in</strong>es if you’re will<strong>in</strong>gto bend cod<strong>in</strong>g standards somewhat. We’ve seen that <strong>C#</strong> 2 makes corout<strong>in</strong>es easyto write and use.Hav<strong>in</strong>g seen some major and sometimes m<strong>in</strong>d-warp<strong>in</strong>g language changes focusedaround a few key features, our next chapter is a change of pace. It describes a numberof small changes that make <strong>C#</strong> 2 more pleasant to work with than its predecessor,learn<strong>in</strong>g from the little niggles of the past to produce a language that has fewer roughedges, more scope for deal<strong>in</strong>g with awkward backward-compatibility cases, and a betterstory around work<strong>in</strong>g with generated code. Each feature is relatively straightforward,but there are quite a few of them…12 http://www.chiark.greenend.org.uk/~sgtatham/corout<strong>in</strong>es.htmlLicensed to Rhona Hadida


Conclud<strong>in</strong>g <strong>C#</strong> 2:the f<strong>in</strong>al featuresThis chapter covers■■■■■■■Partial typesStatic classesSeparate getter/setter property accessNamespace aliasesPragma directivesFixed-size buffersFriend assembliesSo far we’ve looked at the four biggest new features of <strong>C#</strong> 2: generics, nullable types,delegate enhancements, and iterator blocks. Each of these addresses a fairly complexrequirement, which is why we’ve gone <strong>in</strong>to each of them <strong>in</strong> some depth. The rema<strong>in</strong><strong>in</strong>gnew features of <strong>C#</strong> 2 are knock<strong>in</strong>g a few rough edges off <strong>C#</strong> 1. They’re little nigglesthat the language designers decided to correct—either areas where the languageneeded a bit of improvement for its own sake, or where the experience of work<strong>in</strong>gwith code generation and native code could be made more pleasant.183Licensed to Rhona Hadida


184 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featuresOver time, Microsoft has received a lot of feedback from the <strong>C#</strong> community (andits own developers, no doubt) about areas where <strong>C#</strong> hasn’t gleamed quite as brightlyas it might. Although it’s impossible to please everyone—and <strong>in</strong> particular the value ofeach feature has to be weighed aga<strong>in</strong>st the extra complexity it might br<strong>in</strong>g to the language—severalsmaller changes made it <strong>in</strong>to <strong>C#</strong> 2 along with the larger ones.None of the features <strong>in</strong> this chapter are particularly difficult, and we’ll go throughthem fairly quickly. Don’t underestimate how important they are, however—justbecause a topic can be explored <strong>in</strong> a few pages doesn’t mean it’s useless. You’re likelyto use some of these features on a fairly frequent basis. Here’s a quick rundown of thefeatures covered <strong>in</strong> this chapter, so you know what to expect:■ Partial types—The ability to write the code for a type <strong>in</strong> multiple source files;particularly handy for types where part of the code is autogenerated and therest is written manually.■ Static classes—Tidy<strong>in</strong>g up utility classes so that the compiler can make it clearerwhen you’re try<strong>in</strong>g to use them <strong>in</strong>appropriately.■ Separate getter/setter property access—F<strong>in</strong>ally, the ability to have a public getter anda private setter for properties! (That’s not the only comb<strong>in</strong>ation available, butit’s the most common one.)■ Namespace aliases—Ways out of sticky situations where type names aren’t unique.■ Pragma directives—Compiler-specific <strong>in</strong>structions for actions such as suppress<strong>in</strong>gspecific warn<strong>in</strong>gs for a particular section of code.■ Fixed-size buffers—More control over how structs handle arrays <strong>in</strong> unsafe code.■ InternalsVisibleToAttribute (friend assemblies)—A feature spann<strong>in</strong>g library, framework,and runtime, this allows selected assemblies more access when required.You may well be itch<strong>in</strong>g to get on to the sexy stuff from <strong>C#</strong> 3 by this po<strong>in</strong>t, and I don’tblame you. Noth<strong>in</strong>g <strong>in</strong> this chapter is go<strong>in</strong>g to set the world on fire—but each of thesefeatures can make your life more pleasant, or dig you out of a hole <strong>in</strong> some cases. Hav<strong>in</strong>gdampened your expectations somewhat, our first feature is actually pretty nifty.7.1 Partial typesThe first change we’ll look at is due to the power struggle that was usually <strong>in</strong>volvedwhen us<strong>in</strong>g code generators with <strong>C#</strong> 1. For W<strong>in</strong>dows Forms, the designer <strong>in</strong> Visual Studiohad to have its own regions of code that couldn’t be touched by developers, with<strong>in</strong>the same file that developers had to edit for user <strong>in</strong>terface functionality. This wasclearly a brittle situation.In other cases, code generators create source that is compiled alongside manuallywritten code. In <strong>C#</strong> 1, add<strong>in</strong>g extra functionality <strong>in</strong>volved deriv<strong>in</strong>g new classes from theautogenerated ones, which is ugly. There are plenty of other scenarios where hav<strong>in</strong>g anunnecessary l<strong>in</strong>k <strong>in</strong> the <strong>in</strong>heritance cha<strong>in</strong> can cause problems or reduce encapsulation.For <strong>in</strong>stance, if two different parts of your code want to call each other, you need virtualmethods for the parent type to call the child, and protected methods for the reverse situation,where normally you’d just use two private nonvirtual methods.Licensed to Rhona Hadida


Partial types185<strong>C#</strong> 2 allows more than one file to contribute to a type, and <strong>in</strong>deed IDEs can extendthis notion so that some of the code that is used for a type may not even be visible as<strong>C#</strong> source code at all. Types built from multiple source files are called partial types.In this section we’ll also learn about partial methods, which are only relevant <strong>in</strong> partialtypes and allow a rich but efficient way of add<strong>in</strong>g manually written hooks <strong>in</strong>toautogenerated code. This is actually a <strong>C#</strong> 3 feature (this time based on feedback about<strong>C#</strong> 2), but it’s far more logical to discuss it when we exam<strong>in</strong>e partial types than to waituntil the next part of the book.7.1.1 Creat<strong>in</strong>g a type with multiple filesCreat<strong>in</strong>g a partial type is a c<strong>in</strong>ch—you just need to <strong>in</strong>clude the partial contextualkeyword <strong>in</strong> the declaration for the type <strong>in</strong> each file it occurs <strong>in</strong>. A partial type can bedeclared with<strong>in</strong> as many files as you like, although all the examples <strong>in</strong> this sectionuse two.The compiler effectively comb<strong>in</strong>es all the source files together before compil<strong>in</strong>g.This means that code <strong>in</strong> one file can call code <strong>in</strong> another and vice versa, as shown <strong>in</strong>figure 7.1—there’s no need for “forward references” or other tricks.You can’t write half of a member <strong>in</strong> one file and half of it <strong>in</strong> another—each <strong>in</strong>dividualmember has to be complete with<strong>in</strong> its own file. 1 There are a few obvious restrictionsabout the declarations of the type—the declarations have to be compatible. Anyfile can specify <strong>in</strong>terfaces to be implemented (and they don’t have to be implemented<strong>in</strong> that file); any file can specify the base type; any file can specify a type parameterconstra<strong>in</strong>t. However, if multiple files specify a base type, those base types have to bethe same, and if multiple files specify type parameter constra<strong>in</strong>ts, the constra<strong>in</strong>ts haveto be identical. List<strong>in</strong>g 7.1 gives an example of the flexibility afforded (while notdo<strong>in</strong>g anyth<strong>in</strong>g even remotely useful).Example1.cspartial class Example{void FirstMethod(){SecondMethod();}}void ThirdMethod(){}Example2.cspartial class Example{void SecondMethod(){ThirdMethod();}}Figure 7.1 Code <strong>in</strong> partial types is able to “see” all of the members of the type,regardless of which file each member is <strong>in</strong>.1There’s an exception here: partial types can conta<strong>in</strong> nested partial types spread across the same set of files.Licensed to Rhona Hadida


186 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featuresList<strong>in</strong>g 7.1// Example1.csus<strong>in</strong>g System;partial class Example: IEquatablewhere TFirst : class{public bool Equals(str<strong>in</strong>g other)){return false;}}// Example2.csus<strong>in</strong>g System;partial class Example: EventArgs, IDisposable{public void Dispose()}{}I stress that this list<strong>in</strong>g is solely for the purpose of talk<strong>in</strong>g about what’s legal <strong>in</strong> a declaration—thetypes <strong>in</strong>volved were only picked for convenience and familiarity. We cansee that both declarations (B and D) contribute to the list of <strong>in</strong>terfaces that mustbe implemented. In this example, each file implements the <strong>in</strong>terfaces it declares,and that’s a common scenario, but it would be legal to move the implementation ofIDisposable E to Example1.cs and the implementation of IEquatable Cto Example2.cs. Only B specified any type constra<strong>in</strong>ts, and only C specified a baseclass. If B specified a base class, it would have to be EventArgs, and if D specifiedany type constra<strong>in</strong>ts they’d have to be exactly as <strong>in</strong> B. In particular, we couldn’t specifya type constra<strong>in</strong>t for TSecond <strong>in</strong> D even though it’s not mentioned <strong>in</strong> B. Bothtypes have to have the same access modifier, if any—we couldn’t make one declaration<strong>in</strong>ternal and the other public, for example.In “s<strong>in</strong>gle file” types, <strong>in</strong>itialization of member and static variables is guaranteed tooccur <strong>in</strong> the order they appear <strong>in</strong> the file, but there’s no guaranteed order when multiplefiles are <strong>in</strong>volved. Rely<strong>in</strong>g on the order of declaration with<strong>in</strong> the file is brittle tostart with—it leaves your code wide open to subtle bugs if a developer decides to“harmlessly” move th<strong>in</strong>gs around. So, it’s worth avoid<strong>in</strong>g this situation where you cananyway, but particularly avoid it with partial types.Now that we know what we can and can’t do, let’s take a closer look at why we’dwant to do it.7.1.2 Uses of partial typesDemonstration of mix<strong>in</strong>g declarations of a partial typeE ImplementsIDisposableSpecifies <strong>in</strong>terfaceand type parameterconstra<strong>in</strong>tAs I mentioned earlier, partial types are primarily useful <strong>in</strong> conjunction with designersand other code generators. If a code generator has to modify a file that is “owned” bya developer, there’s always a risk of th<strong>in</strong>gs go<strong>in</strong>g wrong. With the partial types model, aBC ImplementsIEquatableDSpecifies baseclass and <strong>in</strong>terfaceLicensed to Rhona Hadida


Partial types187code generator can own the file where it will work, and completely overwrite thewhole file every time it wants to.Some code generators may even choose not to generate a <strong>C#</strong> file at all until thebuild is well under way. For <strong>in</strong>stance, the W<strong>in</strong>dows Presentation Foundation version ofthe Snippy application has Extensible Application Markup Language (XAML) filesthat describe the user <strong>in</strong>terface. When the project is built, each XAML file is converted<strong>in</strong>to a <strong>C#</strong> file <strong>in</strong> the obj directory (the filenames end with “.g.cs” to show they’ve beengenerated) and compiled along with the partial class provid<strong>in</strong>g extra code for thattype (typically event handlers and extra construction code). This completely preventsdevelopers from tweak<strong>in</strong>g the generated code, at least without go<strong>in</strong>g to extremelengths of hack<strong>in</strong>g the build file.I’ve been careful to use the phrase code generator <strong>in</strong>stead of just designer becausethere are plenty of code generators around besides designers. For <strong>in</strong>stance, <strong>in</strong> VisualStudio 2005 web service proxies are generated as partial classes, and you may wellhave your own tools that generate code based on other data sources. One reasonablycommon example of this is Object Relational Mapp<strong>in</strong>g (ORM)—some ORM tools usedatabase entity descriptions from a configuration file (or straight from the database)and generate partial classes represent<strong>in</strong>g those entities.This makes it very straightforward to add behavior to the type: overrid<strong>in</strong>g virtualmethods of the base class, add<strong>in</strong>g new members with bus<strong>in</strong>ess logic, and so forth. It’s agreat way of lett<strong>in</strong>g the developer and the tool work together, rather than constantlysquabbl<strong>in</strong>g about who’s <strong>in</strong> charge.One scenario that is occasionally useful is for one file to be generated conta<strong>in</strong><strong>in</strong>gmultiple partial types, and then some of those types are enhanced <strong>in</strong> other files, onemanually generated file per type. To return to the ORM example, the tool could generatea s<strong>in</strong>gle file conta<strong>in</strong><strong>in</strong>g all the entity def<strong>in</strong>itions, and some of those entities couldhave extra code provided by the developer, us<strong>in</strong>g one file per entity. This keeps thenumber of automatically generated files low, but still provides good visibility of themanual code <strong>in</strong>volved.Figure 7.2 shows how the uses of partial types for XAML and entities are similar, butwith slightly different tim<strong>in</strong>g <strong>in</strong>volved when it comes to creat<strong>in</strong>g the autogenerated <strong>C#</strong>code.A somewhat different use of partial types is as an aid to refactor<strong>in</strong>g. Sometimes atype gets too big and assumes too many responsibilities. One first step to divid<strong>in</strong>g thebloated type <strong>in</strong>to smaller, more coherent ones can be to first split it <strong>in</strong>to a partial typeover two or more files. This can be done with no risk and <strong>in</strong> an experimental manner,mov<strong>in</strong>g methods between files until each file only addresses a particular concern.Although the next step of splitt<strong>in</strong>g the type up is still far from automatic at that stage,it should be a lot easier to see the end goal.When partial types first appeared <strong>in</strong> <strong>C#</strong> 2, no one knew exactly how they’d be used.One feature that was almost immediately requested was a way to provide optional“extra” code for generated methods to call. This need has been addressed by <strong>C#</strong> 3 withpartial methods.Licensed to Rhona Hadida


188 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featuresGuiPage.xaml.cs(Handwritten <strong>C#</strong>)GuiPage.xaml(XAML)Schema/model(Database, XML, etc)XAML to <strong>C#</strong>converter(Build time)Code generator(Prebuild)GuiPage.g.cs(<strong>C#</strong>)Customer.cs(Handwritten <strong>C#</strong>)GeneratedEntities.cs(<strong>C#</strong> - <strong>in</strong>cludes partialCustomer class)<strong>C#</strong> compilation<strong>C#</strong> compilationGuiPage type(Part of an assembly)Customer type(Part of an assembly)Us<strong>in</strong>g XAML for declarative UI designPrebuild<strong>in</strong>g partial classes for database entitiesFigure 7.2Comparison between XAML precompilation and autogenerated entity classes7.1.3 Partial methods—<strong>C#</strong> 3 only!Just to reiterate my previous explanation, I realize that the rest of this part of the bookhas just been deal<strong>in</strong>g with <strong>C#</strong> 2 features—but partial methods don’t fit with any of theother <strong>C#</strong> 3 features and they do fit <strong>in</strong> very well when describ<strong>in</strong>g partial types. Apologiesfor any confusion this may cause.Back to the feature: sometimes we want to be able to specify behavior <strong>in</strong> a manuallycreated file and use that behavior from an automatically generated file. For <strong>in</strong>stance,<strong>in</strong> a class that has lots of automatically generated properties, we might want to be ableto specify code to be executed as validation of a new value for some of those properties.Another common scenario is for a code-generation tool to <strong>in</strong>clude constructors—manuallywritten code often wants to hook <strong>in</strong>to object construction to setdefault values, perform some logg<strong>in</strong>g, and so forth.In <strong>C#</strong> 2, these requirements could only be met either by us<strong>in</strong>g events that the manuallygenerated code could subscribe to, or by mak<strong>in</strong>g the automatically generatedcode assume that the handwritten code will <strong>in</strong>clude methods of a particular name—mak<strong>in</strong>g the whole code fail to compile unless the relevant methods are provided.Alternatively, the generated code can provide a base class with virtual methods that donoth<strong>in</strong>g by default. The manually generated code can then derive from the class andoverride some or all of the methods.All of these solutions are somewhat messy. <strong>C#</strong> 3’s partial methods effectively provideoptional hooks that have no cost whatsoever if they’re not implemented—any callsto the unimplemented partial methods are removed by the compiler. It’s easiest tounderstand this with an example. List<strong>in</strong>g 7.2 shows a partial type specified <strong>in</strong> two files,with the constructor <strong>in</strong> the automatically generated code call<strong>in</strong>g two partial methods,one of which is implemented <strong>in</strong> the manually generated code.Licensed to Rhona Hadida


Partial types189List<strong>in</strong>g 7.2A partial method called from a constructor// Generated.csus<strong>in</strong>g System;partial class PartialMethodDemo{public PartialMethodDemo(){OnConstructorStart();Console.WriteL<strong>in</strong>e("Generated constructor");OnConstructorEnd();}}partial void OnConstructorStart();partial void OnConstructorEnd();// Handwritten.csus<strong>in</strong>g System;partial class PartialMethodDemo{partial void OnConstructorEnd(){Console.WriteL<strong>in</strong>e("Manual code");}}As shown <strong>in</strong> list<strong>in</strong>g 7.2, partial methods are declared just like abstract methods: by provid<strong>in</strong>gthe signature without any implementation but us<strong>in</strong>g the partial modifier.Similarly, the actual implementations just have the partial modifier but are otherwiselike normal methods.Call<strong>in</strong>g the parameterless constructor of PartialMethodDemo would result <strong>in</strong> “Generatedconstructor” and then “Manual code” be<strong>in</strong>g pr<strong>in</strong>ted out. Exam<strong>in</strong><strong>in</strong>g the IL forthe constructor, you wouldn’t see a call to OnConstructorStart because it no longerexists—there’s no trace of it anywhere <strong>in</strong> the compiled type.Because the method may not exist, partial methods must have a return type ofvoid and can’t take out parameters. They have to be private, but they can be staticand/or generic. If the method isn’t implemented <strong>in</strong> one of the files, the whole statementcall<strong>in</strong>g it is removed, <strong>in</strong>clud<strong>in</strong>g any argument evaluations. If evaluat<strong>in</strong>g any of thearguments has a side effect that you want to occur whether or not the partial methodis implemented, you should perform the evaluation separately. For <strong>in</strong>stance, supposeyou have the follow<strong>in</strong>g code:LogEntity(LoadAndCache(id));Here LogEntity is a partial method, and LoadAndCache loads an entity from the databaseand <strong>in</strong>serts it <strong>in</strong>to the cache. You might want to use this <strong>in</strong>stead:MyEntity entity = LoadAndCache(id);LogEntity(entity);That way, the entity is loaded and cached regardless of whether an implementationhas been provided for LogEntity. Of course, if the entity can be loaded equallyLicensed to Rhona Hadida


190 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featurescheaply later on, and may not even be required, you should leave the statement <strong>in</strong> thefirst form and avoid an unnecessary load <strong>in</strong> some cases.To be honest, unless you’re writ<strong>in</strong>g your own code generators, you’re more likely tobe implement<strong>in</strong>g partial methods than declar<strong>in</strong>g and call<strong>in</strong>g them. If you’re only implement<strong>in</strong>gthem, you don’t need to worry about the argument evaluation side of th<strong>in</strong>gs.In summary, partial methods <strong>in</strong> <strong>C#</strong> 3 allow generated code to <strong>in</strong>teract with handwrittencode <strong>in</strong> a rich manner without any performance penalties for situations where the<strong>in</strong>teraction is unnecessary. This is a natural cont<strong>in</strong>uation of the <strong>C#</strong> 2 partial types feature,which enables a much more productive relationship between code-generationtools and developers.Our next feature is entirely different, and is just a way of tell<strong>in</strong>g the compiler moreabout the <strong>in</strong>tended nature of a type so that it can perform more check<strong>in</strong>g on both thetype itself and any code us<strong>in</strong>g it.7.2 Static classesOur second new feature is <strong>in</strong> some ways completely unnecessary—it just makes th<strong>in</strong>gstidier and a bit more elegant when you write utility classes.Everyone has utility classes. I haven’t seen a significant project <strong>in</strong> either Java or <strong>C#</strong>that didn’t have at least one class consist<strong>in</strong>g solely of static methods. The classic exampleappear<strong>in</strong>g <strong>in</strong> developer code is a type with str<strong>in</strong>g helper methods, do<strong>in</strong>g anyth<strong>in</strong>gfrom escap<strong>in</strong>g, revers<strong>in</strong>g, smart replac<strong>in</strong>g—you name it. An example from the Frameworkis the System.Math class. The key features of a utility class are as follows:■ All members are static (except a private constructor).■ The class derives directly from object.■ Typically there’s no state at all, unless there’s some cach<strong>in</strong>g or a s<strong>in</strong>gleton<strong>in</strong>volved.■ There are no visible constructors.■ The class is sealed if the developer remembers to do so.The last two po<strong>in</strong>ts are optional, and <strong>in</strong>deed if there are no visible constructors(<strong>in</strong>clud<strong>in</strong>g protected ones) then the class is effectively sealed anyway. Both of them helpmake the purpose of the class more obvious, however.List<strong>in</strong>g 7.3 gives an example of a <strong>C#</strong> 1 utility class—then we’ll see how <strong>C#</strong> 2improves matters.List<strong>in</strong>g 7.3A typical <strong>C#</strong> 1 utility classus<strong>in</strong>g System;public sealed class Str<strong>in</strong>gHelper{private Str<strong>in</strong>gHelper(){}BSeals class toprevent derivationPrevents <strong>in</strong>stantiationfrom other codeLicensed to Rhona Hadida


Static classes191}public static str<strong>in</strong>g Reverse(str<strong>in</strong>g <strong>in</strong>put){char[] chars = <strong>in</strong>put.ToCharArray();Array.Reverse(chars);return new str<strong>in</strong>g(chars);}All methodsare staticThe private constructor B may seem odd—why have it at all if it’s private and nevergo<strong>in</strong>g to be used? The reason is that if you don’t supply any constructors for a class, the<strong>C#</strong> 1 compiler will always provide a default constructor that is public and parameterless. Inthis case, we don’t want any visible constructors, so we have to provide a private one.This pattern works reasonably well, but <strong>C#</strong> 2 makes it explicit and actively preventsthe type from be<strong>in</strong>g misused. First we’ll see what changes are needed to turn list<strong>in</strong>g 7.3<strong>in</strong>to a “proper” static class as def<strong>in</strong>ed <strong>in</strong> <strong>C#</strong> 2. As you can see from list<strong>in</strong>g 7.4, very littleaction is required.List<strong>in</strong>g 7.4The same utility class as <strong>in</strong> list<strong>in</strong>g 7.3 but converted <strong>in</strong>to a <strong>C#</strong> 2 static classus<strong>in</strong>g System;public static class Str<strong>in</strong>gHelper{public static str<strong>in</strong>g Reverse(str<strong>in</strong>g <strong>in</strong>put){char[] chars = <strong>in</strong>put.ToCharArray();Array.Reverse(chars);return new str<strong>in</strong>g(chars);}}We’ve used the static modifier <strong>in</strong> the class declaration this time <strong>in</strong>stead of sealed,and we haven’t <strong>in</strong>cluded a constructor at all—those are the only code differences. The<strong>C#</strong> 2 compiler knows that a static class shouldn’t have any constructors, so it doesn’tprovide a default one. In fact, the compiler enforces a number of constra<strong>in</strong>ts on theclass def<strong>in</strong>ition:■ It can’t be declared as abstract or sealed, although it’s implicitly both.■ It can’t specify any implemented <strong>in</strong>terfaces.■ It can’t specify a base type.■ It can’t <strong>in</strong>clude any nonstatic members, <strong>in</strong>clud<strong>in</strong>g constructors.■ It can’t <strong>in</strong>clude any operators.■ It can’t <strong>in</strong>clude any protected or protected <strong>in</strong>ternal members.It’s worth not<strong>in</strong>g that although all the members have to be static, you’ve got to explicitlymake them static except for nested types and constants. Although nested types areimplicitly static members of the enclos<strong>in</strong>g class, the nested type itself can be a nonstatictype if that’s required.The compiler not only puts constra<strong>in</strong>ts on the def<strong>in</strong>ition of static classes, though—it also guards aga<strong>in</strong>st their misuse. As it knows that there can never be any <strong>in</strong>stances ofLicensed to Rhona Hadida


192 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featuresthe class, it prevents any use of it that would require one. For <strong>in</strong>stance, all of the follow<strong>in</strong>gare <strong>in</strong>valid when Str<strong>in</strong>gHelper is a static class:Str<strong>in</strong>gHelper variable = null;Str<strong>in</strong>gHelper[] array = null;public void Method1(Str<strong>in</strong>gHelper x) {}public Str<strong>in</strong>gHelper Method1() { return null; }List x = new List();None of these are prevented if the class just follows the <strong>C#</strong> 1 pattern—but all of themare essentially useless. In short, static classes <strong>in</strong> <strong>C#</strong> 2 don’t allow you to do anyth<strong>in</strong>g youcouldn’t do before—but they prevent you from do<strong>in</strong>g th<strong>in</strong>gs that you shouldn’t havebeen do<strong>in</strong>g anyway.The next feature on our list is one with a more positive feel. It’s aimed at a very specific—althoughwidely encountered—situation, and allows a solution that is neitherugly nor breaks encapsulation, which was the choice available <strong>in</strong> <strong>C#</strong> 1.7.3 Separate getter/setter property accessI’ll admit to be<strong>in</strong>g slightly bemused when I first saw that <strong>C#</strong> 1 didn’t allow you to havea public getter and a private setter for properties. This isn’t the only comb<strong>in</strong>ation ofaccess modifiers that is prohibited by <strong>C#</strong> 1, but it’s the most commonly desired one. Infact, <strong>in</strong> <strong>C#</strong> 1 both the getter and the setter have to have the same accessibility—it’sdeclared as part of the property declaration rather than as part of the getter or setter.There are perfectly good reasons to want different accessibility for the getter andthe setter—often you may want some validation, logg<strong>in</strong>g, lock<strong>in</strong>g, or other code to beexecuted when chang<strong>in</strong>g a variable that backs the property but you don’t want tomake the property writable to code outside the class. In <strong>C#</strong> 1 the alternatives wereeither to break encapsulation by mak<strong>in</strong>g the property writable aga<strong>in</strong>st your betterjudgment or to write a SetXXX() method <strong>in</strong> the class to do the sett<strong>in</strong>g, which franklylooks ugly when you’re used to “real” properties.<strong>C#</strong> 2 fixes the problem by allow<strong>in</strong>g either the getter or the setter to explicitly have amore restrictive access than that declared for the property itself. This is most easilyseen with an example:str<strong>in</strong>g name;public str<strong>in</strong>g Name{get { return name; }private set{// Validation, logg<strong>in</strong>g etc herename = value;}}In this case, the Name property is effectively read-only to all other types, 2 but we canuse the familiar property syntax for sett<strong>in</strong>g the property with<strong>in</strong> the type itself. The2Except nested types, which always have access to private members of their enclos<strong>in</strong>g types.Licensed to Rhona Hadida


Namespace aliases193same syntax is also available for <strong>in</strong>dexers as well as properties. You could make the settermore public than the getter (a protected getter and a public setter, for example)but that’s a pretty rare situation, <strong>in</strong> the same way that write-only properties are few andfar between compared with read-only properties.NOTETrivia: The only place where “private” is required—Everywhere else <strong>in</strong> <strong>C#</strong>, thedefault access modifier <strong>in</strong> any given situation is the most private one possible.In other words, if someth<strong>in</strong>g can be declared to be private, then leav<strong>in</strong>gout the access modifiers entirely will default it to be<strong>in</strong>g private. This isa nice element of language design, because it’s hard to get it wrong accidentally:if you want someth<strong>in</strong>g to be more public than it is, you’ll noticewhen you try to use it. If, however, you accidentally make someth<strong>in</strong>g “toopublic,” then the compiler can’t help you to spot the problem. For this reason,my personal convention is not to use any access modifiers unless Ineed to; that way, the code highlights (by way of an access modifier) whenI’ve explicitly chosen to make someth<strong>in</strong>g more public than it might be.Specify<strong>in</strong>g the access of a property getter or setter is the one exception tothis rule—if you don’t specify anyth<strong>in</strong>g, the default is to give the getter/setter the same access as the overall property itself.Note that you can’t declare the property itself to be private and make the getter public—youcan only make a particular getter/setter more private than the property. Also,you can’t specify an access modifier for both the getter and the setter—that would justbe silly, as you could declare the property itself to be whichever is the more public ofthe two modifiers.This aid to encapsulation is long overdue. There’s still noth<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 2 to stopother code <strong>in</strong> the same class from bypass<strong>in</strong>g the property and go<strong>in</strong>g directly to whateverfields are back<strong>in</strong>g it, unfortunately. As we’ll see <strong>in</strong> the next chapter, <strong>C#</strong> 3 fixes this<strong>in</strong> one particular case, but not <strong>in</strong> general. Maybe <strong>in</strong> <strong>C#</strong> 4…We move from a feature you may well want to use fairly regularly to one that youwant to avoid most of the time—it allows your code to be absolutely explicit <strong>in</strong> termsof which types it’s referr<strong>in</strong>g to, but at a significant cost to readability.7.4 Namespace aliasesNamespaces are simply ways of keep<strong>in</strong>g fully qualified names of types dist<strong>in</strong>ct even whenthe unqualified names may be the same. An example of this is the unqualified nameButton. There are two classes with that name <strong>in</strong> the .NET 2.0 Framework: System.W<strong>in</strong>dows.Forms.Button and System.Web.UI.WebControls.Button. Although they’reboth called Button, it’s easy to tell them apart by their namespaces. This mirrors real lifequite closely—you may know several people called Jon, but you’re unlikely to know anyoneelse called Jon Skeet. If you’re talk<strong>in</strong>g with friends <strong>in</strong> a particular context, you maywell be able to just use the name Jon without specify<strong>in</strong>g exactly which one you’re talk<strong>in</strong>gabout—but <strong>in</strong> other contexts you may need to provide more exact <strong>in</strong>formation.The us<strong>in</strong>g directive of <strong>C#</strong> 1 (not to be confused with the us<strong>in</strong>g statement that callsDispose automatically) was available <strong>in</strong> two flavors—one created an alias for aLicensed to Rhona Hadida


194 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featuresnamespace or type (for example, us<strong>in</strong>g Out = System.Console;) and the other just<strong>in</strong>troduced a namespace <strong>in</strong>to the list of contexts the compiler would search whenlook<strong>in</strong>g for a type (for example, us<strong>in</strong>g System.IO;). By and large, this was adequate,but there are a few situations that the language simply couldn’t cope with, and otherswhere automatically generated code would have to go out of its way to make absolutelysure that the right namespaces and types were be<strong>in</strong>g used whatever happened.<strong>C#</strong> 2 fixes these problems, br<strong>in</strong>g<strong>in</strong>g an additional robustness to the language. It’snot that the code be<strong>in</strong>g generated is any more robust <strong>in</strong> terms of execution, but youcan write code that is guaranteed to mean what you want it to regardless of whichother types, assemblies, and namespaces are <strong>in</strong>troduced. These extreme measures arerarely needed outside automatically generated code, but it’s nice to know that they’rethere when you need them. In <strong>C#</strong> 2 there are three types of aliases: the namespacealiases of <strong>C#</strong> 1, the global namespace alias, and extern aliases. We’ll start off with the onetype of alias that was already present <strong>in</strong> <strong>C#</strong> 1, but we’ll <strong>in</strong>troduce a new way of us<strong>in</strong>galiases to ensure that the compiler knows to treat it as an alias rather than check<strong>in</strong>g tosee whether it’s the name of another namespace or type.7.4.1 Qualify<strong>in</strong>g namespace aliasesEven <strong>in</strong> <strong>C#</strong> 1, it was a good idea to avoid namespace aliases wherever possible. Every sooften you might f<strong>in</strong>d that one type name clashed with another—as with our Buttonexample earlier—and so you either had to specify the full name <strong>in</strong>clud<strong>in</strong>g thenamespace every time you used them, or have an alias that dist<strong>in</strong>guished the two, <strong>in</strong>some ways act<strong>in</strong>g like a shortened form of the namespace. List<strong>in</strong>g 7.5 shows an examplewhere the two types of Button are used, qualified by an alias.List<strong>in</strong>g 7.5Us<strong>in</strong>g aliases to dist<strong>in</strong>guish between different Button typesus<strong>in</strong>g System;us<strong>in</strong>g W<strong>in</strong>Forms = System.W<strong>in</strong>dows.Forms;us<strong>in</strong>g WebForms = System.Web.UI.WebControls;class Test{static void Ma<strong>in</strong>(){Console.WriteL<strong>in</strong>e (typeof (W<strong>in</strong>Forms.Button));Console.WriteL<strong>in</strong>e (typeof (WebForms.Button));}}List<strong>in</strong>g 7.5 compiles without any errors or warn<strong>in</strong>gs, although it’s still not as pleasantas it would be if we only needed to deal with one k<strong>in</strong>d of Button to start with. There’sa problem, however—what if someone were to <strong>in</strong>troduce a type or namespace calledW<strong>in</strong>Forms or WebForms? The compiler wouldn’t know what W<strong>in</strong>Forms.Button meant,and would use the type or namespace <strong>in</strong> preference to the alias. We want to be able totell the compiler that we need it to treat W<strong>in</strong>Forms as an alias, even though it’s availableelsewhere. <strong>C#</strong> 2 <strong>in</strong>troduces the ::syntax to do this, as shown <strong>in</strong> list<strong>in</strong>g 7.6.Licensed to Rhona Hadida


Namespace aliases195List<strong>in</strong>g 7.6us<strong>in</strong>g System;us<strong>in</strong>g W<strong>in</strong>Forms = System.W<strong>in</strong>dows.Forms;us<strong>in</strong>g WebForms = System.Web.UI.WebControls;class W<strong>in</strong>Forms{}class Test{static void Ma<strong>in</strong>(){Console.WriteL<strong>in</strong>e (typeof (W<strong>in</strong>Forms::Button));Console.WriteL<strong>in</strong>e (typeof (WebForms::Button));}}Instead of W<strong>in</strong>Forms.Button, list<strong>in</strong>g 7.6 uses W<strong>in</strong>Forms::Button, and the compiler ishappy. If you change the :: back to . you’ll get a compilation error. So, if you use ::everywhere you use an alias, you’ll be f<strong>in</strong>e, right? Well, not quite…7.4.2 The global namespace aliasUs<strong>in</strong>g :: to tell the compiler to use aliasesThere’s one part of the namespace hierarchy that you can’t def<strong>in</strong>e your own alias for:the root of it, or the global namespace. Suppose you have two classes, both namedConfiguration; one is with<strong>in</strong> a namespace of MyCompany and the other doesn’t havea namespace specified at all. Now, how can you refer to the “root” Configurationclass from with<strong>in</strong> the MyCompany namespace? You can’t use a normal alias, and if youjust specify Configuration the compiler will use MyCompany.Configuration.In <strong>C#</strong> 1, there was simply no way of gett<strong>in</strong>g around this. Aga<strong>in</strong>, <strong>C#</strong> 2 comes to therescue, allow<strong>in</strong>g you to use global::Configuration to tell the compiler exactly whatyou want. List<strong>in</strong>g 7.7 demonstrates both the problem and the solution.List<strong>in</strong>g 7.7Use of the global namespace alias to specify the desired type exactlyus<strong>in</strong>g System;class Configuration {}namespace Chapter7{class Configuration {}}class Test{static void Ma<strong>in</strong>(){Console.WriteL<strong>in</strong>e(typeof(Configuration));Console.WriteL<strong>in</strong>e(typeof(global::Configuration));Console.WriteL<strong>in</strong>e(typeof(global::Chapter7.Test));}}Licensed to Rhona Hadida


196 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featuresMost of list<strong>in</strong>g 7.7 is just sett<strong>in</strong>g up the situation—the three l<strong>in</strong>es with<strong>in</strong> Ma<strong>in</strong> are the<strong>in</strong>terest<strong>in</strong>g ones. The first l<strong>in</strong>e pr<strong>in</strong>ts “Chapter7.Configuration” as the compilerresolves Configuration to that type before mov<strong>in</strong>g out to the namespace root. Thesecond l<strong>in</strong>e <strong>in</strong>dicates that the type has to be <strong>in</strong> the global namespace, and so simplypr<strong>in</strong>ts “Configuration.” I <strong>in</strong>cluded the third l<strong>in</strong>e to demonstrate that us<strong>in</strong>g the globalalias you can still refer to types with<strong>in</strong> namespaces, but you have to specify the fullyqualified name.At this po<strong>in</strong>t we can get to any uniquely named type, us<strong>in</strong>g the global namespacealias if necessary—and <strong>in</strong>deed if you ever write a code generator where the codedoesn’t need to be readable, you may wish to use this feature liberally to make surethat you always refer to the correct type whatever other types are actually present bythe time the code is compiled. What do we do if the type’s name isn’t unique evenwhen we <strong>in</strong>clude its namespace? The plot thickens…7.4.3 Extern aliasesAt the start of this section, I referred to human names as examples of namespaces andcontexts. I specifically said that you’re unlikely to know more than one person calledJon Skeet. However, I know that there is more than one person with my name, and it’snot beyond the realm of possibility to suppose that you might know two or more of us.In this case, <strong>in</strong> order to specify which one you mean you have to provide some more<strong>in</strong>formation beyond just the full name—the reason you know the particular person,or the country he lives <strong>in</strong>, or someth<strong>in</strong>g similarly dist<strong>in</strong>ctive.<strong>C#</strong> 2 lets you specify that extra <strong>in</strong>formation <strong>in</strong> the form of an extern alias—a namethat exists not only <strong>in</strong> your source code, but also <strong>in</strong> the parameters you pass to the compiler.For the Microsoft <strong>C#</strong> compiler, this means specify<strong>in</strong>g the assembly that conta<strong>in</strong>sthe types <strong>in</strong> question. Let’s suppose that two assemblies First.dll and Second.dllboth conta<strong>in</strong>ed a type called Demo.Example. We couldn’t just use the fully qualifiedname to dist<strong>in</strong>guish them, as they’ve both got the same fully qualified name. Instead, wecan use extern aliases to specify which we mean. List<strong>in</strong>g 7.8 shows an example of the <strong>C#</strong>code <strong>in</strong>volved, along with the command l<strong>in</strong>e needed to compile it.List<strong>in</strong>g 7.8Work<strong>in</strong>g with different types of the same type <strong>in</strong> different assemblies// Compile with// csc Test.cs /r:FirstAlias=First.dll /r:SecondAlias=Second.dllextern alias FirstAlias;extern alias SecondAlias;us<strong>in</strong>g System;Specifies twoextern aliasesus<strong>in</strong>g FD = FirstAlias::Demo;Refers to extern aliasclass TestB with namespace alias{static void Ma<strong>in</strong>(){Console.WriteL<strong>in</strong>e(typeof(FD.Example));CUsesnamespacealiasLicensed to Rhona Hadida


Pragma directives197}}Console.WriteL<strong>in</strong>e(typeof(SecondAlias::Demo.Example));Console.ReadL<strong>in</strong>e();The code <strong>in</strong> list<strong>in</strong>g 7.8 is quite straightforward, and demonstrates that you caneither use extern directives directly D or via namespace aliases (B and C). In fact,a normal us<strong>in</strong>g directive without an alias (for example, us<strong>in</strong>g First-Alias::Demo;) would have allowed us to use the name Example without any furtherqualification at all. It’s also worth not<strong>in</strong>g that one extern alias can cover multipleassemblies, and several extern aliases can all refer to the same assembly. To specifyan external alias <strong>in</strong> Visual Studio (2005 or 2008), just select the assembly referencewith<strong>in</strong> Solution Explorer and modify the Aliases value <strong>in</strong> the Properties w<strong>in</strong>dow, asshown <strong>in</strong> figure 7.3.Hopefully I don’t need to persuade you toavoid this k<strong>in</strong>d of situation wherever you possiblycan. It can be necessary to work with assembliesfrom different third parties who happen to haveused the same fully qualified type name. Whereyou have more control over the nam<strong>in</strong>g, however,make sure that your names never lead you <strong>in</strong>tothis territory <strong>in</strong> the first place.Our next feature is almost a meta-feature. Theexact purpose it serves depends on which compileryou’re us<strong>in</strong>g, because its whole purpose is toenable control over compiler-specific features—but we’ll concentrate on the Microsoft compiler.7.5 Pragma directivesDescrib<strong>in</strong>g pragma directives <strong>in</strong> general is extremely easy: a pragma directive is a preprocess<strong>in</strong>gdirective represented by a l<strong>in</strong>e beg<strong>in</strong>n<strong>in</strong>g with #pragma. The rest of the l<strong>in</strong>ecan conta<strong>in</strong> any text at all. The result of a pragma directive cannot change the behaviorof the program to contravene anyth<strong>in</strong>g with<strong>in</strong> the <strong>C#</strong> language specification, but itcan do anyth<strong>in</strong>g outside the scope of the specification. If the compiler doesn’t understanda particular pragma directive, it can issue a warn<strong>in</strong>g but not an error.That’s basically everyth<strong>in</strong>g the specification has to say on the subject. TheMicrosoft <strong>C#</strong> compiler understands two pragma directives: warn<strong>in</strong>gs and checksums.7.5.1 Warn<strong>in</strong>g pragmasJust occasionally, the <strong>C#</strong> compiler issues warn<strong>in</strong>gs that are justifiable but annoy<strong>in</strong>g.The correct response to a compiler warn<strong>in</strong>g is almost always to fix it—the code is rarelymade worse by fix<strong>in</strong>g the warn<strong>in</strong>g, and usually it’s improved.However, just occasionally there’s a good reason to ignore a warn<strong>in</strong>g—and that’swhat warn<strong>in</strong>g pragmas are available for. As an example, we’ll create a private field thatDUsesextern aliasdirectlyFigure 7.3 Part of the Propertiesw<strong>in</strong>dow of Visual Studio 2008,show<strong>in</strong>g an extern alias of FirstAliasfor the First.dll referenceLicensed to Rhona Hadida


198 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featuresis never read from or written to. It’s almost always go<strong>in</strong>g to be useless… unless we happento know that it will be used by reflection. List<strong>in</strong>g 7.9 is a complete class demonstrat<strong>in</strong>gthis.List<strong>in</strong>g 7.9Class conta<strong>in</strong><strong>in</strong>g an unused fieldpublic class FieldUsedOnlyByReflection{<strong>in</strong>t x;}If you try to compile list<strong>in</strong>g 7.9, you’ll get a warn<strong>in</strong>g message like this:FieldUsedOnlyByReflection.cs(3,9): warn<strong>in</strong>g CS0169:The private field 'FieldUsedOnlyByReflection.x' is never usedThat’s the output from the command-l<strong>in</strong>e compiler. In the Error List w<strong>in</strong>dow of VisualStudio, you can see the same <strong>in</strong>formation (plus the project it’s <strong>in</strong>) except that you don’tget the warn<strong>in</strong>g number (CS0169). To f<strong>in</strong>d out the number, you need to either selectthe warn<strong>in</strong>g and br<strong>in</strong>g up the help related to it, or look <strong>in</strong> the Output w<strong>in</strong>dow, wherethe full text is shown. We need the number <strong>in</strong> order to make the code compile withoutwarn<strong>in</strong>gs, as shown <strong>in</strong> list<strong>in</strong>g 7.10.List<strong>in</strong>g 7.10Disabl<strong>in</strong>g (and restor<strong>in</strong>g) warn<strong>in</strong>g CS0169public class FieldUsedOnlyByReflection{#pragma warn<strong>in</strong>g disable 0169<strong>in</strong>t x;#pragma warn<strong>in</strong>g restore 0169}List<strong>in</strong>g 7.10 is self-explanatory—the first pragma disables the particular warn<strong>in</strong>g we’re<strong>in</strong>terested <strong>in</strong>, and the second one restores it. It’s good practice to disable warn<strong>in</strong>gs foras short a space as you can, so that you don’t miss any warn<strong>in</strong>gs you genu<strong>in</strong>ely ought tofix. If you want to disable or restore multiple warn<strong>in</strong>gs <strong>in</strong> a s<strong>in</strong>gle l<strong>in</strong>e, just use acomma-separated list of warn<strong>in</strong>g numbers. If you don’t specify any warn<strong>in</strong>g numbersat all, the result is to disable or restore all warn<strong>in</strong>gs <strong>in</strong> one fell swoop—but that’s a badidea <strong>in</strong> almost every imag<strong>in</strong>able scenario.7.5.2 Checksum pragmasYou’re very unlikely to need the second form of pragma recognized by the Microsoftcompiler. It supports the debugger by allow<strong>in</strong>g it to check that it’s found the rightsource file. Normally when a <strong>C#</strong> file is compiled, the compiler generates a checksumfrom the file and <strong>in</strong>cludes it <strong>in</strong> the debugg<strong>in</strong>g <strong>in</strong>formation. When the debugger needsto locate a source file and f<strong>in</strong>ds multiple potential matches, it can generate the checksumitself for each of the candidate files and see which is correct.Now, when an ASP.NET page is converted <strong>in</strong>to <strong>C#</strong>, the generated file is what the <strong>C#</strong>compiler sees. The generator calculates the checksum of the .aspx page, and uses aLicensed to Rhona Hadida


Fixed-size buffers <strong>in</strong> unsafe code199checksum pragma to tell the <strong>C#</strong> compiler to use that checksum <strong>in</strong>stead of calculat<strong>in</strong>gone from the generated page.The syntax of the checksum pragma is#pragma checksum "filename" "{guid}" "checksum bytes"The GUID <strong>in</strong>dicates which hash<strong>in</strong>g algorithm has been used to calculate the checksum.The documentation for the CodeChecksumPragma class gives GUIDs for SHA-1 andMD5, should you ever wish to implement your own dynamic compilation frameworkwith debugger support.It’s possible that future versions of the <strong>C#</strong> compiler will <strong>in</strong>clude more pragmadirectives, and other compilers (such as the Mono compiler, mcs) could have theirown support for different features. Consult your compiler documentation for themost up-to-date <strong>in</strong>formation.The next feature is another one that you may well never use—but at the same time,if you ever do, it’s likely to make your life somewhat simpler.7.6 Fixed-size buffers <strong>in</strong> unsafe codeWhen call<strong>in</strong>g <strong>in</strong>to native code with P/Invoke, it’s not particularly unusual to f<strong>in</strong>d yourselfdeal<strong>in</strong>g with a structure that is def<strong>in</strong>ed to have a buffer of a particular lengthwith<strong>in</strong> it. Prior to <strong>C#</strong> 2, such structures were difficult to handle directly, even withunsafe code. Now, you can declare a buffer of the right size to be embedded directlywith the rest of the data for the structure.This capability isn’t just available for call<strong>in</strong>g native code, although that is its primaryuse. You could use it to easily populate a data structure directly correspond<strong>in</strong>g toa file format, for <strong>in</strong>stance. The syntax is simple, and once aga<strong>in</strong> we’ll demonstrate itwith an example. To create a field that embeds an array of 20 bytes with<strong>in</strong> its enclos<strong>in</strong>gstructure, you would usefixed byte data[20];This would allow data to be used as if it were a byte* (a po<strong>in</strong>ter to byte data),although the implementation used by the <strong>C#</strong> compiler is to create a new nested typewith<strong>in</strong> the declar<strong>in</strong>g type and apply the new FixedBuffer attribute to the variableitself. The CLR then takes care of allocat<strong>in</strong>g the memory appropriately.One downside of this feature is that it’s only available with<strong>in</strong> unsafe code: theenclos<strong>in</strong>g structure has to be declared <strong>in</strong> an unsafe context, and you can only use thefixed-size buffer member with<strong>in</strong> an unsafe context too. This limits the situations <strong>in</strong>which it’s useful, but it can still be a nice trick to have up your sleeve at times. Also,fixed-size buffers are only applicable to primitive types, and can’t be members ofclasses (only structures).There are remarkably few W<strong>in</strong>dows APIs where this feature is directly useful. Thereare numerous situations where a fixed array of characters is called for—the TIME_ZONE_INFORMATION structure, for example—but unfortunately fixed-size buffers of charactersappear to be handled poorly by P/Invoke, with the marshaler gett<strong>in</strong>g <strong>in</strong> the way.Licensed to Rhona Hadida


200 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featuresAs one example, however, list<strong>in</strong>g 7.11 is a console application that displays the colorsavailable <strong>in</strong> the current console w<strong>in</strong>dow. It uses an API function GetConsoleScreen-BufferEx that is new to Vista and W<strong>in</strong>dows Server 2008, and that retrieves extended console<strong>in</strong>formation. List<strong>in</strong>g 7.11 displays all 16 colors <strong>in</strong> hexadecimal format (bbggrr).List<strong>in</strong>g 7.11Demonstration of fixed-size buffers to obta<strong>in</strong> console color <strong>in</strong>formationus<strong>in</strong>g System;us<strong>in</strong>g System.Runtime.InteropServices;struct COORD{public short X, Y;}struct SMALL_RECT{public short Left, Top, Right, Bottom;}unsafe struct CONSOLE_SCREEN_BUFFER_INFOEX{public <strong>in</strong>t StructureSize;public COORD ConsoleSize, CursorPosition;public short Attributes;public SMALL_RECT DisplayW<strong>in</strong>dow;public COORD MaximumW<strong>in</strong>dowSize;public short PopupAttributes;public <strong>in</strong>t FullScreenSupported;public fixed <strong>in</strong>t ColorTable[16];}static class FixedSizeBufferDemo{const <strong>in</strong>t StdOutputHandle = -11;[DllImport("kernel32.dll")]static extern IntPtr GetStdHandle(<strong>in</strong>t nStdHandle);[DllImport("kernel32.dll")]static extern bool GetConsoleScreenBufferInfoEx(IntPtr handle, ref CONSOLE_SCREEN_BUFFER_INFOEX <strong>in</strong>fo);unsafe static void Ma<strong>in</strong>(){IntPtr handle = GetStdHandle(StdOutputHandle);CONSOLE_SCREEN_BUFFER_INFOEX <strong>in</strong>fo;<strong>in</strong>fo = new CONSOLE_SCREEN_BUFFER_INFOEX();<strong>in</strong>fo.StructureSize = sizeof(CONSOLE_SCREEN_BUFFER_INFOEX);GetConsoleScreenBufferInfoEx(handle, ref <strong>in</strong>fo);}}for (<strong>in</strong>t i=0; i < 16; i++){Console.WriteL<strong>in</strong>e ("{0:x6}", <strong>in</strong>fo.ColorTable[i]);}Licensed to Rhona Hadida


Expos<strong>in</strong>g <strong>in</strong>ternal members to selected assemblies201List<strong>in</strong>g 7.11 uses fixed-size buffers for the table of colors. Before fixed-size buffers, wecould still have used the API either with a field for each color table entry or by marshal<strong>in</strong>ga normal array as UnmanagedType.ByValArray. However, this would have createda separate array on the heap <strong>in</strong>stead of keep<strong>in</strong>g the <strong>in</strong>formation all with<strong>in</strong> thestructure. That’s not a problem here, but <strong>in</strong> some high-performance situations it’snice to be able to keep “lumps” of data together. On a different performance note, ifthe buffer is part of a data structure on the managed heap, you have to p<strong>in</strong> it beforeaccess<strong>in</strong>g it. If you do this a lot, it can significantly affect the garbage collector. Stackbasedstructures don’t have this problem, of course.I’m not go<strong>in</strong>g to claim that fixed-size buffers are a hugely important feature <strong>in</strong><strong>C#</strong> 2—at least, they’re not important to most people. I’ve <strong>in</strong>cluded them for completeness,however, and doubtless someone, somewhere will f<strong>in</strong>d them <strong>in</strong>valuable.Our f<strong>in</strong>al feature can barely be called a <strong>C#</strong> 2 language feature at all—but it just aboutcounts, so I’ve <strong>in</strong>cluded it for completeness.7.7 Expos<strong>in</strong>g <strong>in</strong>ternal members to selected assembliesThere are some features that are obviously <strong>in</strong> the language—iterator blocks, for example.There are some features that obviously belong to the runtime, such as JIT compileroptimizations. There are some that clearly sit <strong>in</strong> both camps, like generics. Thislast feature has a toe <strong>in</strong> each but is sufficiently odd that it doesn’t merit a mention <strong>in</strong>either specification. In addition, it uses a term that has different mean<strong>in</strong>gs <strong>in</strong> C++ andVB.NET—add<strong>in</strong>g a third mean<strong>in</strong>g to the mix. To be fair, all the terms are used <strong>in</strong> thecontext of access permissions, but they have different effects.7.7.1 Friend assemblies <strong>in</strong> the simple caseIn .NET 1.1 it was entirely accurate to say that someth<strong>in</strong>g def<strong>in</strong>ed to be <strong>in</strong>ternal(whether a type, a method, a property, a variable, or an event) could only beaccessed with<strong>in</strong> the same assembly <strong>in</strong> which it was declared. 3 In .NET 2.0 that’s stillmostly true, but there’s a new attribute to let you bend the rules slightly: Internals-VisibleToAttribute, usually referred to as just InternalsVisibleTo. (When apply<strong>in</strong>gan attribute whose name ends with Attribute, the <strong>C#</strong> compiler will apply thesuffix automatically.)InternalsVisibleTo can only be applied to an assembly (not a specific member),and you can apply it multiple times to the same assembly. We will call the assemblyconta<strong>in</strong><strong>in</strong>g the attribute the source assembly, although this is entirely unofficial term<strong>in</strong>ology.When you apply the attribute, you have to specify another assembly, known asthe friend assembly. The result is that the friend assembly can see all the <strong>in</strong>ternal membersof the source assembly as if they were public. This may sound alarm<strong>in</strong>g, but it canbe useful, as we’ll see <strong>in</strong> a m<strong>in</strong>ute.List<strong>in</strong>g 7.12 shows this with three classes <strong>in</strong> three different assemblies.3Us<strong>in</strong>g reflection when runn<strong>in</strong>g with suitable permissions doesn’t count.Licensed to Rhona Hadida


202 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featuresList<strong>in</strong>g 7.12Demonstration of friend assemblies// Compiled to Source.dllus<strong>in</strong>g System.Runtime.CompilerServices;[assembly:InternalsVisibleTo("FriendAssembly")]public class Source{<strong>in</strong>ternal static void InternalMethod(){}Grantsadditional access}public static void PublicMethod(){}// Compiled to FriendAssembly.dllpublic class Friend{static void Ma<strong>in</strong>(){Source.InternalMethod();Source.PublicMethod();}}Uses additionalaccess with<strong>in</strong>FriendAssembly// Compiled to EnemyAssembly.dllpublic class Enemy{static void Ma<strong>in</strong>(){// Source.InternalMethod();Source.PublicMethod();}}EnemyAssembly hasno special accessIn list<strong>in</strong>g 7.12 a special relationship exists between FriendAssembly.dll andSource.dll—although it only operates one way: Source.dll has no access to <strong>in</strong>ternalmembers of FriendAssembly.dll. If we were to uncomment the l<strong>in</strong>e at B, the Enemyclass would fail to compile.So, why on earth would we want to open up our well-designed assembly to certa<strong>in</strong>assemblies to start with?7.7.2 Why use InternalsVisibleTo?I can’t say I’ve ever used InternalsVisibleTo between two production assemblies.I’m not say<strong>in</strong>g there aren’t legitimate use cases for that, but I’ve not come acrossthem. However, I have used the attribute when it comes to unit test<strong>in</strong>g.There are some who say you should only test the public <strong>in</strong>terface to code. PersonallyI’m happy to test whatever I can <strong>in</strong> the simplest manner possible. Friend assembliesmake that a lot easier: suddenly it’s trivial to test code that only has <strong>in</strong>ternal access withouttak<strong>in</strong>g the dubious step of mak<strong>in</strong>g members public just for the sake of test<strong>in</strong>g, or<strong>in</strong>clud<strong>in</strong>g the test code with<strong>in</strong> the production assembly. (It does occasionally meanBAccesses publicmethod as normalLicensed to Rhona Hadida


Expos<strong>in</strong>g <strong>in</strong>ternal members to selected assemblies203mak<strong>in</strong>g members <strong>in</strong>ternal for the sake of test<strong>in</strong>g where they might otherwise be private,but that’s a less worry<strong>in</strong>g step.)The only downside to this is that the name of your test assembly lives on <strong>in</strong> yourproduction assembly. In theory this could represent a security attack vector if yourassemblies aren’t signed, and if your code normally operates under a restricted set ofpermissions. (Anyone with full trust could use reflection to access the members <strong>in</strong> thefirst place. You could do that yourself for unit tests, but it’s much nastier.) If this everends up as a genu<strong>in</strong>e issue for anyone, I’ll be very surprised. It does, however, br<strong>in</strong>gthe option of sign<strong>in</strong>g assemblies <strong>in</strong>to the picture. Just when you thought this was anice, simple little feature…7.7.3 InternalsVisibleTo and signed assembliesIf a friend assembly is signed, the source assembly needs to specify the public key of thefriend assembly, to make sure it’s trust<strong>in</strong>g the right code. Contrary to a lot of documentation,it isn’t the public key token that is required but the public key itself. For <strong>in</strong>stance,consider the follow<strong>in</strong>g command l<strong>in</strong>e and output (rewrapped and modified slightly forformatt<strong>in</strong>g) used to discover the public key of a signed FriendAssembly.dll:c:\Users\Jon\Test>sn -Tp FriendAssembly.dllMicrosoft (R) .NET Framework Strong Name Utility Version 3.5.21022.8Copyright (c) Microsoft Corporation. All rights reserved.Public key is0024000004800000940000000602000000240000525341310004000001000100a51372c81ccfb8fba9c5fb84180c4129e50f0facdce932cf31fe563d0fe3cb6b1d5129e28326060a3a539f287aaf59affc5aabc4d8f981e1a82479ab795f410eab22e3266033c633400463ee7513378bb4ef41fc0cae5fb03986d133677c82a865b278c48d99dc251201b9c43edd7bedefd4b5306efd0dec7787ec6b664471c2Public key token is 647b99330b7f792cThe source code for the Source class would now need to have this as the attribute:[assembly:InternalsVisibleTo("FriendAssembly,PublicKey="+"0024000004800000940000000602000000240000525341310004000001"+"000100a51372c81ccfb8fba9c5fb84180c4129e50f0facdce932cf31fe"+"563d0fe3cb6b1d5129e28326060a3a539f287aaf59affc5aabc4d8f981"+"e1a82479ab795f410eab22e3266033c633400463ee7513378bb4ef41fc"+"0cae5fb03986d133677c82a865b278c48d99dc251201b9c43edd7bedef"+"d4b5306efd0dec7787ec6b664471c2")]Unfortunately, you have to either have the public key on one l<strong>in</strong>e or use str<strong>in</strong>g concatenation—whitespace<strong>in</strong> the public key will cause a compilation failure. It would be alot more pleasant to look at if we really could specify the token <strong>in</strong>stead of the wholekey, but fortunately this ugl<strong>in</strong>ess is usually conf<strong>in</strong>ed to AssemblyInfo.cs, so you won’tneed to see it often.In theory, it’s possible to have an unsigned source assembly and a signed friendassembly. In practice, that’s not terribly useful, as the friend assembly typically wants toLicensed to Rhona Hadida


204 CHAPTER 7 Conclud<strong>in</strong>g <strong>C#</strong> 2: the f<strong>in</strong>al featureshave a reference to the source assembly—and you can’t refer to an unsigned assemblyfrom one that is signed! Likewise a signed assembly can’t specify an unsigned friendassembly, so typically you end up with both assemblies be<strong>in</strong>g signed if either one ofthem is.7.8 SummaryThis completes our tour of the new features <strong>in</strong> <strong>C#</strong> 2. The topics we’ve looked at <strong>in</strong> thischapter have broadly fallen <strong>in</strong>to two categories: “nice to have” improvements thatstreaml<strong>in</strong>e development, and “hope you don’t need it” features that can get you out oftricky situations when you need them. To make an analogy between <strong>C#</strong> 2 and improvementsto a house, the major features from our earlier chapters are comparable to fullscaleadditions. Some of the features we’ve seen <strong>in</strong> this chapter (such as partial typesand static classes) are more like redecorat<strong>in</strong>g a bedroom, and features like namespacealiases are ak<strong>in</strong> to fitt<strong>in</strong>g smoke alarms—you may never see a benefit, but it’s nice toknow they’re there if you ever need them.The range of features <strong>in</strong> <strong>C#</strong> 2 is very broad—the designers tackled many of theareas where developers were feel<strong>in</strong>g pa<strong>in</strong>, without any one overarch<strong>in</strong>g goal. That’snot to say the features don’t work well together—nullable value types wouldn’t be feasiblewithout generics, for <strong>in</strong>stance—but there’s no one aim that every feature contributesto, unless you count general productivity.Now that we’ve f<strong>in</strong>ished exam<strong>in</strong><strong>in</strong>g <strong>C#</strong> 2, it’s time to move on to <strong>C#</strong> 3, where thepicture is very different. Nearly every feature <strong>in</strong> <strong>C#</strong> 3 (with the exception of partialmethods, which we’ve covered <strong>in</strong> this chapter) forms part of the grand picture ofLINQ, a conglomeration of technologies that could well change the way traditionalprogrammers th<strong>in</strong>k—forever.Licensed to Rhona Hadida


Part 3<strong>C#</strong> 3—revolutioniz<strong>in</strong>ghow we codeThere is no doubt that <strong>C#</strong> 2 is a significant improvement over <strong>C#</strong> 1. The benefitsof generics <strong>in</strong> particular are fundamental to other changes, not just <strong>in</strong> <strong>C#</strong> 2but also <strong>in</strong> <strong>C#</strong> 3. However, <strong>C#</strong> 2 was <strong>in</strong> some sense a piecemeal collection of features.Don’t get me wrong: they fit together nicely enough, but they address a setof <strong>in</strong>dividual issues. That was appropriate at that stage of <strong>C#</strong>’s development, but<strong>C#</strong> 3 is different.Almost every feature <strong>in</strong> <strong>C#</strong> 3 enables one very specific technology: LINQ.Many of the features are useful outside this context, and you certa<strong>in</strong>ly shouldn’tconf<strong>in</strong>e yourself to only us<strong>in</strong>g them when you happen to be writ<strong>in</strong>g a queryexpression, for example—but it would be equally silly not to recognise the completepicture created by the set of jigsaw puzzle pieces presented <strong>in</strong> the rema<strong>in</strong><strong>in</strong>gchapters.I’m writ<strong>in</strong>g this before <strong>C#</strong> 3 and .NET 3.5 have been fully released, but I’d liketo make a prediction: <strong>in</strong> a few years, we’ll be collectively kick<strong>in</strong>g ourselves for notus<strong>in</strong>g LINQ <strong>in</strong> a more widespread fashion <strong>in</strong> the early days of <strong>C#</strong> 3. The buzzaround LINQ—both with<strong>in</strong> the community and <strong>in</strong> the messages from Microsoft—has been largely around database access and LINQ to SQL. Now databases are certa<strong>in</strong>lyimportant—but we manipulate data all the time, not just from databases but<strong>in</strong> memory, and from files, network resources, and other places. Why shouldn’tother data sources get just as much benefit from LINQ as databases?Licensed to Rhona Hadida


They do, of course—and that’s the hidden jewel of LINQ. It’s been <strong>in</strong> broad daylight,<strong>in</strong> public view—just not talked about very much. Even if you don’t talk aboutit, I’d like you to keep it <strong>in</strong> the back of your m<strong>in</strong>d while you read about the featuresof <strong>C#</strong> 3. Look at your exist<strong>in</strong>g code <strong>in</strong> the light of the possibilities that LINQ has tooffer. It’s not suitable for all tasks, but where it is appropriate it can make a spectaculardifference.It’s only been <strong>in</strong> the course of writ<strong>in</strong>g this book that I’ve become thoroughly conv<strong>in</strong>cedof the elegance and beauty of LINQ. The deeper you study the language, themore clearly you see the harmony between the various elements that have been <strong>in</strong>troduced.Hopefully this will become apparent <strong>in</strong> the rema<strong>in</strong>der of the book, but you’remore likely to feel it gradually as you beg<strong>in</strong> to see LINQ improv<strong>in</strong>g your own code. Idon’t wish to sound like a m<strong>in</strong>dless and noncritical <strong>C#</strong> devotee, but I feel there’ssometh<strong>in</strong>g special <strong>in</strong> <strong>C#</strong> 3.With that brief burst of abstract admiration out of the way, let’s start look<strong>in</strong>g at<strong>C#</strong> 3 <strong>in</strong> a more concrete manner.Licensed to Rhona Hadida


Cutt<strong>in</strong>g fluffwith a smart compilerThis chapter covers■■■■■Automatically implemented propertiesImplicitly typed local variablesObject and collection <strong>in</strong>itializersImplicitly typed arraysAnonymous typesWe start look<strong>in</strong>g at <strong>C#</strong> 3 <strong>in</strong> the same way that we f<strong>in</strong>ished look<strong>in</strong>g at <strong>C#</strong> 2—with acollection of relatively simple features. These are just the first small steps on thepath to LINQ, however. Each of them can be used outside that context, but they’reall pretty important for simplify<strong>in</strong>g code to the extent that LINQ requires <strong>in</strong> orderto be effective.One important po<strong>in</strong>t to note is that while two of the biggest features of <strong>C#</strong> 2—generics and nullable types—required CLR changes, there are no significantchanges to the CLR that ships with .NET 3.5. There are some bug fixes, but noth<strong>in</strong>gfundamental. The framework library has grown to support LINQ, along with <strong>in</strong>troduc<strong>in</strong>ga few more features to the base class library, but that’s a different matter. It’s207Licensed to Rhona Hadida


208 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compilerworth be<strong>in</strong>g quite clear <strong>in</strong> your m<strong>in</strong>d which changes are only <strong>in</strong> the <strong>C#</strong> language,which are library changes, and which are CLR changes.The fact that there are no CLR changes for .NET 3.5 means that almost all of thenew features exposed <strong>in</strong> <strong>C#</strong> 3 are due to the compiler be<strong>in</strong>g will<strong>in</strong>g to do more workfor you. We saw some evidence of this <strong>in</strong> <strong>C#</strong> 2—particularly with anonymous methodsand iterator blocks—and <strong>C#</strong> 3 cont<strong>in</strong>ues <strong>in</strong> the same ve<strong>in</strong>. In this chapter, we’ll meetthe follow<strong>in</strong>g features that are new to <strong>C#</strong> 3:■■■■■Automatically implemented properties—Remov<strong>in</strong>g the drudgery of writ<strong>in</strong>g simpleproperties backed directly by fields.Implicitly typed local variables—When you declare a variable and immediatelyassign a value to it, you no longer need to specify the type <strong>in</strong> the declaration.Object and collection <strong>in</strong>itializers—Simple ways to <strong>in</strong>itialize objects <strong>in</strong> s<strong>in</strong>gle expressions.Implicitly typed arrays—Let the compiler work out the type of new arrays, basedon their contents.Anonymous types—Primarily used <strong>in</strong> LINQ, these allow you to create new “adhoc” types to conta<strong>in</strong> simple properties.As well as describ<strong>in</strong>g what the new features do, I’ll make recommendations about theiruse. Many of the features of <strong>C#</strong> 3 require a certa<strong>in</strong> amount of discretion and restra<strong>in</strong>ton the part of the developer. That’s not to say they’re not powerful and <strong>in</strong>credibly useful—quitethe reverse—but the temptation to use the latest and greatest syntactic sugarshouldn’t be allowed to overrule the drive toward clear and readable code.The considerations I’ll discuss <strong>in</strong> this chapter (and <strong>in</strong>deed <strong>in</strong> the rest of the book)will rarely be black and white. Perhaps more than ever before, readability is <strong>in</strong> the eyeof the beholder—and as you become more comfortable with the new features, they’relikely to become more readable to you. I should stress, however, that unless you havegood reason to suppose you’ll be the only one to ever read your code, you should considerthe needs and views of your colleagues carefully.Enough navel gaz<strong>in</strong>g for the moment. We’ll start off with a feature that shouldn’tcause any controversy—and that I always miss when cod<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 2. Simple but effective,automatically implemented properties just make life better.8.1 Automatically implemented propertiesOur first feature is probably the simplest <strong>in</strong> the whole of <strong>C#</strong> 3. In fact, it’s even simplerthan any of the new features <strong>in</strong> <strong>C#</strong> 2. Despite that—or possibly because of that—it’s alsoimmediately applicable <strong>in</strong> many, many situations. When you read about iterator blocks<strong>in</strong> chapter 6, you may not immediately have thought of any areas of your current codebasethat could be improved by us<strong>in</strong>g them, but I’d be surprised to f<strong>in</strong>d any nontrivial<strong>C#</strong> program that couldn’t be modified to use automatically implemented properties.This fabulously simple feature allows you to express trivial properties with less codethan before.Licensed to Rhona Hadida


Automatically implemented properties209What do I mean by a trivial property? I mean one that is read/write and that storesits value <strong>in</strong> a straightforward private variable without any validation or other customcode. In other words, it’s a property like this:str<strong>in</strong>g name;public str<strong>in</strong>g Name{get { return name; }set { name = value; }}Now, that’s not an awful lot of code, but it’s still five l<strong>in</strong>es—and that’s assum<strong>in</strong>g yourcod<strong>in</strong>g conventions allow you to get away with the “one l<strong>in</strong>e” forms of the getter andsetter. If your cod<strong>in</strong>g conventions force you to keep member variables <strong>in</strong> one area ofcode and properties <strong>in</strong> another, it becomes a bit uglier—and then there’s the questionof whether to add XML documentation to the property, the variable, or both.The <strong>C#</strong> 3 version us<strong>in</strong>g an automatically implemented property is a s<strong>in</strong>gle l<strong>in</strong>e:public str<strong>in</strong>g Name { get; set; }Where previously you might have been tempted to use a public variable (particularlyfor “throwaway code”—which we all know tends to live for longer than anticipated)just to make the code simple, there’s now even less excuse for not follow<strong>in</strong>g the bestpractice of us<strong>in</strong>g a property <strong>in</strong>stead. The compiler generates a private variable thatcan’t be referenced directly <strong>in</strong> the source, and fills <strong>in</strong> the property getter and setterwith the simple code to read and write that variable.NOTETerm<strong>in</strong>ology: Automatic property vs. automatically implemented property—Whenautomatically implemented properties were first discussed, long beforethe full <strong>C#</strong> 3 specification was published, they were called automatic properties.Personally, I f<strong>in</strong>d this a lot less of a mouthful than the full name, andit’s not like anyth<strong>in</strong>g other than the implementation is go<strong>in</strong>g to be automatic.For the rest of this book I will use automatic property and automaticallyimplemented property synonymously.The feature of <strong>C#</strong> 2 that allows you to specify different access for the getter and thesetter is still available here, and you can also create static automatic properties. You needto be careful with static properties <strong>in</strong> terms of multithread<strong>in</strong>g, however—although mosttypes don’t claim to have thread-safe <strong>in</strong>stance members, publicly visible static membersusually should be thread-safe, and the compiler doesn’t do anyth<strong>in</strong>g to help you <strong>in</strong> thisrespect. It’s best to restrict automatic static properties to be private, and make sure youdo any appropriate lock<strong>in</strong>g yourself. List<strong>in</strong>g 8.1 gives an example of this.List<strong>in</strong>g 8.1A Person class that counts created <strong>in</strong>stancespublic class Person{public str<strong>in</strong>g Name { get; private set; }public <strong>in</strong>t Age { get; private set; }Declares propertieswith public gettersLicensed to Rhona Hadida


210 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compilerprivate static <strong>in</strong>t InstanceCounter { get; set; }private static readonly object counterLock = new object();public Person(str<strong>in</strong>g name, <strong>in</strong>t age){Name = name;Age = age;Declaresprivate staticproperty andlock}}lock (counterLock){InstanceCounter++;}Uses lock whileaccess<strong>in</strong>g staticpropertyAn alternative <strong>in</strong> this case is to use a simple static variable and rely on Interlocked.Increment to update the <strong>in</strong>stance counter. You may decide that’s simpler (and moreefficient) code than us<strong>in</strong>g an explicit lock—it’s a judgment call. Due to this sort ofissue, static automatic properties are rarely useful: it’s usually better to implement normalproperties, allow<strong>in</strong>g you more control. Note that you can’t use automatic propertiesand use Interlocked.Increment: you no longer have access to the field, so youcan’t pass it by reference to the method.The other automatic properties <strong>in</strong> list<strong>in</strong>g 8.1, represent<strong>in</strong>g the name and age ofthe person, are real no-bra<strong>in</strong>ers. Where you’ve got properties that you would haveimplemented trivially <strong>in</strong> previous versions of <strong>C#</strong>, there’s no benefit <strong>in</strong> not us<strong>in</strong>g automaticproperties.One slight wr<strong>in</strong>kle occurs if you use automatic properties when writ<strong>in</strong>g your ownstructs: all of your constructors need to explicitly call the parameterless constructor—this()—so that the compiler knows that all the fields have been def<strong>in</strong>itely assigned.You can’t set the fields directly because they’re anonymous, and you can’t use theproperties until all the fields have been set. The only way of proceed<strong>in</strong>g is to call theparameterless constructor, which will set the fields to their default values.That’s all there is to automatically implemented properties. There are no bellsand whistles to them—for <strong>in</strong>stance, there’s no way of declar<strong>in</strong>g them with <strong>in</strong>itialdefault values, and no way of mak<strong>in</strong>g them read-only. If all the <strong>C#</strong> 3 features werethat simple, we could cover everyth<strong>in</strong>g <strong>in</strong> a s<strong>in</strong>gle chapter. Of course, that’s not thecase—but there are still some features that don’t take very much explanation. Ournext topic removes duplicate code <strong>in</strong> another common but specific situation—declar<strong>in</strong>g local variables.8.2 Implicit typ<strong>in</strong>g of local variablesIn chapter 2, I discussed the nature of the <strong>C#</strong> 1 type system. In particular, I stated thatit was static, explicit, and safe. That’s still true <strong>in</strong> <strong>C#</strong> 2, and <strong>in</strong> <strong>C#</strong> 3 it’s still almost completelytrue. The static and safe parts are still true (ignor<strong>in</strong>g explicitly unsafe code, justas we did <strong>in</strong> chapter 2) and most of the time it’s still explicitly typed—but you can askthe compiler to <strong>in</strong>fer the types of local variables for you.Licensed to Rhona Hadida


Implicit typ<strong>in</strong>g of local variables2118.2.1 Us<strong>in</strong>g var to declare a local variableIn order to use implicit typ<strong>in</strong>g, all you need to do is replace the type part of a normallocal variable declaration with var. Certa<strong>in</strong> restrictions exist (we’ll come to those <strong>in</strong> amoment), but essentially it’s as easy as chang<strong>in</strong>g<strong>in</strong>toMyType variableName = someInitialValue;var variableName = someInitialValue;The results of the two l<strong>in</strong>es (<strong>in</strong> terms of compiled code) are exactly the same, assum<strong>in</strong>gthat the type of someInitialValue is MyType. The compiler simply takes the compiletimetype of the <strong>in</strong>itialization expression and makes the variable have that type too. Thetype can be any normal .NET type, <strong>in</strong>clud<strong>in</strong>g generics, delegates, and <strong>in</strong>terfaces. Thevariable is still statically typed; it’s just that you haven’t written the name of the type <strong>in</strong>your code.This is important to understand, as it goes to the heart of what a lot of developers<strong>in</strong>itially fear when they see this feature—that <strong>C#</strong> has become dynamically or weaklytyped. That’s not true at all. The best way of expla<strong>in</strong><strong>in</strong>g this is to show you some<strong>in</strong>valid code:var str<strong>in</strong>gVariable = "Hello, world.";str<strong>in</strong>gVariable = 0;That doesn’t compile, because the type of str<strong>in</strong>gVariable is System.Str<strong>in</strong>g, and youcan’t assign the value 0 to a str<strong>in</strong>g variable. In many dynamic languages, the codewould have compiled, leav<strong>in</strong>g the variable with no particularly useful type as far as thecompiler, IDE, or runtime environment is concerned. Us<strong>in</strong>g var is not like us<strong>in</strong>g aVariant type from COM or VB6. The variable is statically typed; it’s just that the typehas been <strong>in</strong>ferred by the compiler. I apologize if I seem to be go<strong>in</strong>g on about thissomewhat, but it’s <strong>in</strong>credibly important.In Visual Studio 2008, you can tell the type that the compiler has used for the variableby hover<strong>in</strong>g over the var part of the declaration, as shown <strong>in</strong> figure 8.1. Note howthe type parameters for the generic Dictionary type are also expla<strong>in</strong>ed.If this looks familiar, that’s because it’s exactly the same behavior you get when youdeclare local variables explicitly.Tooltips aren’t just available at the po<strong>in</strong>t of declaration, either. As you’d probablyexpect, the tooltip displayed when you hover over the variable name later on <strong>in</strong> thecode <strong>in</strong>dicates the type of the variable too. This is shown <strong>in</strong> figure 8.2, where the samedeclaration is used and then I’ve hovered over a use of the variable.Figure 8.1 Hover<strong>in</strong>g over var <strong>in</strong> Visual Studio2008 displays the type of the declared variable.Licensed to Rhona Hadida


212 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compilerFigure 8.2 Hover<strong>in</strong>gover the use of animplicitly typed localvariable displays its type.Aga<strong>in</strong>, that’s exactly the same behavior as a normal local variable declaration. Now,there are two reasons for br<strong>in</strong>g<strong>in</strong>g up Visual Studio 2008 <strong>in</strong> this context. The first isthat it’s more evidence of the static typ<strong>in</strong>g <strong>in</strong>volved—the compiler clearly knows thetype of the variable. The second is to po<strong>in</strong>t out that you can easily discover the type<strong>in</strong>volved, even from deep with<strong>in</strong> a method. This will be important when we talk aboutthe pros and cons of us<strong>in</strong>g implicit typ<strong>in</strong>g <strong>in</strong> a m<strong>in</strong>ute. First, though, I ought to mentionsome limitations.8.2.2 Restrictions on implicit typ<strong>in</strong>gYou can’t use implicit typ<strong>in</strong>g for every variable <strong>in</strong> every situation. You can only use itwhen■ The variable be<strong>in</strong>g declared is a local variable.■ The variable is <strong>in</strong>itialized as part of the declaration.■ The <strong>in</strong>itialization expression isn’t a method group or anonymous function 1(without cast<strong>in</strong>g).■ The <strong>in</strong>itialization expression isn’t null.■ Only one variable is declared <strong>in</strong> the statement.■ The type you want the variable to have is the compile-time type of the <strong>in</strong>itializationexpression.The third and fourth po<strong>in</strong>ts are <strong>in</strong>terest<strong>in</strong>g. You can’t writevar starter = delegate() { Console.WriteL<strong>in</strong>e(); }This is because the compiler doesn’t know what type to use. You can writevar starter = (ThreadStart) delegate() { Console.WriteL<strong>in</strong>e(); }but if you’re go<strong>in</strong>g to do that you’d be better off explicitly declar<strong>in</strong>g the variable <strong>in</strong>the first place. The same is true <strong>in</strong> the null case—you could cast the null appropriately,but there’d be no po<strong>in</strong>t. Note that you can use the result of method calls orproperties as the <strong>in</strong>itialization expression—you’re not limited to constants and constructorcalls. For <strong>in</strong>stance, you could usevar args = Environment.CommandL<strong>in</strong>e;In that case args would then be of type str<strong>in</strong>g[]. In fact, <strong>in</strong>itializ<strong>in</strong>g a variable withthe result of a method call is likely to be the most common situation where implicit1The term anonymous function covers both anonymous methods and lambda expressions, which we’ll delve <strong>in</strong>to<strong>in</strong> chapter 9.Licensed to Rhona Hadida


Implicit typ<strong>in</strong>g of local variables213typ<strong>in</strong>g is used, as part of LINQ. We’ll see all that later on—just bear it <strong>in</strong> m<strong>in</strong>d as theexamples progress.It’s also worth not<strong>in</strong>g that you are allowed to use implicit typ<strong>in</strong>g for the local variablesdeclared <strong>in</strong> the first part of a us<strong>in</strong>g, for, or foreach statement. For example, thefollow<strong>in</strong>g are all valid (with appropriate bodies, of course):for (var i = 0; i < 10; i++)us<strong>in</strong>g (var x = File.OpenText("test.dat"))foreach (var s <strong>in</strong> Environment.CommandL<strong>in</strong>e)The variables <strong>in</strong> question would end up with types of <strong>in</strong>t, StreamReader and str<strong>in</strong>g,respectively. Of course, just because you can do this doesn’t mean you should. Let’shave a look at the reasons for and aga<strong>in</strong>st us<strong>in</strong>g implicit typ<strong>in</strong>g.8.2.3 Pros and cons of implicit typ<strong>in</strong>gThe question of when it’s a good idea to use implicit typ<strong>in</strong>g is the cause of an awful lotof community discussion. Views range from “everywhere” to “nowhere” with plenty ofmore balanced approaches between the two. We’ll see <strong>in</strong> section 8.5 that <strong>in</strong> order touse another of <strong>C#</strong> 3’s features—anonymous types—you’ve often got to use implicit typ<strong>in</strong>g.You could avoid anonymous types as well, of course, but that’s throw<strong>in</strong>g the babyout with the bathwater.The ma<strong>in</strong> reason for us<strong>in</strong>g implicit typ<strong>in</strong>g (leav<strong>in</strong>g anonymous types aside for themoment) is that it reduces not only the number of keystrokes required to enter thecode, but also the amount of code on the screen. In particular, when generics are<strong>in</strong>volved the type names can get very long. Figures 8.1 and 8.2 used a type of Dictionary, which is 33 characters. By the time you’ve got that twice ona l<strong>in</strong>e (once for the declaration and once for the <strong>in</strong>itialization), you end up with a massivel<strong>in</strong>e just for declar<strong>in</strong>g and <strong>in</strong>itializ<strong>in</strong>g a s<strong>in</strong>gle variable! An alternative is to use analias, but that puts the “real” type <strong>in</strong>volved a long way (conceptually at least) from thecode that uses it.When read<strong>in</strong>g the code, there’s no po<strong>in</strong>t <strong>in</strong> see<strong>in</strong>g the same long type name twiceon the same l<strong>in</strong>e when it’s obvious that they should be the same. If the declaration isn’tvisible on the screen, you’re <strong>in</strong> the same boat whether implicit typ<strong>in</strong>g was used or not(all the ways you’d use to f<strong>in</strong>d out the variable type are still valid) and if it is visible, theexpression used to <strong>in</strong>itialize the variable tells you the type anyway.All of this sounds good, so what are the arguments aga<strong>in</strong>st implicit typ<strong>in</strong>g? Paradoxicallyenough, readability is the most important one, despite also be<strong>in</strong>g an argument<strong>in</strong> favor of implicit typ<strong>in</strong>g! By not be<strong>in</strong>g explicit about what type of variableyou’re declar<strong>in</strong>g, you may be mak<strong>in</strong>g it harder to work it out just by read<strong>in</strong>g the code.It breaks the “state what are we declar<strong>in</strong>g, then what value it will start off with” m<strong>in</strong>dsetthat keeps the declaration and the <strong>in</strong>itialization quite separate. To what extentthat’s an issue depends on both the reader and the <strong>in</strong>itialization expression <strong>in</strong>volved.If you’re explicitly call<strong>in</strong>g a constructor, it’s always go<strong>in</strong>g to be pretty obvious whattype you’re creat<strong>in</strong>g. If you’re call<strong>in</strong>g a method or us<strong>in</strong>g a property, it depends on howLicensed to Rhona Hadida


214 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compilerobvious the return type is just from look<strong>in</strong>g at the call. Integer literals are anotherexample where guess<strong>in</strong>g the <strong>in</strong>ferred type is harder than one might suppose. Howquickly can you work out the type of each of the variables declared here?var a = 2147483647;var b = 2147483648;var c = 4294967295;var d = 4294967296;var e = 9223372036854775807;var f = 9223372036854775808;The answers are <strong>in</strong>t, u<strong>in</strong>t, u<strong>in</strong>t, long, long, and ulong, respectively—the type useddepends on the value of the expression. There’s noth<strong>in</strong>g new here <strong>in</strong> terms of thehandl<strong>in</strong>g of literals—<strong>C#</strong> has always behaved like this—but implicit typ<strong>in</strong>g makes it easierto write obscure code <strong>in</strong> this case.The argument that is rarely explicitly stated but that I believe is beh<strong>in</strong>d a lot of theconcern over implicit typ<strong>in</strong>g is “It just doesn’t feel right.” If you’ve been writ<strong>in</strong>g <strong>in</strong> a C-like language for years and years, there is someth<strong>in</strong>g unnerv<strong>in</strong>g about the whole bus<strong>in</strong>ess,however much you tell yourself that it’s still static typ<strong>in</strong>g under the covers. This maynot be a rational concern, but that doesn’t make it any less real. If you’re uncomfortable,you’re likely to be less productive. If the advantages don’t outweigh your negative feel<strong>in</strong>gs,that’s f<strong>in</strong>e. Depend<strong>in</strong>g on your personality, you may wish to try to push yourself tobecome more comfortable with implicit typ<strong>in</strong>g—but you certa<strong>in</strong>ly don’t have to.8.2.4 RecommendationsHere are some recommendations based on my experience with implicit typ<strong>in</strong>g. That’sall they are—recommendations—and you should feel free to take them with a p<strong>in</strong>chof salt.■■■■■■Consult your teammates on the matter when embark<strong>in</strong>g on a <strong>C#</strong> 3 project.When <strong>in</strong> doubt, try a l<strong>in</strong>e both ways and go with your gut feel<strong>in</strong>gs.Unless there’s a significant ga<strong>in</strong> <strong>in</strong> code simplicity, use explicit typ<strong>in</strong>g. Note thatnumeric variables always fall <strong>in</strong>to this category s<strong>in</strong>ce you’d never ga<strong>in</strong> morethan a few characters anyway.If it’s important that someone read<strong>in</strong>g the code knows the type of the variableat a glance, use explicit typ<strong>in</strong>g.If the variable is directly <strong>in</strong>itialized with a constructor and the type name is long(which often occurs with generics) consider us<strong>in</strong>g implicit typ<strong>in</strong>g.If the precise type of the variable isn’t important, but its general nature is clearfrom the context, use implicit typ<strong>in</strong>g to deemphasize how the code achieves itsaim and concentrate on the higher level of what it’s achiev<strong>in</strong>g.Effectively, my recommendation boils down to not us<strong>in</strong>g implicit typ<strong>in</strong>g either“because it’s new” or for reasons of laz<strong>in</strong>ess, sav<strong>in</strong>g a few keystrokes. Where it keepsthe code tidier, allow<strong>in</strong>g you to concentrate on the most important elements of thecode, go for it. I’ll be us<strong>in</strong>g implicit typ<strong>in</strong>g extensively <strong>in</strong> the rest of the book, for theLicensed to Rhona Hadida


Simplified <strong>in</strong>itialization215simple reason that code is harder to format <strong>in</strong> pr<strong>in</strong>t than on a screen—there’s not asmuch width available.We’ll come back to implicit typ<strong>in</strong>g when we see anonymous types, as they create situationswhere you are forced to ask the compiler to <strong>in</strong>fer the types of some variables.Before that, let’s have a look at how <strong>C#</strong> 3 makes it easier to construct and populate anew object <strong>in</strong> one expression.8.3 Simplified <strong>in</strong>itializationOne would have thought that object-oriented languages would have streaml<strong>in</strong>edobject creation long ago. After all, before you start us<strong>in</strong>g an object, someth<strong>in</strong>g has tocreate it, whether it’s through your code directly or a factory method of some sort.And yet <strong>in</strong> <strong>C#</strong> 2 very few language features are geared toward mak<strong>in</strong>g life easier whenit comes to <strong>in</strong>itialization. If you can’t do what you want us<strong>in</strong>g constructor arguments,you’re basically out of luck—you need to create the object, then manually <strong>in</strong>itialize itwith property calls and the like.This is particularly annoy<strong>in</strong>g when you want to create a whole bunch of objects <strong>in</strong>one go, such as <strong>in</strong> an array or other collection—without a “s<strong>in</strong>gle expression” way of<strong>in</strong>itializ<strong>in</strong>g an object, you’re forced to either use local variables for temporary manipulation,or create a helper method that performs the appropriate <strong>in</strong>itialization basedon parameters.<strong>C#</strong> 3 comes to the rescue <strong>in</strong> a number of ways, as we’ll see <strong>in</strong> this section.8.3.1 Def<strong>in</strong><strong>in</strong>g our sample typesThe expressions we’re go<strong>in</strong>g to be us<strong>in</strong>g <strong>in</strong> this section are called object <strong>in</strong>itializers.These are just ways of specify<strong>in</strong>g <strong>in</strong>itialization that should occur after an object hasbeen created. You can set properties, set properties of properties (don’t worry, it’s simplerthan it sounds), and add to collections that are accessible via properties. To demonstrateall this, we’ll use a Person class aga<strong>in</strong>. To start with, there’s the name and agewe’ve used before, exposed as writable properties. We’ll provide both a parameterlessconstructor and one that accepts the name as a parameter. We’ll also add a list offriends and the person’s home location, both of which are accessible as read-onlyproperties, but that can still be modified by manipulat<strong>in</strong>g the retrieved objects. A simpleLocation class provides Country and Town properties to represent the person’shome. List<strong>in</strong>g 8.2 shows the complete code for the classes.List<strong>in</strong>g 8.2A fairly simple Person class used for further demonstrationspublic class Person{public <strong>in</strong>t Age { get; set; }public str<strong>in</strong>g Name { get; set; }List friends = new List();public List Friends { get { return friends; } }Location home = new Location();Licensed to Rhona Hadida


216 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compilerpublic Location Home { get { return home; } }public Person() { }}public Person(str<strong>in</strong>g name){Name = name;}public class Location{public str<strong>in</strong>g Country { get; set; }public str<strong>in</strong>g Town { get; set; }}List<strong>in</strong>g 8.2 is straightforward, but it’s worth not<strong>in</strong>g that both the list of friends and thehome location are created <strong>in</strong> a “blank” way when the person is created, rather thanbe<strong>in</strong>g left as just null references. That’ll be important later on—but for the momentlet’s look at the properties represent<strong>in</strong>g the name and age of a person.8.3.2 Sett<strong>in</strong>g simple propertiesNow that we’ve got our Person type, we want to create some <strong>in</strong>stances of it us<strong>in</strong>g thenew features of <strong>C#</strong> 3. In this section we’ll look at sett<strong>in</strong>g the Name and Age properties—we’ll come to the others later.In fact, object <strong>in</strong>itializers aren’t restricted to us<strong>in</strong>g properties. All of the syntacticsugar here also applies to fields, but the vast majority of the time you’ll be us<strong>in</strong>g properties.In a well-encapsulated system you’re unlikely to have access to fields anyway,unless you’re creat<strong>in</strong>g an <strong>in</strong>stance of a type with<strong>in</strong> that type’s own code. It’s worthknow<strong>in</strong>g that you can use fields, of course—so for the rest of the section, just read propertyand field whenever the text says property.With that out of the way, let’s get down to bus<strong>in</strong>ess. Suppose we want to create aperson called Tom, who is four years old. Prior to <strong>C#</strong> 3, there are two ways this can beachieved:Person tom1 = new Person();tom1.Name = "Tom";tom1.Age = 4;Person tom2 = new Person("Tom");tom2.Age = 4;The first version simply uses the parameterless constructor and then sets both properties.The second version uses the constructor overload which sets the name, and thensets the age afterward. Both of these options are still available <strong>in</strong> <strong>C#</strong> 3 of course, butthere are other alternatives:Person tom3 = new Person() { Name="Tom", Age=4 };Person tom4 = new Person { Name="Tom", Age=4 };Person tom5 = new Person("Tom") { Age = 4 };Licensed to Rhona Hadida


Simplified <strong>in</strong>itialization217The part <strong>in</strong> braces at the end of each l<strong>in</strong>e is the object <strong>in</strong>itializer. Aga<strong>in</strong>, it’s just compilertrickery. The IL used to <strong>in</strong>itialize tom3 and tom4 is identical, and <strong>in</strong>deed it’s verynearly 2 the same as we used for tom1. Predictably, the code for tom5 is nearly the sameas for tom2. Note how for tom4 we omitted the parentheses for the constructor. Youcan use this shorthand for types with a parameterless constructor, which is what getscalled <strong>in</strong> the compiled code.After the constructor has been called, the specified properties are set <strong>in</strong> the obviousway. They’re set <strong>in</strong> the order specified <strong>in</strong> the object <strong>in</strong>itializer, and you can onlyspecify any particular property at most once—you can’t set the Name property twice,for example. (You could, however, call the constructor tak<strong>in</strong>g the name as a parameter,and then set the Name property. It would be po<strong>in</strong>tless, but the compiler wouldn’tstop you from do<strong>in</strong>g it.) The expression used as the value for a property can be anyexpression that isn’t itself an assignment—you can call methods, create new objects(potentially us<strong>in</strong>g another object <strong>in</strong>itializer), pretty much anyth<strong>in</strong>g.You may well be wonder<strong>in</strong>g just how useful this is—we’ve saved one orImportant!Oneexpression to<strong>in</strong>itialize anobjecttwo l<strong>in</strong>es of code, but surely that’s not a good enough reason to make thelanguage more complicated, is it? There’s a subtle po<strong>in</strong>t here, though:we’ve not just created an object <strong>in</strong> one l<strong>in</strong>e—we’ve created it <strong>in</strong> oneexpression. That difference can be very important. Suppose you want tocreate an array of type Person[] with some predef<strong>in</strong>ed data <strong>in</strong> it. Evenwithout us<strong>in</strong>g the implicit array typ<strong>in</strong>g we’ll see later, the code is neat andreadable:Person[] family = new Person[]{new Person { Name="Holly", Age=31 },new Person { Name="Jon", Age=31 },new Person { Name="Tom", Age=4 },new Person { Name="William", Age=1 },new Person { Name="Rob<strong>in</strong>", Age=1 }};Now, <strong>in</strong> a simple example like this we could have written a constructor tak<strong>in</strong>g both thename and age as parameters, and <strong>in</strong>itialized the array <strong>in</strong> a similar way <strong>in</strong> <strong>C#</strong> 1 or 2.However, appropriate constructors aren’t always available—and if there are severalconstructor parameters, it’s often not clear which one means what just from the position.By the time a constructor needs to take five or six parameters, I often f<strong>in</strong>d myselfrely<strong>in</strong>g on IntelliSense more than I want to. Us<strong>in</strong>g the property names is a great boonto readability <strong>in</strong> such cases.This form of object <strong>in</strong>itializer is the one you’ll probably use most often. However,there are two other forms—one for sett<strong>in</strong>g subproperties, and one for add<strong>in</strong>g to collections.Let’s look at subproperties—properties of properties—first.2In fact, the variable’s new value isn’t assigned until all the properties have been set. A temporary local variableis used until then. This is very rarely noticeable, though, and where it is the code should probably be morestraightforward anyway.Licensed to Rhona Hadida


218 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compiler8.3.3 Sett<strong>in</strong>g properties on embedded objectsSo far we’ve found it easy to set the Name and Age properties, but we can’t set the Homeproperty <strong>in</strong> the same way—it’s read-only. However, we can set the town and the countryof a person, by first fetch<strong>in</strong>g the Home property, and then sett<strong>in</strong>g properties on theresult. The language specification refers to this as sett<strong>in</strong>g the properties of an embeddedobject. Just to make it clear, what we’re talk<strong>in</strong>g about is the follow<strong>in</strong>g <strong>C#</strong> 1 code:Person tom = new Person("Tom");tom.Age = 4;tom.Home.Country = "UK";tom.Home.Town = "Read<strong>in</strong>g";When we’re populat<strong>in</strong>g the home location, each statement is do<strong>in</strong>g a get to retrievethe Location <strong>in</strong>stance, and then a set on the relevant property on that <strong>in</strong>stance.There’s noth<strong>in</strong>g new <strong>in</strong> that, but it’s worth slow<strong>in</strong>g your m<strong>in</strong>d down to look at it carefully;otherwise, it’s easy to miss what’s go<strong>in</strong>g on beh<strong>in</strong>d the scenes.<strong>C#</strong> 3 allows all of this to be done <strong>in</strong> one expression, as shown here:Looks like anassignment toHome, but it’snot really!Person tom = new Person("Tom"){Age = 4,Home = { Country="UK", Town="Read<strong>in</strong>g" }};The compiled code for these snippets is effectively the same. The compilerspots that to the right side of the = sign is another object <strong>in</strong>itializer,and applies the properties to the embedded object appropriately. One po<strong>in</strong>t aboutthe formatt<strong>in</strong>g I’ve used—just as <strong>in</strong> almost all <strong>C#</strong> features, it’s whitespace <strong>in</strong>dependent:you can collapse the whitespace <strong>in</strong> the object <strong>in</strong>itializer, putt<strong>in</strong>g it all on one l<strong>in</strong>eif you like. It’s up to you to work out where the sweet spot is <strong>in</strong> balanc<strong>in</strong>g long l<strong>in</strong>esaga<strong>in</strong>st lots of l<strong>in</strong>es.The absence of the new keyword <strong>in</strong> the part <strong>in</strong>itializ<strong>in</strong>g Home is significant. If youneed to work out where the compiler is go<strong>in</strong>g to create new objects and where it’sgo<strong>in</strong>g to set properties on exist<strong>in</strong>g ones, look for occurrences of new <strong>in</strong> the <strong>in</strong>itializer.Every time a new object is created, the new keyword appears somewhere.We’ve dealt with the Home property—but what about Tom’s friends? There areproperties we can set on a List, but none of them will add entries to the list.It’s time for the next feature—collection <strong>in</strong>itializers.8.3.4 Collection <strong>in</strong>itializersCreat<strong>in</strong>g a collection with some <strong>in</strong>itial values is an extremely common task. Until <strong>C#</strong> 3arrived, the only language feature that gave any assistance was array creation—andeven that was clumsy <strong>in</strong> many situations. <strong>C#</strong> 3 has collection <strong>in</strong>itializers, which allow youto use the same type of syntax as array <strong>in</strong>itializers but with arbitrary collections andmore flexibility.Licensed to Rhona Hadida


Simplified <strong>in</strong>itialization219CREATING NEW COLLECTIONS WITH COLLECTION INITIALIZERSAs a first example, let’s use the now-familiar List type. In <strong>C#</strong> 2, you could populatea list either by pass<strong>in</strong>g <strong>in</strong> an exist<strong>in</strong>g collection, or by call<strong>in</strong>g Add repeatedly after creat<strong>in</strong>gan empty list. Collection <strong>in</strong>itializers <strong>in</strong> <strong>C#</strong> 3 take the latter approach. Suppose wewant to populate a list of str<strong>in</strong>gs with some names—here’s the <strong>C#</strong> 2 code (on the left)and the close equivalent <strong>in</strong> <strong>C#</strong> 3 (on the right):List names = new List();names.Add("Holly");names.Add("Jon");names.Add("Tom");names.Add("Rob<strong>in</strong>");names.Add("William");var names = new List{"Holly", "Jon", "Tom","Rob<strong>in</strong>", "William"};Just as with object <strong>in</strong>itializers, you can specify constructor parameters if you want, oruse a parameterless constructor either explicitly or implicitly. Also as before, the decisionabout how much whitespace to use is entirely yours—<strong>in</strong> real code (where there’ssignificantly more room than <strong>in</strong> a book), I might well have put the entire <strong>C#</strong> 3 statementon one l<strong>in</strong>e. The use of implicit typ<strong>in</strong>g here was partly for space reasons—thenames variable could equally well have been declared explicitly. Reduc<strong>in</strong>g the numberof l<strong>in</strong>es of code (without reduc<strong>in</strong>g readability) is nice, but there are two bigger benefitsof collection <strong>in</strong>itializers:■■The “create and <strong>in</strong>itialize” part counts as a s<strong>in</strong>gle expression.There’s a lot less clutter <strong>in</strong> the code.The first po<strong>in</strong>t becomes important when you want to use a collection as either anargument to a method or as one element <strong>in</strong> a larger collection. That happens relativelyrarely (although often enough to still be useful)—but the second po<strong>in</strong>t is the real reasonthis is a killer feature <strong>in</strong> my view. If you look at the code on the right, you see the<strong>in</strong>formation you need, with each piece of <strong>in</strong>formation written only once. The variablename occurs once, the type be<strong>in</strong>g used occurs once, and each of the elements of the<strong>in</strong>itialized collection appears once. It’s all extremely simple, and much clearer thanthe <strong>C#</strong> 2 code, which conta<strong>in</strong>s a lot of fluff around the useful bits.Collection <strong>in</strong>itializers aren’t limited to just lists. You can use them with any typethat implements IEnumerable, as long as it has an appropriate public Add method foreach element <strong>in</strong> the <strong>in</strong>itializer. You can use an Add method with more than one parameterby putt<strong>in</strong>g the values with<strong>in</strong> another set of braces. The most common use for thisis creat<strong>in</strong>g dictionaries. For example, if we wanted a dictionary mapp<strong>in</strong>g names toages, we could use the follow<strong>in</strong>g code:Dictionary nameAgeMap = new Dictionary{{"Holly", 31},Licensed to Rhona Hadida


220 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compiler{"Jon", 31},{"Tom", 4}};In this case, the Add(str<strong>in</strong>g, <strong>in</strong>t) method would be called three times. If multipleAdd overloads are available, different elements of the <strong>in</strong>itializer can call different overloads.If no compatible overload is available for a specified element, the code will failto compile. There are two <strong>in</strong>terest<strong>in</strong>g po<strong>in</strong>ts about the design decision here:■■The fact that the type has to implement IEnumerable is never used by the compiler.The Add method is only found by name—there’s no <strong>in</strong>terface requirementspecify<strong>in</strong>g it.These are both pragmatic decisions. Requir<strong>in</strong>g IEnumerable to be implemented is areasonable attempt to check that the type really is a collection of some sort, and us<strong>in</strong>gany public overload of the Add method (rather than requir<strong>in</strong>g an exact signature)allows for simple <strong>in</strong>itializations such as the earlier dictionary example. Nonpublicoverloads, <strong>in</strong>clud<strong>in</strong>g those that explicitly implement an <strong>in</strong>terface, are not used. This isa slightly different situation from object <strong>in</strong>itializers sett<strong>in</strong>g properties, where <strong>in</strong>ternalproperties are available too (with<strong>in</strong> the same assembly, of course).An early draft of the specification required ICollection to be implemented<strong>in</strong>stead, and the implementation of the s<strong>in</strong>gle-parameter Add method (as specifiedby the <strong>in</strong>terface) was called rather than allow<strong>in</strong>g different overloads. Thissounds more “pure,” but there are far more types that implement IEnumerable thanICollection—and us<strong>in</strong>g the s<strong>in</strong>gle-parameter Add method would be <strong>in</strong>convenient.For example, <strong>in</strong> our case it would have forced us to explicitly create an<strong>in</strong>stance of a KeyValuePair for each element of the <strong>in</strong>itializer. Sacrific<strong>in</strong>ga bit of academic purity has made the language far more useful <strong>in</strong> real life.POPULATING COLLECTIONS WITHIN OTHER OBJECT INITIALIZERSSo far we’ve only seen collection <strong>in</strong>itializers used <strong>in</strong> a stand-alone fashion to createwhole new collections. They can also be comb<strong>in</strong>ed with object <strong>in</strong>itializers to populateembedded collections. To show this, we’ll go back to our Person example. TheFriends property is read-only, so we can’t create a new collection and specify that asthe collection of friends—but we can add to whatever collection is returned by theproperty’s getter. The way we do this is similar to the syntax we’ve already seen for sett<strong>in</strong>gproperties of embedded objects, but we just specify a collection <strong>in</strong>itializer <strong>in</strong>steadof a sequence of properties.Let’s see this <strong>in</strong> action by creat<strong>in</strong>g another Person <strong>in</strong>stance for Tom, this time withfriends (list<strong>in</strong>g 8.3).List<strong>in</strong>g 8.3Build<strong>in</strong>g up a rich object us<strong>in</strong>g object and collection <strong>in</strong>itializersPerson tom = new Person{Name = "Tom",Age = 4,Calls parameterless constructorSets propertiesdirectlyLicensed to Rhona Hadida


Simplified <strong>in</strong>itialization221Home = { Town="Read<strong>in</strong>g", Country="UK" },Friends ={new Person { Name = "Phoebe" },new Person("Abi"),new Person { Name = "Ethan", Age = 4 },new Person("Ben"){Age = 4,Home = { Town = "Purley", Country="UK" }}}};List<strong>in</strong>g 8.3 uses all the features of object and collection <strong>in</strong>itializers we’ve come across.The ma<strong>in</strong> part of <strong>in</strong>terest is the collection <strong>in</strong>itializer, which itself uses all k<strong>in</strong>ds of differentforms of object <strong>in</strong>itializers <strong>in</strong>ternally. Note that we’re not specify<strong>in</strong>g a type hereas we did with the stand-alone collection creation: we’re not creat<strong>in</strong>g a new collection,just add<strong>in</strong>g to an exist<strong>in</strong>g one.We could have gone further, specify<strong>in</strong>g friends of friends, friends of friends offriends, and so forth. What we couldn’t do with this syntax is specify that Tom is Ben’sfriend—while you’re still <strong>in</strong>itializ<strong>in</strong>g an object, you don’t have access to it. This can beawkward <strong>in</strong> a few cases, but usually isn’t a problem.Collection <strong>in</strong>itialization with<strong>in</strong> object <strong>in</strong>itializers works as a sort of cross betweenstand-alone collection <strong>in</strong>itializers and sett<strong>in</strong>g embedded object properties. For eachelement <strong>in</strong> the collection <strong>in</strong>itializer, the collection property getter (Friends <strong>in</strong> thiscase) is called, and then the appropriate Add method is called on the returned value.The collection isn’t cleared <strong>in</strong> any way before elements are added. For example, if youwere to decide that someone should always be their own friend, and added this tothe list of friends with<strong>in</strong> the Person constructor, us<strong>in</strong>g a collection <strong>in</strong>itializer wouldonly add extra friends.As you can see, the comb<strong>in</strong>ation of collection and object <strong>in</strong>itializers can be used topopulate whole trees of objects. But when and where is this likely to actually happen?8.3.5 Uses of <strong>in</strong>itialization featuresInitializesembeddedobjectInitializes collectionwith further object<strong>in</strong>itializersTry<strong>in</strong>g to p<strong>in</strong> down exactly where these features are useful is rem<strong>in</strong>iscent of be<strong>in</strong>g <strong>in</strong>a Monty Python sketch about the Spanish Inquisition—every time you th<strong>in</strong>k you’vegot a reasonably complete list, another fairly common example pops up. I’ll justmention three examples, which I hope will encourage you to consider where else youmight use them.“CONSTANT” COLLECTIONSIt’s not uncommon for me to want some k<strong>in</strong>d of collection (often a map) that is effectivelyconstant. Of course, it can’t be a constant as far as the <strong>C#</strong> language is concerned,but it can be declared static and read-only, with big warn<strong>in</strong>gs to say that it shouldn’t bechanged. (It’s usually private, so that’s good enough.) Typically, this <strong>in</strong>volves creat<strong>in</strong>ga static constructor and often a helper method, just to populate the map. With <strong>C#</strong> 3’scollection <strong>in</strong>itializers, it’s easy to set the whole th<strong>in</strong>g up <strong>in</strong>l<strong>in</strong>e.Licensed to Rhona Hadida


222 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compilerSETTING UP UNIT TESTSWhen writ<strong>in</strong>g unit tests, I frequently want to populate an object just for one test, oftenpass<strong>in</strong>g it <strong>in</strong> as a parameter to the method I’m try<strong>in</strong>g to test at the time. This is particularlycommon with entity classes. Writ<strong>in</strong>g all of the <strong>in</strong>itialization “long-hand” can belongw<strong>in</strong>ded and also hides the essential structure of the object from the reader of thecode, just as XML creation code can often obscure what the document would look likeif you viewed it (appropriately formatted) <strong>in</strong> a text editor. With appropriate <strong>in</strong>dentationof object <strong>in</strong>itializers, the nested structure of the object hierarchy can becomeobvious <strong>in</strong> the very shape of the code, as well as make the values stand out more thanthey would otherwise.PARAMETER ENCAPSULATIONSometimes patterns occur <strong>in</strong> production code that can be aided by <strong>C#</strong> 3’s <strong>in</strong>itializationfeatures. For <strong>in</strong>stance, rather than specify<strong>in</strong>g several parameters to a s<strong>in</strong>gle method,you can sometimes make code more straightforward by collect<strong>in</strong>g the parameterstogether <strong>in</strong> an extra type. The framework ProcessStartInfo type is a good exampleof this—the designers could have overloaded Process.Start with many different setsof parameters, but us<strong>in</strong>g ProcessStartInfo makes everyth<strong>in</strong>g clearer. <strong>C#</strong> 3 allows youto create a ProcessStartInfo and fill <strong>in</strong> all the properties <strong>in</strong> a clearer manner—andyou could even specify it <strong>in</strong>l<strong>in</strong>e <strong>in</strong> a call to Process.Start. In some ways, the methodcall would then act as if it had a lot of default parameters, with the properties provid<strong>in</strong>gthe names of parameters you want to specify. It’s worth consider<strong>in</strong>g this patternwhen you f<strong>in</strong>d yourself us<strong>in</strong>g lots of parameters—it was always a useful technique toknow about, but <strong>C#</strong> 3 makes it that bit more elegant.Of course, there are uses beyond these three <strong>in</strong> ord<strong>in</strong>ary code, and I certa<strong>in</strong>ly don’twant to put you off us<strong>in</strong>g the new features elsewhere. There’s very little reason not touse them, other than possibly confus<strong>in</strong>g developers who aren’t familiar with <strong>C#</strong> 3 yet.You may decide that us<strong>in</strong>g an object <strong>in</strong>itializer just to set one property (as opposed tojust explicitly sett<strong>in</strong>g it <strong>in</strong> a separate statement) is over the top—that’s a matter of aesthetics,and I can’t give you much guidance there. As with implicit typ<strong>in</strong>g, it’s a goodidea to try the code both ways, and learn to predict your own (and your team’s) read<strong>in</strong>gpreferences.So far we’ve looked at a fairly diverse range of features: implement<strong>in</strong>g propertieseasily, simplify<strong>in</strong>g local variable declarations, and populat<strong>in</strong>g objects <strong>in</strong> s<strong>in</strong>gle expressions.In the rema<strong>in</strong>der of this chapter we’ll be gradually br<strong>in</strong>g<strong>in</strong>g these topicstogether, us<strong>in</strong>g more implicit typ<strong>in</strong>g and more object population, and creat<strong>in</strong>g wholetypes without giv<strong>in</strong>g any implementation details.Our next topic appears to be quite similar to collection <strong>in</strong>itializers when you lookat code us<strong>in</strong>g it. I mentioned earlier that array <strong>in</strong>itialization was a bit clumsy <strong>in</strong> <strong>C#</strong> 1and 2. I’m sure it won’t surprise you to learn that it’s been streaml<strong>in</strong>ed for <strong>C#</strong> 3. Let’stake a look.Licensed to Rhona Hadida


Implicitly typed arrays2238.4 Implicitly typed arraysIn <strong>C#</strong> 1 and 2, <strong>in</strong>itializ<strong>in</strong>g an array as part of a variable declaration and <strong>in</strong>itializationstatement was quite neat—but if you wanted to do it anywhere else, you had to specifythe exact array type <strong>in</strong>volved. So for example, this compiles without any problem:str<strong>in</strong>g[] names = {"Holly", "Jon", "Tom", "Rob<strong>in</strong>", "William"};This doesn’t work for parameters, though: suppose we want to make a call toMyMethod, declared as void MyMethod(str<strong>in</strong>g[] names). This code won’t work:MyMethod({"Holly", "Jon", "Tom", "Rob<strong>in</strong>", "William"});Instead, you have to tell the compiler what type of array you want to <strong>in</strong>itialize:MyMethod(new str<strong>in</strong>g[] {"Holly", "Jon", "Tom", "Rob<strong>in</strong>", "William"});<strong>C#</strong> 3 allows someth<strong>in</strong>g <strong>in</strong> between:MyMethod(new[] {"Holly", "Jon", "Tom", "Rob<strong>in</strong>", "William"});Clearly the compiler needs to work out what type of array to use. It starts by form<strong>in</strong>g aset conta<strong>in</strong><strong>in</strong>g all the compile-time types of the expressions <strong>in</strong>side the braces. Ifthere’s exactly one type <strong>in</strong> that set that all the others can be implicitly converted to,that’s the type of the array. Otherwise, (or if all the values are typeless expressions,such as constant null values or anonymous methods, with no casts) the code won’tcompile. Note that only the types of the expressions are considered as candidates forthe overall array type. This means that occasionally you might have to explicitly cast avalue to a less specific type. For <strong>in</strong>stance, this won’t compile:new[] { new MemoryStream(), new Str<strong>in</strong>gWriter() }There’s no conversion from MemoryStream to Str<strong>in</strong>gWriter, or vice versa. Both areimplicitly convertible to object and IDisposable, but the compiler only considers typesthat are <strong>in</strong> the orig<strong>in</strong>al set produced by the expressions themselves. If we change oneof the expressions <strong>in</strong> this situation so that its type is either object or IDisposable, thecode compiles:new[] { (IDisposable) new MemoryStream(), new Str<strong>in</strong>gWriter() }The type of this last expression is implicitly IDisposable[]. Of course, at that po<strong>in</strong>tyou might as well explicitly state the type of the array just as you would <strong>in</strong> <strong>C#</strong> 1 and 2,to make it clearer what you’re try<strong>in</strong>g to achieve.Compared with the earlier features, implicitly typed arrays are a bit of an anticlimax.I f<strong>in</strong>d it hard to get particularly excited about them, even though they do makelife that bit simpler <strong>in</strong> cases where an array is passed as a parameter. You could wellargue that this feature doesn’t prove itself <strong>in</strong> the “usefulness versus complexity” balanceused by the language designers to decide what should be part of the language.The designers haven’t gone mad, however—there’s one important situation <strong>in</strong>which this implicit typ<strong>in</strong>g is absolutely crucial. That’s when you don’t know (and<strong>in</strong>deed can’t know) the name of the type of the elements of the array. How can youpossibly get <strong>in</strong>to this peculiar state? Read on…Licensed to Rhona Hadida


224 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compiler8.5 Anonymous typesImplicit typ<strong>in</strong>g, object and collection <strong>in</strong>itializers, and implicit array typ<strong>in</strong>g are all useful<strong>in</strong> their own right, to a greater or lesser extent. However, they all really serve ahigher purpose—they make it possible to work with our f<strong>in</strong>al feature of the chapter,anonymous types. They, <strong>in</strong> turn, serve a higher purpose—LINQ.8.5.1 First encounters of the anonymous k<strong>in</strong>dIt’s much easier to expla<strong>in</strong> anonymous types when you’ve already got some idea ofwhat they are through an example. I’m sorry to say that without the use of extensionmethods and lambda expressions, the examples <strong>in</strong> this section are likely to be a littlecontrived, but there’s a dist<strong>in</strong>ct chicken-and-egg situation here: anonymous types aremost useful with<strong>in</strong> the context of the more advanced features, but we need to understandthe build<strong>in</strong>g blocks before we can see much of the bigger picture. Stick with it—it will make sense <strong>in</strong> the long run, I promise.Let’s pretend we didn’t have the Person class, and the only properties we caredabout were the name and age. List<strong>in</strong>g 8.4 shows how we could still build objects withthose properties, without ever declar<strong>in</strong>g a type.List<strong>in</strong>g 8.4Creat<strong>in</strong>g objects of an anonymous type with Name and Age propertiesvar tom = new { Name = "Tom", Age = 4 };var holly = new { Name = "Holly", Age = 31 };var jon = new { Name = "Jon", Age = 31 };Console.WriteL<strong>in</strong>e("{0} is {1} years old", jon.Name, jon.Age);As you can tell from list<strong>in</strong>g 8.4, the syntax for <strong>in</strong>itializ<strong>in</strong>g an anonymous type is similarto the object <strong>in</strong>itializers we saw <strong>in</strong> section 8.3.2—it’s just that the name of the type ismiss<strong>in</strong>g between new and the open<strong>in</strong>g brace. We’re us<strong>in</strong>g implicitly typed local variablesbecause that’s all we can use—we don’t have a type name to declare the variablewith. As you can see from the last l<strong>in</strong>e, the type has properties for the Name and Age,both of which can be read and which will have the values specified <strong>in</strong> the anonymousobject <strong>in</strong>itializer used to create the <strong>in</strong>stance—so <strong>in</strong> this case the output is “Jon is 31 yearsold.” The properties have the same types as the expressions <strong>in</strong> the <strong>in</strong>itializers—str<strong>in</strong>gfor Name, and <strong>in</strong>t for Age. Just as <strong>in</strong> normal object <strong>in</strong>itializers, the expressions used <strong>in</strong>anonymous object <strong>in</strong>itializers can call methods or constructors, fetch properties, performcalculations—whatever you need to do.You may now be start<strong>in</strong>g to see why implicitly typed arrays are important. Supposewe want to create an array conta<strong>in</strong><strong>in</strong>g the whole family, and then iterate through it towork out the total age. List<strong>in</strong>g 8.5 does just that—and demonstrates a few other <strong>in</strong>terest<strong>in</strong>gfeatures of anonymous types at the same time.List<strong>in</strong>g 8.5var family = new[]{Populat<strong>in</strong>g an array us<strong>in</strong>g anonymous types and then f<strong>in</strong>d<strong>in</strong>g the total ageBUses an implicitly typed array <strong>in</strong>itializerLicensed to Rhona Hadida


Anonymous types225new { Name = "Holly", Age = 31 },new { Name = "Jon", Age = 31 },new { Name = "Tom", Age = 4 },new { Name = "Rob<strong>in</strong>", Age = 1 },new { Name = "William", Age = 1 }};<strong>in</strong>t totalAge = 0;Uses implicitforeach (var person <strong>in</strong> family)typ<strong>in</strong>g for person{totalAge += person.Age; E Sums ages}Console.WriteL<strong>in</strong>e("Total age: {0}", totalAge);Putt<strong>in</strong>g together list<strong>in</strong>g 8.5 and what we learned about implicitly typed arrays <strong>in</strong> section8.4, we can deduce someth<strong>in</strong>g very important: all the people <strong>in</strong> the family are of thesame type. If each use of an anonymous object <strong>in</strong>itializer <strong>in</strong> C created a new type, therewouldn’t be any appropriate type for the array declared at B. With<strong>in</strong> any given assembly,the compiler treats two anonymous object <strong>in</strong>itializers as the same type if there arethe same number of properties, with the same names and types, and they appear <strong>in</strong>the same order. In other words, if we swapped the Name and Age properties <strong>in</strong> one ofthe <strong>in</strong>itializers, there’d be two different types <strong>in</strong>volved—likewise if we <strong>in</strong>troduced anextra property <strong>in</strong> one l<strong>in</strong>e, or used a long <strong>in</strong>stead of an <strong>in</strong>t for the age of one person,another anonymous type would have been <strong>in</strong>troduced.NOTEImplementation detail: how many types?—If you ever decide to look at the IL(or decompiled <strong>C#</strong>) for an anonymous type, be aware that although twoanonymous object <strong>in</strong>itializers with the same property names <strong>in</strong> the sameorder but us<strong>in</strong>g different property types will produce two different types,they’ll actually be generated from a s<strong>in</strong>gle generic type. The generic typeis parameterized, but the closed, constructed types will be different becausethey’ll be given different type arguments for the different <strong>in</strong>itializers.Notice that we’re able to use a foreach statement to iterate over the array just as wewould any other collection. The type <strong>in</strong>volved is <strong>in</strong>ferred D, and the type of theperson variable is the same anonymous type we’ve used <strong>in</strong> the array. Aga<strong>in</strong>, we canuse the same variable for different <strong>in</strong>stances because they’re all of the same type.List<strong>in</strong>g 8.5 also proves that the Age property really is strongly typed as an <strong>in</strong>t—otherwise try<strong>in</strong>g to sum the ages E wouldn’tcompile. The compiler knows about the anonymoustype, and Visual Studio 2008 is evenwill<strong>in</strong>g to share the <strong>in</strong>formation via tooltips,just <strong>in</strong> case you’re uncerta<strong>in</strong>. Figure 8.3 showsthe result of hover<strong>in</strong>g over the person part ofthe person.Age expression from list<strong>in</strong>g 8.5.Now that we’ve seen anonymous types <strong>in</strong>action, let’s go back and look at what the compileris actually do<strong>in</strong>g for us.CDUses sameanonymous typefive timesFigure 8.3 Hover<strong>in</strong>g over a variablethat is declared (implicitly) to be of ananonymous type shows the details ofthat anonymous type.Licensed to Rhona Hadida


226 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compiler8.5.2 Members of anonymous typesAnonymous types are created by the compiler and <strong>in</strong>cluded <strong>in</strong> the compiled assembly<strong>in</strong> the same way as the extra types for anonymous methods and iterator blocks. The CLRtreats them as perfectly ord<strong>in</strong>ary types, and so they are—if you later move from an anonymoustype to a normal, manually coded type with the behavior described <strong>in</strong> this section,you shouldn’t see anyth<strong>in</strong>g change. Anonymous types conta<strong>in</strong> the follow<strong>in</strong>g members:■ A constructor tak<strong>in</strong>g all the <strong>in</strong>itialization values. The parameters are <strong>in</strong> thesame order as they were specified <strong>in</strong> the anonymous object <strong>in</strong>itializer, and havethe same names and types.■ Public read-only properties.■ Private read-only fields back<strong>in</strong>g the properties.■ Overrides for Equals, GetHashCode, and ToStr<strong>in</strong>g.That’s it—there are no implemented <strong>in</strong>terfaces, no clon<strong>in</strong>g or serializationcapabilities—just a constructor, some properties and the normal methods from object.The constructor and the properties do the obvious th<strong>in</strong>gs. Equality between two<strong>in</strong>stances of the same anonymous type is determ<strong>in</strong>ed <strong>in</strong> the natural manner, compar<strong>in</strong>geach property value <strong>in</strong> turn us<strong>in</strong>g the property type’s Equals method. The hashcode generation is similar, call<strong>in</strong>g GetHashCode on each property value <strong>in</strong> turn andcomb<strong>in</strong><strong>in</strong>g the results. The exact method for comb<strong>in</strong><strong>in</strong>g the various hash codestogether to form one “composite” hash is unspecified, and you shouldn’t write codethat depends on it anyway—all you need to be confident <strong>in</strong> is that two equal <strong>in</strong>stanceswill return the same hash, and two unequal <strong>in</strong>stances will probably return differenthashes. All of this only works if the Equals and GetHashCode implementations of allthe different types <strong>in</strong>volved as properties conform to the normal rules, of course.Note that because the properties are read-only, all anonymous types are immutableso long as the types used for their properties are immutable. This provides you with allthe normal benefits of immutability—be<strong>in</strong>g able to pass values to methods withoutfear of them chang<strong>in</strong>g, simple shar<strong>in</strong>g of data across threads, and so forth.We’re almost done with anonymous types now. However, there’s one slight wr<strong>in</strong>klestill to talk about—a shortcut for a situation that is fairly common <strong>in</strong> LINQ.8.5.3 Projection <strong>in</strong>itializersThe anonymous object <strong>in</strong>itializers we’ve seen so far have all been lists of name/valuepairs—Name = "Jon", Age=31 and the like. As it happens, I’ve always used constantsbecause they make for smaller examples, but <strong>in</strong> real code you often want to copy propertiesfrom an exist<strong>in</strong>g object. Sometimes you’ll want to manipulate the values <strong>in</strong> someway, but often a straight copy is enough.Aga<strong>in</strong>, without LINQ it’s hard to give conv<strong>in</strong>c<strong>in</strong>g examples of this, but let’s go backto our Person class, and just suppose we had a good reason to want to convert a collectionof Person <strong>in</strong>stances <strong>in</strong>to a similar collection where each element has just a name,and a flag to say whether or not that person is an adult. Given an appropriate personvariable, we could use someth<strong>in</strong>g like this:new { Name = person.Name, IsAdult = (person.Age >= 18) }Licensed to Rhona Hadida


Anonymous types227That certa<strong>in</strong>ly works, and for just a s<strong>in</strong>gle property the syntax for sett<strong>in</strong>g the name(the part <strong>in</strong> bold) is not too clumsy—but if you were copy<strong>in</strong>g several properties itwould get tiresome. <strong>C#</strong> 3 provides a shortcut: if you don’t specify the property name,but just the expression to evaluate for the value, it will use the last part of the expressionas the name—provided it’s a simple field or property. This is called a projection <strong>in</strong>itializer.It means we can rewrite the previous code asnew { person.Name, IsAdult = (person.Age >= 18) }It’s quite common for all the bits of an anonymous object <strong>in</strong>itializer to be projection<strong>in</strong>itializers—it typically happens when you’re tak<strong>in</strong>g some properties from one objectand some properties from another, often as part of a jo<strong>in</strong> operation. Anyway, I’m gett<strong>in</strong>gahead of myself. List<strong>in</strong>g 8.6 shows the previous code <strong>in</strong> action, us<strong>in</strong>g theList.ConvertAll method and an anonymous delegate.List<strong>in</strong>g 8.6Transformation from Person to a name and adulthood flagList family = new List{new Person {Name="Holly", Age=31},new Person {Name="Jon", Age=31},new Person {Name="Tom", Age=4},new Person {Name="Rob<strong>in</strong>", Age=1},new Person {Name="William", Age=1}};var converted = family.ConvertAll(delegate(Person person){ return new { person.Name, IsAdult = (person.Age >= 18) }; });foreach (var person <strong>in</strong> converted){Console.WriteL<strong>in</strong>e("{0} is an adult? {1}",person.Name, person.IsAdult);}In addition to the use of a projection <strong>in</strong>itializer for the Name property, list<strong>in</strong>g 8.6 showsthe value of delegate type <strong>in</strong>ference and anonymous methods. Without them, wecouldn’t have reta<strong>in</strong>ed our strong typ<strong>in</strong>g of converted, as we wouldn’t have been ableto specify what the TOutput type parameter of Converter should be. As it is, we caniterate through the new list and access the Name and IsAdult properties as if we wereus<strong>in</strong>g any other type.Don’t spend too long th<strong>in</strong>k<strong>in</strong>g about projection <strong>in</strong>itializers at this po<strong>in</strong>t—theimportant th<strong>in</strong>g is to be aware that they exist, so you won’t get confused when you seethem later. In fact, that advice applies to this entire section on anonymous types—sowithout go<strong>in</strong>g <strong>in</strong>to details, let’s look at why they’re present at all.8.5.4 What’s the po<strong>in</strong>t?I hope you’re not feel<strong>in</strong>g cheated at this po<strong>in</strong>t, but I sympathize if you do. Anonymoustypes are a fairly complex solution to a problem we haven’t really encountered yet…except that I bet you have seen part of the problem before, really.Licensed to Rhona Hadida


228 CHAPTER 8 Cutt<strong>in</strong>g fluff with a smart compilerIf you’ve ever done any real-life work <strong>in</strong>volv<strong>in</strong>g databases, you’ll know that you don’talways want all of the data that’s available on all the rows that match your query criteria.Often it’s not a problem to fetch more than you need, but if you only need two columnsout of the fifty <strong>in</strong> the table, you wouldn’t bother to select all fifty, would you?The same problem occurs <strong>in</strong> nondatabase code. Suppose we have a class that readsa log file and produces a sequence of log l<strong>in</strong>es with many fields. Keep<strong>in</strong>g all of the<strong>in</strong>formation might be far too memory <strong>in</strong>tensive if we only care about a couple of fieldsfrom the log. LINQ lets you filter that <strong>in</strong>formation easily.But what’s the result of that filter<strong>in</strong>g? How can we keep some data and discard therest? How can we easily keep some derived data that isn’t directly represented <strong>in</strong> the orig<strong>in</strong>alform? How can we comb<strong>in</strong>e pieces of data that may not <strong>in</strong>itially have been consciouslyassociated, or that may only have a relationship <strong>in</strong> a particular situation?Effectively, we want a new data type—but manually creat<strong>in</strong>g such a type <strong>in</strong> every situationis tedious, particularly when you have tools such as LINQ available that make therest of the process so simple. Figure 8.4 shows the three elements that make anonymoustypes a powerful feature.If you f<strong>in</strong>d yourself creat<strong>in</strong>g a type that is only used <strong>in</strong> a s<strong>in</strong>gle method, and that onlyconta<strong>in</strong>s fields and trivial properties, consider whether an anonymous type would beappropriate. Even if you’re not develop<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 3 yet, keep an eye out for places whereit might be worth us<strong>in</strong>g an anonymous type when you upgrade. The more you th<strong>in</strong>kabout this sort of feature, the easier thedecisions about when to use it will become.I suspect that you’ll f<strong>in</strong>d that most of thetimes when you f<strong>in</strong>d yourself lean<strong>in</strong>gtoward anonymous types, you could alsouse LINQ to help you—look out for that too.If you f<strong>in</strong>d yourself us<strong>in</strong>g the samesequence of properties for the same purpose<strong>in</strong> several places, however, you mightwant to consider creat<strong>in</strong>g a normal typefor the purpose, even if it still just conta<strong>in</strong>strivial properties. Anonymous types naturally“<strong>in</strong>fect” whatever code they’re used <strong>in</strong>with implicit typ<strong>in</strong>g—which is often f<strong>in</strong>e,but can be a nuisance at other times. Aswith the previous features, use anonymoustypes when they genu<strong>in</strong>ely make the codesimpler to work with, not just becausethey’re new and cool.8.6 SummaryTailor<strong>in</strong>g dataencapsulation toone situationAvoid<strong>in</strong>g excessivedata accumulationAnonymoustypesAvoid<strong>in</strong>g manual"turn the handle"cod<strong>in</strong>gFigure 8.4 Anonymous types allow you to keepjust the data you need for a particular situation,<strong>in</strong> a form that is tailored to that situation, withoutthe tedium of writ<strong>in</strong>g a fresh type each time.What a seem<strong>in</strong>gly mixed bag of features! We’ve seen four features that are quite similar,at least <strong>in</strong> syntax: object <strong>in</strong>itializers, collection <strong>in</strong>itializers, implicitly typed arrays,and anonymous types. The other two features—automatic properties and implicitlyLicensed to Rhona Hadida


Summary229typed local variables—are somewhat different. Likewise, most of the features wouldhave been useful <strong>in</strong>dividually <strong>in</strong> <strong>C#</strong> 2, whereas implicitly typed arrays and anonymoustypes only pay back the cost of learn<strong>in</strong>g about them when the rest of the <strong>C#</strong> 3 featuresare brought <strong>in</strong>to play.So what do these features really have <strong>in</strong> common? They all relieve the developer oftedious cod<strong>in</strong>g. I’m sure you don’t enjoy writ<strong>in</strong>g trivial properties any more than I do, orsett<strong>in</strong>g several properties, one at a time, us<strong>in</strong>g a local variable—particularly whenyou’re try<strong>in</strong>g to build up a collection of similar objects. Not only do the new featuresof <strong>C#</strong> 3 make it easier to write the code, they also make it easier to read it too, at leastwhen they’re applied sensibly.In our next chapter we’ll look at a major new language feature, along with a frameworkfeature it provides direct support for. If you thought anonymous methods madecreat<strong>in</strong>g delegates easy, just wait until you see lambda expressions…Licensed to Rhona Hadida


Lambda expressionsand expression treesThis chapter covers■■■■■■■Lambda expression syntaxConversions from lambdas to delegatesExpression tree framework classesConversions from lambdas to expression treesWhy expression trees matterChanges to type <strong>in</strong>ferenceChanges to overload resolutionIn chapter 5 we saw how <strong>C#</strong> 2 made delegates much easier to use due to implicitconversions of method groups, anonymous methods, and parameter covariance.This is enough to make event subscription significantly simpler and more readable,but delegates <strong>in</strong> <strong>C#</strong> 2 are still too bulky to be used all the time: a page of code full ofanonymous methods is quite pa<strong>in</strong>ful to read, and you certa<strong>in</strong>ly wouldn’t want tostart putt<strong>in</strong>g multiple anonymous methods <strong>in</strong> a s<strong>in</strong>gle statement on a regular basis.One of the fundamental build<strong>in</strong>g blocks of LINQ is the ability to create pipel<strong>in</strong>esof operations, along with any state required by those operations. These operations230Licensed to Rhona Hadida


Lambda expressions and expression trees231express all k<strong>in</strong>ds of logic about data: how to filter it, how to order it, how to jo<strong>in</strong> differentdata sources together, and much more. When LINQ queries are executed “<strong>in</strong> process,”those operations are usually represented by delegates.Statements conta<strong>in</strong><strong>in</strong>g several delegates are common when manipulat<strong>in</strong>g data withLINQ to Objects, 1 and lambda expressions <strong>in</strong> <strong>C#</strong> 3 make all of this possible without sacrific<strong>in</strong>greadability. (While I’m mention<strong>in</strong>g readability, this chapter uses lambda expressionand lambda <strong>in</strong>terchangeably; as I need to refer to normal expressions quite a lot, ithelps to use the short version <strong>in</strong> many cases.)NOTEIt’s all Greek to me—The term lambda expression comes from lambda calculus,also written as -calculus, where is the Greek letter lambda. This isan area of math and computer science deal<strong>in</strong>g with def<strong>in</strong><strong>in</strong>g and apply<strong>in</strong>gfunctions. It’s been around for a long time and is the basis of functionallanguages such as ML. The good news is that you don’t need toknow lambda calculus to use lambda expressions <strong>in</strong> <strong>C#</strong> 3.Lambda expressions and expression treesExecut<strong>in</strong>g delegates is only part of the LINQ story. To use databases and other queryeng<strong>in</strong>es efficiently, we need a different representation of the operations <strong>in</strong> the pipel<strong>in</strong>e:a way of treat<strong>in</strong>g code as data that can be exam<strong>in</strong>ed programmatically. The logicwith<strong>in</strong> the operations can then be transformed <strong>in</strong>to a different form, such as a webservice call, a SQL or LDAP query—whatever is appropriate.Although it’s possible to build up representations of queries <strong>in</strong> a particular API, it’susually tricky to read and sacrifices a lot of compiler support. This is where lambdassave the day aga<strong>in</strong>: not only can they be used to create delegate <strong>in</strong>stances, but the <strong>C#</strong>compiler can also transform them <strong>in</strong>to expression trees—data structures represent<strong>in</strong>gthe logic of the lambda expressions so that other code can exam<strong>in</strong>e it. In short,lambda expressions are the idiomatic way of represent<strong>in</strong>g the operations <strong>in</strong> LINQ datapipel<strong>in</strong>es—but we’ll be tak<strong>in</strong>g th<strong>in</strong>gs one step at a time, exam<strong>in</strong><strong>in</strong>g them <strong>in</strong> a fairly isolatedway before we embrace the whole of LINQ.In this chapter we’ll look at both ways of us<strong>in</strong>g lambda expressions, although forthe moment our coverage of expression trees will be relatively basic—we’re not go<strong>in</strong>gto actually create any SQL just yet. However, with the theory under your belt youshould be relatively comfortable with lambda expressions and expression trees by thetime we hit the really impressive stuff <strong>in</strong> chapter 12.In the f<strong>in</strong>al part of this chapter, we’ll exam<strong>in</strong>e how type <strong>in</strong>ference has changed for<strong>C#</strong> 3, mostly due to lambdas with implicit parameter types. This is a bit like learn<strong>in</strong>ghow to tie shoelaces: far from excit<strong>in</strong>g, but without this ability you’ll trip over yourselfwhen you start runn<strong>in</strong>g.Let’s beg<strong>in</strong> by see<strong>in</strong>g what lambda expressions look like. We’ll start with an anonymousmethod and gradually transform it <strong>in</strong>to shorter and shorter forms.1LINQ to Objects is the LINQ provider <strong>in</strong> .NET 3.5 that handles sequences of data with<strong>in</strong> the same process. Bycontrast, providers such as LINQ to SQL offload the work to other “out of process” systems—databases, forexample.Licensed to Rhona Hadida


232 CHAPTER 9 Lambda expressions and expression trees9.1 Lambda expressions as delegatesIn many ways, lambda expressions can be seen as an evolution of anonymous methodsfrom <strong>C#</strong> 2. There’s almost noth<strong>in</strong>g that an anonymous method can do that can’t be doneus<strong>in</strong>g a lambda expression, and it’s almost always more readable and compact us<strong>in</strong>glambdas. In particular, the behavior of captured variables is exactly the same <strong>in</strong> lambdaexpressions as <strong>in</strong> anonymous methods. In their most explicit form, not much differenceexists between the two—but lambda expressions have a lot of shortcuts available to makethem compact <strong>in</strong> common situations. Like anonymous methods, lambda expressionshave special conversion rules—the type of the expression isn’t a delegate type <strong>in</strong> itself,but it can be converted <strong>in</strong>to a delegate <strong>in</strong>stance <strong>in</strong> various ways, both implicitly andexplicitly. The term anonymous function covers anonymous methods and lambda expressions—<strong>in</strong>many cases the same conversion rules apply to both of them.We’re go<strong>in</strong>g to start with a very simple example, <strong>in</strong>itially expressed as an anonymousmethod. We’ll create a delegate <strong>in</strong>stance that takes a str<strong>in</strong>g parameter and returns an<strong>in</strong>t (which is the length of the str<strong>in</strong>g). First we need to choose a delegate type to use;fortunately, .NET 3.5 comes with a whole family of generic delegate types to help us out.9.1.1 Prelim<strong>in</strong>aries: <strong>in</strong>troduc<strong>in</strong>g the Func delegate typesThere are five generic Func delegate types <strong>in</strong> the System namespace of .NET 3.5.There’s noth<strong>in</strong>g special about Func—it’s just handy to have some predef<strong>in</strong>ed generictypes that are capable of handl<strong>in</strong>g many situations. Each delegate signature takesbetween zero and four parameters, the types of which are specified as type parameters.The last type parameter is used for the return type <strong>in</strong> each case. Here are the signaturesof all the Func delegate types:public delegate TResult Func()public delegate TResult Func(T arg)public delegate TResult Func(T1 arg1, T2 arg2)public delegate TResult Func(T1 arg1, T2 arg2, T3 arg3)public delegate TResult Func(T1 arg1, T2 arg2, T3 arg3, T4 arg4)For example, Func is equivalent to a delegate type of the formdelegate <strong>in</strong>t SomeDelegate(str<strong>in</strong>g arg1, double arg2)The Action set of delegates provide the equivalent functionality when you want avoid return type. The s<strong>in</strong>gle parameter form of Action existed <strong>in</strong> .NET 2.0, but the restare new to .NET 3.5. For our example we need a type that takes a str<strong>in</strong>g parameterand returns an <strong>in</strong>t, so we’ll use Func.9.1.2 First transformation to a lambda expressionNow that we know the delegate type, we can use an anonymous method to create ourdelegate <strong>in</strong>stance. List<strong>in</strong>g 9.1 shows this, along with execut<strong>in</strong>g the delegate <strong>in</strong>stanceafterward so we can see it work<strong>in</strong>g.Licensed to Rhona Hadida


Lambda expressions as delegates233List<strong>in</strong>g 9.1Us<strong>in</strong>g an anonymous method to create a delegate <strong>in</strong>stanceFunc returnLength;returnLength = delegate (str<strong>in</strong>g text) { return text.Length; };Console.WriteL<strong>in</strong>e (returnLength("Hello"));List<strong>in</strong>g 9.1 pr<strong>in</strong>ts “5,” just as we’d expect it to. I’ve separated out the declaration ofreturnLength from the assignment to it so we can keep it on one l<strong>in</strong>e—it’s easier tokeep track of that way. The anonymous method expression is the part <strong>in</strong> bold, andthat’s the part we’re go<strong>in</strong>g to convert <strong>in</strong>to a lambda expression.The most long-w<strong>in</strong>ded form of a lambda expression is this:(explicitly-typed-parameter-list) => { statements }The => part is new to <strong>C#</strong> 3 and tells the compiler that we’re us<strong>in</strong>g a lambda expression.Most of the time lambda expressions are used with a delegate type that has a nonvoidreturn type—the syntax is slightly less <strong>in</strong>tuitive when there isn’t a result. This isanother <strong>in</strong>dication of the changes <strong>in</strong> idiom between <strong>C#</strong> 1 and <strong>C#</strong> 3. In <strong>C#</strong> 1, delegateswere usually used for events and rarely returned anyth<strong>in</strong>g. Although lambda expressionscerta<strong>in</strong>ly can be used <strong>in</strong> this way (and we’ll show an example of this later), muchof their elegance comes from the shortcuts that are available when they need toreturn a value.With the explicit parameters and statements <strong>in</strong> braces, this version looks very similarto an anonymous method. List<strong>in</strong>g 9.2 is equivalent to list<strong>in</strong>g 9.1 but uses a lambdaexpression.List<strong>in</strong>g 9.2A long-w<strong>in</strong>ded first lambda expression, similar to an anonymous methodFunc returnLength;returnLength = (str<strong>in</strong>g text) => { return text.Length; };Console.WriteL<strong>in</strong>e (returnLength("Hello"));Aga<strong>in</strong>, I’ve used bold to <strong>in</strong>dicate the expression used to create the delegate <strong>in</strong>stance.When read<strong>in</strong>g lambda expressions, it helps to th<strong>in</strong>k of the => part as “goes to”—so theexample <strong>in</strong> list<strong>in</strong>g 9.2 could be read as “text goes to text.Length.” As this is the onlypart of the list<strong>in</strong>g that is <strong>in</strong>terest<strong>in</strong>g for a while, I’ll show it alone from now on. You canreplace the bold text from list<strong>in</strong>g 9.2 with any of the lambda expressions listed <strong>in</strong> thissection and the result will be the same.The same rules that govern return statements <strong>in</strong> anonymous methods apply tolambdas too: you can’t try to return a value from a lambda expression with a voidreturn type, whereas if there’s a nonvoid return type every code path has to return acompatible value. 2 It’s all pretty <strong>in</strong>tuitive and rarely gets <strong>in</strong> the way.So far, we haven’t saved much space or made th<strong>in</strong>gs particularly easy to read. Let’sstart apply<strong>in</strong>g the shortcuts.2Code paths throw<strong>in</strong>g exceptions don’t need to return a value, of course, and neither do detectable <strong>in</strong>f<strong>in</strong>iteloops.Licensed to Rhona Hadida


234 CHAPTER 9 Lambda expressions and expression trees9.1.3 Us<strong>in</strong>g a s<strong>in</strong>gle expression as the bodyThe form we’ve seen so far uses a full block of code to return the value. This is veryflexible—you can have multiple statements, perform loops, return from differentplaces <strong>in</strong> the block, and so on, just as with anonymous methods. Most of the time,however, you can easily express the whole of the body <strong>in</strong> a s<strong>in</strong>gle expression, the valueof which is the result of the lambda. In these cases, you can specify just that expression,without any braces, return statements, or semicolons. The format then is(explicitly-typed-parameter-list) => expressionIn our case, this means that the lambda expression becomes(str<strong>in</strong>g text) => text.LengthThat’s start<strong>in</strong>g to look simpler already. Now, what about that parameter type? Thecompiler already knows that <strong>in</strong>stances of Func take a s<strong>in</strong>gle str<strong>in</strong>gparameter, so we should be able to just name that parameter…9.1.4 Implicitly typed parameter listsMost of the time, the compiler can guess the parameter types without you explicitlystat<strong>in</strong>g them. In these cases, you can write the lambda expression as(implicitly-typed-parameter-list) => expressionAn implicitly typed parameter list is just a comma-separated list of names, without thetypes. You can’t mix and match for different parameters—either the whole list isexplicitly typed, or it’s all implicitly typed. Also, if any of the parameters are out or refparameters, you are forced to use explicit typ<strong>in</strong>g. In our case, however, it’s f<strong>in</strong>e—soour lambda expression is now just(text) => text.LengthThat’s gett<strong>in</strong>g pretty short now—there’s not a lot more we could get rid of. The parenthesesseem a bit redundant, though.9.1.5 Shortcut for a s<strong>in</strong>gle parameterWhen the lambda expression only needs a s<strong>in</strong>gle parameter, and that parameter canbe implicitly typed, <strong>C#</strong> 3 allows us to omit the parentheses, so it now has this form:parameter-name => expressionThe f<strong>in</strong>al form of our lambda expression is thereforetext => text.LengthYou may be wonder<strong>in</strong>g why there are so many special cases with lambda expressions—none of the rest of the language cares whether a method has one parameter or more,for <strong>in</strong>stance. Well, what sounds like a very particular case actually turns out to beextremely common, and the improvement <strong>in</strong> readability from remov<strong>in</strong>g the parenthesesfrom the parameter list can be significant when there are many lambdas <strong>in</strong> a shortpiece of code.Licensed to Rhona Hadida


Simple examples us<strong>in</strong>g List and events235It’s worth not<strong>in</strong>g that you can put parentheses around the whole lambda expressionif you want to, just like other expressions. Sometimes this helps readability <strong>in</strong> the casewhere you’re assign<strong>in</strong>g the lambda to a variable or property—otherwise, the equals symbolscan get confus<strong>in</strong>g. List<strong>in</strong>g 9.3 shows this <strong>in</strong> the context of our orig<strong>in</strong>al code.List<strong>in</strong>g 9.3A concise lambda expression, bracketed for clarityFunc returnLength;returnLength = (text => text.Length);Console.WriteL<strong>in</strong>e (returnLength("Hello"));At first you may f<strong>in</strong>d list<strong>in</strong>g 9.3 a bit confus<strong>in</strong>g to read, <strong>in</strong> the same way that anonymousmethods appear strange to many developers until they get used to them. When you areused to lambda expressions, however, you can appreciate how concise they are. Itwould be hard to imag<strong>in</strong>e a shorter, clearer way of creat<strong>in</strong>g a delegate <strong>in</strong>stance. 3 Wecould have changed the variable name text to someth<strong>in</strong>g like x, and <strong>in</strong> full LINQ that’soften useful, but longer names give a bit more <strong>in</strong>formation to the reader.The decision of whether to use the short form for the body of the lambda expression,specify<strong>in</strong>g just an expression <strong>in</strong>stead of a whole block, is completely <strong>in</strong>dependentfrom the decision about whether to use explicit or implicit parameters. We happen tohave gone down one route of shorten<strong>in</strong>g the lambda, but we could have started off bymak<strong>in</strong>g the parameters implicit.NOTEHigher-order functions—The body of a lambda expression can itself conta<strong>in</strong>a lambda expression—and it tends to be as confus<strong>in</strong>g as it sounds.Alternatively, the parameter to a lambda expression can be another delegate,which is just as bad. Both of these are examples of higher-orderfunctions. If you enjoy feel<strong>in</strong>g dazed and confused, have a look at someof the sample code <strong>in</strong> the downloadable source. Although I’m be<strong>in</strong>gflippant, this approach is common <strong>in</strong> functional programm<strong>in</strong>g and canbe very useful. It just takes a certa<strong>in</strong> degree of perseverance to get <strong>in</strong>tothe right m<strong>in</strong>d-set.So far we’ve only dealt with a s<strong>in</strong>gle lambda expression, just putt<strong>in</strong>g it <strong>in</strong>to differentforms. Let’s take a look at a few examples to make th<strong>in</strong>gs more concrete before weexam<strong>in</strong>e the details.9.2 Simple examples us<strong>in</strong>g List and eventsWhen we look at extension methods <strong>in</strong> chapter 10, we’ll use lambda expressions all thetime. Until then, List and event handlers give us the best examples. We’ll start offwith lists, us<strong>in</strong>g automatically implemented properties, implicitly typed local variables,and collection <strong>in</strong>itializers for the sake of brevity. We’ll then call methods that take delegateparameters—us<strong>in</strong>g lambda expressions to create the delegates, of course.3That’s not to say it’s impossible, however. Some languages allow closures to be represented as simple blocksof code with a magic variable name to represent the common case of a s<strong>in</strong>gle parameter.Licensed to Rhona Hadida


236 CHAPTER 9 Lambda expressions and expression trees9.2.1 Filter<strong>in</strong>g, sort<strong>in</strong>g, and actions on listsIf you remember the F<strong>in</strong>dAll method on List, it takes a Predicate andreturns a new list with all the elements from the orig<strong>in</strong>al list that match the predicate.The Sort method takes a Comparison and sorts the list accord<strong>in</strong>gly. F<strong>in</strong>ally, theForEach method takes an Action to perform on each element. List<strong>in</strong>g 9.4 useslambda expressions to provide the delegate <strong>in</strong>stance to each of these methods. Thesample data <strong>in</strong> question is just the name and year of release for various films. We pr<strong>in</strong>tout the orig<strong>in</strong>al list, then create and pr<strong>in</strong>t out a filtered list of only old films, then sortand pr<strong>in</strong>t out the orig<strong>in</strong>al list, ordered by name. (It’s <strong>in</strong>terest<strong>in</strong>g to consider howmuch more code would have been required to do the same th<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 1, by the way.)List<strong>in</strong>g 9.4Manipulat<strong>in</strong>g a list of films us<strong>in</strong>g lambda expressionsclass Film{public str<strong>in</strong>g Name { get; set; }public <strong>in</strong>t Year { get; set; }public override str<strong>in</strong>g ToStr<strong>in</strong>g(){return str<strong>in</strong>g.Format("Name={0}, Year={1}", Name, Year);}}...var films = new List{new Film {Name="Jaws", Year=1975},new Film {Name="S<strong>in</strong>g<strong>in</strong>g <strong>in</strong> the Ra<strong>in</strong>", Year=1952},new Film {Name="Some Like It Hot", Year=1959},new Film {Name="The Wizard of Oz", Year=1939},new Film {Name="It's a Wonderful Life", Year=1946},new Film {Name="American Beauty", Year=1999},new Film {Name="High Fidelity", Year=2000},new Film {Name="The Usual Suspects", Year=1995}};Action pr<strong>in</strong>t = film => { Console.WriteL<strong>in</strong>e(film); };films.ForEach(pr<strong>in</strong>t);C Pr<strong>in</strong>ts orig<strong>in</strong>al listfilms.F<strong>in</strong>dAll(film => film.Year < 1960).ForEach(pr<strong>in</strong>t);films.Sort((f1, f2) => f1.Name.CompareTo(f2.Name));films.ForEach(pr<strong>in</strong>t);Createsreusablelist-pr<strong>in</strong>t<strong>in</strong>gdelegateThe first half of list<strong>in</strong>g 9.4 <strong>in</strong>volves just sett<strong>in</strong>g up the data. I would have used an anonymoustype, but it’s relatively tricky to create a generic list from a collection of anonymoustype <strong>in</strong>stances. (You can do it by creat<strong>in</strong>g a generic method that takes an arrayand converts it to a list of the same type, then pass an implicitly typed array <strong>in</strong>to thatmethod. An extension method <strong>in</strong> .NET 3.5 called ToList provides this functionalitytoo, but that would be cheat<strong>in</strong>g as we haven’t looked at extension methods yet!)DCreatesfiltered listEBSortsorig<strong>in</strong>al listLicensed to Rhona Hadida


Simple examples us<strong>in</strong>g List and events237Before we use the newly created list, we create a delegate <strong>in</strong>stance B, which we’lluse to pr<strong>in</strong>t out the items of the list. We use this delegate <strong>in</strong>stance three times, which iswhy I’ve created a variable to hold it rather than us<strong>in</strong>g a separate lambda expressioneach time. It just pr<strong>in</strong>ts a s<strong>in</strong>gle element, but by pass<strong>in</strong>g it <strong>in</strong>to List.ForEach wecan simply dump the whole list to the console.The first list we pr<strong>in</strong>t out C is just the orig<strong>in</strong>al one without any modifications. Wethen f<strong>in</strong>d all the films <strong>in</strong> our list that were made before 1960 and pr<strong>in</strong>t those out D. Thisis done with another lambda expression, which is executed for each film <strong>in</strong> the list—itonly has to determ<strong>in</strong>e whether or not a s<strong>in</strong>gle film should be <strong>in</strong>cluded <strong>in</strong> the filtered list.The source code uses the lambda expression as a method argument, but really the compilerhas created a method like this:private static bool SomeAutoGeneratedName(Film film){return film.Year < 1960;}The method call to F<strong>in</strong>dAll is then effectively this:films.F<strong>in</strong>dAll(new Predicate(SomeAutoGeneratedName))The lambda expression support here is just like the anonymous method support <strong>in</strong><strong>C#</strong> 2; it’s all cleverness on the part of the compiler. (In fact, the Microsoft compiler iseven smarter <strong>in</strong> this case—it realizes it can get away with reus<strong>in</strong>g the delegate<strong>in</strong>stance if the code is ever called aga<strong>in</strong>, so caches it.)The sort E is also performed us<strong>in</strong>g a lambda expression, which compares any twofilms us<strong>in</strong>g their names. I have to confess that explicitly call<strong>in</strong>g CompareTo ourselves isa bit ugly. In the next chapter we’ll see how the OrderBy extension method allows usto express order<strong>in</strong>g <strong>in</strong> a neater way.Let’s look at a different example, this time us<strong>in</strong>g lambda expressions with eventhandl<strong>in</strong>g.9.2.2 Logg<strong>in</strong>g <strong>in</strong> an event handlerIf you th<strong>in</strong>k back to chapter 5, <strong>in</strong> list<strong>in</strong>g 5.9 we saw an easy way of us<strong>in</strong>g anonymousmethods to log which events were occurr<strong>in</strong>g—but we were only able to get away with acompact syntax because we didn’t m<strong>in</strong>d los<strong>in</strong>g the parameter <strong>in</strong>formation. What if wewanted to log both the nature of the event and <strong>in</strong>formation about its sender and arguments?Lambda expressions enable this <strong>in</strong> a very neat way, as shown <strong>in</strong> list<strong>in</strong>g 9.5.List<strong>in</strong>g 9.5Logg<strong>in</strong>g events us<strong>in</strong>g lambda expressionsstatic void Log(str<strong>in</strong>g title, object sender, EventArgs e){Console.WriteL<strong>in</strong>e("Event: {0}", title);Console.WriteL<strong>in</strong>e(" Sender: {0}", sender);Console.WriteL<strong>in</strong>e(" Arguments: {0}", e.GetType());foreach (PropertyDescriptor prop <strong>in</strong>TypeDescriptor.GetProperties(e))Licensed to Rhona Hadida


238 CHAPTER 9 Lambda expressions and expression trees{str<strong>in</strong>g name = prop.DisplayName;object value = prop.GetValue(e);Console.WriteL<strong>in</strong>e(" {0}={1}", name, value);}}...Button button = new Button();button.Text = "Click me";button.Click += (src, e) => { Log("Click", src, e); };button.KeyPress += (src, e) => { Log("KeyPress", src, e); };button.MouseClick += (src, e) => { Log("MouseClick", src, e); };Form form = new Form();form.AutoSize=true;form.Controls.Add(button);Application.Run(form);List<strong>in</strong>g 9.5 uses lambda expressions to pass the event name and parameters to the Logmethod, which logs details of the event. We don’t log the details of the source event,beyond whatever its ToStr<strong>in</strong>g override returns, because there’s an overwhelm<strong>in</strong>gamount of <strong>in</strong>formation associated with controls. However, we use reflection over propertydescriptors to show the details of the EventArgs <strong>in</strong>stance passed to us. Here’ssome sample output when you click the button:Event: ClickSender: System.W<strong>in</strong>dows.Forms.Button, Text: Click meArguments: System.W<strong>in</strong>dows.Forms.MouseEventArgsButton=LeftClicks=1X=53Y=17Delta=0Location={X=53,Y=17}Event: MouseClickSender: System.W<strong>in</strong>dows.Forms.Button, Text: Click meArguments: System.W<strong>in</strong>dows.Forms.MouseEventArgsButton=LeftClicks=1X=53Y=17Delta=0Location={X=53,Y=17}All of this is possible without lambda expressions, of course—but it’s a lot neater than itwould have been otherwise. Now that we’ve seen lambdas be<strong>in</strong>g converted <strong>in</strong>to delegate<strong>in</strong>stances, it’s time to look at expression trees, which represent lambda expressionsas data <strong>in</strong>stead of code.9.3 Expression treesThe idea of “code as data” is an old one, but it hasn’t been used much <strong>in</strong> popular programm<strong>in</strong>glanguages. You could argue that all .NET programs use the concept,because the IL code is treated as data by the JIT, which then converts it <strong>in</strong>to nativeLicensed to Rhona Hadida


Expression trees239code to run on your CPU. That’s quite deeply hidden, though, and while libraries existto manipulate IL programmatically, they’re not widely used.Expression trees <strong>in</strong> .NET 3.5 provide an abstract way of represent<strong>in</strong>g some code as atree of objects. It’s like CodeDOM but operat<strong>in</strong>g at a slightly higher level, and only forexpressions. The primary use of expression trees is <strong>in</strong> LINQ, and later <strong>in</strong> this sectionwe’ll see how crucial expression trees are to the whole LINQ story.<strong>C#</strong> 3 provides built-<strong>in</strong> support for convert<strong>in</strong>g lambda expressions to expressiontrees, but before we cover that let’s explore how they fit <strong>in</strong>to the .NET Frameworkwithout us<strong>in</strong>g any compiler tricks.9.3.1 Build<strong>in</strong>g expression trees programmaticallyExpression trees aren’t as mystical as they sound, although some of the uses they’reput to look like magic. As the name suggests, they’re trees of objects, where each node<strong>in</strong> the tree is an expression <strong>in</strong> itself. Different types of expressions represent the differentoperations that can be performed <strong>in</strong> code: b<strong>in</strong>ary operations, such as addition;unary operations, such as tak<strong>in</strong>g the length of an array; method calls; constructorcalls; and so forth.The System.L<strong>in</strong>q.Expressions namespace conta<strong>in</strong>s the various classes that representexpressions. All of them derive from the Expression class, which is abstract andmostly consists of static factory methods to create <strong>in</strong>stances of other expressionclasses. It exposes two properties, however:■■The Type property represents the .NET type of the evaluated expression—youcan th<strong>in</strong>k of it like a return type. The type of an expression that fetches theLength property of a str<strong>in</strong>g would be <strong>in</strong>t, for example.The NodeType property returns the k<strong>in</strong>d of expression represented, as a memberof the ExpressionType enumeration, with values such as LessThan, Multiply,and Invoke. To use the same example, <strong>in</strong> myStr<strong>in</strong>g.Length the property accesspart would have a node type of MemberAccess.There are many classes derived from Expression, and some of them can have manydifferent node types: B<strong>in</strong>aryExpression, for <strong>in</strong>stance, represents any operation withtwo operands: arithmetic, logic, comparisons, array <strong>in</strong>dex<strong>in</strong>g, and the like. This iswhere the NodeType property is important, as it dist<strong>in</strong>guishes between different k<strong>in</strong>dsof expressions that are represented by the same class.I don’t <strong>in</strong>tend to cover every expression class or node type—there are far toomany, and MSDN does a perfectly good job of expla<strong>in</strong><strong>in</strong>g them. Instead, we’ll try to geta general feel for what you can do with expression trees.Let’s start off by creat<strong>in</strong>g one of the simplest possible expression trees, add<strong>in</strong>g twoconstant <strong>in</strong>tegers together. List<strong>in</strong>g 9.6 creates an expression tree to represent 2+3.List<strong>in</strong>g 9.6 A very simple expression tree, add<strong>in</strong>g 2 and 3Expression firstArg = Expression.Constant(2);Expression secondArg = Expression.Constant(3);Expression add = Expression.Add(firstArg, secondArg);Console.WriteL<strong>in</strong>e(add);Licensed to Rhona Hadida


240 CHAPTER 9 Lambda expressions and expression treesRunn<strong>in</strong>g list<strong>in</strong>g 9.6 will produce the output“(2 + 3),” which demonstrates thatthe various expression classes overrideToStr<strong>in</strong>g to produce human-readableoutput. Figure 9.1 depicts the tree generatedby the code.It’s worth not<strong>in</strong>g that the “leaf” expressionsare created first <strong>in</strong> the code: youbuild expressions from the bottom up.This is enforced by the fact that expressionsare immutable—once you’ve createdan expression, it will never change,so you can cache and reuse expressionsat will.Now that we’ve built up an expressiontree, let’s try to actually execute it.LeftfirstArgConstantExpressionNodeType=ConstantType=System.Int32Value=2addB<strong>in</strong>aryExpressionNodeType=AddType=System.Int32RightsecondArgConstantExpressionNodeType=ConstantType=System.Int32Value=3Figure 9.1 Graphical representation of theexpression tree created by list<strong>in</strong>g 9.69.3.2 Compil<strong>in</strong>g expression trees <strong>in</strong>to delegatesOne of the types derived from Expression is LambdaExpression. The generic classExpression then derives from LambdaExpression. It’s all slightly confus<strong>in</strong>g—figure9.2 shows the type hierarchy to make th<strong>in</strong>gs clearer.The difference between Expression and Expression is that thegeneric class is statically typed to <strong>in</strong>dicate what k<strong>in</strong>d of expression it is, <strong>in</strong> terms ofreturn type and parameters. Fairly obviously, this is expressed by the TDelegate typeparameter, which must be a delegate type. For <strong>in</strong>stance, our simple addition expressionis one that takes no parameters and returns an <strong>in</strong>teger—this is matched by thesignature of Func, so we could use an Expression to represent theexpression <strong>in</strong> a statically typed manner. We do this us<strong>in</strong>g the Expression.Lambdamethod. This has a number of overloads—our examples use the generic method,which uses a type parameter to <strong>in</strong>dicate the type of delegate we want to represent. SeeMSDN for alternatives.ExpressionLambdaExpressionB<strong>in</strong>aryExpression(Other types)ExpressionFigure 9.2Type hierarchy from Expression up to ExpressionLicensed to Rhona Hadida


Expression trees241So, what’s the po<strong>in</strong>t of do<strong>in</strong>g this? Well, LambdaExpression has a Compile method thatcreates a delegate of the appropriate type. This delegate can now be executed <strong>in</strong> thenormal manner, as if it had been created us<strong>in</strong>g a normal method or any other means.List<strong>in</strong>g 9.7 shows this <strong>in</strong> action, with the same expression as before.List<strong>in</strong>g 9.7Compil<strong>in</strong>g and execut<strong>in</strong>g an expression treeExpression firstArg = Expression.Constant(2);Expression secondArg = Expression.Constant(3);Expression add = Expression.Add(firstArg, secondArg);Func compiled = Expression.Lambda(add).Compile();Console.WriteL<strong>in</strong>e(compiled());Arguably list<strong>in</strong>g 9.7 is one of the most convoluted ways of pr<strong>in</strong>t<strong>in</strong>g out “5” that youcould ask for. At the same time, it’s also rather impressive. We’re programmaticallycreat<strong>in</strong>g some logical blocks and represent<strong>in</strong>g them as normal objects, and then ask<strong>in</strong>gthe framework to compile the whole th<strong>in</strong>g <strong>in</strong>to “real” code that can be executed.You may never need to actually use expression trees this way, or even build them upprogrammatically at all, but it’s useful background <strong>in</strong>formation that will help youunderstand how LINQ works.As I said at the beg<strong>in</strong>n<strong>in</strong>g of this section, expression trees are not too far removedfrom CodeDOM—Snippy compiles and executes <strong>C#</strong> code that has been entered aspla<strong>in</strong> text, for <strong>in</strong>stance. However, two significant differences exist between CodeDOMand expression trees.First, expression trees are only able to represent s<strong>in</strong>gle expressions. They’re notdesigned for whole classes, methods, or even just statements. Second, <strong>C#</strong> supportsexpression trees directly <strong>in</strong> the language, through lambda expressions. Let’s take alook at that now.9.3.3 Convert<strong>in</strong>g <strong>C#</strong> lambda expressions to expression treesAs we’ve already seen, lambda expressions can be converted to appropriate delegate<strong>in</strong>stances, either implicitly or explicitly. That’s not the only conversion that is available,however. You can also ask the compiler to build an expression tree from yourlambda expression, creat<strong>in</strong>g an <strong>in</strong>stance of Expression at executiontime. For example, list<strong>in</strong>g 9.8 shows a much shorter way of creat<strong>in</strong>g the “return 5”expression, compil<strong>in</strong>g it and then <strong>in</strong>vok<strong>in</strong>g the result<strong>in</strong>g delegate.List<strong>in</strong>g 9.8Us<strong>in</strong>g lambda expressions to create expression treesExpression return5 = () => 5;Func compiled = return5.Compile();Console.WriteL<strong>in</strong>e(compiled());In the first l<strong>in</strong>e of list<strong>in</strong>g 9.8, the () => 5 part is the lambda expression. In this case,putt<strong>in</strong>g it <strong>in</strong> an extra pair of parentheses around the whole th<strong>in</strong>g makes it look worserather than better. Notice that we don’t need any casts because the compiler can verifyLicensed to Rhona Hadida


242 CHAPTER 9 Lambda expressions and expression treeseveryth<strong>in</strong>g as it goes. We could have written 2+3 <strong>in</strong>stead of 5, but the compiler wouldhave optimized the addition away for us. The important po<strong>in</strong>t to take away is that thelambda expression has been converted <strong>in</strong>to an expression tree.NOTEThere are limitations—Not all lambda expressions can be converted toexpression trees. You can’t convert a lambda with a block of statements(even just one return statement) <strong>in</strong>to an expression tree—it has to be <strong>in</strong>the form that just evaluates a s<strong>in</strong>gle expression. That expression can’tconta<strong>in</strong> assignments, as they can’t be represented <strong>in</strong> expression trees.Although these are the most common restrictions, they’re not the onlyones—the full list is not worth describ<strong>in</strong>g here, as this issue comes up sorarely. If there’s a problem with an attempted conversion, you’ll f<strong>in</strong>d outat compile time.Let’s take a look at a more complicated example just to see how th<strong>in</strong>gs work, particularlywith respect to parameters. This time we’ll write a predicate that takes two str<strong>in</strong>gsand checks to see if the first one beg<strong>in</strong>s with the second. The code is simple when writtenas a lambda expression, as shown <strong>in</strong> list<strong>in</strong>g 9.9.List<strong>in</strong>g 9.9Demonstration of a more complicated expression treeExpression expression =( (x,y) => x.StartsWith(y) );var compiled = expression.Compile();Console.WriteL<strong>in</strong>e(compiled("First", "Second"));Console.WriteL<strong>in</strong>e(compiled("First", "Fir"));The expression tree itself is more complicated, especially by the time we’ve convertedit <strong>in</strong>to an <strong>in</strong>stance of LambdaExpression. List<strong>in</strong>g 9.10 shows how it’s built <strong>in</strong> code.List<strong>in</strong>g 9.10Build<strong>in</strong>g a method call expression tree <strong>in</strong> codeMethodInfo method = typeof(str<strong>in</strong>g).GetMethod("StartsWith", new[] { typeof(str<strong>in</strong>g) });var target = Expression.Parameter(typeof(str<strong>in</strong>g), "x");var methodArg = Expression.Parameter(typeof(str<strong>in</strong>g), "y");Expression[] methodArgs = new[] { methodArg };Expression call = Expression.Call(target, method, methodArgs);var lambdaParameters = new[] { target, methodArg };var lambda = Expression.Lambda(call, lambdaParameters);var compiled = lambda.Compile();Console.WriteL<strong>in</strong>e(compiled("First", "Second"));Console.WriteL<strong>in</strong>e(compiled("First", "Fir"));Builds upparts ofmethod callAs you can see, list<strong>in</strong>g 9.10 is considerably more <strong>in</strong>volved than the version with the <strong>C#</strong>lambda expression. However, it does make it more obvious exactly what is <strong>in</strong>volved <strong>in</strong>CCreates CallExpressionfrom partsBDConverts call<strong>in</strong>to Lambda-ExpressionLicensed to Rhona Hadida


Expression trees243the tree and how parameters are bound. We start off by work<strong>in</strong>g out everyth<strong>in</strong>g weneed to know about the method call that forms the body of the f<strong>in</strong>al expression B:the target of the method (<strong>in</strong> other words, the str<strong>in</strong>g we’re call<strong>in</strong>g StartsWith on); themethod itself (as a MethodInfo); and the list of arguments (<strong>in</strong> this case, just the one).It so happens that our method target and argument will both be parameters passed<strong>in</strong>to the expression, but they could be other types of expressions—constants, theresults of other method calls, property evaluations, and so forth.After build<strong>in</strong>g the method call as an expression C, we then need to convert it<strong>in</strong>to a lambda expression D, b<strong>in</strong>d<strong>in</strong>g the parameters as we go. We reuse the sameParameterExpression values we created as <strong>in</strong>formation for the method call: theorder <strong>in</strong> which they’re specified when creat<strong>in</strong>g the lambda expression is the order <strong>in</strong>which they’ll be picked up when we eventually call the delegate.Figure 9.3 shows the same f<strong>in</strong>al expression tree graphically. To be picky, eventhough it’s still called an expression tree, the fact that we reuse the parameter expressions(and we have to—creat<strong>in</strong>g a new one with the same name and attempt<strong>in</strong>g tob<strong>in</strong>d parameters that way causes an exception at execution time) means that it’s not atree anymore.lambdaExpressionNodeType=LambdaType=System.BooleanBodyParameterscallMethodCallExpressionNodeType=CallType=System.BooleanlambdaParametersCollection ofParameterExpressionsMethodArgumentsObject(Conta<strong>in</strong>s)methodmethodArgstargetMethodInfo forstr<strong>in</strong>g.StartsWith(str<strong>in</strong>g)(Conta<strong>in</strong>s)Collection ofExpressionsParameterExpressionNodeType=ParameterType=System.Str<strong>in</strong>gName="x"methodArgParameterExpressionNodeType=ParameterType=System.Str<strong>in</strong>gName="y"(Conta<strong>in</strong>s)Figure 9.3 Graphical representation of expression tree that calls a method and uses parameters froma lambda expressionLicensed to Rhona Hadida


244 CHAPTER 9 Lambda expressions and expression treesGlanc<strong>in</strong>g at the complexity of figure 9.3 and list<strong>in</strong>g 9.10 without try<strong>in</strong>g to look at thedetails, you’d be forgiven for th<strong>in</strong>k<strong>in</strong>g that we were do<strong>in</strong>g someth<strong>in</strong>g really complicatedwhen <strong>in</strong> fact it’s just a s<strong>in</strong>gle method call. Imag<strong>in</strong>e what the expression tree for agenu<strong>in</strong>ely complex expression would look like—and then be grateful that <strong>C#</strong> 3 cancreate expression trees from lambda expressions!One small po<strong>in</strong>t to note is that although the <strong>C#</strong> 3 compiler builds expression trees<strong>in</strong> the compiled code us<strong>in</strong>g code similar to list<strong>in</strong>g 9.10, it has one shortcut up itssleeve: it doesn’t need to use normal reflection to get the MethodInfo forstr<strong>in</strong>g.StartsWith. Instead, it uses the method equivalent of the typeof operator.This is only available <strong>in</strong> IL, not <strong>in</strong> <strong>C#</strong> itself—and the same operator is also used to createdelegate <strong>in</strong>stances from method groups.Now that we’ve seen how expression trees and lambda expressions are l<strong>in</strong>ked, let’stake a brief look at why they’re so useful.9.3.4 Expression trees at the heart of LINQWithout lambda expressions, expression trees would have relatively little value.They’d be an alternative to CodeDOM <strong>in</strong> cases where you only wanted to model a s<strong>in</strong>gleexpression <strong>in</strong>stead of whole statements, methods, types and so forth—but the benefitwould still be limited.The reverse is also true to a limited extent: without expression trees, lambda expressionswould certa<strong>in</strong>ly be less useful. Hav<strong>in</strong>g a more compact way of creat<strong>in</strong>g delegate<strong>in</strong>stances would still be welcome, and the shift toward a more functional mode of developmentwould still be viable. Lambda expressions are particularly effective when comb<strong>in</strong>edwith extension methods, as we’ll see <strong>in</strong> the next chapter. However, with expressiontrees <strong>in</strong> the picture as well, th<strong>in</strong>gs get a lot more <strong>in</strong>terest<strong>in</strong>g.So what do we get by comb<strong>in</strong><strong>in</strong>g lambda expressions, expression trees, and extensionmethods? The answer is the language side of LINQ, pretty much. The extra syntaxwe’ll see <strong>in</strong> chapter 11 is ic<strong>in</strong>g on the cake, but the story would still have been quitecompell<strong>in</strong>g with just those three <strong>in</strong>gredients. For a long time we’ve been able to eitherhave nice compile-time check<strong>in</strong>g or we’ve been able to tell another platform to runsome code, usually expressed as text (SQL queries be<strong>in</strong>g the most obvious example).We haven’t been able to do both at the same time.By comb<strong>in</strong><strong>in</strong>g lambda expressions that provide compile-time checks and expressiontrees that abstract the execution model away from the desired logic, we can havethe best of both worlds—with<strong>in</strong> reason. At the heart of “out of process” LINQ providersis the idea that we can produce an expression tree from a familiar source language(<strong>C#</strong> <strong>in</strong> our case) and use the result as an <strong>in</strong>termediate format, which can then be converted<strong>in</strong>to the native language of the target platform: SQL, for example. In somecases there may not be a simple native language so much as a native API—mak<strong>in</strong>g differentweb service calls depend<strong>in</strong>g on what the expression represents, perhaps. Figure9.4 shows the different paths of LINQ to Objects and LINQ to SQL.In some cases the conversion may try to perform all the logic on the target platform,whereas other cases may use the compilation facilities of expression trees to executeLicensed to Rhona Hadida


Changes to type <strong>in</strong>ference and overload resolution245<strong>C#</strong> query code withlambda expressions<strong>C#</strong> query code withlambda expressionsCompile timeIL us<strong>in</strong>gdelegates<strong>C#</strong> compiler<strong>C#</strong> compilerIL us<strong>in</strong>gexpression treesExecution timeQuery resultsDelegate codeexecuted directly<strong>in</strong> the CLRDynamic SQLQuery resultsLINQ to SQL providerExecuted at databaseand fetched backLINQ to ObjectsLINQ to SQLFigure 9.4 Both LINQ to Objects and LINQ to SQL start off with <strong>C#</strong> code, and end with query results.The ability to execute the code remotely comes through expression trees.some of the expression locally and some elsewhere. We’ll look at some of the details ofthis conversion step <strong>in</strong> chapter 12, but you should bear this end goal <strong>in</strong> m<strong>in</strong>d as weexplore extension methods and LINQ syntax <strong>in</strong> chapters 10 and 11.NOTENot all check<strong>in</strong>g can be done by the compiler—When expression trees areexam<strong>in</strong>ed by some sort of converter, there are often cases that have to berejected. For <strong>in</strong>stance, although it’s possible to convert a call tostr<strong>in</strong>g.StartsWith <strong>in</strong>to a similar SQL expression, a call to str<strong>in</strong>g.IsInterned doesn’t make sense <strong>in</strong> a database environment. Expressiontrees allow a large amount of compile-time safety, but the compiler canonly check that the lambda expression can be converted <strong>in</strong>to a validexpression tree; it can’t make sure that the expression tree will be suitablefor its eventual use.That f<strong>in</strong>ishes our direct coverage of lambda expressions and expression trees. Beforewe go any further, however, there are a few changes to <strong>C#</strong> that need some explanation,regard<strong>in</strong>g type <strong>in</strong>ference and how the compiler selects between overloaded methods.9.4 Changes to type <strong>in</strong>ference and overload resolutionThe steps <strong>in</strong>volved <strong>in</strong> type <strong>in</strong>ference and overload resolution have been altered <strong>in</strong><strong>C#</strong> 3 to accommodate lambda expressions and <strong>in</strong>deed to make anonymous methodsmore useful. This doesn’t count as a new feature of <strong>C#</strong> as such, but it can be importantto understand what the compiler is go<strong>in</strong>g to do. If you f<strong>in</strong>d details like thistedious and irrelevant, feel free to skip to the chapter summary—but remember thatthis section exists, so you can read it if you run across a compilation error related tothis topic and can’t understand why your code doesn’t work. (Alternatively, youmight want to come back to this section if you f<strong>in</strong>d your code does compile, but youdon’t th<strong>in</strong>k it should!)Licensed to Rhona Hadida


246 CHAPTER 9 Lambda expressions and expression treesEven with<strong>in</strong> this section I’m not go<strong>in</strong>g to go <strong>in</strong>to absolutely every nook andcranny—that’s what the language specification is for. Instead, I’ll give an overview ofthe new behavior, provid<strong>in</strong>g examples of common cases. The primary reason forchang<strong>in</strong>g the specification is to allow lambda expressions to work <strong>in</strong> a concise fashion,which is why I’ve <strong>in</strong>cluded the topic <strong>in</strong> this particular chapter. Let’s look a little deeperat what problems we’d have run <strong>in</strong>to if the <strong>C#</strong> team had stuck with the old rules.9.4.1 Reasons for change: streaml<strong>in</strong><strong>in</strong>g generic method callsType <strong>in</strong>ference occurs <strong>in</strong> a few situations. We’ve already seen it apply to implicitlytyped arrays, and it’s also required when you try to implicitly convert a method groupto a delegate type as the parameter to a method—with overload<strong>in</strong>g of the methodbe<strong>in</strong>g called, and overload<strong>in</strong>g of methods with<strong>in</strong> the method group, and the possibilityof generic methods gett<strong>in</strong>g <strong>in</strong>volved, the set of potential conversions can becomequite overwhelm<strong>in</strong>g.By far the most common situation for type <strong>in</strong>ference is when you’re call<strong>in</strong>g ageneric method without specify<strong>in</strong>g the type arguments for that method. This happensall the time <strong>in</strong> LINQ—the way that query expressions work depends on this heavily. It’sall handled so smoothly that it’s easy to ignore how much the compiler has to workout on your behalf, all for the sake of mak<strong>in</strong>g your code clearer and more concise.The rules were reasonably straightforward <strong>in</strong> <strong>C#</strong> 2, although method groups andanonymous methods weren’t always handled as well as we might have liked. The type<strong>in</strong>ference process didn’t deduce any <strong>in</strong>formation from them, lead<strong>in</strong>g to situationswhere the desired behavior was obvious to developers but not to the compiler. Life ismore complicated <strong>in</strong> <strong>C#</strong> 3 due to lambda expressions—if you call a generic methodus<strong>in</strong>g a lambda expression with an implicitly typed parameter list, the compiler needsto work out what types you’re talk<strong>in</strong>g about, even before it can check the lambdaexpression’s body.This is much easier to see <strong>in</strong> code than <strong>in</strong> words. List<strong>in</strong>g 9.11 gives an example of thek<strong>in</strong>d of issue we want to solve: call<strong>in</strong>g a generic method us<strong>in</strong>g a lambda expression.List<strong>in</strong>g 9.11Example of code requir<strong>in</strong>g the new type <strong>in</strong>ference rulesstatic void Pr<strong>in</strong>tConvertedValue(TInput <strong>in</strong>put, Converter converter){Console.WriteL<strong>in</strong>e(converter(<strong>in</strong>put));}...Pr<strong>in</strong>tConvertedValue("I'm a str<strong>in</strong>g", x => x.Length);The method Pr<strong>in</strong>tConvertedValue <strong>in</strong> list<strong>in</strong>g 9.11 simply takes an <strong>in</strong>put value and adelegate that can convert that value <strong>in</strong>to a different type. It’s completely generic—itmakes no assumptions about the type parameters TInput and TOutput. Now, look atthe types of the arguments we’re call<strong>in</strong>g it with <strong>in</strong> the bottom l<strong>in</strong>e of the list<strong>in</strong>g. Thefirst argument is clearly a str<strong>in</strong>g, but what about the second? It’s a lambda expression,so we need to convert it <strong>in</strong>to a Converter—and that means weneed to know the types of TInput and TOutput.Licensed to Rhona Hadida


Changes to type <strong>in</strong>ference and overload resolution247If you remember, the type <strong>in</strong>ference rules of <strong>C#</strong> 2 were applied to each argument <strong>in</strong>dividually,with no way of us<strong>in</strong>g the types <strong>in</strong>ferred from one argument to another. In ourcase, these rules would have stopped us from f<strong>in</strong>d<strong>in</strong>g the types of TInput and TOutputfor the second argument, so the code <strong>in</strong> list<strong>in</strong>g 9.11 would have failed to compile.Our eventual goal is to understand what makes list<strong>in</strong>g 9.11 compile <strong>in</strong> <strong>C#</strong> 3 (and itdoes, I promise you), but we’ll start with someth<strong>in</strong>g a bit more modest.9.4.2 Inferred return types of anonymous functionsList<strong>in</strong>g 9.12 shows an example of some code that looks like it should compile butdoesn’t under the type <strong>in</strong>ference rules of <strong>C#</strong> 2.List<strong>in</strong>g 9.12Attempt<strong>in</strong>g to <strong>in</strong>fer the return type of an anonymous methoddelegate T MyFunc();static void WriteResult (MyFunc function){Console.WriteL<strong>in</strong>e(function()); Declares generic}method with...delegate parameterWriteResult(delegate { return 5; });Compil<strong>in</strong>g list<strong>in</strong>g 9.12 under <strong>C#</strong> 2 gives an errorDeclares Functhat isn’t <strong>in</strong> .NET 2.0Requires type<strong>in</strong>ference for Terror CS0411: The type arguments for method'Snippet.WriteResult(Snippet.MyFunc)' cannot be <strong>in</strong>ferred from theusage. Try specify<strong>in</strong>g the type arguments explicitly.We can fix the error <strong>in</strong> two ways—either specify the type argument explicitly (as suggestedby the compiler) or cast the anonymous method to a concrete delegate type:WriteResult(delegate { return 5; });WriteResult((MyFunc)delegate { return 5; });Both of these work, but they’re slightly ugly. We’d like the compiler to perform thesame k<strong>in</strong>d of type <strong>in</strong>ference as for nondelegate types, us<strong>in</strong>g the type of the returnedexpression to <strong>in</strong>fer the type of T. That’s exactly what <strong>C#</strong> 3 does for both anonymousmethods and lambda expressions—but there’s one catch. Although <strong>in</strong> many casesonly one return statement is <strong>in</strong>volved, there can sometimes be more. List<strong>in</strong>g 9.13 is aslightly modified version of list<strong>in</strong>g 9.12 where the anonymous method sometimesreturns an <strong>in</strong>teger and sometimes returns an object.List<strong>in</strong>g 9.13Code return<strong>in</strong>g an <strong>in</strong>teger or an object depend<strong>in</strong>g on the time of daydelegate T MyFunc();static void WriteResult (MyFunc function){Console.WriteL<strong>in</strong>e(function());}...Licensed to Rhona Hadida


248 CHAPTER 9 Lambda expressions and expression treesWriteResult(delegate{if (DateTime.Now.Hour < 12){return 10;}else{return new object();}});The compiler uses the same logic to determ<strong>in</strong>e the return type <strong>in</strong> this situation as itdoes for implicitly typed arrays, as described <strong>in</strong> section 8.4. It forms a set of all thetypes from the return statements <strong>in</strong> the body of the anonymous function 4 (<strong>in</strong> this case<strong>in</strong>t and object) and checks to see if exactly one of the types can be implicitly convertedto from all the others. There’s an implicit conversion from <strong>in</strong>t to object (viabox<strong>in</strong>g) but not from object to <strong>in</strong>t, so the <strong>in</strong>ference succeeds with object as the<strong>in</strong>ferred return type. If there are no types match<strong>in</strong>g that criterion, or more than one,no return type can be <strong>in</strong>ferred and you’ll get a compilation error.So, now we know how to work out the return type of an anonymous function—butwhat about lambda expressions where the parameter types can be implicitly def<strong>in</strong>ed?9.4.3 Two-phase type <strong>in</strong>ferenceReturn typeis <strong>in</strong>tReturn typeis objectThe details of type <strong>in</strong>ference <strong>in</strong> <strong>C#</strong> 3 are much more complicated than they are for<strong>C#</strong> 2. It’s rare that you’ll need to reference the specification for the exact behavior,but if you do I recommend you write down all the type parameters, arguments, andso forth on a piece of paper, and then follow the specification step by step, carefullynot<strong>in</strong>g down every action it requires. You’ll end up with a sheet full of fixed andunfixed type variables, with a different set of bounds for each of them. A fixed typevariable is one that the compiler has decided the value of; otherwise it is unfixed. Abound is a piece of <strong>in</strong>formation about a type variable. I suspect you’ll get a headache,too.I’m go<strong>in</strong>g to present a more “fuzzy” way of th<strong>in</strong>k<strong>in</strong>g about type <strong>in</strong>ference—onethat is likely to serve just as well as know<strong>in</strong>g the specification, and will be a lot easier tounderstand. The fact is, if the compiler doesn’t perform type <strong>in</strong>ference <strong>in</strong> exactly theway you want it to, it will almost certa<strong>in</strong>ly result <strong>in</strong> a compilation error rather thancode that builds but doesn’t behave properly. If your code doesn’t build, try giv<strong>in</strong>g thecompiler more <strong>in</strong>formation—it’s as simple as that. However, here’s roughly what’schanged for <strong>C#</strong> 3.The first big difference is that the method arguments work as a team <strong>in</strong> <strong>C#</strong> 3. In<strong>C#</strong> 2 every argument was used to try to p<strong>in</strong> down some type parameters exactly, and4Returned expressions which don’t have a type, such as null or another lambda expression, aren’t <strong>in</strong>cluded<strong>in</strong> this set. Their validity is checked later, once a return type has been determ<strong>in</strong>ed, but they don’t contributeto that decision.Licensed to Rhona Hadida


Changes to type <strong>in</strong>ference and overload resolution249the compiler would compla<strong>in</strong> if any two arguments came up with different resultsfor a particular type parameter, even if they were compatible. In <strong>C#</strong> 3, arguments cancontribute pieces of <strong>in</strong>formation—types that must be implicitly convertible to thef<strong>in</strong>al fixed value of a particular type parameter. The logic used to come up with thatfixed value is the same as for <strong>in</strong>ferred return types and implicitly typed arrays. List<strong>in</strong>g9.14 shows an example of this—without us<strong>in</strong>g any lambda expressions or evenanonymous methods.List<strong>in</strong>g 9.14Flexible type <strong>in</strong>ference comb<strong>in</strong><strong>in</strong>g <strong>in</strong>formation from multiple argumentsstatic void Pr<strong>in</strong>tType (T first, T second){Console.WriteL<strong>in</strong>e(typeof(T));}...Pr<strong>in</strong>tType(1, new object());Although the code <strong>in</strong> list<strong>in</strong>g 9.14 is syntactically valid <strong>in</strong> <strong>C#</strong> 2, it wouldn’t build: type<strong>in</strong>ference would fail, because the first parameter would decide that T must be <strong>in</strong>tand the second parameter would decide that T must be object. In <strong>C#</strong> 3 the compilerdeterm<strong>in</strong>es that T should be object <strong>in</strong> exactly the same way that it did for the <strong>in</strong>ferredreturn type <strong>in</strong> list<strong>in</strong>g 9.13. In fact, the <strong>in</strong>ferred return type rules are effectively oneexample of the more general process <strong>in</strong> <strong>C#</strong> 3.The second change is that type <strong>in</strong>ference is now performed <strong>in</strong> two phases. The firstphase deals with “normal” arguments where the types <strong>in</strong>volved are known to beg<strong>in</strong>with. This <strong>in</strong>cludes explicitly typed anonymous functions.The second phase then kicks <strong>in</strong>, where implicitly typed lambda expressions andmethod groups have their types <strong>in</strong>ferred. The idea is to see whether any of the <strong>in</strong>formationwe’ve pieced together so far is enough to work out the parameter types of thelambda expression (or method group). If it is, the compiler is then able to exam<strong>in</strong>ethe body of the lambda expression and work out the <strong>in</strong>ferred return type—which isoften another of the type parameters we’re look<strong>in</strong>g for. If the second phase gives somemore <strong>in</strong>formation, we go through it aga<strong>in</strong>, repeat<strong>in</strong>g until either we run out of cluesor we’ve worked out all the type parameters <strong>in</strong>volved.Let’s look at two examples to show how it works. First we’ll take the code we startedthe section with—list<strong>in</strong>g 9.11.static void Pr<strong>in</strong>tConvertedValue(TInput <strong>in</strong>put, Converter converter){Console.WriteL<strong>in</strong>e(converter(<strong>in</strong>put));}...Pr<strong>in</strong>tConvertedValue("I'm a str<strong>in</strong>g", x => x.Length);The type parameters we need to work out <strong>in</strong> list<strong>in</strong>g 9.11 are TInput and TOutput. Thesteps performed are as follows:Licensed to Rhona Hadida


250 CHAPTER 9 Lambda expressions and expression trees1 Phase 1 beg<strong>in</strong>s.2 The first parameter is of type TInput, and the first argument is of type str<strong>in</strong>g.We <strong>in</strong>fer that there must be an implicit conversion from str<strong>in</strong>g to TInput.3 The second parameter is of type Converter, and the secondargument is an implicitly typed lambda expression. No <strong>in</strong>ference is performed—wedon’t have enough <strong>in</strong>formation.4 Phase 2 beg<strong>in</strong>s.5 TInput doesn’t depend on any unfixed type parameters, so it’s fixed to str<strong>in</strong>g.6 The second argument now has a fixed <strong>in</strong>put type, but an unfixed output type. Wecan consider it to be (str<strong>in</strong>g x) => x.Length and <strong>in</strong>fer the return type as <strong>in</strong>t.Therefore an implicit conversion must take place from <strong>in</strong>t to TOutput.7 Phase 2 repeats.8 TOutput doesn’t depend on anyth<strong>in</strong>g unfixed, so it’s fixed to <strong>in</strong>t.9 There are now no unfixed type parameters, so <strong>in</strong>ference succeeds.Complicated, eh? Still, it does the job—the result is what we’d want (TInput=str<strong>in</strong>g,TOutput=<strong>in</strong>t) and everyth<strong>in</strong>g compiles without any problems. The importance ofphase 2 repeat<strong>in</strong>g is best shown with another example, however. List<strong>in</strong>g 9.15 shows twoconversions be<strong>in</strong>g performed, with the output of the first one becom<strong>in</strong>g the <strong>in</strong>put ofthe second. Until we’ve worked out the output type of the first conversion, we don’tknow the <strong>in</strong>put type of the second, so we can’t <strong>in</strong>fer its output type either.List<strong>in</strong>g 9.15Multistage type <strong>in</strong>ferencestatic void ConvertTwice(TInput <strong>in</strong>put,Converter firstConversion,Converter secondConversion){TMiddle middle = firstConversion(<strong>in</strong>put);TOutput output = secondConversion(middle);Console.WriteL<strong>in</strong>e(output);}...ConvertTwice("Another str<strong>in</strong>g",text => text.Length,length => Math.Sqrt(length));The first th<strong>in</strong>g to notice is that the method signature appears to be pretty horrific. It’snot too bad when you stop be<strong>in</strong>g scared and just look at it carefully—and certa<strong>in</strong>ly theexample usage makes it more obvious. We take a str<strong>in</strong>g, and perform a conversion onit: the same conversion as before, just a length calculation. We then take that length(an <strong>in</strong>t) and f<strong>in</strong>d its square root (a double).Phase 1 of type <strong>in</strong>ference tells the compiler that there must be a conversion fromstr<strong>in</strong>g to TInput. The first time through phase 2, TInput is fixed to str<strong>in</strong>g and we<strong>in</strong>fer that there must be a conversion from <strong>in</strong>t to TMiddle. The second time throughLicensed to Rhona Hadida


Changes to type <strong>in</strong>ference and overload resolution251phase 2, TMiddle is fixed to <strong>in</strong>t and we <strong>in</strong>fer that there must be a conversion fromdouble to TOutput. The third time through phase 2, TOutput is fixed to double andtype <strong>in</strong>ference succeeds. When type <strong>in</strong>ference has f<strong>in</strong>ished, the compiler can look atthe code with<strong>in</strong> the lambda expression properly.NOTECheck<strong>in</strong>g the body of a lambda expression—The body of a lambda expressioncannot be checked until the <strong>in</strong>put parameter types are known. The lambdaexpression x => x.Length is valid if x is an array or a str<strong>in</strong>g, but <strong>in</strong>valid <strong>in</strong>many other cases. This isn’t a problem when the parameter types areexplicitly declared, but with an implicit parameter list the compiler needsto wait until it’s performed the relevant type <strong>in</strong>ference before it can try towork out what the lambda expression means.These examples have shown only one change work<strong>in</strong>g at a time—<strong>in</strong> practice there canbe several pieces of <strong>in</strong>formation about different type variables, potentially discovered<strong>in</strong> different iterations of the process. In an effort to save your sanity (and m<strong>in</strong>e), I’mnot go<strong>in</strong>g to present any more complicated examples—hopefully you understand thegeneral mechanism, even if the exact details are hazy.Although it may seem as if this k<strong>in</strong>d of situation will occur so rarely that it’s notworth hav<strong>in</strong>g such complex rules to cover it, <strong>in</strong> fact it’s quite common <strong>in</strong> <strong>C#</strong> 3, particularlywith LINQ. Indeed, you could easily use type <strong>in</strong>ference extensively without eventh<strong>in</strong>k<strong>in</strong>g about it—it’s likely to become second nature to you. If it fails and you wonderwhy, however, you can always revisit this section and the language specification.There’s one more change we need to cover, but you’ll be glad to hear it’s easierthan type <strong>in</strong>ference: method overload<strong>in</strong>g.9.4.4 Pick<strong>in</strong>g the right overloaded methodOverload<strong>in</strong>g occurs when there are multiple methods available with the same namebut different signatures. Sometimes it’s obvious which method is appropriate, becauseit’s the only one with the right number of parameters, or it’s the only one where allthe arguments can be converted <strong>in</strong>to the correspond<strong>in</strong>g parameter types.The tricky bit comes when there are multiple methods that could be the right one.The rules are quite complicated (yes, aga<strong>in</strong>)—but the key part is the way that eachargument type is converted <strong>in</strong>to the parameter type. For <strong>in</strong>stance, consider thesemethod signatures, as if they were both declared <strong>in</strong> the same type:void Write(<strong>in</strong>t x)void Write(double y)The mean<strong>in</strong>g of a call to Write(1.5) is obvious, because there’s no implicit conversionfrom double to <strong>in</strong>t, but a call to Write(1) is trickier. There is an implicitconversion from <strong>in</strong>t to double, so both methods are possible. At that po<strong>in</strong>t, thecompiler considers the conversion from <strong>in</strong>t to <strong>in</strong>t, and from <strong>in</strong>t to double. A conversionfrom any type to itself is def<strong>in</strong>ed to be better than any conversion to a differenttype, so the Write(<strong>in</strong>t x) method is better than Write(double y) for thisparticular call.Licensed to Rhona Hadida


252 CHAPTER 9 Lambda expressions and expression treesWhen there are multiple parameters, the compiler has to make sure there isexactly one method that is at least as good as all the others for every parameter. As asimple example, suppose we hadvoid Write(<strong>in</strong>t x, double y)void Write(double x, <strong>in</strong>t y)A call to Write(1, 1) would be ambiguous, and the compiler would force you to add acast to at least one of the parameters to make it clear which method you meant to call.That logic still applies to <strong>C#</strong> 3, but with one extra rule about anonymous functions,which never specify a return type. In this case, the <strong>in</strong>ferred return type (as described<strong>in</strong> 9.4.2) is used <strong>in</strong> the “better conversion” rules.Let’s see an example of the k<strong>in</strong>d of situation that needs the new rule. List<strong>in</strong>g 9.16conta<strong>in</strong>s two methods with the name Execute, and a call us<strong>in</strong>g a lambda expression.List<strong>in</strong>g 9.16Sample of overload<strong>in</strong>g choice <strong>in</strong>fluenced by delegate return typestatic void Execute(Func action){Console.WriteL<strong>in</strong>e("action returns an <strong>in</strong>t: "+action());}static void Execute(Func action){Console.WriteL<strong>in</strong>e("action returns a double: "+action());}...Execute( () => 1 );The call to Execute <strong>in</strong> list<strong>in</strong>g 9.16 could have been written with an anonymous methodor a method group <strong>in</strong>stead—the same rules are applied whatever k<strong>in</strong>d of conversion is<strong>in</strong>volved. So, which Execute method should be called? The overload<strong>in</strong>g rules say thatwhen two methods are both applicable after perform<strong>in</strong>g conversions on the arguments,then those argument conversions are exam<strong>in</strong>ed to see which one is “better.” The conversionshere aren’t from a normal .NET type to the parameter type—they’re from alambda expression to two different delegate types. So, which conversion is better?Surpris<strong>in</strong>gly enough, the same situation <strong>in</strong> <strong>C#</strong> 2 would result <strong>in</strong> a compilationerror—there was no language rule cover<strong>in</strong>g this case. In <strong>C#</strong> 3, however, the methodwith the Func parameter would be chosen. The extra rule that has been addedcan be paraphrased to this:If an anonymous function can be converted to two delegate types that havethe same parameter list but different return types, then the delegateconversions are judged by the conversions from the <strong>in</strong>ferred return type tothe delegates’ return types.That’s pretty much gibberish without referr<strong>in</strong>g to an example. Let’s look back at list<strong>in</strong>g9.16: we’re convert<strong>in</strong>g from a lambda expression with no parameters and an<strong>in</strong>ferred return type of <strong>in</strong>t to either Func or Func. The parameter listsare the same (empty) for both delegate types, so the rule applies. We then just need toLicensed to Rhona Hadida


Summary253f<strong>in</strong>d the better conversion: <strong>in</strong>t to <strong>in</strong>t, or <strong>in</strong>t to double. This puts us <strong>in</strong> more familiarterritory—as we saw earlier, the <strong>in</strong>t to <strong>in</strong>t conversion is better. List<strong>in</strong>g 9.16 thereforepr<strong>in</strong>ts out “action returns an <strong>in</strong>t: 1.”9.4.5 Wrapp<strong>in</strong>g up type <strong>in</strong>ference and overload resolutionThis section has been pretty heavy. I would have loved to make it simpler—but it’s afundamentally complicated topic. The term<strong>in</strong>ology <strong>in</strong>volved doesn’t make it any easier,especially as parameter type and type parameter mean completely different th<strong>in</strong>gs!Congratulations if you made it through and actually understood it all. Don’t worry ifyou didn’t: hopefully next time you read through the section, it will shed a bit morelight on the topic—particularly after you’ve run <strong>in</strong>to situations where it’s important <strong>in</strong>your own code. For the moment, here are the most important po<strong>in</strong>ts:■ Anonymous functions (anonymous methods and lambda expressions) have<strong>in</strong>ferred return types based on the types of all the return statements.■ Lambda expressions can only be understood by the compiler when the types ofall the parameters are known.■ Type <strong>in</strong>ference no longer requires that each argument <strong>in</strong>dependently comes toexactly the same conclusion about type parameters, as long as the results staycompatible.■ Type <strong>in</strong>ference is now multistage: the <strong>in</strong>ferred return type of one anonymousfunction may be used as a parameter type for another.■ F<strong>in</strong>d<strong>in</strong>g the “best” overloaded method when anonymous functions are <strong>in</strong>volvedtakes the <strong>in</strong>ferred return type <strong>in</strong>to account.9.5 SummaryIn <strong>C#</strong> 3, lambda expressions almost entirely replace anonymous methods. The only th<strong>in</strong>gyou can do with an anonymous method that you can’t do with a lambda expression issay that you don’t care about the parameters <strong>in</strong> the way that we saw <strong>in</strong> section 5.4.3. Ofcourse, anonymous methods are supported for the sake of backward compatibility, butidiomatic, freshly written <strong>C#</strong> 3 code will conta<strong>in</strong> very few of them.We’ve seen how lambda expressions are much more than just a more compact syntaxfor delegate creation, however. They can be converted <strong>in</strong>to expression trees, whichcan then be processed by other code, possibly perform<strong>in</strong>g equivalent actions <strong>in</strong> differentexecution environments. This is arguably the most important part of the LINQstory.Our discussion of type <strong>in</strong>ference and overload<strong>in</strong>g was a necessary evil to some extent:no one actually enjoys discuss<strong>in</strong>g the sort of rules which are required, but it’s importantto have at least a pass<strong>in</strong>g understand<strong>in</strong>g of what’s go<strong>in</strong>g on. Before we all feel too sorryfor ourselves, spare a thought for the poor language designers who have to live andbreathe this k<strong>in</strong>d of th<strong>in</strong>g, mak<strong>in</strong>g sure the rules are consistent and don’t fall apart <strong>in</strong>nasty situations. Then pity the testers who have to try to break the implementation!Licensed to Rhona Hadida


254 CHAPTER 9 Lambda expressions and expression treesThat’s it <strong>in</strong> terms of describ<strong>in</strong>g lambda expressions—but we’ll be see<strong>in</strong>g a lot moreof them <strong>in</strong> the rest of the book. For <strong>in</strong>stance, our next chapter is all about extensionmethods. Superficially, they’re completely separate from lambda expressions—but <strong>in</strong>reality the two features are often used together.Licensed to Rhona Hadida


Extension methodsThis chapter covers■■■Writ<strong>in</strong>g extension methodsCall<strong>in</strong>g extension methodsMethod cha<strong>in</strong><strong>in</strong>g■ Extension methods <strong>in</strong> .NET 3.5■Other uses for extension methodsI’m not a fan of <strong>in</strong>heritance. Or rather, I’m not a fan of a number of places where<strong>in</strong>heritance has been used <strong>in</strong> code that I’ve ma<strong>in</strong>ta<strong>in</strong>ed, or class libraries I’veworked with. As with so many th<strong>in</strong>gs, it’s powerful when used properly, but it’s got adesign overhead to it that is often overlooked and can become pa<strong>in</strong>ful over time.It’s sometimes used as a way of add<strong>in</strong>g extra behavior and functionality to a class,even when no real <strong>in</strong>formation is be<strong>in</strong>g added about the object—where noth<strong>in</strong>g isbe<strong>in</strong>g specialized.Sometimes that’s appropriate—if objects of the new type should carry aroundthe details of the extra behavior—but often it’s not. Indeed, often it’s just not possibleto use <strong>in</strong>heritance <strong>in</strong> this way <strong>in</strong> the first place—if you’re work<strong>in</strong>g with a valuetype, a sealed class, or an <strong>in</strong>terface. The alternative is usually to write a bunch ofstatic methods, most of which take an <strong>in</strong>stance of the type <strong>in</strong> question as at least255Licensed to Rhona Hadida


256 CHAPTER 10 Extension methodsone of their parameters. This works f<strong>in</strong>e, without the design penalty of <strong>in</strong>heritance,but does tend to make code look ugly.<strong>C#</strong> 3 <strong>in</strong>troduces the idea of extension methods, which have the benefits of the staticmethods solution and also improve the readability of code that calls them. They letyou call static methods as if they were <strong>in</strong>stance methods of a completely different class.Don’t panic—it’s not as crazy or as arbitrary as it sounds.In this chapter we’ll first look at how to use extension methods and how to writethem. We’ll then exam<strong>in</strong>e a few of the extension methods provided by the .NET 3.5Framework, and see how they can be cha<strong>in</strong>ed together easily. This cha<strong>in</strong><strong>in</strong>g ability isan important part of the reason for <strong>in</strong>troduc<strong>in</strong>g extension methods to the language <strong>in</strong>the first place. F<strong>in</strong>ally, we’ll consider some of the pros and cons of us<strong>in</strong>g extensionmethods <strong>in</strong>stead of “pla<strong>in</strong>” static methods.First, though, let’s have a closer look at why extension methods are sometimesdesirable compared with the pla<strong>in</strong> old static methods available <strong>in</strong> <strong>C#</strong> 1 and 2, particularlywhen you create utility classes.10.1 Life before extension methodsYou may be gett<strong>in</strong>g a sense of déjà vu at this po<strong>in</strong>t, because utility classes came up <strong>in</strong>chapter 7 when we looked at static classes. If you’ve written a lot of <strong>C#</strong> 2 code by thetime you start us<strong>in</strong>g <strong>C#</strong> 3, you should look at your static classes—many of the methods<strong>in</strong> them may well be good candidates for turn<strong>in</strong>g <strong>in</strong>to extension methods. That’s notto say that all exist<strong>in</strong>g static classes are a good fit, but you may well recognize the follow<strong>in</strong>gtraits:■■■You want to add some members to a type.You don’t need to add any more data to the <strong>in</strong>stances of the type.You can’t change the type itself, because it’s <strong>in</strong> someone else’s code.One slight variation on this is where you want to work with an <strong>in</strong>terface <strong>in</strong>stead of aclass, add<strong>in</strong>g useful behavior while only call<strong>in</strong>g methods on the <strong>in</strong>terface. A goodexample of this is IList. Wouldn’t it be nice to be able to sort any (mutable)implementation of IList? It would be horrendous to force all implementations ofthe <strong>in</strong>terface to implement sort<strong>in</strong>g themselves, but it would be nice from the po<strong>in</strong>t ofview of the user of the list.The th<strong>in</strong>g is, IList provides all the build<strong>in</strong>g blocks for a completely genericsort rout<strong>in</strong>e (several, <strong>in</strong> fact), but you can’t put that implementation <strong>in</strong> the <strong>in</strong>terface.IList could have been specified as an abstract class <strong>in</strong>stead, and the sort<strong>in</strong>g functionality<strong>in</strong>cluded that way, but as <strong>C#</strong> and .NET have s<strong>in</strong>gle <strong>in</strong>heritance of implementation,that would have placed a significant restriction on the types deriv<strong>in</strong>g from it.Extension methods would allow us to sort any IList implementation, mak<strong>in</strong>g itappear as if the list itself provided the functionality.We’ll see later that a lot of the functionality of LINQ is built on extension methodsover <strong>in</strong>terfaces. For the moment, though, we’ll use a different type for our examples:Licensed to Rhona Hadida


Life before extension methods257System.IO.Stream. The Stream class is the bedrock of b<strong>in</strong>ary communications <strong>in</strong> .NET.Stream itself is an abstract class with several concrete derived classes, such as Network-Stream, FileStream, and MemoryStream. Unfortunately, there are a few pieces of functionalitythat would have been handy to <strong>in</strong>clude <strong>in</strong> Stream but that just aren’t there.The “miss<strong>in</strong>g features” I come across most often are the ability to read the whole ofa stream <strong>in</strong>to memory as a byte array, and the ability to copy 1 the contents of onestream <strong>in</strong>to another. Both of these are frequently implemented badly, mak<strong>in</strong>g assumptionsabout streams that just aren’t valid (the most common is that Stream.Read willcompletely fill the given buffer if the data doesn’t run out first).It would be nice to have the functionality <strong>in</strong> a s<strong>in</strong>gle place, rather than duplicat<strong>in</strong>git <strong>in</strong> several projects. That’s why I wrote the StreamUtil class <strong>in</strong> my miscellaneous utilitylibrary. The real code conta<strong>in</strong>s a fair amount of error check<strong>in</strong>g and other functionality,but list<strong>in</strong>g 10.1 shows a cut-down version that is more than adequate for our needs.List<strong>in</strong>g 10.1A simple utility class to provide extra functionality for streamsus<strong>in</strong>g System.IO;public static class StreamUtil{const <strong>in</strong>t BufferSize = 8192;}public static void Copy(Stream <strong>in</strong>put,Stream output){byte[] buffer = new byte[BufferSize];<strong>in</strong>t read;while ((read = <strong>in</strong>put.Read(buffer, 0, buffer.Length)) > 0){output.Write(buffer, 0, read);}}public static byte[] ReadFully(Stream <strong>in</strong>put){us<strong>in</strong>g (MemoryStream tempStream = new MemoryStream()){Copy(<strong>in</strong>put, tempStream);return tempStream.ToArray();}}The implementation details don’t matter much, although it’s worth not<strong>in</strong>g that theReadFully method calls the Copy method—that will be useful to demonstrate a po<strong>in</strong>tabout extension methods later. The class is easy to use—list<strong>in</strong>g 10.2 shows how we canwrite a web response to disk, for example.1Due to the nature of streams, this “copy<strong>in</strong>g” doesn’t necessarily duplicate the data—it just reads it from onestream and writes it to another. Although “copy” isn’t a strictly accurate term <strong>in</strong> this sense, the difference isusually irrelevant.Licensed to Rhona Hadida


258 CHAPTER 10 Extension methodsList<strong>in</strong>g 10.2WebRequest request = WebRequest.Create("http://mann<strong>in</strong>g.com");us<strong>in</strong>g (WebResponse response = request.GetResponse())us<strong>in</strong>g (Stream responseStream = response.GetResponseStream())us<strong>in</strong>g (FileStream output = File.Create("response.dat")){StreamUtil.Copy(responseStream, output);}List<strong>in</strong>g 10.2 is quite compact, and the StreamUtil class has taken care of loop<strong>in</strong>g andask<strong>in</strong>g the response stream for more data until it’s all been received. It’s done its jobas a utility class perfectly reasonably. Even so, it doesn’t feel very object-oriented. We’dreally like to ask the response stream to copy itself to the output stream, just like theMemoryStream class has a WriteTo method. It’s not a big problem, but it’s just a littleugly as it is.Inheritance wouldn’t help us <strong>in</strong> this situation (we want this behavior to be availablefor all streams, not just ones we’re responsible for) and we can’t go chang<strong>in</strong>g theStream class itself—so what can we do? With <strong>C#</strong> 2, we were out of options—we had tostick with the static methods and live with the clums<strong>in</strong>ess. <strong>C#</strong> 3 allows us to change ourstatic class to expose its members as extension methods, so we can pretend that themethods have been part of Stream all along. Let’s see what changes are required.10.2 Extension method syntaxUs<strong>in</strong>g StreamUtil to copy a web response stream to a fileExtension methods are almost embarrass<strong>in</strong>gly easy to create, and simple to use too.The considerations around when and how to use them are significantly deeper thanthe difficulties <strong>in</strong>volved <strong>in</strong> learn<strong>in</strong>g how to write them <strong>in</strong> the first place. Let’s start offby convert<strong>in</strong>g our StreamUtil class to have a couple of extension methods.10.2.1 Declar<strong>in</strong>g extension methodsYou can’t use just any method as an extension method—it has to have the follow<strong>in</strong>gcharacteristics:■ It has to be <strong>in</strong> a non-nested, nongeneric static class (and therefore has to be astatic method).■ It has to have at least one parameter.■ The first parameter has to be prefixed with the this keyword.■ The first parameter can’t have any other modifiers (such as out or ref).■ The type of the first parameter must not be a po<strong>in</strong>ter type.That’s it—the method can be generic, return a value, have ref/out parameters otherthan the first one, be implemented with an iterator block, be part of a partial class, usenullable types—anyth<strong>in</strong>g, as long as the above constra<strong>in</strong>ts are met.We will call the type of the first parameter the extended type of the method. It’s notofficial specification term<strong>in</strong>ology, but it’s a useful piece of shorthand.Not only does the previous list provide all the restrictions, but it also gives thedetails of what you need to do to turn a “normal” static method <strong>in</strong> a static class <strong>in</strong>to anLicensed to Rhona Hadida


Extension method syntax259extension method—just add the this keyword. List<strong>in</strong>g 10.3 shows the same class as <strong>in</strong>list<strong>in</strong>g 10.1, but this time with both methods as extension methods.List<strong>in</strong>g 10.3The StreamUtil class aga<strong>in</strong>, but this time with extension methodspublic static class StreamUtil{const <strong>in</strong>t BufferSize = 8192;}public static void CopyTo(this Stream <strong>in</strong>put,Stream output){byte[] buffer = new byte[BufferSize];<strong>in</strong>t read;while ((read = <strong>in</strong>put.Read(buffer, 0, buffer.Length)) > 0){output.Write(buffer, 0, read);}}public static byte[] ReadFully(this Stream <strong>in</strong>put){us<strong>in</strong>g (MemoryStream tempStream = new MemoryStream()){CopyTo(<strong>in</strong>put, tempStream);return tempStream.ToArray();}}Yes, the only big change <strong>in</strong> list<strong>in</strong>g 10.3 is the addition of the two modifiers, as shown <strong>in</strong>bold. I’ve also changed the name of the method from Copy to CopyTo. As we’ll see <strong>in</strong> am<strong>in</strong>ute, that will allow call<strong>in</strong>g code to read more naturally, although it does lookslightly strange <strong>in</strong> the ReadFully method at the moment.Now, it’s not much use hav<strong>in</strong>g extension methods if we can’t use them…10.2.2 Call<strong>in</strong>g extension methodsI’ve mentioned it <strong>in</strong> pass<strong>in</strong>g, but we haven’t yet seen what an extension method actuallydoes. Simply put, it pretends to be an <strong>in</strong>stance method of another type—the typeof the first parameter of the method.The transformation of our example code that uses StreamUtil is as simple as thetransformation of the utility class itself. This time, <strong>in</strong>stead of add<strong>in</strong>g someth<strong>in</strong>g <strong>in</strong>we’ll take it away. List<strong>in</strong>g 10.4 is a repeat performance of list<strong>in</strong>g 10.2, but us<strong>in</strong>g the“new” syntax to call CopyTo. I say “new,” but it’s really not new at all—it’s the same syntaxwe’ve always used for call<strong>in</strong>g <strong>in</strong>stance methods.List<strong>in</strong>g 10.4Copy<strong>in</strong>g a stream us<strong>in</strong>g an extension methodWebRequest request = WebRequest.Create("http://mann<strong>in</strong>g.com");us<strong>in</strong>g (WebResponse response = request.GetResponse())us<strong>in</strong>g (Stream responseStream = response.GetResponseStream())Licensed to Rhona Hadida


260 CHAPTER 10 Extension methodsus<strong>in</strong>g (FileStream output = File.Create("response.dat")){responseStream.CopyTo(output);}In list<strong>in</strong>g 10.4 it at least looks like we’re ask<strong>in</strong>g the response stream to do the copy<strong>in</strong>g.It’s still StreamUtil do<strong>in</strong>g the work beh<strong>in</strong>d the scenes, but the code reads <strong>in</strong> a morenatural way. In fact, the compiler has converted the CopyTo call <strong>in</strong>to a normal staticmethod call to StreamUtil.CopyTo, pass<strong>in</strong>g the value of responseStream as the firstargument (followed by output as normal).Now that you can see the code <strong>in</strong> question, I hope you can understand why Ichanged the method name from Copy to CopyTo. Some names work just as well forstatic methods as <strong>in</strong>stance methods, but you’ll f<strong>in</strong>d that others need tweak<strong>in</strong>g to getthe maximum readability benefit.If we want to make the StreamUtil code slightly more pleasant, you can changethe l<strong>in</strong>e of ReadFully that calls CopyTo like this:<strong>in</strong>put.CopyTo(tempStream);At this po<strong>in</strong>t the name change is fully appropriate for all the uses—although there’snoth<strong>in</strong>g to stop you from us<strong>in</strong>g the extension method as a normal static method,which is useful when you’re migrat<strong>in</strong>g a lot of code.You may have noticed that there’s noth<strong>in</strong>g <strong>in</strong> these method calls to <strong>in</strong>dicate thatwe’re us<strong>in</strong>g an extension method <strong>in</strong>stead of a regular <strong>in</strong>stance method of Stream. Thiscan be seen <strong>in</strong> two ways: it’s a good th<strong>in</strong>g if our aim is to make extension methods blend<strong>in</strong> as much as possible and cause very little alarm—but it’s a bad th<strong>in</strong>g if you want to beable to immediately see what’s really go<strong>in</strong>g on. If you’re us<strong>in</strong>g Visual Studio 2008, youcan hover over a method call and get an <strong>in</strong>dication <strong>in</strong> the tooltip when it’s an extensionmethod, as shown <strong>in</strong> figure 10.1.IntelliSense also <strong>in</strong>dicates when it’s offer<strong>in</strong>g an extension method, <strong>in</strong> both theicon for the method and the tooltip when it’s selected. Of course, you don’t want tohave to hover over every method call you make or be super-careful with IntelliSense,but most of the time it doesn’t matter whether you’re call<strong>in</strong>g an <strong>in</strong>stance or extensionmethod.There’s one th<strong>in</strong>g that’s still rather strange about our call<strong>in</strong>g code, though—itdoesn’t mention StreamUtil anywhere! How does the compiler know to use theextension method <strong>in</strong> the first place?Figure 10.1 Hover<strong>in</strong>g overa method call <strong>in</strong> Visual Studio2008 reveals whether or notthe method is actually anextension method.Licensed to Rhona Hadida


Extension method syntax26110.2.3 How extension methods are foundIt’s important to know how to call extension methods—but it’s also important to knowhow to not call them—how to not be presented with unwanted options. To achievethat, we need to know how the compiler decides which extension methods to use <strong>in</strong>the first place.Extension methods are made available to the code <strong>in</strong> the same way that classes aremade available without qualification—with us<strong>in</strong>g directives. When the compiler seesan expression that looks like it’s try<strong>in</strong>g to use an <strong>in</strong>stance method but none of the<strong>in</strong>stance methods are compatible with the method call (if there’s no method with thatname, for <strong>in</strong>stance, or no overload matches the arguments given), it then looks for anappropriate extension method. It considers all the extension methods <strong>in</strong> all theimported namespaces and the current namespaces, and matches ones where there’san implicit conversion from the expression type to the extended type.NOTE Implementation: How does the compiler spot an extension method <strong>in</strong> a library? Towork out whether or not to use an extension method, the compiler has tobe able to tell the difference between an extension method and othermethods with<strong>in</strong> a static class that happen to have an appropriate signature.It does this by check<strong>in</strong>g whether System.Runtime.CompilerServices.ExtensionAttribute has been applied to the method. This attribute is newto .NET 3.5, but the compiler doesn’t check which assembly the attributecomes from. This means that you can still use extension methods even ifyour project targets .NET 2.0—you just need to def<strong>in</strong>e your own attributewith the right name, <strong>in</strong> the right namespace.If multiple applicable extension methods are available for different extended types(us<strong>in</strong>g implicit conversions), the most appropriate one is chosen with the “better conversion”rules used <strong>in</strong> overload<strong>in</strong>g. For <strong>in</strong>stance, if IChild <strong>in</strong>herits from IParent, andthere’s an extension method with the same name for both, then the IChild extensionmethod is used <strong>in</strong> preference to the one on IParent. This is crucial to LINQ, as you’llsee <strong>in</strong> section 12.2, where we meet the IQueryable <strong>in</strong>terface.It’s important to note that <strong>in</strong>stance methods are always used beforeextension methods, but the compiler doesn’t warn of an extensionmethod that matches an exist<strong>in</strong>g <strong>in</strong>stance method. If a new version of theframework were to <strong>in</strong>troduce a CopyTo method <strong>in</strong> Stream that took thesame parameters as our extension method, recompil<strong>in</strong>g our code aga<strong>in</strong>stthe new framework would silently change the mean<strong>in</strong>g of the methodcall. (Indeed, that’s one reason for choos<strong>in</strong>g CopyTo <strong>in</strong>stead ofWriteTo—we wouldn’t want the mean<strong>in</strong>g to change depend<strong>in</strong>g onwhether the compile-time type was Stream or MemoryStream.)One potential problem with the way that extension methods are made available tocode is that it’s very wide-rang<strong>in</strong>g. If there are two classes <strong>in</strong> the same namespace conta<strong>in</strong><strong>in</strong>gmethods with the same extended type, there’s no way of only us<strong>in</strong>g the extensionmethods from one of the classes. Likewise, there’s no way of import<strong>in</strong>g aWarn<strong>in</strong>g:subtleversion<strong>in</strong>gissue!Licensed to Rhona Hadida


262 CHAPTER 10 Extension methodsnamespace for the sake of mak<strong>in</strong>g types available us<strong>in</strong>g only their simple names, butwithout mak<strong>in</strong>g the extension methods with<strong>in</strong> that namespace available at the sametime. I recommend us<strong>in</strong>g a namespace that solely conta<strong>in</strong>s static classes with extensionmethods to mitigate this problem.There’s one aspect of extension methods that can be quite surpris<strong>in</strong>g when youfirst encounter it but is also useful <strong>in</strong> some situations. It’s all about null references—let’s take a look.10.2.4 Call<strong>in</strong>g a method on a null referenceI’d be amazed if I ever encountered anyone who’d done a significant amount of .NETprogramm<strong>in</strong>g without see<strong>in</strong>g a NullReferenceException due to call<strong>in</strong>g a method witha variable whose value turned out to be a null reference. You can’t call <strong>in</strong>stance methodson null references <strong>in</strong> <strong>C#</strong> (although IL itself supports it for nonvirtual calls)—but you cancall extension methods with a null reference. This is demonstrated by list<strong>in</strong>g 10.5. Notethat this isn’t a snippet s<strong>in</strong>ce nested classes can’t conta<strong>in</strong> extension methods.List<strong>in</strong>g 10.5Extension method be<strong>in</strong>g called on a null referenceus<strong>in</strong>g System;public static class NullUtil{public static bool IsNull(this object x){return x==null;}}public class Test{static void Ma<strong>in</strong>(){object y = null;Console.WriteL<strong>in</strong>e(y.IsNull());y = new object();Console.WriteL<strong>in</strong>e(y.IsNull());}}The output of list<strong>in</strong>g 10.5 is “True” then “False”—if IsNull had been a normal<strong>in</strong>stance method, an exception would have been thrown <strong>in</strong> the second l<strong>in</strong>e of Ma<strong>in</strong>.Instead, IsNull was called with null as the argument. Prior to the advent of extensionmethods, <strong>C#</strong> had no way of lett<strong>in</strong>g you write the more readable y.IsNull() formsafely, requir<strong>in</strong>g NullUtil.IsNull(y) <strong>in</strong>stead. There’s one particularly obvious example<strong>in</strong> the framework where this could be useful: str<strong>in</strong>g.IsNullOrEmpty. <strong>C#</strong> 3 allowsyou to write an extension method that has the same signature (other than the “extra”parameter for the extended type) as an exist<strong>in</strong>g static method on the extended type.To save you read<strong>in</strong>g through that sentence several times, here’s an example—eventhough the str<strong>in</strong>g class has a static, parameterless method IsNullOrEmpty, you canstill create and use the follow<strong>in</strong>g extension method:Licensed to Rhona Hadida


Extension methods <strong>in</strong> .NET 3.5263public static bool IsNullOrEmpty(this str<strong>in</strong>g text){return str<strong>in</strong>g.IsNullOrEmpty(text);}At first it seems odd to be able to call IsNullOrEmpty on a variable that is null withoutan exception be<strong>in</strong>g thrown, particularly if you’re familiar with it as a static method from.NET 2.0. In my view, code us<strong>in</strong>g the extension method is more easily understandable.For <strong>in</strong>stance, if you read the expression if (name.IsNullOrEmpty()) out loud, it saysexactly what it’s do<strong>in</strong>g. As always, experiment to see what works for you—but be awareof the possibility of other people us<strong>in</strong>g this technique if you’re debugg<strong>in</strong>g code. Don’tbe certa<strong>in</strong> that an exception will be thrown on a method call unless you’re sure it’s notan extension method! Also note that you should th<strong>in</strong>k carefully before reus<strong>in</strong>g an exist<strong>in</strong>gname for an extension method—the previous extension method could confusereaders who are only familiar with the static method from the framework.Now that we know the syntax and behavior of extension methods, we can have a lookat some examples of them, which are provided <strong>in</strong> .NET 3.5 as part of the framework.10.3 Extension methods <strong>in</strong> .NET 3.5The biggest use of extension methods <strong>in</strong> .NET 3.5 is <strong>in</strong> LINQ. Some LINQ providers havea few extension methods to help them along, but there are two classes that stand out,both of them appear<strong>in</strong>g <strong>in</strong> the System.L<strong>in</strong>q namespace: Enumerable and Queryable.These conta<strong>in</strong> many, many extension methods: most of the ones <strong>in</strong> Enumerable operateon IEnumerable and most of those <strong>in</strong> Queryable operate on IQueryable. We’llsee the purpose of IQueryable <strong>in</strong> chapter 12, but for the moment let’s concentrateon Enumerable.10.3.1 First steps with EnumerableEven just look<strong>in</strong>g at Enumerable, we’re gett<strong>in</strong>g very close to LINQ now. Indeed, a lot ofthe time you don’t need full-blown query expressions to solve a problem. Enumerablehas a lot of methods <strong>in</strong> it, and the purpose of this section isn’t to cover all of them butto give you enough of a feel for them to let you go off and experiment. It’s a real joy tojust play with everyth<strong>in</strong>g available <strong>in</strong> Enumerable—although this time it’s def<strong>in</strong>itelyworth fir<strong>in</strong>g up Visual Studio 2008 for your experiments (rather than us<strong>in</strong>g Snippy) asIntelliSense is handy for this k<strong>in</strong>d of activity.All the complete examples <strong>in</strong> this section deal with a simple situation: we start offwith a collection of <strong>in</strong>tegers and transform it <strong>in</strong> various ways. Obviously real-life situationsare likely to be somewhat more complicated, usually deal<strong>in</strong>g with bus<strong>in</strong>ess-relatedtypes. At the end of this section, I’ll present a couple of examples of just the transformationside of th<strong>in</strong>gs applied to possible bus<strong>in</strong>ess situations, with full source code availableon the book’s website—but that’s harder to play with than a straightforwardcollection of numbers. It’s worth consider<strong>in</strong>g some recent projects you’ve been work<strong>in</strong>gon as we go, however—see if you can th<strong>in</strong>k of situations where you could have madeyour code simpler or more readable by us<strong>in</strong>g the k<strong>in</strong>d of operations described here.Licensed to Rhona Hadida


264 CHAPTER 10 Extension methodsThere are a few methods <strong>in</strong> Enumerable that aren’t extension methods, and we’lluse one of them <strong>in</strong> the examples for the rest of the chapter. The Range method takestwo <strong>in</strong>t parameters: a number to start with, and how many results to yield. The resultis an IEnumerable, which simply returns one number at a time <strong>in</strong> the obviousway. This isn’t as flexible as the Range type we built <strong>in</strong> chapter 6, but it’s still handyfor this sort of quick test<strong>in</strong>g. If you’ve downloaded or typed <strong>in</strong> Range, you canexperiment with types other than <strong>in</strong>t—all the extension methods presented herework with any IEnumerable.To demonstrate the Range method and give us a framework to play with, let’s justpr<strong>in</strong>t out the numbers 0 to 9, as shown <strong>in</strong> list<strong>in</strong>g 10.6. To keep the examples short, I’vestuck with the “snippet” format even though you’ll probably want to play with this <strong>in</strong>Visual Studio.List<strong>in</strong>g 10.6 Us<strong>in</strong>g Enumerable.Range to pr<strong>in</strong>t out the numbers 0 to 9var collection = Enumerable.Range(0, 10);foreach (var element <strong>in</strong> collection){Console.WriteL<strong>in</strong>e(element);}There are no extension methods called <strong>in</strong> list<strong>in</strong>g 10.6, just a pla<strong>in</strong> static method. Andyes, it really does just pr<strong>in</strong>t the numbers 0 to 9—I never claimed this code would setthe world on fire.NOTEDeferred execution—The Range method doesn’t build a list with the appropriatenumbers—it just yields them at the appropriate time. In otherwords, construct<strong>in</strong>g the enumerable <strong>in</strong>stance doesn’t do the bulk of thework—it just gets th<strong>in</strong>gs ready, so that the data can be provided <strong>in</strong> a “just<strong>in</strong>-time”fashion at the appropriate po<strong>in</strong>t. This is called deferred executionand is a crucial part of LINQ. We’ll learn much more about this <strong>in</strong> thenext chapter.Pretty much the simplest th<strong>in</strong>g we can do with a sequence of numbers (which isalready <strong>in</strong> order) is to reverse it. List<strong>in</strong>g 10.7 uses the Reverse extension method to dothis—it returns an IEnumerable that yields the same elements as the orig<strong>in</strong>alsequence but <strong>in</strong> the reverse order.List<strong>in</strong>g 10.7Revers<strong>in</strong>g a collection with the Reverse methodvar collection = Enumerable.Range(0, 10).Reverse();foreach (var element <strong>in</strong> collection){Console.WriteL<strong>in</strong>e(element);}Predictably enough, this pr<strong>in</strong>ts out 9, then 8, then 7, and so on right down to 0. We’vecalled Reverse (seem<strong>in</strong>gly) on an IEnumerable and the same type has beenLicensed to Rhona Hadida


Extension methods <strong>in</strong> .NET 3.5265returned. This pattern of return<strong>in</strong>g one enumerable based on another is pervasive <strong>in</strong>the Enumerable class.NOTEEfficiency: buffer<strong>in</strong>g vs. stream<strong>in</strong>g—The extension methods provided by theframework try very hard to “stream” or “pipe” data wherever possible—when an iterator is asked for its next element, it will often take an elementoff the iterator it’s cha<strong>in</strong>ed to, process that element, and thenreturn someth<strong>in</strong>g appropriate, preferably without us<strong>in</strong>g any more storageitself. Simple transformations and filters can do this very easily, andit’s a really powerful way of efficiently process<strong>in</strong>g data where it’s possible—butsome operations such as revers<strong>in</strong>g the order, or sort<strong>in</strong>g, requireall the data to be available, so it’s all loaded <strong>in</strong>to memory for bulk process<strong>in</strong>g.The difference between this buffered approach and pip<strong>in</strong>g issimilar to the difference between read<strong>in</strong>g data by load<strong>in</strong>g a wholeDataSet versus us<strong>in</strong>g a DataReader to process one record at a time. It’simportant to consider what’s required when us<strong>in</strong>g LINQ—a s<strong>in</strong>glemethod call can have significant performance implications.Let’s do someth<strong>in</strong>g a little more adventurous now—we’ll use a lambda expression toremove the even numbers.10.3.2 Filter<strong>in</strong>g with Where, and cha<strong>in</strong><strong>in</strong>g method calls togetherThe Where extension method is a simple but powerful way of filter<strong>in</strong>g collections: itaccepts a predicate, which it applies to each of the elements of the orig<strong>in</strong>al collection.Aga<strong>in</strong>, it returns an IEnumerable, and this time any element that matches the predicateis <strong>in</strong>cluded <strong>in</strong> the result<strong>in</strong>g collection. List<strong>in</strong>g 10.8 demonstrates this, apply<strong>in</strong>g theodd/even filter to the collection of <strong>in</strong>tegers before revers<strong>in</strong>g it. We don’t have to use alambda expression here—for <strong>in</strong>stance, we could use a delegate we’d created earlier, oran anonymous method. In this case (and <strong>in</strong> many other real-life situations), it’s simpleto put the filter<strong>in</strong>g logic <strong>in</strong>l<strong>in</strong>e, and lambda expressions keep the code concise.List<strong>in</strong>g 10.8Us<strong>in</strong>g the Where method with a lambda expression to keep odd numbers onlyvar collection = Enumerable.Range(0, 10).Where(x => x%2 != 0).Reverse();foreach (var element <strong>in</strong> collection){Console.WriteL<strong>in</strong>e(element);}List<strong>in</strong>g 10.8 pr<strong>in</strong>ts out the numbers 9, 7, 5, 3, and 1. Hopefully you’ll have noticed apattern form<strong>in</strong>g—we’re cha<strong>in</strong><strong>in</strong>g the method calls together. The cha<strong>in</strong><strong>in</strong>g idea itselfisn’t new. For example, Str<strong>in</strong>gBuilder.Append always returns the <strong>in</strong>stance you call iton, allow<strong>in</strong>g code like this:builder.Append(x).Append(y).Append(z)Licensed to Rhona Hadida


266 CHAPTER 10 Extension methodsThat’s f<strong>in</strong>e for <strong>in</strong>stance methods, but extension methods allow static method calls tobe cha<strong>in</strong>ed together. This is one of the primary reasons for extension methods exist<strong>in</strong>g.They’re useful for other utility classes, but their true power is revealed <strong>in</strong> this ability tocha<strong>in</strong> static methods <strong>in</strong> a natural way. That’s why extension methods primarily showup <strong>in</strong> Enumerable and Queryable <strong>in</strong> .NET 3.5: LINQ is geared toward this approach todata process<strong>in</strong>g, with <strong>in</strong>formation travel<strong>in</strong>g through pipel<strong>in</strong>es constructed of <strong>in</strong>dividualoperations cha<strong>in</strong>ed together.NOTEEfficiency consideration: reorder<strong>in</strong>g method calls to avoid waste—I’m certa<strong>in</strong>lynot a fan of micro-optimization without good cause, but it’s worth look<strong>in</strong>gat the order<strong>in</strong>g of the method calls <strong>in</strong> list<strong>in</strong>g 10.8. We could haveadded the Where call after the Reverse call and achieved the same results.However, that would have wasted some effort—the Reverse call wouldhave had to work out where the even numbers should come <strong>in</strong> thesequence even though they will be discarded from the f<strong>in</strong>al result. In thiscase it’s not go<strong>in</strong>g to make much difference, but it can have a significanteffect on performance: if you can reduce the amount of wasted workwithout compromis<strong>in</strong>g readability, that’s a good th<strong>in</strong>g. That doesn’tmean you should always put filters at the start of the pipel<strong>in</strong>e, however;you need to th<strong>in</strong>k carefully about any reorder<strong>in</strong>g to make sure you’ll stillget the correct results.There are two obvious ways of writ<strong>in</strong>g the first part of list<strong>in</strong>g 10.8 without us<strong>in</strong>g thefact that Reverse and Where are extension methods. One is to use a temporary variable,which keeps the structure <strong>in</strong>tact:var collection = Enumerable.Range(0, 10);collection = Enumerable.Where(collection, x => x%2 != 0)collection = Enumerable.Reverse(collection);I hope you’ll agree that the mean<strong>in</strong>g of the code is far less clear here than <strong>in</strong> list<strong>in</strong>g 10.8.It gets even worse with the other option, which is to keep the “s<strong>in</strong>gle statement” style:var collection = Enumerable.Reverse(Enumerable.Where(Enumerable.Range(0, 10),x => x%2 != 0));The method call order appears to be reversed, because the <strong>in</strong>nermost method call(Range) will be performed first, then the others, with execution work<strong>in</strong>g its way outward.Let’s get back to our nice clean syntax but <strong>in</strong>troduce another wr<strong>in</strong>kle—we’ll transform(or project) each element <strong>in</strong> our orig<strong>in</strong>al collection, creat<strong>in</strong>g an anonymous typefor the result.10.3.3 Projections us<strong>in</strong>g the Select method and anonymous typesThe most important projection method <strong>in</strong> Enumerable is Select—it operates on anIEnumerable and projects it <strong>in</strong>to an IEnumerable by way of aFunc, which is the transformation to use on each element, specifiedLicensed to Rhona Hadida


Extension methods <strong>in</strong> .NET 3.5267as a delegate. It’s very like the ConvertAll method <strong>in</strong> List, but operat<strong>in</strong>g on anyenumerable collection and us<strong>in</strong>g deferred execution to perform the projection only aseach element is requested.When I <strong>in</strong>troduced anonymous types, I said they were useful with lambda expressionsand LINQ—well, here’s an example of the k<strong>in</strong>d of th<strong>in</strong>g you can do with them.We’ve currently got the odd numbers between 0 and 9 (<strong>in</strong> reverse order)—let’s createa type that encapsulates the square root of the number as well as the orig<strong>in</strong>al number.List<strong>in</strong>g 10.9 shows both the projection and a slightly modified way of writ<strong>in</strong>g out theresults. I’ve adjusted the whitespace solely for the sake of space on the pr<strong>in</strong>ted page.List<strong>in</strong>g 10.9Projection us<strong>in</strong>g a lambda expression and an anonymous typevar collection = Enumerable.Range(0, 10).Where(x => x%2 != 0).Reverse().Select(x => new { Orig<strong>in</strong>al=x, SquareRoot=Math.Sqrt(x) } );foreach (var element <strong>in</strong> collection){Console.WriteL<strong>in</strong>e("sqrt({0})={1}",element.Orig<strong>in</strong>al,element.SquareRoot);}This time the type of collection isn’t IEnumerable—it’s IEnumerable,where Someth<strong>in</strong>g is the anonymous type created by the compiler. We can’texplicitly type the collection variable except as either the nongeneric IEnumerabletype or object. Implicit typ<strong>in</strong>g is what allows us to use the Orig<strong>in</strong>al and SquareRootproperties when writ<strong>in</strong>g out the results. The output of list<strong>in</strong>g 10.9 is as follows:sqrt(9)=3sqrt(7)=2.64575131106459sqrt(5)=2.23606797749979sqrt(3)=1.73205080756888sqrt(1)=1Of course, a Select method doesn’t have to use an anonymous type at all—we couldhave selected just the square root of the number, discard<strong>in</strong>g the orig<strong>in</strong>al. In that casethe result would have been IEnumerable. Alternatively, we could have manuallywritten a type to encapsulate an <strong>in</strong>teger and its square root—it was just easiest touse an anonymous type <strong>in</strong> this case.Let’s look at one last method to round off our coverage of Enumerable for themoment: OrderBy.10.3.4 Sort<strong>in</strong>g us<strong>in</strong>g the OrderBy methodSort<strong>in</strong>g data is a common requirement when process<strong>in</strong>g data, and <strong>in</strong> LINQ this is usuallyperformed us<strong>in</strong>g the OrderBy or OrderByDescend<strong>in</strong>g methods, sometimes followedby ThenBy or ThenByDescend<strong>in</strong>g if you need to sort by more than one propertyof the data. This ability to sort on multiple properties has always been available theLicensed to Rhona Hadida


268 CHAPTER 10 Extension methodshard way us<strong>in</strong>g a complicated comparison, but it’s much clearer to be able to presenta series of simple comparisons.To demonstrate this, I’m go<strong>in</strong>g to change the operations we’ll use a bit. We’ll startoff with the <strong>in</strong>tegers –5 to 5 (<strong>in</strong>clusive—11 elements <strong>in</strong> total), and then project to ananonymous type conta<strong>in</strong><strong>in</strong>g the orig<strong>in</strong>al number and its square (rather than squareroot). F<strong>in</strong>ally, we’ll sort by the square and then the orig<strong>in</strong>al number. List<strong>in</strong>g 10.10shows all of this.List<strong>in</strong>g 10.10Order<strong>in</strong>g a sequence by two propertiesvar collection = Enumerable.Range(-5, 11).Select(x => new { Orig<strong>in</strong>al=x, Square=x*x }).OrderBy(x => x.Square).ThenBy(x => x.Orig<strong>in</strong>al);foreach (var element <strong>in</strong> collection){Console.WriteL<strong>in</strong>e(element);}Notice how aside from the call to Enumerable.Range (which isn’t as clear as us<strong>in</strong>g ourown Range class) the code reads almost exactly like the textual description. Thistime I’ve decided to let the anonymous type’s ToStr<strong>in</strong>g implementation do the formatt<strong>in</strong>g,and here are the results:{ Orig<strong>in</strong>al = 0, Square = 0 }{ Orig<strong>in</strong>al = -1, Square = 1 }{ Orig<strong>in</strong>al = 1, Square = 1 }{ Orig<strong>in</strong>al = -2, Square = 4 }{ Orig<strong>in</strong>al = 2, Square = 4 }{ Orig<strong>in</strong>al = -3, Square = 9 }{ Orig<strong>in</strong>al = 3, Square = 9 }{ Orig<strong>in</strong>al = -4, Square = 16 }{ Orig<strong>in</strong>al = 4, Square = 16 }{ Orig<strong>in</strong>al = -5, Square = 25 }{ Orig<strong>in</strong>al = 5, Square = 25 }As <strong>in</strong>tended, the “ma<strong>in</strong>” sort<strong>in</strong>g property is Square—but for two values that both havethe same square, the negative orig<strong>in</strong>al number is always sorted before the positiveone. Writ<strong>in</strong>g a s<strong>in</strong>gle comparison to do the same k<strong>in</strong>d of th<strong>in</strong>g (<strong>in</strong> a general case—there are mathematical tricks to cope with this particular example) would have beensignificantly more complicated, to the extent that you wouldn’t want to <strong>in</strong>clude thecode “<strong>in</strong>l<strong>in</strong>e” <strong>in</strong> the lambda expression.We’ve seen just a few of the many extension methods available <strong>in</strong> Enumerable, buthopefully you can appreciate how neatly they can be cha<strong>in</strong>ed together. In the nextchapter we’ll see how this can be expressed <strong>in</strong> a different way us<strong>in</strong>g extra syntax providedby <strong>C#</strong> 3 (query expressions)—as well as some other operations we haven’t coveredhere. It’s worth remember<strong>in</strong>g that you don’t have to use query expressions,though—often it can be simpler to make a couple of calls to methods <strong>in</strong> Enumerable,us<strong>in</strong>g extension methods to cha<strong>in</strong> operations together.Licensed to Rhona Hadida


Extension methods <strong>in</strong> .NET 3.5269Now that we’ve seen how all these apply to our “collection of numbers” example, it’stime for me to make good on the promise of some more bus<strong>in</strong>ess-related situations.10.3.5 Bus<strong>in</strong>ess examples <strong>in</strong>volv<strong>in</strong>g cha<strong>in</strong><strong>in</strong>gMuch of what we do as developers <strong>in</strong>volves mov<strong>in</strong>g data around. In fact, for manyapplications that’s the only mean<strong>in</strong>gful th<strong>in</strong>g we do—the user <strong>in</strong>terface, web services,database, and other components often exist solely to get data from one placeto another, or from one form <strong>in</strong>to another. It should be of no surprise that theextension methods we’ve looked at <strong>in</strong> this section are well suited to many bus<strong>in</strong>essproblems. I’ll just give a couple of examples, as I’m sure you’ll be able to take themas a spr<strong>in</strong>gboard <strong>in</strong>to th<strong>in</strong>k<strong>in</strong>g about your bus<strong>in</strong>ess requirements and how <strong>C#</strong> 3 andthe Enumerable class can help you solve problems more expressively than before.For each example I’ll only <strong>in</strong>clude a sample query—it should be enough to understandthe purpose of the code, but without all the baggage. Full work<strong>in</strong>g code is onthe book’s website.AGGREGATION: SUMMING SALARIESThe first example <strong>in</strong>volves a company comprised of several departments. Each departmenthas a number of employees, each of whom has a salary. Suppose we want toreport on total salary cost by department, with the most expensive department first.The query is simplycompany.Departments.Select(dept => new{dept.Name,Cost=dept.Employees.Sum (person => person.Salary)}).OrderByDescend<strong>in</strong>g (deptWithCost => deptWithCost.Cost);This query uses an anonymous type to keep the department name (us<strong>in</strong>g a projection<strong>in</strong>itializer) and the sum of the salaries of all the employees with<strong>in</strong> that department.The salary summation uses a self-explanatory Sum extension method, aga<strong>in</strong> part ofEnumerable. In the result, the department name and total salary can be retrieved asproperties. If you wanted the orig<strong>in</strong>al department reference, you’d just need tochange the anonymous type used <strong>in</strong> the Select method.GROUPING: COUNTING BUGS ASSIGNED TO DEVELOPERSIf you’re a professional developer, I’m sure you’ve seen many project managementtools giv<strong>in</strong>g you different metrics. If you have access to the raw data, LINQ can helpyou transform it <strong>in</strong> practically any way you choose. As a simple example, we could lookat a list of developers and how many bugs they have assigned to them at the moment:bugs.GroupBy(bug => bug.AssignedTo).Select(list => new { Developer=list.Key, Count=list.Count() }).OrderByDescend<strong>in</strong>g (x => x.Count);This query uses the GroupBy extension method, which groups the orig<strong>in</strong>al collectionby a projection (the developer assigned to fix the bug <strong>in</strong> this case), result<strong>in</strong>g <strong>in</strong> anLicensed to Rhona Hadida


270 CHAPTER 10 Extension methodsIGroup<strong>in</strong>g. There are many overloads of GroupBy, but I’ve used thesimplest one here and then selected just the key (the name of the developer) and thenumber of bugs assigned to them. After that we’ve just ordered the result to show thedevelopers with the most bugs first.One of the problems when look<strong>in</strong>g at the Enumerable class can be work<strong>in</strong>g outexactly what’s go<strong>in</strong>g on—one of the overloads of GroupBy has four type parametersand five “normal” parameters (three of which are delegates), for <strong>in</strong>stance. Don’tpanic, though—just follow the steps shown <strong>in</strong> chapter 3, assign<strong>in</strong>g different types todifferent type parameters until you’ve got a concrete example of what the methodwould look like. That usually makes it a lot easier to understand what’s go<strong>in</strong>g on.We’ll use the example of defect track<strong>in</strong>g as our sample data when we look at queryexpressions <strong>in</strong> the next chapter.These examples aren’t particularly <strong>in</strong>volved ones, but I hope you can see thepower of cha<strong>in</strong><strong>in</strong>g method calls together, where each method takes an orig<strong>in</strong>al collectionand returns another one <strong>in</strong> some form or other, whether by filter<strong>in</strong>g out somevalues, order<strong>in</strong>g them, transform<strong>in</strong>g each element, aggregat<strong>in</strong>g some values, or manyother options. In many cases, the result<strong>in</strong>g code can be read aloud and understoodimmediately—and <strong>in</strong> other situations it’s still usually a lot simpler than the equivalentcode would have been <strong>in</strong> previous versions of <strong>C#</strong>.Now that we’ve seen some of the extension methods provided for us, we’ll considerjust how and when it makes sense for you to write them yourself.10.4 Usage ideas and guidel<strong>in</strong>esLike implicit typ<strong>in</strong>g of local variables, extension methods are controversial. It wouldbe hard to claim that they make the overall aim of the code harder to understand <strong>in</strong>many cases, but at the same time they do obscure the details of what method is gett<strong>in</strong>gcalled. In the words of one of the lecturers at my university, “I’m hid<strong>in</strong>g the truth <strong>in</strong>order to show you a bigger truth”—if you believe that the most important aspect of thecode is its result, extension methods are great. If the implementation is more importantto you, then explicitly call<strong>in</strong>g a static method is clearer. Effectively, it’s the differencebetween the “what” and the “how.”We’ve already looked at us<strong>in</strong>g extension methods for utility classes and methodcha<strong>in</strong><strong>in</strong>g, but before we discuss the pros and cons further, it’s worth call<strong>in</strong>g out a coupleof aspects of this that may not be obvious.10.4.1 “Extend<strong>in</strong>g the world” and mak<strong>in</strong>g <strong>in</strong>terfaces richerWes Dyer, a former developer on the <strong>C#</strong> compiler team, has a fantastic blog 2 cover<strong>in</strong>gall k<strong>in</strong>ds of subject matter. One of his posts about extension methods 3 particularlycaught my attention. It’s called “Extend<strong>in</strong>g the World,” and it talks about how2http://blogs.msdn.com/wesdyer3http://blogs.msdn.com/wesdyer/archive/2007/03/09/extend<strong>in</strong>g-the-world.aspxLicensed to Rhona Hadida


Usage ideas and guidel<strong>in</strong>es271extension methods can make code easier to read by effectively adapt<strong>in</strong>g your environmentto your needs:Typically for a given problem, a programmer is accustomed to build<strong>in</strong>g upa solution until it f<strong>in</strong>ally meets the requirements. Now, it is possible toextend the world to meet the solution <strong>in</strong>stead of solely just build<strong>in</strong>g up untilwe get to it. That library doesn’t provide what you need, just extend thelibrary to meet your needs.This has implications beyond situations where you’d use a utility class. Typically developersonly start creat<strong>in</strong>g utility classes when they’ve seen the same k<strong>in</strong>d of code reproduced<strong>in</strong> dozens of places—but extend<strong>in</strong>g a library is about clarity of expression asmuch as avoid<strong>in</strong>g duplication. Extension methods can make the call<strong>in</strong>g code feel likethe library is richer than it really is.We’ve already seen this with IEnumerable, where even the simplest implementationappears to have a wide set of operations available, such as sort<strong>in</strong>g, group<strong>in</strong>g, projection,and filter<strong>in</strong>g. Of course, the benefits aren’t limited to <strong>in</strong>terfaces—you can also“extend the world” with enums, abstract classes, and so forth.The .NET Framework also provides a good example of another use for extensionmethods: fluent <strong>in</strong>terfaces.10.4.2 Fluent <strong>in</strong>terfacesThere used to be a television program <strong>in</strong> the United K<strong>in</strong>gdom called Catchphrase. Theidea was that contestants would watch a screen where an animation would show somecryptic version of a phrase or say<strong>in</strong>g, which they’d have to guess. The host would oftentry to help by <strong>in</strong>struct<strong>in</strong>g them: “Say what you see.” That’s pretty much the ideabeh<strong>in</strong>d fluent <strong>in</strong>terfaces—that if you read the code verbatim, its purpose will leap offthe screen as if it were written <strong>in</strong> a natural human language. The term was orig<strong>in</strong>allyco<strong>in</strong>ed by Mart<strong>in</strong> Fowler 4 and Eric Evans. If you’re familiar with doma<strong>in</strong> specific languages(DSLs), you may be wonder<strong>in</strong>g what the differences are between a fluent <strong>in</strong>terfaceand a DSL. A lot has been written on the subject, but the consensus seems to bethat a DSL has more freedom to create its own syntax and grammar, whereas a fluent<strong>in</strong>terface is constra<strong>in</strong>ed by the “host” language (<strong>C#</strong> <strong>in</strong> our case).A good example of a fluent <strong>in</strong>terface <strong>in</strong> the framework is the OrderBy and ThenBymethods: with a bit of <strong>in</strong>terpretation of lambda expressions, the code expla<strong>in</strong>s exactlywhat it does. In the case of our numbers example earlier, we could read “order by thesquare, then by the orig<strong>in</strong>al number” without much work. Statements end up read<strong>in</strong>gas whole sentences rather than just <strong>in</strong>dividual noun-verb phrases.Writ<strong>in</strong>g fluent <strong>in</strong>terfaces can require a change of m<strong>in</strong>d-set. Method names defy thenormal “descriptive verb” form, with “And,” “Then,” and “If” sometimes be<strong>in</strong>g suitablemethods <strong>in</strong> a fluent <strong>in</strong>terface. The methods themselves often do little more thansett<strong>in</strong>g up context for future calls, often return<strong>in</strong>g a type whose sole purpose is to actas a bridge between calls. Figure 10.2 gives an example of how this “bridg<strong>in</strong>g” works. It4http://www.mart<strong>in</strong>fowler.com/bliki/FluentInterface.htmlLicensed to Rhona Hadida


272 CHAPTER 10 Extension methodsReturns SoloMeet<strong>in</strong>gMeet<strong>in</strong>g.Between("Jon").And("Russell")Returns UntimedMeet<strong>in</strong>g.At(8.OClock().Tomorrow())Returns Meet<strong>in</strong>gReturns TimeSpanReturns DateTimeFigure 10.2 Pull<strong>in</strong>g apart a fluent <strong>in</strong>terface expressionto create a meet<strong>in</strong>g. The time of the meet<strong>in</strong>g is specifiedus<strong>in</strong>g extension methods to create a TimeSpan from an<strong>in</strong>t, and a DateTime from a TimeSpan.only uses two extension methods (on <strong>in</strong>t and TimeSpan), but they make all the differenceto the readability.The grammar of the example <strong>in</strong> figure 10.2 could have many different forms:you may be able to add additional attendees to an UntimedMeet<strong>in</strong>g, or create anUnattendedMeet<strong>in</strong>g at a particular time before specify<strong>in</strong>g the attendees, for <strong>in</strong>stance.Anders Norås has a full example on his blog 5 that can be a useful start<strong>in</strong>g po<strong>in</strong>twhen you’re plann<strong>in</strong>g a fluent <strong>in</strong>terface.<strong>C#</strong> 3 only supports extension methods rather than extension properties, which restrictsfluent <strong>in</strong>terfaces slightly—it means we can’t have expressions such as 1.week.from.nowor 2.days + 10.hours (which are both valid <strong>in</strong> Groovy with an appropriate package 6 ) butwith a few superfluous parentheses we can have achieve similar results. At first it looksodd to call a method on a number (such as 2.Dollars() or 3.Meters ()), but it’s hardto deny that the mean<strong>in</strong>g is clear. Without extension methods, this sort of clarity simplyisn’t possible when you need to act on types like numbers that aren’t under your control.At the time of this writ<strong>in</strong>g, the development community is still on the fence aboutfluent <strong>in</strong>terfaces: they’re relatively rare <strong>in</strong> most fields, although many mock<strong>in</strong>g librariesused for unit test<strong>in</strong>g have at least some fluent aspects. They’re certa<strong>in</strong>ly not universallyapplicable, but <strong>in</strong> the right situations they can radically transform the readabilityof the call<strong>in</strong>g code.These aren’t the only uses available for extension methods, of course—you maywell discover someth<strong>in</strong>g new and wonderful that makes the world a generally betterplace us<strong>in</strong>g extension methods. I constantly f<strong>in</strong>d it amaz<strong>in</strong>g how such a simple littlefeature can have such a profound impact on readability when used appropriately. Thekey word there is “appropriately.”10.4.3 Us<strong>in</strong>g extension methods sensiblyI’m <strong>in</strong> no position to dictate how you write your code. It may be possible to write teststo objectively measure readability for an “average” developer, but it only matters for5http://andersnoras.com/blogs/anoras/archive/2007/07/09/beh<strong>in</strong>d-the-scenes-of-the-plann<strong>in</strong>g-dsl.aspx6http://groovy.codehaus.org/Google+Data+SupportLicensed to Rhona Hadida


Usage ideas and guidel<strong>in</strong>es273those who are go<strong>in</strong>g to use and ma<strong>in</strong>ta<strong>in</strong> your code. So, you need to consult with therelevant people as far as you can: this depends on your type of project and its audience,of course, but it’s nice to present different options and get appropriate feedback.Extension methods make this particularly easy <strong>in</strong> many cases, as you candemonstrate both options <strong>in</strong> work<strong>in</strong>g code simultaneously—turn<strong>in</strong>g a method <strong>in</strong>to anextension method doesn’t stop you from call<strong>in</strong>g it explicitly <strong>in</strong> the same way as before.The ma<strong>in</strong> question to ask is the one I referred to at the start of this section: is the“what does it do” of the code more important than the “how does it do it?” That variesby person and situation, but here are some guidel<strong>in</strong>es to bear <strong>in</strong> m<strong>in</strong>d:■■■■■■■■■Everyone on the development team should be aware of extension methods andwhere they might be used. Where possible, avoid surpris<strong>in</strong>g code ma<strong>in</strong>ta<strong>in</strong>ers.By putt<strong>in</strong>g extensions <strong>in</strong> their own namespace, you make it hard to use them accidentally.Even if it’s not obvious when read<strong>in</strong>g the code, the developer writ<strong>in</strong>g itshould at least be aware of what she’s do<strong>in</strong>g. Use a projectwide or companywideconvention for nam<strong>in</strong>g the namespace. You may choose to take this one step furtherand use a s<strong>in</strong>gle namespace for each extended type. For <strong>in</strong>stance, you couldcreate a TypeExtensions namespace for classes that extend System.Type.The decision to write an extension method should always be a conscious one. Itshouldn’t become habitual—certa<strong>in</strong>ly not every static method deserves to be anextension method.An extension method is reasonably valid if it’s applicable to all <strong>in</strong>stances of theextended type. If it’s only appropriate <strong>in</strong> certa<strong>in</strong> situations, I’d make it clearthat the method is not part of the type by leav<strong>in</strong>g it as a “normal” static method.Document whether or not the first parameter (the value your method appearsto be called on) is allowed to be null—if it’s not, check the value <strong>in</strong> the methodand throw an exception if necessary.Be careful not to use a method name that already has a mean<strong>in</strong>g <strong>in</strong> theextended type. If the extended type is a framework type or comes from a thirdpartylibrary, check all your extended method names whenever you change versionsof the library.Question your <strong>in</strong>st<strong>in</strong>cts, but acknowledge that they affect your productivity. Justlike with implicit typ<strong>in</strong>g, there’s little po<strong>in</strong>t <strong>in</strong> forc<strong>in</strong>g yourself to use a featureyou <strong>in</strong>st<strong>in</strong>ctively dislike.Try to group extension methods <strong>in</strong>to static classes deal<strong>in</strong>g with the sameextended type. Sometimes related classes (such as DateTime and TimeSpan) canbe sensibly grouped together, but avoid group<strong>in</strong>g extension methods target<strong>in</strong>gdisparate types such as Stream and str<strong>in</strong>g with<strong>in</strong> the same class.Th<strong>in</strong>k really carefully before add<strong>in</strong>g extension methods with the same extendedtype and same name <strong>in</strong> two different namespaces, particularly if there are situationswhere the different methods may both be applicable (they have the samenumber of parameters). It’s reasonable for add<strong>in</strong>g or remov<strong>in</strong>g a us<strong>in</strong>g directiveto make a program fail to build, but it’s nasty if it still builds but changesthe behavior.Licensed to Rhona Hadida


274 CHAPTER 10 Extension methodsFew of these guidel<strong>in</strong>es are particularly clear-cut—to some extent you’ll have to feelyour own way to the best use or avoidance of extension methods. It’s perfectly reasonableto never write your own extension methods at all but still use the LINQ-relatedones for the readability ga<strong>in</strong>s available there. It’s worth at least th<strong>in</strong>k<strong>in</strong>g about what’spossible, though.10.5 SummaryThe mechanical aspect of extension methods is straightforward—it’s a simple featureto describe and demonstrate. The benefits (and costs) of them are harder to talkabout <strong>in</strong> a def<strong>in</strong>itive manner—it’s a touchy-feely topic, and different people arebound to have different views on the value provided.In this chapter I’ve tried to show a bit of everyth<strong>in</strong>g—early on we looked at whatthe feature achieves <strong>in</strong> the language, before we saw some of the capabilities availablethrough the framework. In some ways, this was a relatively gentle <strong>in</strong>troduction toLINQ: we’ll be revisit<strong>in</strong>g some of the extension methods we’ve seen so far when wedelve <strong>in</strong>to query expressions <strong>in</strong> the next chapter, as well as see<strong>in</strong>g some new ones.A wide variety of methods are available with<strong>in</strong> the Enumerable class, and we’ve onlyscratched the surface. An exhaustive description with examples of all the methodswould take most of a book on its own, and it’d become rather dull. It’s much more<strong>in</strong>terest<strong>in</strong>g to come up with a scenario of your own devis<strong>in</strong>g (whether hypothetical or<strong>in</strong> a real project) and browse through MSDN to see what’s available to help you. I urgeyou to use a sandbox project of some description to play with the extension methodsprovided—it does feel like play rather than work, and you’re unlikely to want to constra<strong>in</strong>yourself to just look<strong>in</strong>g at what you need to achieve your most immediate goal.The appendix has a list of the standard query operators from LINQ, which coversmany of the methods with<strong>in</strong> Enumerable.New patterns and practices keep emerg<strong>in</strong>g <strong>in</strong> software eng<strong>in</strong>eer<strong>in</strong>g, and ideasfrom some systems often cross-poll<strong>in</strong>ate to others. That’s one of the th<strong>in</strong>gs that keepsdevelopment excit<strong>in</strong>g. Extension methods allow code to be written <strong>in</strong> a way which waspreviously unavailable <strong>in</strong> <strong>C#</strong>, creat<strong>in</strong>g fluent <strong>in</strong>terfaces and chang<strong>in</strong>g the environmentto suit our code rather than the other way around. Those are just the techniques we’velooked at <strong>in</strong> this chapter—there are bound to be <strong>in</strong>terest<strong>in</strong>g future developmentsus<strong>in</strong>g the new <strong>C#</strong> features, whether <strong>in</strong>dividually or comb<strong>in</strong>ed.The revolution obviously doesn’t end here, however. For a few calls, extensionmethods are f<strong>in</strong>e. In our next chapter we look at the real power tools: query expressionsand full-blown LINQ.Licensed to Rhona Hadida


Query expressionsand LINQ to ObjectsThis chapter covers■■■■■■■Stream<strong>in</strong>g sequences of dataDeferred executionStandard query operatorsQuery expression translationRange variables and transparent identifiersProject<strong>in</strong>g, filter<strong>in</strong>g, and sort<strong>in</strong>gJo<strong>in</strong><strong>in</strong>g and group<strong>in</strong>gYou may well be tired of all the hyperbole around LINQ by now. We’ve seen someexamples <strong>in</strong> chapters 1 and 3, and you’ve almost certa<strong>in</strong>ly read some examples andarticles on the Web. This is where we separate myth from reality:■■■LINQ isn’t go<strong>in</strong>g to turn the most complicated query <strong>in</strong>to a one-l<strong>in</strong>er.LINQ isn’t go<strong>in</strong>g to mean you never need to look at raw SQL aga<strong>in</strong>.LINQ isn’t go<strong>in</strong>g to magically imbue you with architectural genius.275Licensed to Rhona Hadida


276 CHAPTER 11 Query expressions and LINQ to ObjectsGiven all that, LINQ is still go<strong>in</strong>g to change how most of us th<strong>in</strong>k about code. It’s not asilver bullet, but it’s a very powerful tool to have <strong>in</strong> your development armory. We’llexplore two dist<strong>in</strong>ct aspects of LINQ: the framework support, and the compiler translationof query expressions. The latter can look odd to start with, but I’m sure you’lllearn to love them.Query expressions are effectively “preprocessed” by the compiler <strong>in</strong>to “normal”<strong>C#</strong> 3, which is then compiled <strong>in</strong> a perfectly ord<strong>in</strong>ary way. This is a neat way of <strong>in</strong>tegrat<strong>in</strong>gqueries <strong>in</strong>to the language without chang<strong>in</strong>g its semantics all over the place:it’s syntactic sugar <strong>in</strong> its purest form. Most of this chapter is a list of the preprocess<strong>in</strong>gtranslations performed by the compiler, as well as the effects achieved when theresult uses the Enumerable extension methods.You won’t see any SQL or XML here—all that awaits us <strong>in</strong> chapter 12. However, withthis chapter as a foundation you should be able to understand what the more excit<strong>in</strong>gLINQ providers do when we meet them. Call me a spoilsport, but I want to take awaysome of their magic. Even without the air of mystery, LINQ is still very cool.First let’s consider what LINQ is <strong>in</strong> the first place, and how we’re go<strong>in</strong>g to explore it.11.1 Introduc<strong>in</strong>g LINQA topic as large as LINQ needs a certa<strong>in</strong> amount of background before we’re ready tosee it <strong>in</strong> action. In this section we’ll look at what LINQ is (as far as we can discern it), afew of the core pr<strong>in</strong>ciples beh<strong>in</strong>d it, and the data model we’re go<strong>in</strong>g to use for all theexamples <strong>in</strong> this chapter and the next. I know you’re likely to be itch<strong>in</strong>g to get <strong>in</strong>tothe code, so I’ll keep it fairly brief. Let’s start with an issue you’d th<strong>in</strong>k would be quitestraightforward: what counts as LINQ?11.1.1 What’s <strong>in</strong> a name?LINQ has suffered from the same problem that .NET had early <strong>in</strong> its life: it’s a termthat has never been precisely def<strong>in</strong>ed. We know what it stands for: Language INtegratedQuery—but that doesn’t actually help much. LINQ is fundamentally about datamanipulation: it’s a means of access<strong>in</strong>g data from different sources <strong>in</strong> a unified manner.It allows you to express logic about that manipulation <strong>in</strong> the language of yourchoice—<strong>C#</strong> <strong>in</strong> our case—even when the logic needs to be executed <strong>in</strong> an entirely differentenvironment. It also attempts to elim<strong>in</strong>ate (or at least vastly reduce) the impedancemismatch 1 —the difficulties <strong>in</strong>troduced when you need to <strong>in</strong>tegrate two environmentswith very different data models, such as the object-oriented model of .NET and therelational data model of SQL.All of the <strong>C#</strong> 3 features we’ve seen so far (with the exception of partial methodsand automatic properties) contribute to the LINQ story, so how many of these do youneed to use before you can consider yourself to be us<strong>in</strong>g LINQ? Do you need to beus<strong>in</strong>g query expressions, which are the f<strong>in</strong>al genu<strong>in</strong>e language feature <strong>in</strong> <strong>C#</strong> 3? If you use1http://en.wikipedia.org/wiki/Object-Relational_impedance_mismatch expla<strong>in</strong>s this for the specific objectrelationalcase, but impedance mismatches are present <strong>in</strong> other situations, such as XML representations andweb services.Licensed to Rhona Hadida


Introduc<strong>in</strong>g LINQ277the extension methods of Enumerable but do it without query expressions, does thatcount? What about pla<strong>in</strong> lambda expressions without any query<strong>in</strong>g go<strong>in</strong>g on?As a concrete example of this question, let’s consider four slightly different ways ofachiev<strong>in</strong>g the same goal. Suppose we have a List (where Person has propertiesName and Age as <strong>in</strong> chapter 8) and we wish to apply a transformation so that we geta sequence of str<strong>in</strong>gs, just the names of the people <strong>in</strong> the list. Us<strong>in</strong>g lambda expressions<strong>in</strong> preference to anonymous methods, we can still use the standard ConvertAll methodof List, as follows:var names = people.ConvertAll(p => p.Name);Alternatively we can use the Select method of the Enumerable class, because Listimplements IEnumerable. The result of this will be an IEnumerablerather than a List, but <strong>in</strong> many cases that’s f<strong>in</strong>e. The code becomesvar names = Enumerable.Select(people, p => p.Name);Know<strong>in</strong>g that Select is an extension method, we can simplify th<strong>in</strong>gs a little bit:var names = people.Select(p => p.Name);F<strong>in</strong>ally, we could use a query expression:var names = from p <strong>in</strong> people select p.Name;Four one-l<strong>in</strong>ers, all of which accomplish much the same goal. 2 Which of them count asLINQ? My personal answer is that the first isn’t really us<strong>in</strong>g LINQ (even though it uses alambda expression), but the rest are LINQ-based solutions. The second form certa<strong>in</strong>lyisn’t idiomatic, but the last two are both perfectly respectable LINQ ways of achiev<strong>in</strong>gthe same aim, and <strong>in</strong> fact all of the last three samples compile to the same IL.In the end, there are no bonus po<strong>in</strong>ts available for us<strong>in</strong>g LINQ, so the question ismoot. However, it’s worth be<strong>in</strong>g aware that when a fellow developer says they’re us<strong>in</strong>gLINQ to solve a particular problem, that statement could have a variety of mean<strong>in</strong>gs.The long and the short of it is that LINQ is a collection of technologies, <strong>in</strong>clud<strong>in</strong>gthe language features of <strong>C#</strong> 3 (and VB9) along with the framework libraries provided aspart of .NET 3.5. If your project targets .NET 3.5, you can use LINQ as much or as littleas you like. You can use just the support for <strong>in</strong>-memory query<strong>in</strong>g, otherwise known asLINQ to Objects, or providers that target XML documents, relational databases, or otherdata sources. The only provider we’ll use <strong>in</strong> this chapter is LINQ to Objects, and <strong>in</strong> thenext chapter we’ll see how the same concepts apply to the other providers.There are a few concepts that are vital to LINQ. We’ve seen them tangentially <strong>in</strong>chapter 10, but let’s look at them a little more closely.11.1.2 Fundamental concepts <strong>in</strong> LINQMost of this chapter is dedicated to exactly what the <strong>C#</strong> 3 compiler does with queryexpressions, but it won’t make much sense until we have a better understand<strong>in</strong>g of the2There’s actually a big difference between ConvertAll and Select, as we’ll see <strong>in</strong> a m<strong>in</strong>ute, but <strong>in</strong> many caseseither could be used.Licensed to Rhona Hadida


278 CHAPTER 11 Query expressions and LINQ to Objectsideas underly<strong>in</strong>g LINQ as a whole. One of the problems with reduc<strong>in</strong>g the impedancemismatch between two data models is that it usually <strong>in</strong>volves creat<strong>in</strong>g yet anothermodel to act as the bridge. This section describes the LINQ model, beg<strong>in</strong>n<strong>in</strong>g with itsmost important aspect: sequences.SEQUENCESYou’re almost certa<strong>in</strong>ly familiar with the concept of a sequence: it’s encapsulated by theIEnumerable and IEnumerable <strong>in</strong>terfaces, and we’ve already looked at those fairlyclosely <strong>in</strong> chapter 6 when we studied iterators. Indeed, <strong>in</strong> many ways a sequence is justa slightly more abstract way of th<strong>in</strong>k<strong>in</strong>g of an iterator. A sequence is like a conveyor beltof items—you fetch them one at a time until either you’re no longer <strong>in</strong>terested or thesequence has run out of data. There are three other fairly obvious examples: a Streamrepresents a sequence of bytes, a TextReader represents a sequence of characters, anda DataReader represents a sequence of rows from a database. 3The key difference between a sequence and other collection data structures suchas lists and arrays is that when you’re read<strong>in</strong>g from a sequence, you don’t generallyknow how many more items are wait<strong>in</strong>g, or have access to arbitrary items—just thecurrent one. Indeed, some sequences could be never-end<strong>in</strong>g: you could easily have an<strong>in</strong>f<strong>in</strong>ite sequence of random numbers, for example. Only one piece of data is providedat a time by the sequence, so you can implement an <strong>in</strong>f<strong>in</strong>ite sequence withouthav<strong>in</strong>g <strong>in</strong>f<strong>in</strong>ite storage. Lists and arrays can act as sequences, of course—just asList implements IEnumerable—but the reverse isn’t always true. You can’thave an <strong>in</strong>f<strong>in</strong>ite array or list, for example.Sequences are the bread and butter of LINQ. When you read a queryexpression, it’s really helpful to th<strong>in</strong>k of the sequences <strong>in</strong>volved: there’salways at least one sequence to start with, and it’s usually transformed<strong>in</strong>to other sequences along the way, possibly be<strong>in</strong>g jo<strong>in</strong>ed with yet moresequences. We’ll see examples of this as we go further <strong>in</strong>to the chapter,but I can’t emphasize enough how important it is. Examples of LINQqueries are frequently provided on the Web with very little explanation:when you take them apart by look<strong>in</strong>g at the sequences <strong>in</strong>volved,th<strong>in</strong>gs make a lot more sense. As well as be<strong>in</strong>g an aid to read<strong>in</strong>g code, it can also helpa lot when writ<strong>in</strong>g it. Th<strong>in</strong>k<strong>in</strong>g <strong>in</strong> sequences can be tricky—it’s a bit of a mental leapsometimes—but if you can get there, it will help you immeasurably when you’rework<strong>in</strong>g with LINQ.As a simple example, let’s take another query expression runn<strong>in</strong>g aga<strong>in</strong>st a list ofpeople. We’ll apply the same transformation as before, but with a filter <strong>in</strong>volved thatkeeps only adults <strong>in</strong> the result<strong>in</strong>g sequence:Th<strong>in</strong>k <strong>in</strong>sequences!var adultNames = from person <strong>in</strong> peoplewhere person.Age >= 18select person.Name;3Interest<strong>in</strong>gly, none of these actually implements IEnumerable.Licensed to Rhona Hadida


Introduc<strong>in</strong>g LINQ279Figure 11.1 shows this query expressiongraphically, break<strong>in</strong>g it down <strong>in</strong>toits <strong>in</strong>dividual steps. I’ve <strong>in</strong>cluded anumber of similar figures <strong>in</strong> this chapter,but unfortunately for complicatedqueries there is simply not enoughroom on the pr<strong>in</strong>ted page to show asmuch data as we might like. Moredetailed diagrams are available on thebook’s website.Each arrow represents a sequence—the description is on the left side, andsome sample data is on the right. Eachbox is a transformation from our queryexpression. Initially, we have the wholefamily (as Person objects); then afterfilter<strong>in</strong>g, the sequence only conta<strong>in</strong>sadults (aga<strong>in</strong>, as Person objects); andthe f<strong>in</strong>al result has the names of thoseadults as str<strong>in</strong>gs. Each step simply takesone sequence and applies an operationAll "Person" objects<strong>in</strong> "people"All "Person" objects withan age of at least 18Names of people withan age of at least 18from person <strong>in</strong> peoplewhere person.Age >= 18select person.Name(Result of query)Name="Holly", Age=31Name="Tom", Age=4Name="Jon", Age=31Name="William", Age=1Name="Rob<strong>in</strong>", Age=1Name="Holly", Age=31Name="Jon", Age=31"Holly""Jon"Figure 11.1 A simple query expression broken down<strong>in</strong>to the sequences and transformations <strong>in</strong>volvedto produce a new sequence. The result isn’t the str<strong>in</strong>gs “Holly” and “Jon” —<strong>in</strong>stead, it’san IEnumerable, which, when asked for its elements one by one, will first yield“Holly” and then “Jon.”This example was straightforward to start with, but we’ll apply the same techniquelater to more complicated query expressions <strong>in</strong> order to understand them more easily.Some advanced operations <strong>in</strong>volve more than one sequence as <strong>in</strong>put, but it’s still a lotless to worry about than try<strong>in</strong>g to understand the whole query <strong>in</strong> one go.So, why are sequences so important? They’re the basis for a stream<strong>in</strong>g model fordata handl<strong>in</strong>g—one that allows us to process data only when we need to.DEFERRED EXECUTION AND STREAMINGWhen the query expression shown <strong>in</strong> figure 11.1 is created, no data is processed. Theorig<strong>in</strong>al list of people isn’t accessed at all. Instead, a representation of the query is builtup <strong>in</strong> memory. Delegate <strong>in</strong>stances are used to represent the predicate test<strong>in</strong>g for adulthoodand the conversion from a person to that person’s name. It’s only when the result<strong>in</strong>gIEnumerable is asked for its first element that the wheels start turn<strong>in</strong>g.This aspect of LINQ is called deferred execution. When the first element of the resultis requested, the Select transformation asks the Where transformation for its first element.The Where transformation asks the list for its first element, checks whether thepredicate matches (which it does <strong>in</strong> this case), and returns that element back toSelect. That <strong>in</strong> turn extracts the name and returns it as the result.That’s all a bit of a mouthful, but a sequence diagram makes it all much clearer.I’m go<strong>in</strong>g to collapse the calls to MoveNext and Current to a s<strong>in</strong>gle fetch operation: itLicensed to Rhona Hadida


280 CHAPTER 11 Query expressions and LINQ to Objectsmakes the diagram a lot simpler. Just remember that each time the fetch occurs, it’seffectively check<strong>in</strong>g for the end of the sequence as well. Figure 11.2 shows the first fewstages of our sample query expression <strong>in</strong> operation, when we pr<strong>in</strong>t out each elementof the result us<strong>in</strong>g a foreach loop.As you can see <strong>in</strong> figure 11.2, only one element of data is processed at a time. If wedecided to stop pr<strong>in</strong>t<strong>in</strong>g output after writ<strong>in</strong>g “Holly,” we would never execute any ofthe operations on the other elements of the orig<strong>in</strong>al sequence. Although severalstages are <strong>in</strong>volved here, process<strong>in</strong>g data <strong>in</strong> a stream<strong>in</strong>g manner like this is efficient andCaller(foreach)Select Where Listfetchfetchfetchreturn {"Holly", 31}return {"Holly", 31}Check: Age >= 18? Yesreturn "Holly"Transform:{"Holly", 31} =>"Holly”Pr<strong>in</strong>t "Holly"fetchfetchfetchreturn {"Tom", 4}Check: Age >= 18? Nofetchreturn {"Jon", 31}return {"Jon", 31}Check: Age >= 18? Yesreturn "Jon"Transform:{"Jon", 31} =>"Jon"Pr<strong>in</strong>t "Jon"(and so on)Figure 11.2Sequence diagram of the execution of a query expressionLicensed to Rhona Hadida


Introduc<strong>in</strong>g LINQ281flexible. In particular, regardless of how much source data there is, you don’t need toknow about more than one element of it at any one po<strong>in</strong>t <strong>in</strong> time. That’s the differencebetween us<strong>in</strong>g List.ConvertAll and Enumerable.Select: the former creates awhole new <strong>in</strong>-memory list, whereas the latter just iterates through the orig<strong>in</strong>alsequence, yield<strong>in</strong>g a s<strong>in</strong>gle converted element at a time.This is a best-case scenario, however. There are times where <strong>in</strong> order to fetch thefirst result of a query, you have to evaluate all of the data from the source. We’vealready seen one example of this <strong>in</strong> the previous chapter: the Enumerable.Reversemethod needs to fetch all the data available <strong>in</strong> order to return the last orig<strong>in</strong>al elementas the first element of the result<strong>in</strong>g sequence. This makes Reverse a buffer<strong>in</strong>goperation—which can have a huge effect on the efficiency (or even feasibility) of youroverall operation. If you can’t afford to have all the data <strong>in</strong> memory at one time, youcan’t use buffer<strong>in</strong>g operations.Just as the stream<strong>in</strong>g aspect depends on which operation you perform, some transformationstake place as soon as you call them, rather than us<strong>in</strong>g deferred execution.This is called immediate execution. Generally speak<strong>in</strong>g, operations that return anothersequence (usually an IEnumerable or IQueryable) use deferred execution,whereas operations that return a s<strong>in</strong>gle value use immediate execution.The operations that are widely available <strong>in</strong> LINQ are known as the standard queryoperators—let’s take a brief look at them now.STANDARD QUERY OPERATORSLINQ’s standard query operators are a collection of transformations that have well-understoodmean<strong>in</strong>gs. LINQ providers are encouraged to implement as many of these operatorsas possible, mak<strong>in</strong>g the implementation obey the expected behavior. This iscrucial <strong>in</strong> provid<strong>in</strong>g a consistent query framework across multiple data sources. Ofcourse, some LINQ providers may expose more functionality, and some of the operatorsmay not map appropriately to the target doma<strong>in</strong> of the provider—but at least theopportunity for consistency is there.<strong>C#</strong> 3 has support for some of the standard query operators built <strong>in</strong>to the languagevia query expressions, but they can always be called manually. You may be <strong>in</strong>terested toknow that VB9 has more of the operators present <strong>in</strong> the language: as ever, there’s atrade-off between the added complexity of <strong>in</strong>clud<strong>in</strong>g a feature <strong>in</strong> the language andthe benefits that feature br<strong>in</strong>gs. Personally I th<strong>in</strong>k the <strong>C#</strong> team has done an admirablejob: I’ve always been a fan of a small language with a large library beh<strong>in</strong>d it.We’ll see some of these operators <strong>in</strong> our examples as we go through this chapter andthe next, but I’m not aim<strong>in</strong>g to give a comprehensive guide to them here: this book isprimarily about <strong>C#</strong>, not the whole of LINQ. You don’t need to know all of the operators<strong>in</strong> order to be productive <strong>in</strong> LINQ, but your experience is likely to grow over time. Theappendix gives a brief description of each of the standard query operators, and MSDNgives more details of each specific overload. When you run <strong>in</strong>to a problem, check thelist: if it feels like there ought to be a built-<strong>in</strong> method to help you, there probably is!Hav<strong>in</strong>g mentioned examples, it’s time to <strong>in</strong>troduce the data model that most ofthe rest of the sample code <strong>in</strong> this chapter will use.Licensed to Rhona Hadida


282 CHAPTER 11 Query expressions and LINQ to Objects11.1.3 Def<strong>in</strong><strong>in</strong>g the sample data modelIn section 10.3.5 I gave a brief example of bug track<strong>in</strong>g as a real use for extensionmethods and lambda expressions. We’ll use the same idea for almost all of the samplecode <strong>in</strong> this chapter—it’s a fairly simple model, but one that can be manipulated <strong>in</strong>many different ways to give useful <strong>in</strong>formation. It’s also a doma<strong>in</strong> that most professionaldevelopers are familiar with, while not <strong>in</strong>volv<strong>in</strong>g frankly tedious relationshipsbetween customers and orders. You can f<strong>in</strong>d further examples us<strong>in</strong>g the same model<strong>in</strong> the downloadable source code, along with detailed comments. Some of these aremore complicated than the ones presented <strong>in</strong> this chapter and provide good exercisesfor understand<strong>in</strong>g larger query expressions.Our fictional sett<strong>in</strong>g is SkeetySoft, a small software company with big ambition. Thefounders have decided to attempt to create an office suite, a media player, and an <strong>in</strong>stantmessag<strong>in</strong>g application. After all, there are no big players <strong>in</strong> those markets, are there?The development department of SkeetySoft consists of five people: two developers(Deborah and Darren), two testers (Tara and Tim), and a manager (Mary). There’scurrently has a s<strong>in</strong>gle customer: Col<strong>in</strong>. The aforementioned products are SkeetyOffice,SkeetyMediaPlayer, and SkeetyTalk, respectively. We’re go<strong>in</strong>g to look at defectslogged dur<strong>in</strong>g August 2007, us<strong>in</strong>g the data model shown <strong>in</strong> figure 11.3.Figure 11.3Class diagram of the SkeetySoft defect data modelLicensed to Rhona Hadida


Simple beg<strong>in</strong>n<strong>in</strong>gs: select<strong>in</strong>g elements283As you can see, we’re not record<strong>in</strong>g an awful lot of data. In particular, there’s no realhistory to the defects, but there’s enough here to let us demonstrate the queryexpression features of <strong>C#</strong> 3. For the purposes of this chapter, all the data is stored <strong>in</strong>memory. We have a class named SampleData with properties AllDefects, AllUsers,AllProjects, and AllSubscriptions, which each return an appropriate type ofIEnumerable. The Start and End properties return DateTime <strong>in</strong>stances for thestart and end of August respectively, and there are nested classes Users andProjects with<strong>in</strong> SampleData to provide easy access to a particular user or project.The one type that may not be immediately obvious is NotificationSubscription:the idea beh<strong>in</strong>d this is to send an email to the specified address every time a defectis created or changed <strong>in</strong> the relevant project.There are 41 defects <strong>in</strong> the sample data, created us<strong>in</strong>g <strong>C#</strong> 3 object <strong>in</strong>itializers. All ofthe code is available on the book’s website, but I won’t <strong>in</strong>clude the sample data itself<strong>in</strong> this chapter.Now that the prelim<strong>in</strong>aries are dealt with, let’s get crack<strong>in</strong>g with some queries!11.2 Simple beg<strong>in</strong>n<strong>in</strong>gs: select<strong>in</strong>g elementsHav<strong>in</strong>g brought up some general LINQ concepts beforehand, I’ll <strong>in</strong>troduce the conceptsthat are specific to <strong>C#</strong> 3 as they arise <strong>in</strong> the course of the rest of the chapter. We’rego<strong>in</strong>g to start with a simple query (even simpler than the ones we’ve seen so far <strong>in</strong> thechapter) and work up to some quite complicated ones, not only build<strong>in</strong>g up your expertiseof what the <strong>C#</strong> 3 compiler is do<strong>in</strong>g, but also teach<strong>in</strong>g you how to read LINQ code.All of our examples will follow the pattern of def<strong>in</strong><strong>in</strong>g a query, and then pr<strong>in</strong>t<strong>in</strong>gthe results to the console. We’re not <strong>in</strong>terested <strong>in</strong> b<strong>in</strong>d<strong>in</strong>g queries to data grids or anyth<strong>in</strong>glike that—it’s all important, but not directly relevant to learn<strong>in</strong>g <strong>C#</strong> 3.We can use a simple expression that just pr<strong>in</strong>ts out all our users as the start<strong>in</strong>gpo<strong>in</strong>t for exam<strong>in</strong><strong>in</strong>g what the compiler is do<strong>in</strong>g beh<strong>in</strong>d the scenes and learn<strong>in</strong>g aboutrange variables.11.2.1 Start<strong>in</strong>g with a source and end<strong>in</strong>g with a selectionEvery query expression <strong>in</strong> <strong>C#</strong> 3 starts off <strong>in</strong> the same way—stat<strong>in</strong>g the source of asequence of data:from element <strong>in</strong> sourceThe element part is just an identifier, with an optional type name before it. Most ofthe time you won’t need the type name, and we won’t have one for our first example.Lots of different th<strong>in</strong>gs can happen after that first clause, but sooner or later youalways end with a select clause or a group clause. We’ll start off with a select clauseto keep th<strong>in</strong>gs nice and simple. The syntax for a select clause is also easy:select expressionThe select clause is known as a projection. Comb<strong>in</strong><strong>in</strong>g the two together and us<strong>in</strong>g themost trivial expression we can th<strong>in</strong>k of gives a simple (and practically useless) query,as shown <strong>in</strong> list<strong>in</strong>g 11.1.Licensed to Rhona Hadida


284 CHAPTER 11 Query expressions and LINQ to ObjectsList<strong>in</strong>g 11.1Trivial query to pr<strong>in</strong>t the list of usersvar query = from user <strong>in</strong> SampleData.AllUsersselect user;foreach (var user <strong>in</strong> query){Console.WriteL<strong>in</strong>e(user);}The query expression is the part highlighted <strong>in</strong> bold. I’ve overridden ToStr<strong>in</strong>g foreach of the entities <strong>in</strong> the model, so the results of list<strong>in</strong>g 11.1 are as follows:User: Tim Trotter (Tester)User: Tara Tutu (Tester)User: Deborah Denton (Developer)User: Darren Dahlia (Developer)User: Mary Malcop (Manager)User: Col<strong>in</strong> Carton (Customer)You may be wonder<strong>in</strong>g how useful this is as an example: after all, we could have justused SampleData.AllUsers directly <strong>in</strong> our foreach statement. However, we’ll use thisquery expression—however trivial—to <strong>in</strong>troduce two new concepts. First we’ll look atthe general nature of the translation process the compiler uses when it encounters aquery expression, and then we’ll discuss range variables.11.2.2 Compiler translations as the basis of query expressionsThe <strong>C#</strong> 3 query expression support is based on the compiler translat<strong>in</strong>g query expressions<strong>in</strong>to “normal” <strong>C#</strong> code. It does this <strong>in</strong> a mechanical manner that doesn’t try tounderstand the code, apply type <strong>in</strong>ference, check the validity of method calls, or anyof the normal bus<strong>in</strong>ess of a compiler. That’s all done later, after the translation. Inmany ways, this first phase can be regarded as a preprocessor step. The compiler translateslist<strong>in</strong>g 11.1 <strong>in</strong>to list<strong>in</strong>g 11.2. before do<strong>in</strong>g the real compilation.List<strong>in</strong>g 11.2The query expression of list<strong>in</strong>g 11.1 translated <strong>in</strong>to a method callvar query = SampleData.AllUsers.Select(user => user);foreach (var user <strong>in</strong> query){Console.WriteL<strong>in</strong>e(user);}The <strong>C#</strong> 3 compiler translates the query expression <strong>in</strong>to exactly that code before properlycompil<strong>in</strong>g it further. In particular, it doesn’t assume that it should use Enumerable.Select, or that List will conta<strong>in</strong> a method called Select. It merely translates thecode and then lets the next phase of compilation deal with f<strong>in</strong>d<strong>in</strong>g an appropriatemethod—whether as a straightforward member or as an extension method. The parametercan be a suitable delegate type or an Expression for an appropriate type T.This is where it’s important that lambda expressions can be converted <strong>in</strong>toboth delegate <strong>in</strong>stances and expression trees. All the examples <strong>in</strong> this chapter willLicensed to Rhona Hadida


Simple beg<strong>in</strong>n<strong>in</strong>gs: select<strong>in</strong>g elements285use delegates, but we’ll see how expression trees are used when we look at the otherLINQ providers <strong>in</strong> chapter 12. When I present the signatures for some of the methodscalled by the compiler later on, remember that these are just the ones called <strong>in</strong>LINQ to Objects—whenever the parameter is a delegate type (which most of themare), the compiler will use a lambda expression as the argument, and then try tof<strong>in</strong>d a method with a suitable signature.It’s also important to remember that wherever a normal variable (such as a localvariable with<strong>in</strong> the method) appears with<strong>in</strong> a lambda expression after translation hasbeen performed, it will become a captured variable <strong>in</strong> the same way that we saw back<strong>in</strong> chapter 5. This is just normal lambda expression behavior—but unless you understandwhich variables will be captured, you could easily be confused by the results ofyour queries.The language specification gives details of the query expression pattern, which mustbe implemented for all query expressions to work, but this isn’t def<strong>in</strong>ed as an <strong>in</strong>terfaceas you might expect. It makes a lot of sense, however: it allows LINQ to be appliedto <strong>in</strong>terfaces such as IEnumerable us<strong>in</strong>g extension methods. This chapter tackleseach element of the query expression pattern, one at a time.List<strong>in</strong>g 11.3 proves how the compiler translation works: it provides a dummyimplementation of both Select and Where, with Select be<strong>in</strong>g a normal <strong>in</strong>stancemethod and Where be<strong>in</strong>g an extension method. Our orig<strong>in</strong>al simple query expressiononly conta<strong>in</strong>ed a select clause, but I’ve <strong>in</strong>cluded the where clause to show both k<strong>in</strong>dsof methods <strong>in</strong> use. Unfortunately, because it requires a top-level class to conta<strong>in</strong> theextension method it can’t be represented as a snippet but only as a full list<strong>in</strong>g.List<strong>in</strong>g 11.3Compiler translation call<strong>in</strong>g methods on a dummy LINQ implementationus<strong>in</strong>g System;static class Extensions{public static Dummy Where(this Dummy dummy,Func predicate){Console.WriteL<strong>in</strong>e ("Where called");return dummy;}}DeclaresWhereextensionmethodpublic class Dummy{public Dummy Select(Func selector){Console.WriteL<strong>in</strong>e ("Select called");return new Dummy();}}Declares Select<strong>in</strong>stance methodpublic class TranslationExample{Licensed to Rhona Hadida


286 CHAPTER 11 Query expressions and LINQ to Objectsstatic void Ma<strong>in</strong>(){var source = new Dummy();Creates sourceto be queried}}var query = from dummy <strong>in</strong> sourcewhere dummy.ToStr<strong>in</strong>g()=="Ignored"select "Anyth<strong>in</strong>g";Runn<strong>in</strong>g list<strong>in</strong>g 11.3 pr<strong>in</strong>ts “Where called” and then “Select called” just as we’dexpect, because the query expression has been translated <strong>in</strong>to this code:var query = source.Where(dummy => dummy.ToStr<strong>in</strong>g()=="Ignored").Select(dummy => "Anyth<strong>in</strong>g");Calls methods via aquery expressionOf course, we’re not do<strong>in</strong>g any query<strong>in</strong>g or transformation here, but it shows how thecompiler is translat<strong>in</strong>g our query expression. If you’re puzzled as to why we’ve selected"Anyth<strong>in</strong>g" <strong>in</strong>stead of just dummy, it’s because a projection of just dummy (which is a “donoth<strong>in</strong>g” projection) would be removed by the compiler <strong>in</strong> this particular case. We’lllook at that later <strong>in</strong> section 11.3.2, but for the moment the important idea is the overalltype of translation <strong>in</strong>volved. We only need to learn what translations the <strong>C#</strong> compilerwill use, and then we can take any query expression, convert it <strong>in</strong>to the form thatdoesn’t use query expressions, and then look at what it’s do<strong>in</strong>g from that po<strong>in</strong>t of view.Notice how we don’t implement IEnumerable at all <strong>in</strong> Dummy. The translationfrom query expressions to “normal” code doesn’t depend on it, but <strong>in</strong> practice almostall LINQ providers will expose data either as IEnumerable or IQueryable (whichwe’ll look at <strong>in</strong> chapter 12). The fact that the translation doesn’t depend on any particulartypes but merely on the method names and parameters is a sort of compile-timeform of duck typ<strong>in</strong>g. This is similar to the same way that the collection <strong>in</strong>itializers presented<strong>in</strong> chapter 8 f<strong>in</strong>d a public method called Add us<strong>in</strong>g normal overload resolutionrather than us<strong>in</strong>g an <strong>in</strong>terface conta<strong>in</strong><strong>in</strong>g an Add method with a particular signature.Query expressions take this idea one step further—the translation occurs early <strong>in</strong> thecompilation process <strong>in</strong> order to allow the compiler to pick either <strong>in</strong>stance methods orextension methods. You could even consider the translation to be the work of a separatepreprocess<strong>in</strong>g eng<strong>in</strong>e.NOTE Why from … where … select <strong>in</strong>stead of select … from … where? Manydevelopers f<strong>in</strong>d the order of the clauses <strong>in</strong> query expressions confus<strong>in</strong>gto start with. It looks just like SQL—except back to front. If you look backto the translation <strong>in</strong>to methods, you’ll see the ma<strong>in</strong> reason beh<strong>in</strong>d it. Thequery expression is processed <strong>in</strong> the same order that it’s written: we startwith a source <strong>in</strong> the from clause, then filter it <strong>in</strong> the where clause, thenproject it <strong>in</strong> the select clause. Another way of look<strong>in</strong>g at it is to considerthe diagrams throughout this chapter. The data flows from top to bottom,and the boxes appear <strong>in</strong> the diagram <strong>in</strong> the same order as their correspond<strong>in</strong>gclauses appear <strong>in</strong> the query expression. Once you get overany <strong>in</strong>itial discomfort due to unfamiliarity, you may well f<strong>in</strong>d thisapproach appeal<strong>in</strong>g—I certa<strong>in</strong>ly do.Licensed to Rhona Hadida


Simple beg<strong>in</strong>n<strong>in</strong>gs: select<strong>in</strong>g elements287So, we know that a source level translation is <strong>in</strong>volved—but there’s another crucialconcept to understand before we move on any further.11.2.3 Range variables and nontrivial projectionsLet’s look back at our orig<strong>in</strong>al query expression <strong>in</strong> a bit more depth. We haven’t exam<strong>in</strong>edthe identifier <strong>in</strong> the from clause or the expression <strong>in</strong> the select clause. Figure 11.4shows the query expression aga<strong>in</strong>, with each part labeled to expla<strong>in</strong> its purpose.The contextual keywords are easy to expla<strong>in</strong>—they specify to the compiler what wewant to do with the data. Likewise the source expression is just a normal <strong>C#</strong> expression—a property <strong>in</strong> this case, but it could just as easily have been a method call, or a variable.The tricky bits are the range variable declaration and the projection expression.Range variables aren’t like any other type of variable. In some ways they’re not variablesat all! They’re only available <strong>in</strong> query expressions, and they’re effectively presentto propagate context from one expression to another. They represent one element ofa particular sequence, and they’re used <strong>in</strong> the compiler translation to allow otherexpressions to be turned <strong>in</strong>to lambda expressions easily.We’ve already seen that our orig<strong>in</strong>al query expression was turned <strong>in</strong>toSampleData.AllUsers.Select(user => user)The left side of the lambda expression—the part that provides the parameter name—comes from the range variable declaration. The right side comes from the selectclause. The translation is as simple as that (<strong>in</strong> this case). It all works out OK becausewe’ve used the same name on both sides. Suppose we’d written the query expressionlike this:from user <strong>in</strong> SampleData.AllUsersselect personIn that case, the translated version would have beenSampleData.AllUsers.Select(user => person)At that po<strong>in</strong>t the compiler would have compla<strong>in</strong>ed because it wouldn’t have knownwhat person referred to. Now that we know how simple the process is, however, itRange variabledeclarationSource expression(normal code)from user <strong>in</strong> SampleData.AllUsersselect user;Query expressioncontextual keywordsExpression us<strong>in</strong>grange variableFigure 11.4 A simple query expressionbroken down <strong>in</strong>to its constituent partsLicensed to Rhona Hadida


288 CHAPTER 11 Query expressions and LINQ to Objectsbecomes easier to understand a query expression that has a slightly more complicatedprojection. List<strong>in</strong>g 11.4 pr<strong>in</strong>ts out just the names of our users.List<strong>in</strong>g 11.4Query select<strong>in</strong>g just the names of the usersvar query = from user <strong>in</strong> SampleData.AllUsersselect user.Name;foreach (str<strong>in</strong>g name <strong>in</strong> query){Console.WriteL<strong>in</strong>e(name);}This time we’re us<strong>in</strong>g user.Name as the projection, and we can see that the result is asequence of str<strong>in</strong>gs, not of User objects. The translation of the query expression followsthe same rules as before, and becomesSampleData.AllUsers.Select(user => user.Name)The compiler allows this, because the Select extension method as applied to AllUserseffectively has this signature, act<strong>in</strong>g as if it were a member of IEnumerable:IEnumerable Select (Func selector)NOTEExtension methods pretend<strong>in</strong>g to be <strong>in</strong>stance methods—Just to be clear, Selectisn’t a member of IEnumerable itself, and the real signature has anIEnumerable parameter at the start. However, the translation processdoesn’t care whether the result is a call to a normal <strong>in</strong>stance method oran extension method. I f<strong>in</strong>d it easier to pretend it’s a normal method justfor the sake of work<strong>in</strong>g out the query translation. I’ve used the same conventionfor the rema<strong>in</strong>der of the chapter.The type <strong>in</strong>ference described <strong>in</strong> chapter 9 kicks <strong>in</strong>, convert<strong>in</strong>g the lambda expression<strong>in</strong>to a Func by decid<strong>in</strong>g that the user parameter must be of type User,and then <strong>in</strong>ferr<strong>in</strong>g that the return type (and thus TResult) should be str<strong>in</strong>g. This iswhy lambda expressions allow implicitly typed parameters, and why there are suchcomplicated type <strong>in</strong>ference rules: these are the gears and pistons of the LINQ eng<strong>in</strong>e.NOTE Why do you need to know all this? You can almost ignore what’s go<strong>in</strong>g onwith range variables for a lot of the time. You may well have seen many,many queries and understood what they achieve without ever know<strong>in</strong>gabout what’s go<strong>in</strong>g on beh<strong>in</strong>d the scenes. That’s f<strong>in</strong>e for when th<strong>in</strong>gs arework<strong>in</strong>g (as they tend to with examples <strong>in</strong> tutorials), but it’s when th<strong>in</strong>gsgo wrong that it pays to know about the details. If you have a queryexpression that won’t compile because the compiler is compla<strong>in</strong><strong>in</strong>g thatit doesn’t know about a particular identifier, you should look at the rangevariables <strong>in</strong>volved.So far we’ve only seen implicitly typed range variables. What happens when we<strong>in</strong>clude a type <strong>in</strong> the declaration? The answer lies <strong>in</strong> the Cast and OfType standardquery operators.Licensed to Rhona Hadida


Simple beg<strong>in</strong>n<strong>in</strong>gs: select<strong>in</strong>g elements28911.2.4 Cast, OfType, and explicitly typed range variablesMost of the time, range variables can be implicitly typed; <strong>in</strong> .NET 3.5, you’re likely tobe work<strong>in</strong>g with generic collections where the specified type is all you need. What ifthat weren’t the case, though? What if we had an ArrayList, or perhaps an object[]that we wanted to perform a query on? It would be a pity if LINQ couldn’t be applied<strong>in</strong> those situations. Fortunately, there are two standard query operators that come tothe rescue: Cast and OfType. Only Cast is supported directly by the query expressionsyntax, but we’ll look at both <strong>in</strong> this section.The two operators are similar: both take an arbitrary untyped sequence and returna strongly typed sequence. Cast does this by cast<strong>in</strong>g each element to the target type(and fail<strong>in</strong>g on any element that isn’t of the right type) and OfType does a test first,skipp<strong>in</strong>g any elements of the wrong type.List<strong>in</strong>g 11.5 demonstrates both of these operators, used as simple extension methodsfrom Enumerable. Just for a change, we won’t be us<strong>in</strong>g our SkeetySoft defect systemfor our sample data—after all, that’s all strongly typed! Instead, we’ll just use twoArrayList objects.List<strong>in</strong>g 11.5Us<strong>in</strong>g Cast and OfType to work with weakly typed collectionsArrayList list = new ArrayList { "First", "Second", "Third"};IEnumerable str<strong>in</strong>gs = list.Cast();foreach (str<strong>in</strong>g item <strong>in</strong> str<strong>in</strong>gs){Console.WriteL<strong>in</strong>e(item);}list = new ArrayList { 1, "not an <strong>in</strong>t", 2, 3};IEnumerable <strong>in</strong>ts = list.OfType();foreach (<strong>in</strong>t item <strong>in</strong> <strong>in</strong>ts){Console.WriteL<strong>in</strong>e(item);}The first list has only str<strong>in</strong>gs <strong>in</strong> it, so we’re safe to use Cast to obta<strong>in</strong> asequence of str<strong>in</strong>gs. The second list has mixed content, so <strong>in</strong> order to fetch just the<strong>in</strong>tegers from it we use OfType. If we’d used Cast on the second list, anexception would have been thrown when we tried to cast “not an <strong>in</strong>t” to <strong>in</strong>t. Note thatthis would only have happened after we’d pr<strong>in</strong>ted “1”—both operators stream theirdata, convert<strong>in</strong>g elements as they fetch them.When you <strong>in</strong>troduce a range variable with an explicit type, the compiler uses a callto Cast to make sure the sequence used by the rest of the query expression is of theappropriate type. List<strong>in</strong>g 11.6 shows this, with a projection us<strong>in</strong>g the Substr<strong>in</strong>g methodto prove that the sequence generated by the from clause is a sequence of str<strong>in</strong>gs.List<strong>in</strong>g 11.6Us<strong>in</strong>g an explicitly typed range variable to automatically call CastArrayList list = new ArrayList { "First", "Second", "Third"};var str<strong>in</strong>gs = from str<strong>in</strong>g entry <strong>in</strong> listselect entry.Substr<strong>in</strong>g(0, 3);Licensed to Rhona Hadida


290 CHAPTER 11 Query expressions and LINQ to Objectsforeach (str<strong>in</strong>g start <strong>in</strong> str<strong>in</strong>gs){Console.WriteL<strong>in</strong>e(start);}The output of list<strong>in</strong>g 11.6 is “Fir,” “Sec,” “Thi”—but what’s more <strong>in</strong>terest<strong>in</strong>g is thetranslated query expression, which islist.Cast().Select(entry => entry.Substr<strong>in</strong>g(0,3));Without the cast, we wouldn’t be able to call Select at all, because the extensionmethod is only def<strong>in</strong>ed for IEnumerable rather than IEnumerable. Even whenyou’re us<strong>in</strong>g a strongly typed collection, you might still want to use an explicitly typedrange variable, though. For <strong>in</strong>stance, you could have a collection that is def<strong>in</strong>ed to bea List but you know that all the elements are <strong>in</strong>stances ofMyImplementation. Us<strong>in</strong>g a range variable with an explicit type of MyImplementationallows you to access all the members of MyImplementation without manually <strong>in</strong>sert<strong>in</strong>gcasts all over the code.We’ve covered a lot of important conceptual ground so far, even though wehaven’t achieved any impressive results. To recap the most important po<strong>in</strong>ts briefly:■ LINQ is based on sequences of data, which are streamed wherever possible.■ Creat<strong>in</strong>g a query doesn’t immediately execute it: most operations use deferredexecution.■ Query expressions <strong>in</strong> <strong>C#</strong> 3 <strong>in</strong>volve a preprocess<strong>in</strong>g phase that converts theexpression <strong>in</strong>to normal <strong>C#</strong>, which is then compiled properly with all the normalrules of type <strong>in</strong>ference, overload<strong>in</strong>g, lambda expressions, and so forth.■ The variables declared <strong>in</strong> query expressions don’t act like anyth<strong>in</strong>g else: theyare range variables, which allow you to refer to data consistently with<strong>in</strong> thequery expression.I know that there’s a lot of somewhat abstract <strong>in</strong>formation to take <strong>in</strong>. Don’t worry ifyou’re beg<strong>in</strong>n<strong>in</strong>g to wonder if LINQ is worth all this trouble. I promise you that it is.With a lot of the groundwork out of the way, we can start do<strong>in</strong>g genu<strong>in</strong>ely usefulth<strong>in</strong>gs—like filter<strong>in</strong>g our data, and then order<strong>in</strong>g it.11.3 Filter<strong>in</strong>g and order<strong>in</strong>g a sequenceYou may be surprised to learn that these two operations are some of the simplest toexpla<strong>in</strong> <strong>in</strong> terms of compiler translations. The reason is that they always return asequence of the same type as their <strong>in</strong>put, which means we don’t need to worry aboutany new range variables be<strong>in</strong>g <strong>in</strong>troduced. It also helps that we’ve seen the correspond<strong>in</strong>gextension methods <strong>in</strong> chapter 10.11.3.1 Filter<strong>in</strong>g us<strong>in</strong>g a where clauseIt’s remarkably easy to understand the where clause. The format is justwhere filter-expressionLicensed to Rhona Hadida


Filter<strong>in</strong>g and order<strong>in</strong>g a sequence291The compiler translates this <strong>in</strong>to a call to the Where method with a lambda expression,which uses the appropriate range variable as the parameter and the filterexpression as the body. The filter expression is applied as a predicate to each elementof the <strong>in</strong>com<strong>in</strong>g stream of data, and only those that return true are present <strong>in</strong>the result<strong>in</strong>g sequence. Us<strong>in</strong>g multiple where clauses results <strong>in</strong> multiple cha<strong>in</strong>edWhere calls—only elements that match all of the predicates are part of the result<strong>in</strong>gsequence. List<strong>in</strong>g 11.7 demonstrates a query expression that f<strong>in</strong>ds all open defectsassigned to Tim.List<strong>in</strong>g 11.7Query expression us<strong>in</strong>g multiple where clausesUser tim = SampleData.Users.TesterTim;var query = from bug <strong>in</strong> SampleData.AllDefectswhere bug.Status != Status.Closedwhere bug.AssignedTo == timselect bug.Summary;foreach (var summary <strong>in</strong> query){Console.WriteL<strong>in</strong>e(summary);}The query expression <strong>in</strong> list<strong>in</strong>g 11.7 is translated <strong>in</strong>toSampleData.AllDefects.Where (bug => bug.Status != Status.Closed).Where (bug => bug.AssignedTo == tim).Select(bug => bug.Summary)The output of list<strong>in</strong>g 11.7 is as follows:Installation is slowSubtitles only work <strong>in</strong> WelshPlay button po<strong>in</strong>ts the wrong wayWebcam makes me look baldNetwork is saturated when play<strong>in</strong>g WAV fileOf course, we could write a s<strong>in</strong>gle where clause that comb<strong>in</strong>ed the two conditions asan alternative to us<strong>in</strong>g multiple where clauses. In some cases this may improve performance,but it’s worth bear<strong>in</strong>g the readability of the query expression <strong>in</strong> m<strong>in</strong>d, too.Once more, this is likely to be a fairly subjective matter. My personal <strong>in</strong>cl<strong>in</strong>ation is tocomb<strong>in</strong>e conditions that are logically related but keep others separate. In this case,both parts of the expression deal directly with a defect (as that’s all our sequence conta<strong>in</strong>s),so it would be reasonable to comb<strong>in</strong>e them. As before, it’s worth try<strong>in</strong>g bothforms to see which is clearer.In a moment, we’ll start try<strong>in</strong>g to apply some order<strong>in</strong>g rules to our query, but firstwe should look at a small detail to do with the select clause.11.3.2 Degenerate query expressionsWhile we’ve got a fairly simple translation to work with, let’s revisit a po<strong>in</strong>t I glossedover earlier <strong>in</strong> section 11.2.2 when I first <strong>in</strong>troduced the compiler translations. So far,Licensed to Rhona Hadida


292 CHAPTER 11 Query expressions and LINQ to Objectsall our translated query expressions have <strong>in</strong>cluded a call to Select. What happens ifour select clause does noth<strong>in</strong>g, effectively return<strong>in</strong>g the same sequence as it’s given?The answer is that the compiler removes that call to Select—but only if there areother operations be<strong>in</strong>g performed with<strong>in</strong> the query expression. For example, the follow<strong>in</strong>gquery expression just selects all the defects <strong>in</strong> the system:from defect <strong>in</strong> SampleData.AllDefectsselect defectThis is known as a degenerate query expression. The compiler deliberately generates a callto Select even though it seems to do noth<strong>in</strong>g:SampleData.Select (defect => defect)There’s a big difference between this and the simple expression SampleData, however.The items returned by the two sequences are the same, but the result of the Selectmethod is just the sequence of items, not the source itself. The result of a queryexpression is never the same object as the source data, unless the LINQ provider hasbeen poorly coded. This can be important from a data <strong>in</strong>tegrity po<strong>in</strong>t of view—a providercan return a mutable result object, know<strong>in</strong>g that changes to the returned dataset won’t affect the “master” even <strong>in</strong> the face of a degenerate query.When other operations are <strong>in</strong>volved, there’s no need for the compiler to keep “noop”select clauses. For example, suppose we change the query expression <strong>in</strong> list<strong>in</strong>g 11.7to select the whole defect rather than just the name:from bug <strong>in</strong> SampleData.AllDefectswhere bug.Status != Status.Closedwhere bug.AssignedTo == SampleData.Users.TesterTimselect bugWe now don’t need the f<strong>in</strong>al call to Select, so the translated code is just this:SampleData.AllDefects.Where (bug => bug.Status != Status.Closed).Where (bug => bug.AssignedTo == tim)These rules rarely get <strong>in</strong> the way when you’re writ<strong>in</strong>g query expressions, but they cancause confusion if you decompile the code with a tool such as Reflector—it can be surpris<strong>in</strong>gto see the Select call go miss<strong>in</strong>g for no apparent reason.With that knowledge <strong>in</strong> hand, let’s improve our query so that we know what Timshould work on next.11.3.3 Order<strong>in</strong>g us<strong>in</strong>g an orderby clauseIt’s not uncommon for developers and testers to be asked to work on the most criticaldefects before they tackle more trivial ones. We can use a simple query to tell Tim theorder <strong>in</strong> which he should tackle the open defects assigned to him. List<strong>in</strong>g 11.8 doesexactly this us<strong>in</strong>g an orderby clause, pr<strong>in</strong>t<strong>in</strong>g out all the details of the bugs, <strong>in</strong>descend<strong>in</strong>g order of priority.Licensed to Rhona Hadida


Filter<strong>in</strong>g and order<strong>in</strong>g a sequence293List<strong>in</strong>g 11.8Sort<strong>in</strong>g by the severity of a bug, from high to low priorityUser tim = SampleData.Users.TesterTim;var query = from bug <strong>in</strong> SampleData.AllDefectswhere bug.Status != Status.Closedwhere bug.AssignedTo == timorderby bug.Severity descend<strong>in</strong>gselect bug;foreach (var bug <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0}: {1}", bug.Severity, bug.Summary);}The output of list<strong>in</strong>g 11.8 shows that we’ve sorted the results appropriately:Showstopper: Webcam makes me look baldMajor: Subtitles only work <strong>in</strong> WelshMajor: Play button po<strong>in</strong>ts the wrong wayM<strong>in</strong>or: Network is saturated when play<strong>in</strong>g WAV fileTrivial: Installation is slowHowever, you can see that we’ve got two major defects. Which order should those betackled <strong>in</strong>? Currently no clear order<strong>in</strong>g is <strong>in</strong>volved. Let’s change the query so thatafter sort<strong>in</strong>g by severity <strong>in</strong> descend<strong>in</strong>g order, we sort by “last modified time” <strong>in</strong> ascend<strong>in</strong>gorder. This means that Tim will test the bugs that have been fixed a long time agobefore the ones that were fixed recently. This just requires an extra expression <strong>in</strong> theorderby clause, as shown <strong>in</strong> list<strong>in</strong>g 11.9.List<strong>in</strong>g 11.9Order<strong>in</strong>g by severity and then last modified timeUser tim = SampleData.Users.TesterTim;var query = from bug <strong>in</strong> SampleData.AllDefectswhere bug.Status != Status.Closedwhere bug.AssignedTo == timorderby bug.Severity descend<strong>in</strong>g, bug.LastModifiedselect bug;foreach (var bug <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0}: {1} ({2:d})",bug.Severity, bug.Summary, bug.LastModified);}The results of list<strong>in</strong>g 11.9 are shown here. Note how the order of the two majordefects has been reversed.Showstopper: Webcam makes me look bald (08/27/2007)Major: Play button po<strong>in</strong>ts the wrong way (08/17/2007)Major: Subtitles only work <strong>in</strong> Welsh (08/23/2007)M<strong>in</strong>or: Network is saturated when play<strong>in</strong>g WAV file (08/31/2007)Trivial: Installation is slow (08/15/2007)Licensed to Rhona Hadida


294 CHAPTER 11 Query expressions and LINQ to ObjectsSo, that’s what the query expression looks like—but what does the compiler do? It simplycalls the OrderBy and ThenBy methods (or OrderByDescend<strong>in</strong>g/ThenBy-Descend<strong>in</strong>g for descend<strong>in</strong>g orders). Our query expression is translated <strong>in</strong>toSampleData.AllDefects.Where (bug => bug.Status != Status.Closed).Where (bug => bug.AssignedTo == tim).OrderByDescend<strong>in</strong>g (bug => bug.Severity).ThenBy (bug => bug.LastModified)Now that we’ve seen an example, let’s look at the general syntax of orderby clauses.They’re basically the contextual keyword orderby followed by one or more order<strong>in</strong>gs.An order<strong>in</strong>g is just an expression (which can use range variables) optionally followed bydescend<strong>in</strong>g, which has the obvious mean<strong>in</strong>g. The translation for the first order<strong>in</strong>g is acall to OrderBy or OrderByDescend<strong>in</strong>g, and any other order<strong>in</strong>gs are translated us<strong>in</strong>g acall to ThenBy or ThenByDescend<strong>in</strong>g, as shown <strong>in</strong> our example.The difference between OrderBy and ThenBy is quite simple: OrderBy assumes ithas primary control over the order<strong>in</strong>g, whereas ThenBy understands that it’s subservientto one or more previous order<strong>in</strong>gs. For LINQ to Objects, ThenBy is only def<strong>in</strong>ed asan extension method for IOrderedEnumerable, which is the type returned byOrderBy (and by ThenBy itself, to allow further cha<strong>in</strong><strong>in</strong>g).It’s very important to note that although you can use multiple orderbyclauses, each one will start with its own OrderBy or OrderByDescend<strong>in</strong>gclause, which means the last one will effectively “w<strong>in</strong>.” There may be somereason for <strong>in</strong>clud<strong>in</strong>g multiple orderby clauses, but it would be veryunusual. You should almost always use a s<strong>in</strong>gle clause conta<strong>in</strong><strong>in</strong>g multipleorder<strong>in</strong>gs <strong>in</strong>stead.As noted <strong>in</strong> chapter 10, apply<strong>in</strong>g an order<strong>in</strong>g requires all the data tobe loaded (at least for LINQ to Objects)—you can’t order an <strong>in</strong>f<strong>in</strong>itesequence, for example. Hopefully the reason for this is obvious—youdon’t know whether you’ll see someth<strong>in</strong>g that should come at the start of the result<strong>in</strong>gsequence until you’ve seen all the elements, for example.We’re about halfway through learn<strong>in</strong>g about query expressions, and you may be surprisedthat we haven’t seen any jo<strong>in</strong>s yet. Obviously they’re important <strong>in</strong> LINQ just asthey’re important <strong>in</strong> SQL, but they’re also complicated. I promise we’ll get to them <strong>in</strong>due course, but <strong>in</strong> order to <strong>in</strong>troduce just one new concept at a time, we’ll detour vialet clauses first. That way we can learn about transparent identifiers before we hit jo<strong>in</strong>s.Warn<strong>in</strong>g:only use oneorderbyclause!11.4 Let clauses and transparent identifiersMost of the rest of the operators we still need to look at <strong>in</strong>volve transparent identifiers.Just like range variables, you can get along perfectly well without understand<strong>in</strong>g transparentidentifiers, if you only want to have a fairly shallow grasp of query expressions.If you’ve bought this book, I hope you want to know <strong>C#</strong> 3 at a deeper level, which will(among other th<strong>in</strong>gs) enable you to look compilation errors <strong>in</strong> the face and knowwhat they’re talk<strong>in</strong>g about.Licensed to Rhona Hadida


Let clauses and transparent identifiers295You don’t need to know everyth<strong>in</strong>g about transparent identifiers, but I’ll teach youenough so that if you see one <strong>in</strong> the language specification you won’t feel like runn<strong>in</strong>gand hid<strong>in</strong>g. You’ll also understand why they’re needed at all—and that’s where anexample will come <strong>in</strong> handy. The let clause is the simplest transformation availablethat uses transparent identifiers.11.4.1 Introduc<strong>in</strong>g an <strong>in</strong>termediate computation with letA let clause simply <strong>in</strong>troduces a new range variable with a value that can be based onother range variables. The syntax is as easy as pie:let identifier = expressionTo expla<strong>in</strong> this operator <strong>in</strong> terms that don’t use any other complicated operators, I’mgo<strong>in</strong>g to resort to a very artificial example. Suspend your disbelief, and imag<strong>in</strong>e thatf<strong>in</strong>d<strong>in</strong>g the length of a str<strong>in</strong>g is a costly operation. Now imag<strong>in</strong>e that we had a completelybizarre system requirement to order our users by the lengths of their names, andthen display the name and its length. Yes, I know it’s somewhat unlikely. List<strong>in</strong>g 11.10shows one way of do<strong>in</strong>g this without a let clause.List<strong>in</strong>g 11.10Sort<strong>in</strong>g by the lengths of user names without a let clausevar query = from user <strong>in</strong> SampleData.AllUsersorderby user.Name.Lengthselect user.Name;foreach (var name <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0}: {1}", name.Length, name);}That works f<strong>in</strong>e, but it uses the dreaded Length property twice—once to sort the users,and once <strong>in</strong> the display side. Surely not even the fastest supercomputer could cope withf<strong>in</strong>d<strong>in</strong>g the lengths of six str<strong>in</strong>gs twice! No, we need to avoid that redundant computation.We can do so with the let clause, which evaluates an expression and <strong>in</strong>troducesit as a new range variable. List<strong>in</strong>g 11.11 achieves the same result as list<strong>in</strong>g 11.10, but onlyuses the Length property once per user.List<strong>in</strong>g 11.11Us<strong>in</strong>g a let clause to remove redundant calculationsvar query = from user <strong>in</strong> SampleData.AllUserslet length = user.Name.Lengthorderby lengthselect new { Name = user.Name, Length = length };foreach (var entry <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0}: {1}", entry.Length, entry.Name);}List<strong>in</strong>g 11.11 <strong>in</strong>troduces a new range variable called length, which conta<strong>in</strong>s the lengthof the user’s name (for the current user <strong>in</strong> the orig<strong>in</strong>al sequence). We then use that newLicensed to Rhona Hadida


296 CHAPTER 11 Query expressions and LINQ to Objectsrange variable for both sort<strong>in</strong>g and the projection at the end. Have you spotted the problemyet? We need to use two range variables, but the lambda expression passed to Selectonly takes one parameter! This is where transparent identifiers come on the scene.11.4.2 Transparent identifiersIn list<strong>in</strong>g 11.11, we’ve got two range variables <strong>in</strong>volved <strong>in</strong> the f<strong>in</strong>al projection, but theSelect method only acts on a s<strong>in</strong>gle sequence. How can we comb<strong>in</strong>e the range variables?The answer is to create an anonymous type that conta<strong>in</strong>s both variables butapply a clever translation to make it look as if we’ve actually got two parameters for theselect and orderby clauses. Figure 11.5 shows the sequences <strong>in</strong>volved.The let clause achieves its objectives by us<strong>in</strong>g another call to Select, creat<strong>in</strong>g ananonymous type for the result<strong>in</strong>g sequence, and effectively creat<strong>in</strong>g a new range variablewhose name can never be seen or used <strong>in</strong> source code. Our query expressionfrom list<strong>in</strong>g 11.11 is translated <strong>in</strong>to someth<strong>in</strong>g like this:SampleData.AllUsers.Select(user => new { user,length=user.Name.Length }).OrderBy(z => z.length).Select(z => new { Name=z.user.Name,Length=z.length })from user <strong>in</strong>SampleData.AllUsersUser: { Name="Tim Trotter" ... }User: { Name="Tara Tutu" ... }User: { Name="Dave Denton" ... }...let length = user.Name.Lengthuser=User: { Name="Tim Trotter" ... }, length=11user=User: { Name="Tara Tutu" ... }, length=9user=User: { Name="Dave Denton" ... }, length=11...orderby lengthuser=User: { Name="Tara Tutu" ... }, length=9user=User: { Name="Tim Trotter" ...}, length=11user=User: { Name="Dave Denton" ... }, length=11...select new { Name=user.Name,Length=user.Length }(Result of query)Name="Tara Tutu", Length=9Name="Tim Trotter", Length=11Name="Dave Denton", Length=11...Figure 11.5 Sequences <strong>in</strong>volved <strong>in</strong> list<strong>in</strong>g11.11, where a let clause <strong>in</strong>troduces thelength range variableLicensed to Rhona Hadida


Jo<strong>in</strong>s29711.5 Jo<strong>in</strong>sEach part of the query has been adjusted appropriately: where the orig<strong>in</strong>al queryexpression referenced user or length directly, if the reference occurs after the letclause, it’s replaced by z.user or z.length. The choice of z as the name here is arbitrary—it’sall hidden by the compiler.If you read the <strong>C#</strong> 3 language specification on let clauses, you’ll see that the translationit describes is from one query expression to another. It uses an asterisk (*) torepresent the transparent identifier <strong>in</strong>troduced. The transparent identifier is thenerased as a f<strong>in</strong>al step <strong>in</strong> translation. I won’t use that notation <strong>in</strong> this chapter, as it’s hardto come to grips with and unnecessary at the level of detail we’re go<strong>in</strong>g <strong>in</strong>to. Hopefullywith this background, the specification won’t be quite as impenetrable as it mightbe otherwise, should you need to refer to it.The good news is that we can now take a look at the rest of the translations mak<strong>in</strong>gup <strong>C#</strong> 3’s query expression support. I won’t go <strong>in</strong>to the details of every transparentidentifier <strong>in</strong>troduced, but I’ll mention the situations <strong>in</strong> which they occur. Let’s look atthe support for jo<strong>in</strong>s first.If you’ve ever read anyth<strong>in</strong>g about SQL, you probably have an idea what a database jo<strong>in</strong>is. It takes two tables and creates a result by match<strong>in</strong>g one set of rows aga<strong>in</strong>st anotherset of rows. A LINQ jo<strong>in</strong> is similar, except it works on sequences. Three types of jo<strong>in</strong>sare available, although not all of them use the jo<strong>in</strong> keyword <strong>in</strong> the query expression.We’ll start with the jo<strong>in</strong> that is closest to a SQL <strong>in</strong>ner jo<strong>in</strong>.11.5.1 Inner jo<strong>in</strong>s us<strong>in</strong>g jo<strong>in</strong> clausesInner jo<strong>in</strong>s <strong>in</strong>volve two sequences. One key selector expression is applied to each elementof the first sequence and another key selector (which may be totally different) isapplied to each element of the second sequence. The result of the jo<strong>in</strong> is a sequenceof all the pairs of elements where the key from the first element is the same as the keyfrom the second element.NOTETerm<strong>in</strong>ology clash! Inner and outer sequences—The MSDN documentation forthe Jo<strong>in</strong> method used to evaluate <strong>in</strong>ner jo<strong>in</strong>s unhelpfully calls thesequences <strong>in</strong>volved <strong>in</strong>ner and outer. This has noth<strong>in</strong>g to do with <strong>in</strong>nerjo<strong>in</strong>s and outer jo<strong>in</strong>s—it’s just a way of differentiat<strong>in</strong>g between thesequences. You can th<strong>in</strong>k of them as first and second, left and right, Bertand Ernie—anyth<strong>in</strong>g you like that helps you. I’ll use left and right for thischapter. Aside from anyth<strong>in</strong>g else, it makes it obvious which sequence iswhich <strong>in</strong> diagram form.The two sequences can be anyth<strong>in</strong>g you like: the right sequence can even be the sameas the left sequence, if that’s useful. (Imag<strong>in</strong>e f<strong>in</strong>d<strong>in</strong>g pairs of people who were bornon the same day, for example.) The only th<strong>in</strong>g that matters is that the two key selectorexpressions must result <strong>in</strong> the same type of key 4 . You can’t jo<strong>in</strong> a sequence of people4It is also valid for there to be two key types <strong>in</strong>volved, with an implicit conversion from one to the other. Oneof the types must be a better choice than the other, <strong>in</strong> the same way that the compiler <strong>in</strong>fers the type of animplicitly typed array. In my experience you rarely need to consciously consider this detail.Licensed to Rhona Hadida


298 CHAPTER 11 Query expressions and LINQ to Objectsto a sequence of cities by say<strong>in</strong>g that the birth date of the person is the same as thepopulation of the city—it doesn’t make any sense.The syntax for an <strong>in</strong>ner jo<strong>in</strong> looks more complicated than it is:[query select<strong>in</strong>g the left sequence]jo<strong>in</strong> right-range-variable <strong>in</strong> right-sequenceon left-key-selector equals right-key-selectorSee<strong>in</strong>g equals as a contextual keyword rather than us<strong>in</strong>g symbols can be slightly disconcert<strong>in</strong>g,but it makes it easier to dist<strong>in</strong>guish the left key selector from the right keyselector. Often (but not always) at least one of the key selectors is a trivial one that justselects the exact element from that sequence.Let’s look at an example from our defect system. Suppose we had just added thenotification feature, and wanted to send the first batch of emails for all the exist<strong>in</strong>gdefects. We need to jo<strong>in</strong> the list of notifications aga<strong>in</strong>st the list of defects, where theirprojects match. List<strong>in</strong>g 11.12 performs just such a jo<strong>in</strong>.List<strong>in</strong>g 11.12Jo<strong>in</strong><strong>in</strong>g the defects and notification subscriptions based on projectvar query = from defect <strong>in</strong> SampleData.AllDefectsjo<strong>in</strong> subscription <strong>in</strong> SampleData.AllSubscriptionson defect.Project equals subscription.Projectselect new { defect.Summary, subscription.EmailAddress };foreach (var entry <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0}: {1}", entry.EmailAddress, entry.Summary);}List<strong>in</strong>g 11.12 will show each of the media player bugs twice—once for “mediabugs@skeetysoft.com”and once for “theboss@skeetysoft.com” (because the boss reallycares about the media player project).In this particular case we could easily have made the jo<strong>in</strong> the other way round,revers<strong>in</strong>g the left and right sequences. The result would have been the same entriesbut <strong>in</strong> a different order. The implementation <strong>in</strong> LINQ to Objects returns entries sothat all the pairs us<strong>in</strong>g the first element of the left sequence are returned (<strong>in</strong> theorder of the right sequence), then all the pairs us<strong>in</strong>g the second element of the leftsequence, and so on. The right sequence is buffered, but the left sequence isstreamed—so if you want to jo<strong>in</strong> a massive sequence to a t<strong>in</strong>y one, it’s worth us<strong>in</strong>g thet<strong>in</strong>y one as the right sequence if you can.One error that might trip you up is putt<strong>in</strong>g the key selectors the wrong way round.In the left key selector, only the left sequence range variable is <strong>in</strong> scope; <strong>in</strong> the rightkey selector only the right range variable is <strong>in</strong> scope. If you reverse the left and rightsequences, you have to reverse the left and right key selectors too. Fortunately thecompiler knows that it’s a common mistake and suggests the appropriate course ofaction.Just to make it more obvious what’s go<strong>in</strong>g on, figure 11.6 shows the sequences asthey’re processed.Licensed to Rhona Hadida


Jo<strong>in</strong>s299from defect <strong>in</strong>SampleData.AllDefectsDefect: { ID=1, Project=Media player, Summary="MP3 files ..." ... }D efect: { ID=2, Project=Media player, Summary="Text is too big", ...}D efect: { ID=3, Project=Talk, Summary="Sky is wrong ..." ...}...from subscription <strong>in</strong>SampleData.AllSubscriptionsjo<strong>in</strong> subscription <strong>in</strong>SampleData.AllSubscriptionson defect.Project equals subscription.ProjectNotificationSubscription sequence:{ Media player, "media-bugs@..." },{ Talk, "talk-bugs@..." },{ Office, "office-bugs@..." }{ Media player, "theboss@..." }{ defect={ ID=1 ... }, subscription={ Media player, "media-bugs@..." }{ defect={ ID=1 ... }, subscription={ Media player, "theboss@..." }{ defect={ ID=2 ... }, subscription={ Media player, "media-bugs@..." }{ defect={ ID=2 ... }, subscription={ Media player, "theboss@..." }{ defect={ ID=3 ... }, subscription={ Talk, "talk-bugs@..." }select new { defect.Summary,subscription.EmailAddress }{ Summary="MP3 files ...", EmailAddress="media-bugs@..." }{ Summary="MP3 files ...", EmailAddress="theboss@..." }{ Summary="Text is too big", EmailAddress="media-bugs@..." }{ Summary="Text is too big", EmailAddress="theboss@... }{ Summary="Sky is wrong ...", EmailAddress="talk-bugs@..." }(Result of query)Figure 11.6 The jo<strong>in</strong> from list<strong>in</strong>g 11.12 <strong>in</strong> graphical form, show<strong>in</strong>g two different sequences(defects and subscriptions) used as data sourcesOften you want to filter the sequence, and filter<strong>in</strong>g before the jo<strong>in</strong> occurs is more efficientthan filter<strong>in</strong>g it afterward. At this stage the query expression is simpler if the leftsequence is the one requir<strong>in</strong>g filter<strong>in</strong>g. For <strong>in</strong>stance, if we wanted to show only defectsthat are closed, we could use this query expression:from defect <strong>in</strong> SampleData.AllDefectswhere defect.Status == Status.Closedjo<strong>in</strong> subscription <strong>in</strong> SampleData.AllSubscriptionson defect.Project equals subscription.Projectselect new { defect.Summary, subscription.EmailAddress }We can perform the same query with the sequences reversed, but it’s messier:from subscription <strong>in</strong> SampleData.AllSubscriptionsjo<strong>in</strong> defect <strong>in</strong> (from defect <strong>in</strong> SampleData.AllDefectswhere defect.Status == Status.Closedselect defect)on subscription.Project equals defect.Projectselect new { defect.Summary, subscription.EmailAddress }Licensed to Rhona Hadida


300 CHAPTER 11 Query expressions and LINQ to ObjectsNotice how you can use one query expression <strong>in</strong>side another—<strong>in</strong>deed, the languagespecification describes many of the compiler translations <strong>in</strong> these terms. Nested queryexpressions are useful but hurt readability as well: it’s often worth look<strong>in</strong>g for an alternative,or us<strong>in</strong>g a variable for the right-hand sequence <strong>in</strong> order to make the code clearer.NOTE Are <strong>in</strong>ner jo<strong>in</strong>s useful <strong>in</strong> LINQ to Objects? Inner jo<strong>in</strong>s are used all the time<strong>in</strong> SQL. They are effectively the way that we navigate from one entity to arelated one, usually jo<strong>in</strong><strong>in</strong>g a foreign key <strong>in</strong> one table to the primary keyon another. In the object-oriented model, we tend to navigate from oneobject to another via references. For <strong>in</strong>stance, retriev<strong>in</strong>g the summary ofa defect and the name of the user assigned to work on it would require ajo<strong>in</strong> <strong>in</strong> SQL—<strong>in</strong> <strong>C#</strong> we just use a cha<strong>in</strong> of properties. If we’d had a reverseassociation from Project to the list of NotificationSubscriptionobjects associated with it <strong>in</strong> our model, we wouldn’t have needed the jo<strong>in</strong>to achieve the goal of this example, either. That’s not to say that <strong>in</strong>nerjo<strong>in</strong>s aren’t useful sometimes even with<strong>in</strong> object-oriented models—butthey don’t naturally occur nearly as often as <strong>in</strong> relational models.Inner jo<strong>in</strong>s are translated by the compiler <strong>in</strong>to calls to the Jo<strong>in</strong> method. The signatureof the overload used for LINQ to Objects is as follows (when imag<strong>in</strong><strong>in</strong>g it to be an<strong>in</strong>stance method of IEnumerable):IEnumerable Jo<strong>in</strong> (IEnumerable <strong>in</strong>ner,Func outerKeySelector,Func <strong>in</strong>nerKeySelector,Func resultSelector)The first three parameters are self-explanatory when you’ve remembered to treat <strong>in</strong>nerand outer as right and left, respectively, but the last one is slightly more <strong>in</strong>terest<strong>in</strong>g. It’s aprojection from two elements (one from the left sequence and one from the rightsequence) <strong>in</strong>to a s<strong>in</strong>gle element of the result<strong>in</strong>g sequence. When the jo<strong>in</strong> is followedby anyth<strong>in</strong>g other than a select clause, the <strong>C#</strong> 3 compiler <strong>in</strong>troduces a transparentidentifier <strong>in</strong> order to make the range variables used <strong>in</strong> both sequences available forlater clauses, and creates an anonymous type and simple mapp<strong>in</strong>g to use for theresultSelector parameter.However, if the next part of the query expression is a select clause, the projectionfrom the select clause is used directly as the resultSelector parameter—there’s nopo<strong>in</strong>t <strong>in</strong> creat<strong>in</strong>g a pair and then call<strong>in</strong>g Select when you can do the transformation<strong>in</strong> one step. You can still th<strong>in</strong>k about it as a “jo<strong>in</strong>” step followed by a “select” stepdespite the two be<strong>in</strong>g squished <strong>in</strong>to a s<strong>in</strong>gle method call. This leads to a more consistentmental model <strong>in</strong> my view, and one that is easier to reason about. Unless you’relook<strong>in</strong>g at the generated code, just ignore the optimization the compiler is perform<strong>in</strong>gfor you.The good news is that hav<strong>in</strong>g learned about <strong>in</strong>ner jo<strong>in</strong>s, our next type of jo<strong>in</strong> ismuch easier to approach.Licensed to Rhona Hadida


Jo<strong>in</strong>s30111.5.2 Group jo<strong>in</strong>s with jo<strong>in</strong> … <strong>in</strong>to clausesWe’ve seen that the result sequence from a normal jo<strong>in</strong> clause consistsof pairs of elements, one from each of the <strong>in</strong>put sequences. A group jo<strong>in</strong>looks similar <strong>in</strong> terms of the query expression but has a significantly differentoutcome. Each element of a group jo<strong>in</strong> result consists of an elementfrom the left sequence (us<strong>in</strong>g its orig<strong>in</strong>al range variable), and alsoa sequence of all the match<strong>in</strong>g elements of the right sequence, exposed asa new range variable specified by the identifier com<strong>in</strong>g after <strong>in</strong>to <strong>in</strong> thejo<strong>in</strong> clause.Let’s change our previous example to use a group jo<strong>in</strong>. List<strong>in</strong>g 11.13 aga<strong>in</strong> shows allthe defects and the notifications required for each one, but breaks them out <strong>in</strong> a perdefectmanner. Pay particular attention to how we’re display<strong>in</strong>g the results, and to thenested foreach loop.Resultconta<strong>in</strong>sembeddedsubsequencesList<strong>in</strong>g 11.13Jo<strong>in</strong><strong>in</strong>g defects and subscriptions with a group jo<strong>in</strong>var query = from defect <strong>in</strong> SampleData.AllDefectsjo<strong>in</strong> subscription <strong>in</strong> SampleData.AllSubscriptionson defect.Project equals subscription.Project<strong>in</strong>to groupedSubscriptionsselect new { Defect=defect,Subscriptions=groupedSubscriptions };foreach (var entry <strong>in</strong> query){Console.WriteL<strong>in</strong>e(entry.Defect.Summary);foreach (var subscription <strong>in</strong> entry.Subscriptions){Console.WriteL<strong>in</strong>e (" {0}", subscription.EmailAddress);}}The Subscriptions property of each entry is the embedded sequence of subscriptionsmatch<strong>in</strong>g that entry’s defect. Figure 11.7 shows how the two <strong>in</strong>itial sequences arecomb<strong>in</strong>ed.One important difference between an <strong>in</strong>ner jo<strong>in</strong> and a group jo<strong>in</strong>—and <strong>in</strong>deedbetween a group jo<strong>in</strong> and normal group<strong>in</strong>g—is that for a group jo<strong>in</strong> there’s a one-toonecorrespondence between the left sequence and the result sequence, even if someof the elements <strong>in</strong> the left sequence don’t match any elements of the right sequence.This can be very important, and is sometimes used to simulate a left outer jo<strong>in</strong> fromSQL. The embedded sequence is empty when the left element doesn’t match any rightelements. As with an <strong>in</strong>ner jo<strong>in</strong>, a group jo<strong>in</strong> buffers the right sequence but streamsthe left one.List<strong>in</strong>g 11.14 shows an example of this, count<strong>in</strong>g the number of bugs created oneach day <strong>in</strong> August. It uses a DateTimeRange (as described <strong>in</strong> chapter 6) as the leftsequence, and a projection that calls Count on the embedded sequence <strong>in</strong> the resultof the group jo<strong>in</strong>.Licensed to Rhona Hadida


302 CHAPTER 11 Query expressions and LINQ to Objectsfrom defect <strong>in</strong>SampleData.AllDefectsDefect: { ID=1, Project=Media player, ... }Defect: { ID=2, Project=Media player, ...}Defect: { ID=3, Project=Talk, ...}...jo<strong>in</strong> subscription <strong>in</strong>SampleData.AllSubscriptionson defect.Project equals subscription.Project<strong>in</strong>to groupedSubscriptionsfrom subscription <strong>in</strong>SampleData.AllSubscriptions(NotificationSubscription sequence){ defect=Defect: { ID=1, Project=Media player ... },groupedSubscriptions= { Media player, "media-bugs@..." ... }{ Media player, "theboss@..." ...}{ defect=Defect: { ID=2, Project=Media player ... },groupedSubscriptions= { Media player, "media-bugs@..." ... }{ Media player, "theboss@..." ...}{ defect=Defect: { ID=3, Project=Talk ... },groupedSubscriptions= { Talk, "talk-bugs@..." ... }...select new { Defect=defect,Subscriptions=groupedSubscriptions }(Result of query){ Defect=Defect: { ID=1, Project=Media player ... },Subscriptions= { Media player, "media-bugs@..." ... }{ Media player, "theboss@..." ...}{ Defect=Defect: { ID=2, Project=Media player ... },Subscriptions= { Media player, "media-bugs@..." ... }{ Media player, "theboss@..." ...}{ Defect=Defect: { ID=3, Project=Talk ... },Subscriptions= { Talk, "talk-bugs@..." ... }...Figure 11.7 Sequences <strong>in</strong>volved <strong>in</strong> the group jo<strong>in</strong> from list<strong>in</strong>g 11.13. The short arrows<strong>in</strong>dicate embedded sequences with<strong>in</strong> the result entries. In the output, some entriesconta<strong>in</strong> multiple email addresses for the same bug.List<strong>in</strong>g 11.14Count<strong>in</strong>g the number of bugs raised on each day <strong>in</strong> Augustvar dates = new DateTimeRange(SampleData.Start, SampleData.End);var query = from date <strong>in</strong> datesjo<strong>in</strong> defect <strong>in</strong> SampleData.AllDefectson date equals defect.Created.Date<strong>in</strong>to jo<strong>in</strong>edselect new { Date=date, Count=jo<strong>in</strong>ed.Count() };Licensed to Rhona Hadida


Jo<strong>in</strong>s303foreach (var grouped <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0:d}: {1}", grouped.Date, grouped.Count);}Count itself uses immediate execution, iterat<strong>in</strong>g through all the elements of the sequenceit’s called on—but we’re only call<strong>in</strong>g it <strong>in</strong> the projection part of the query expression,so it becomes part of a lambda expression. This means we still have deferred execution:noth<strong>in</strong>g is evaluated until we start the foreach loop.Here is the first part of the results of list<strong>in</strong>g 11.14, show<strong>in</strong>g the number of bugs createdeach day <strong>in</strong> the first week of August:08/01/2007: 108/02/2007: 008/03/2007: 208/04/2007: 108/05/2007: 008/06/2007: 108/07/2007: 1The compiler translation <strong>in</strong>volved for a group jo<strong>in</strong> is simply a call to the GroupJo<strong>in</strong>method, which has the follow<strong>in</strong>g signature:IEnumerable GroupJo<strong>in</strong> (IEnumerable <strong>in</strong>ner,Func outerKeySelector,Func <strong>in</strong>nerKeySelector,Func resultSelector)The signature is exactly the same as for <strong>in</strong>ner jo<strong>in</strong>s, except that the resultSelectorparameter has to work with a sequence of right-hand elements, not just a s<strong>in</strong>gle one.As with <strong>in</strong>ner jo<strong>in</strong>s, if a group jo<strong>in</strong> is followed by a select clause the projection is usedas the result selector of the GroupJo<strong>in</strong> call; otherwise, a transparent identifier is <strong>in</strong>troduced.In this case we have a select clause immediately after the group jo<strong>in</strong>, so thetranslated query looks like this:dates.GroupJo<strong>in</strong>(SampleData.AllDefects,date => date,defect => defect.Created.Date,(date, jo<strong>in</strong>ed) => new { Date=date,Count=jo<strong>in</strong>ed.Count() })Our f<strong>in</strong>al type of jo<strong>in</strong> is known as a cross jo<strong>in</strong>—but it’s not quite as straightforward as itmight seem at first.11.5.3 Cross jo<strong>in</strong>s us<strong>in</strong>g multiple from clausesSo far all our jo<strong>in</strong>s have been equijo<strong>in</strong>s—a match has been performed between elementsof the left and right sequences. Cross jo<strong>in</strong>s don’t perform any match<strong>in</strong>g between thesequences: the result conta<strong>in</strong>s every possible pair of elements. They’re achieved by simplyus<strong>in</strong>g two (or more) from clauses. For the sake of sanity we’ll only consider two fromclauses for the moment—when there are more, just mentally perform a cross jo<strong>in</strong> onLicensed to Rhona Hadida


304 CHAPTER 11 Query expressions and LINQ to Objectsthe first two from clauses, then cross jo<strong>in</strong> the result<strong>in</strong>g sequence with the next fromclause, and so on. Each extra from clause adds its own range variable.List<strong>in</strong>g 11.15 shows a simple (but useless) cross jo<strong>in</strong> <strong>in</strong> action, produc<strong>in</strong>g asequence where each entry consists of a user and a project. I’ve deliberately pickedtwo completely unrelated <strong>in</strong>itial sequences to show that no match<strong>in</strong>g is performed.List<strong>in</strong>g 11.15Cross jo<strong>in</strong><strong>in</strong>g users aga<strong>in</strong>st projectsvar query = from user <strong>in</strong> SampleData.AllUsersfrom project <strong>in</strong> SampleData.AllProjectsselect new { User=user, Project=project };foreach (var pair <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0}/{1}",pair.User.Name,pair.Project.Name);}The output of list<strong>in</strong>g 11.15 beg<strong>in</strong>s like this:Tim Trotter/Skeety Media PlayerTim Trotter/Skeety TalkTim Trotter/Skeety OfficeTara Tutu/Skeety Media PlayerTara Tutu/Skeety TalkTara Tutu/Skeety OfficeFigure 11.8 shows the sequences <strong>in</strong>volved to get this result.If you’re familiar with SQL, you’re probably quite comfortable so far—it looks justlike a Cartesian product obta<strong>in</strong>ed from a query specify<strong>in</strong>g multiple tables. Indeed,most of the time that’s exactly how cross jo<strong>in</strong>s are used. However, there’s more poweravailable when you want it: the right sequence used at any particular po<strong>in</strong>t <strong>in</strong> time candepend on the “current” value of the left sequence. When this is the case, it’s not across jo<strong>in</strong> <strong>in</strong> the normal sense of the term. The query expression translation is thesame whether or not we’re us<strong>in</strong>g a true cross jo<strong>in</strong>, so we need to understand the morecomplicated scenario <strong>in</strong> order to understand the translation process.Before we dive <strong>in</strong>to the details, let’s see the effect it produces. List<strong>in</strong>g 11.16 shows asimple example, us<strong>in</strong>g sequences of <strong>in</strong>tegers.List<strong>in</strong>g 11.16Cross jo<strong>in</strong> where the right sequence depends on the left elementvar query = from left <strong>in</strong> Enumerable.Range(1, 4)from right <strong>in</strong> Enumerable.Range(11, left)select new { Left=left, Right=right };foreach (var pair <strong>in</strong> query){Console.WriteL<strong>in</strong>e("Left={0}; Right={1}",pair.Left, pair.Right);}Licensed to Rhona Hadida


Jo<strong>in</strong>s305from user <strong>in</strong>SampleData.AllUsersUser: { Name="Tim Trotter" ... }User: { Name="Tara Tutu" ... }User: { Name="Dave Denton" ... }User: { Name="Darren Dahlia" ...}User: { Name="Mary Malcop" ...}User: { Name="Col<strong>in</strong> Carton" ...}SampleData.AllProjectsfrom project <strong>in</strong>SampleData.AllProjectsProject: { Name="Skeety Media Player }Project: { Name="Skeety Talk" }Project: { Name="Skeety Office" }user=User (Tim Trotter), project=Project (Media Player)user=User (Tim Trotter), project=Project (Talk)user=User (Tim Trotter), project=Project (Office)user=User (Tara Tutu), project=Project (Media Player)user=User (Tara Tutu), project=Project (Talk)user=User (Tara Tutu), project=Project (Office)...select new { User=user,Project=project }User=User (Tim Trotter), Project=Project (Media Player)User=User (Tim Trotter), Project=Project (Talk)User=User (Tim Trotter), Project=Project (Office)U ser=User (Tara Tutu), Project=Project (Media Player)User=User (Tara Tutu), Project=Project (Talk)User=User (Tara Tutu), Project=Project (Office)...(Result of query)Figure 11.8 Sequences from list<strong>in</strong>g 11.15, cross jo<strong>in</strong><strong>in</strong>g users and projects. All possiblecomb<strong>in</strong>ations are returned <strong>in</strong> the results.List<strong>in</strong>g 11.16 starts with a simple range of <strong>in</strong>tegers, 1 to 4. For each of those <strong>in</strong>tegers,we create another range, beg<strong>in</strong>n<strong>in</strong>g at 11 and hav<strong>in</strong>g as many elements as the orig<strong>in</strong>al<strong>in</strong>teger. By us<strong>in</strong>g multiple from clauses, the left sequence is jo<strong>in</strong>ed with each of thegenerated right sequences, result<strong>in</strong>g <strong>in</strong> this output:Left=1; Right=11Left=2; Right=11Left=2; Right=12Licensed to Rhona Hadida


306 CHAPTER 11 Query expressions and LINQ to ObjectsLeft=3; Right=11Left=3; Right=12Left=3; Right=13Left=4; Right=11Left=4; Right=12Left=4; Right=13Left=4; Right=14The method the compiler calls to generate this sequence is SelectMany. It takes a s<strong>in</strong>gle<strong>in</strong>put sequence (the left sequence <strong>in</strong> our term<strong>in</strong>ology), a delegate to generateanother sequence from any element of the left sequence, and a delegate to generate aresult element given an element of each of the sequences. Here’s the signature of themethod, aga<strong>in</strong> written as if it were an <strong>in</strong>stance method on IEnumerable:public IEnumerable SelectMany (Func collectionSelector,Func resultSelector)As with the other jo<strong>in</strong>s, if the part of the query expression follow<strong>in</strong>g the jo<strong>in</strong> is a selectclause, that projection is used as the f<strong>in</strong>al argument; otherwise, a transparent identifieris <strong>in</strong>troduced to make both the left and right sequences’ range variables available.Just to make this all a bit more concrete, here’s the query expression of list<strong>in</strong>g 11.16,as the translated source code:Enumerable.Range(1, 4).SelectMany (left => Enumerable.Range(11, left),(left, right) => new {Left=left,Right=right})One <strong>in</strong>terest<strong>in</strong>g feature of SelectMany is that the execution is completely streamed—it only needs to process one element of each sequence at a time, because it uses afreshly generated right sequence for each different element of the left sequence.Compare this with <strong>in</strong>ner jo<strong>in</strong>s and group jo<strong>in</strong>s: they both load the right sequencecompletely before start<strong>in</strong>g to return any results. You should bear <strong>in</strong> m<strong>in</strong>d theexpected size of sequence, and how expensive it might be to evaluate it multipletimes, when consider<strong>in</strong>g which type of jo<strong>in</strong> to use and which sequence to use as theleft and which as the right.This behavior of flatten<strong>in</strong>g a sequence of sequences, one produced from each element<strong>in</strong> an orig<strong>in</strong>al sequence, can be very useful. Consider a situation where youmight want to process a lot of log files, a l<strong>in</strong>e at a time. We can process a seamlesssequence of l<strong>in</strong>es, with barely any work. The follow<strong>in</strong>g pseudo-code is filled <strong>in</strong> morethoroughly <strong>in</strong> the downloadable source code, but the overall mean<strong>in</strong>g and usefulnessshould be clear:var query = from file <strong>in</strong> Directory.GetFiles(logDirectory, "*.log")from l<strong>in</strong>e <strong>in</strong> new FileL<strong>in</strong>eReader(file)let entry = new LogEntry(l<strong>in</strong>e)where entry.Type == EntryType.Errorselect entry;Licensed to Rhona Hadida


Group<strong>in</strong>gs and cont<strong>in</strong>uations307In just five l<strong>in</strong>es of code we have retrieved, parsed, and filtered a whole collection oflog files, return<strong>in</strong>g a sequence of entries represent<strong>in</strong>g errors. Crucially, we haven’t hadto load even a s<strong>in</strong>gle full log file <strong>in</strong>to memory all <strong>in</strong> one go, let alone all of the files—all the data is streamed.Hav<strong>in</strong>g tackled jo<strong>in</strong>s, the last items we need to look at are slightly easier to understand.We’re go<strong>in</strong>g to look at group<strong>in</strong>g elements by a key, and cont<strong>in</strong>u<strong>in</strong>g a queryexpression after a group … by or select clause.11.6 Group<strong>in</strong>gs and cont<strong>in</strong>uationsOne common requirement is to group a sequence of elements by one of its properties.LINQ makes this easy with the group … by clause. As well as describ<strong>in</strong>g this f<strong>in</strong>altype of clause, we’ll also revisit our earliest one (select) to see a feature called querycont<strong>in</strong>uations that can be applied to both group<strong>in</strong>gs and projections. Let’s start with asimple group<strong>in</strong>g.11.6.1 Group<strong>in</strong>g with the group … by clauseGroup<strong>in</strong>g is largely <strong>in</strong>tuitive, and LINQ makes it simple. To group a sequence <strong>in</strong> aquery expression, all you need to do is use the group … by clause, with this syntax:group projection by group<strong>in</strong>gThis clause comes at the end of a query expression <strong>in</strong> the same way a select clausedoes. The similarities between these clauses don’t end there: the projection expressionis the same k<strong>in</strong>d of projection a select clause uses. The outcome is somewhat different,however.The group<strong>in</strong>g expression determ<strong>in</strong>es what the sequence is grouped by—the key ofthe group<strong>in</strong>g. The overall result is a sequence where each element is itself a sequenceof projected elements, and also has a Key property, which is the key for that group;this comb<strong>in</strong>ation is encapsulated <strong>in</strong> the IGroup<strong>in</strong>g <strong>in</strong>terface, whichextends IEnumerable.Let’s have a look at a simple example from the SkeetySoft defect system: group<strong>in</strong>gdefects by their current assignee. List<strong>in</strong>g 11.17 does this with the simplest form of projection,so that the result<strong>in</strong>g sequence has the assignee as the key, and a sequence ofdefects embedded <strong>in</strong> each entry.List<strong>in</strong>g 11.17Group<strong>in</strong>g defects by assignee—trivial projectionvar query = from defect <strong>in</strong> SampleData.AllDefectswhere defect.AssignedTo != nullgroup defect by defect.AssignedTo;foreach (var entry <strong>in</strong> query){Console.WriteL<strong>in</strong>e(entry.Key.Name);foreach (var defect <strong>in</strong> entry){Console.WriteL<strong>in</strong>e(" ({0}) {1}",Uses key of eachentry: the assigneeEDBFilters outunassigned defectsGroups byassigneeCIterates over entry’ssubsequenceLicensed to Rhona Hadida


308 CHAPTER 11 Query expressions and LINQ to Objects}}Console.WriteL<strong>in</strong>e();defect.Severity,defect.Summary);List<strong>in</strong>g 11.17 might be useful <strong>in</strong> a daily build report, to quickly see what defects eachperson needs to look at. We’ve filtered out all the defects that don’t need any moreattention B and then grouped us<strong>in</strong>g the AssignedTo property. Although this timewe’re just us<strong>in</strong>g a property, the group<strong>in</strong>g expression can be anyth<strong>in</strong>g you like—it’s justapplied to each entry <strong>in</strong> the <strong>in</strong>com<strong>in</strong>g sequence, and the sequence is grouped basedon the result of the expression. Note that group<strong>in</strong>g cannot stream the results,although it streams the <strong>in</strong>put, apply<strong>in</strong>g the key selection and projection to each elementand buffer<strong>in</strong>g the grouped sequences of projected elements.The projection we’ve applied <strong>in</strong> the group<strong>in</strong>g C is trivial—it just selects the orig<strong>in</strong>alelement. As we go through the result<strong>in</strong>g sequence, each entry has a Key property,which is of type User D, and each entry also implements IEnumerable,which is the sequence of defects assigned to that user E.The results of list<strong>in</strong>g 11.17 start like this:Darren Dahlia(Showstopper) MP3 files crash system(Major) Can't play files more than 200 bytes long(Major) DivX is choppy on Pentium 100(Trivial) User <strong>in</strong>terface should be more caramellyAfter all of Darren’s defects have been pr<strong>in</strong>ted out, we see Tara’s, then Tim’s, and soon. The implementation effectively keeps a list of the assignees it’s seen so far, andadds a new one every time it needs to. Figure 11.9 shows the sequences generatedthroughout the query expression, which may make this order<strong>in</strong>g clearer.With<strong>in</strong> each entry’s subsequence, the order of the defects is the same as the orderof the orig<strong>in</strong>al defect sequence. If you actively care about the order<strong>in</strong>g, considerexplicitly stat<strong>in</strong>g it <strong>in</strong> the query expression, to make it more readable.If you run list<strong>in</strong>g 11.17, you’ll see that Mary Malcop doesn’t appear <strong>in</strong> the output atall, because she doesn’t have any defects assigned to her. If you wanted to produce afull list of users and defects assigned to each of them, you’d need to use a group jo<strong>in</strong>like the one used <strong>in</strong> list<strong>in</strong>g 11.14.The compiler always uses a method called GroupBy for group<strong>in</strong>g clauses. When theprojection <strong>in</strong> a group<strong>in</strong>g clause is trivial—<strong>in</strong> other words, when each entry <strong>in</strong> the orig<strong>in</strong>alsequence maps directly to the exact same object <strong>in</strong> a subsequence—the compileruses a simple method call, which just needs the group<strong>in</strong>g expression, so it knows howto map each element to a key. For <strong>in</strong>stance, the query expression <strong>in</strong> list<strong>in</strong>g 11.17 istranslated <strong>in</strong>to this nonquery expression:SampleData.AllDefects.Where(defect => defect.AssignedTo != null).GroupBy(defect => defect.AssignedTo)Licensed to Rhona Hadida


Group<strong>in</strong>gs and cont<strong>in</strong>uations309from defect <strong>in</strong>SampleData.AllDefectsDefect: { ID=1, AssignedTo=Darren ...}Defect: { ID=2, AssignedTo=null ...}Defect: { ID=3, AssignedTo=Tara ...}Defect: { ID=4, AssignedTo=Darren ...}Defect: { ID=5, AssignedTo=Tim ...}Defect: { ID=6, AssignedTo=Darren ...}...where defect.AssignedTo != nullDefect: { ID=1, AssignedTo=Darren ...}Defect: { ID=3, AssignedTo=Tara ...}Defect: { ID=4, AssignedTo=Darren ...}Defect: { ID=5, AssignedTo=Tim ...}Defect: { ID=6, AssignedTo=Darren ...}...group defect by defect.AssignedToKey=DarrenKey=TaraKey=TimDefect: { ID=1, AssignedTo=Darren ...}Defect: { ID=4, AssignedTo=Darren ...}Defect: { ID=6, AssignedTo=Darren ...}...Defect: { ID=3, AssignedTo=Tara ...}Defect: { ID=13, AssignedTo=Tara ...}...Defect: { ID=5, AssignedTo=Tim ...}Defect: { ID=8, AssignedTo=Tim ...}...(Result of query)Figure 11.9 Sequences used when group<strong>in</strong>g defects by assignee. Eachentry of the result has a Key property and is also a sequence of defect entries.When the projection is nontrivial, a slightly more complicated version is used. List<strong>in</strong>g11.18 gives an example of a projection so that we only capture the summary ofeach defect rather than the Defect object itself.List<strong>in</strong>g 11.18Group<strong>in</strong>g defects by assignee—projection reta<strong>in</strong>s just the summaryvar query = from defect <strong>in</strong> SampleData.AllDefectswhere defect.AssignedTo != nullgroup defect.Summary by defect.AssignedTo;foreach (var entry <strong>in</strong> query)Licensed to Rhona Hadida


310 CHAPTER 11 Query expressions and LINQ to Objects{}Console.WriteL<strong>in</strong>e(entry.Key.Name);foreach (var summary <strong>in</strong> entry){Console.WriteL<strong>in</strong>e(" {0}", summary);}Console.WriteL<strong>in</strong>e();I’ve highlighted the differences between list<strong>in</strong>g 11.18 and list<strong>in</strong>g 11.17 <strong>in</strong> bold. Hav<strong>in</strong>gprojected a defect to just its summary, the embedded sequence <strong>in</strong> each entry is just anIEnumerable. In this case, the compiler uses an overload of GroupBy withanother parameter to represent the projection. The query expression <strong>in</strong> list<strong>in</strong>g 11.18is translated <strong>in</strong>to the follow<strong>in</strong>g expression:SampleData.AllDefects.Where(defect => defect.AssignedTo != null).GroupBy(defect => defect.AssignedTo,defect => defect.Summary)There are more complex overloads of GroupBy available as extension methods onIEnumerable, but they aren’t used by the <strong>C#</strong> 3 compiler when translat<strong>in</strong>g queryexpressions. You can call them manually, of course—if you f<strong>in</strong>d you want more powerfulgroup<strong>in</strong>g behavior than query expressions provide natively, then they’re worthlook<strong>in</strong>g <strong>in</strong>to.Group<strong>in</strong>g clauses are relatively simple but very useful. Even <strong>in</strong> our defect-track<strong>in</strong>gsystem, you could easily imag<strong>in</strong>e want<strong>in</strong>g to group defects by project, creator, severity,or status, as well as the assignee we’ve used for these examples.So far, we’ve ended each query expression with a select or group … by clause, andthat’s been the end of the expression. There are times, however, when you want to domore with the results—and that’s where query cont<strong>in</strong>uations are used.11.6.2 Query cont<strong>in</strong>uationsQuery cont<strong>in</strong>uations provide a way of us<strong>in</strong>g the result of one query expression as the<strong>in</strong>itial sequence of another. They apply to both group … by and select clauses, andthe syntax is the same for both—you simply use the contextual keyword <strong>in</strong>to and thenprovide the name of a new range variable. That range variable can then be used <strong>in</strong> thenext part of the query expression.The <strong>C#</strong> 3 specification expla<strong>in</strong>s this <strong>in</strong> terms of a translation from one queryexpression to another, chang<strong>in</strong>g<strong>in</strong>tofirst-query <strong>in</strong>to identifiersecond-query-bodyfrom identifier <strong>in</strong> (first-query)second-query-bodyAn example will make this a lot clearer. Let’s go back to our group<strong>in</strong>g of defects byassignee, but this time imag<strong>in</strong>e we only want the count of the defects assigned to eachLicensed to Rhona Hadida


Group<strong>in</strong>gs and cont<strong>in</strong>uations311person. We can’t do that with the projection <strong>in</strong> the group<strong>in</strong>g clause, because that onlyapplies to each <strong>in</strong>dividual defect. We want to project an assignee and the sequence oftheir defects <strong>in</strong>to the assignee and the count from that sequence, which is achievedus<strong>in</strong>g the code <strong>in</strong> list<strong>in</strong>g 11.19.List<strong>in</strong>g 11.19Cont<strong>in</strong>u<strong>in</strong>g a group<strong>in</strong>g with another projectionvar query = from defect <strong>in</strong> SampleData.AllDefectswhere defect.AssignedTo != nullgroup defect by defect.AssignedTo <strong>in</strong>to groupedselect new { Assignee=grouped.Key,Count=grouped.Count() };foreach (var entry <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0}: {1}",entry.Assignee.Name,entry.Count);}The changes to the query expression are highlighted <strong>in</strong> bold. We can use the groupedrange variable <strong>in</strong> the second part of the query, but the defect range variable is nolonger available—you can th<strong>in</strong>k of it as be<strong>in</strong>g out of scope. Our projection simply createsan anonymous type with Assignee and Count properties, us<strong>in</strong>g the key of eachgroup as the assignee, and count<strong>in</strong>g the sequence of defects associated with eachgroup. The results of list<strong>in</strong>g 11.19 are as follows:Darren Dahlia: 14Tara Tutu: 5Tim Trotter: 5Deborah Denton: 9Col<strong>in</strong> Carton: 2Follow<strong>in</strong>g the specification, the query expression from list<strong>in</strong>g 11.19 is translated <strong>in</strong>tothis one:from grouped <strong>in</strong> (from defect <strong>in</strong> SampleData.AllDefectswhere defect.AssignedTo != nullgroup defect by defect.AssignedTo)select new { Assignee=grouped.Key, Count=grouped.Count() }The rest of the translations are then performed, result<strong>in</strong>g <strong>in</strong> the follow<strong>in</strong>g code:SampleData.AllDefects.Where (defect => defect.AssignedTo != null).GroupBy(defect => defect.AssignedTo).Select(grouped => new { Assignee=grouped.Key,Count=grouped.Count() })An alternative way of understand<strong>in</strong>g cont<strong>in</strong>uations is to th<strong>in</strong>k of two separate statements.This isn’t as accurate <strong>in</strong> terms of the actual compiler translation, but I f<strong>in</strong>d itmakes it easier to see what’s go<strong>in</strong>g on. In this case, the query expression (and assignmentto the query variable) can be thought of as the follow<strong>in</strong>g two statements:Licensed to Rhona Hadida


312 CHAPTER 11 Query expressions and LINQ to Objectsvar tmp = from defect <strong>in</strong> SampleData.AllDefectswhere defect.AssignedTo != nullgroup defect by defect.AssignedTo;var query = from grouped <strong>in</strong> tmpselect new { Assignee=grouped.Key,Count=grouped.Count() };Of course, if you f<strong>in</strong>d this easier to read there’s noth<strong>in</strong>g to stop you from break<strong>in</strong>g upthe orig<strong>in</strong>al expression <strong>in</strong>to this form <strong>in</strong> your source code. Noth<strong>in</strong>g will be evaluateduntil you start try<strong>in</strong>g to step through the query results anyway, due to deferred execution.Let’s extend this example to see how multiple cont<strong>in</strong>uations can be used. Ourresults are currently unordered—let’s change that so we can see who’s got the mostdefects assigned to them first. We could use a let clause after the first cont<strong>in</strong>uation,but list<strong>in</strong>g 11.20 shows an alternative with a second cont<strong>in</strong>uation after our currentexpression.List<strong>in</strong>g 11.20Query expression cont<strong>in</strong>uations from group and selectvar query = from defect <strong>in</strong> SampleData.AllDefectswhere defect.AssignedTo != nullgroup defect by defect.AssignedTo <strong>in</strong>to groupedselect new { Assignee=grouped.Key,Count=grouped.Count() } <strong>in</strong>to resultorderby result.Count descend<strong>in</strong>gselect result;foreach (var entry <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0}: {1}",entry.Assignee.Name,entry.Count);}The changes between list<strong>in</strong>g 11.19 and 11.20 are highlighted <strong>in</strong> bold. We haven’t hadto change any of the output code as we’ve got the same type of sequence—we’ve justapplied an order<strong>in</strong>g to it. This time the translated query expression is as follows:SampleData.AllDefects.Where (defect => defect.AssignedTo != null).GroupBy(defect => defect.AssignedTo).Select(grouped => new { Assignee=grouped.Key,Count=grouped.Count() }).OrderByDescend<strong>in</strong>g(result => result.Count);By pure co<strong>in</strong>cidence, this is remarkably similar to the first defect track<strong>in</strong>g query wecame across, <strong>in</strong> section 10.3.5. Our f<strong>in</strong>al select clause effectively does noth<strong>in</strong>g, so the<strong>C#</strong> compiler ignores it. It’s required <strong>in</strong> the query expression, however, as all queryexpressions have to end with either a select or a group … by clause. There’s noth<strong>in</strong>gto stop you from us<strong>in</strong>g a different projection or perform<strong>in</strong>g other operations with thecont<strong>in</strong>ued query—jo<strong>in</strong>s, further group<strong>in</strong>gs, and so forth. Just keep an eye on the readabilityof the query expression as it grows.Licensed to Rhona Hadida


Summary31311.7 SummaryIn this chapter, we’ve looked at how LINQ to Objects and <strong>C#</strong> 3 <strong>in</strong>teract, focus<strong>in</strong>g onthe way that query expressions are first translated <strong>in</strong>to code that doesn’t <strong>in</strong>volve queryexpressions, then compiled <strong>in</strong> the usual way. We’ve seen how all query expressionsform a series of sequences, apply<strong>in</strong>g a transformation of some description at eachstep. In many cases these sequences are evaluated us<strong>in</strong>g deferred execution, fetch<strong>in</strong>gdata only when it is first required.Compared with all the other features of <strong>C#</strong> 3, query expressions look somewhatalien—more like SQL than the <strong>C#</strong> we’re used to. One of the reasons they look so oddis that they’re declarative <strong>in</strong>stead of imperative—a query talks about the features of theend result rather than the exact steps required to achieve it. This goes hand <strong>in</strong> handwith a more functional way of th<strong>in</strong>k<strong>in</strong>g. It can take a while to click, and it’s certa<strong>in</strong>lynot suitable for every situation, but where declarative syntax is appropriate it can vastlyimprove readability, as well as mak<strong>in</strong>g code easier to test and also easier to parallelize.Don’t be fooled <strong>in</strong>to th<strong>in</strong>k<strong>in</strong>g that LINQ should only be used with databases: pla<strong>in</strong><strong>in</strong>-memory manipulation of collections is common, and as we’ve seen it’s supportedvery well by query expressions and the extension methods <strong>in</strong> Enumerable.In a very real sense, you’ve seen all the new features of <strong>C#</strong> 3 now! Although wehaven’t looked at any other LINQ providers yet, we now have a clearer understand<strong>in</strong>gof what the compiler will do for us when we ask it to handle XML and SQL. The compileritself doesn’t know the difference between LINQ to Objects, LINQ to SQL, or anyof the other providers: it just follows the same rules bl<strong>in</strong>dly. In the next chapter we’llsee how these rules form the f<strong>in</strong>al piece of the LINQ jigsaw puzzle when they convertlambda expressions <strong>in</strong>to the expression trees so that the various clauses of queryexpressions can be executed on different platforms.Licensed to Rhona Hadida


LINQ beyond collectionsThis chapter covers■■■■■■LINQ to SQLBuild<strong>in</strong>g providers with IQueryableLINQ to DataSetLINQ to XMLThird party-LINQFuture Microsoft LINQ technologiesIn the previous chapter, we saw how LINQ to Objects works, with the <strong>C#</strong> 3 compilertranslat<strong>in</strong>g query expressions <strong>in</strong>to normal <strong>C#</strong> code, which for LINQ to Objects justhappens to call the extension methods present <strong>in</strong> the Enumerable class. Even withoutany other features, query expressions would have been useful for manipulat<strong>in</strong>gdata <strong>in</strong> memory. It probably wouldn’t have been worth the extra complexity <strong>in</strong> thelanguage, though. In reality, LINQ to Objects is just one aspect of the big picture.In this chapter we’ll take a whirlw<strong>in</strong>d tour of other LINQ providers and APIs.First we’ll look at LINQ to SQL, an Object Relational Mapp<strong>in</strong>g (ORM) solution fromMicrosoft that ships as part of .NET 3.5. After we’ve seen it work<strong>in</strong>g as if by magic,we’ll take a look at what’s happen<strong>in</strong>g beh<strong>in</strong>d the scenes, and how query expressionswritten <strong>in</strong> <strong>C#</strong> end up execut<strong>in</strong>g as SQL on the database.314Licensed to Rhona Hadida


LINQ to SQL315LINQ to DataSet and LINQ to XML are both frameworks that tackle exist<strong>in</strong>g problems(manipulat<strong>in</strong>g datasets and XML respectively) but do so <strong>in</strong> a LINQ-friendly fashion.Both of them use LINQ to Objects for the underly<strong>in</strong>g query support, but the APIshave been designed so that query expressions can be used to access data <strong>in</strong> a pa<strong>in</strong>lessand consistent manner.Hav<strong>in</strong>g covered the LINQ APIs shipped with .NET 3.5, we’ll take a peek at someother providers. Microsoft developers aren’t the only ones writ<strong>in</strong>g LINQ providers,and I’ll show a few third-party examples, before reveal<strong>in</strong>g what Microsoft has <strong>in</strong> storefor us with the ADO.NET Entity Framework and Parallel LINQ.This chapter is not meant to provide you with a comprehensive knowledge of us<strong>in</strong>gLINQ by any means: it’s a truly massive topic, and I’m only go<strong>in</strong>g to scratch the surfaceof each provider here. The purpose of the chapter is to give you a broad idea of whatLINQ is capable of, and how much easier it can make development. Hopefully there’llbe enough of a “wow” factor that you’ll want to study some or all of the providers further.To this end, there are great onl<strong>in</strong>e resources, particularly blog posts from the variousLINQ teams (see this book’s website for a list of l<strong>in</strong>ks), but I also thoroughlyrecommend LINQ <strong>in</strong> Action (Mann<strong>in</strong>g 2008).As well as understand<strong>in</strong>g LINQ itself, by the end of this chapter you should see howthe different pieces of the <strong>C#</strong> 3 feature set all fit together, and why they’re all present<strong>in</strong> the first place. Just as a rem<strong>in</strong>der, you shouldn’t expect to see any new features of <strong>C#</strong>at this po<strong>in</strong>t—we’ve covered them all <strong>in</strong> the previous chapters—but they may wellmake more sense when you see how they help to provide unified query<strong>in</strong>g over multipledata sources.The change <strong>in</strong> pace, from the detail of the previous chapters to the spr<strong>in</strong>t throughfeatures <strong>in</strong> this one, may be slightly alarm<strong>in</strong>g at first. Just relax and enjoy the ride,remember<strong>in</strong>g that the big picture is the important th<strong>in</strong>g here. There won’t be a testafterward, I promise. There’s a lot to cover, so let’s get crack<strong>in</strong>g with the most impressiveLINQ provider <strong>in</strong> .NET 3.5: LINQ to SQL.12.1 LINQ to SQLI’m sure by now you’ve absorbed the message that LINQ to SQL converts query expressions<strong>in</strong>to SQL, which is then executed on the database. There’s more to it than that,however—it’s a full ORM solution. In this section we’ll move our defect system <strong>in</strong>to aSQL Server 2005 database, populate it with the sample defects, query it, and update it.We won’t look at the details of how the queries are converted <strong>in</strong>to SQL until the nextsection, though: it’s easier to understand the mechanics once you’ve seen the endresult. Let’s start off by gett<strong>in</strong>g our database and entities up and runn<strong>in</strong>g.12.1.1 Creat<strong>in</strong>g the defect database and entitiesTo use LINQ to SQL, you need a database (obviously) and some classes represent<strong>in</strong>gthe entities. The classes have metadata associated with them to tell LINQ to SQL howthey map to database tables. This metadata can be built directly <strong>in</strong>to the classes us<strong>in</strong>gLicensed to Rhona Hadida


316 CHAPTER 12 LINQ beyond collectionsattributes, or specified with an XML file. To keep th<strong>in</strong>gs simple, we’re go<strong>in</strong>g to useattributes—and by us<strong>in</strong>g the designer built <strong>in</strong>to Visual Studio 2008, we won’t evenneed to specify the attributes ourselves. First, though, we need a database. It’s possibleto generate the database schema from the entities, but my personal preference is towork “database first.”CREATING THE SQL SERVER 2005 DATABASE SCHEMAThe mapp<strong>in</strong>g from the classes we had before to SQL Server 2005 database tables isstraightforward. Each table has an auto<strong>in</strong>crement<strong>in</strong>g <strong>in</strong>teger ID column, with anappropriate name: ProjectID, DefectID, and so forth. The references between tablessimply use the same name, so the Defect table has a ProjectID column, for <strong>in</strong>stance,with a foreign key constra<strong>in</strong>t. There are a few exceptions to this simple set of rules:■■■User is a reserved word <strong>in</strong> T-SQL, so the User class is mapped to the DefectUsertable.The enumerations (status, severity, and user type) don’t have tables: their valuesare simply mapped to t<strong>in</strong>y<strong>in</strong>t columns <strong>in</strong> the Defect and DefectUser tables.The Defect table has two l<strong>in</strong>ks to the DefectUser table, one for the user whocreated the defect and one for the current assignee. These are represented withthe CreatedByUserId and AssignedToUserId columns, respectively.The database is available as part of the downloadable source code so that you can useit with SQL Server 2005 Express yourself. If you leave the files <strong>in</strong> the same directorystructure that they come <strong>in</strong>, you won’t even need to change the connection str<strong>in</strong>gwhen you use the sample code.CREATING THE ENTITY CLASSESOnce our tables are created, creat<strong>in</strong>g the entity classes from Visual Studio 2008 iseasy. Simply open Server Explorer (View, Server Explorer) and add a data source tothe SkeetySoftDefects database (right-click on Data Connections and select Add Connection).You should be able to see four tables: Defect, DefectUser, Project, andNotificationSubscription.In a <strong>C#</strong> project target<strong>in</strong>g .NET 3.5, you should then be able to add a new item oftype “LINQ to SQL classes.” When choos<strong>in</strong>g a name for the item, bear <strong>in</strong> m<strong>in</strong>d thatamong the classes it will create for you, there will be one with the selected name followedby DataContext—this is go<strong>in</strong>g to be an important class, so choose the namecarefully. Visual Studio 2008 doesn’t make it terribly easy to refactor this after you’vecreated it, unfortunately. I chose DefectModel—so the data context class is calledDefectModelDataContext.The designer will open when you’ve created the new item. You can then drag thefour tables from Server Explorer <strong>in</strong>to the designer, and it will figure out all the associations.After that, you can rearrange the diagram, and adjust various properties of theentities. Here’s a list of what I changed:■■I renamed the DefectID property to ID to match our previous model.I renamed DefectUser to User (so although the table is still called DefectUser,we’ll generate a class called User, just like before).Licensed to Rhona Hadida


LINQ to SQL317■ I changed the type of the Severity, Status, and UserType properties to theirenum equivalents (hav<strong>in</strong>g copied those enumerations <strong>in</strong>to the project).■ I renamed the parent and child properties used for the associations betweenDefect and DefectUser—the designer guessed suitable names for the otherassociations, but had trouble here because there were two associations betweenthe same pair of tables.Figure 12.1 shows the designer diagram after all of these changes.As you can see, the model <strong>in</strong> figure 12.1 is exactly the same as the code modelshown <strong>in</strong> figure 11.3, except without the enumerations. The public <strong>in</strong>terface is so similarthat we can create <strong>in</strong>stances of our entities with the same sample data code. If youlook <strong>in</strong> the <strong>C#</strong> code generated by the designer (DefectModel.designer.cs), you’llf<strong>in</strong>d five partial classes: one for each of the entities, and the DefectModelDataContextclass I mentioned earlier. The fact that they’re partial classes is important when itcomes to mak<strong>in</strong>g the sample data creation code work seamlessly. I had created a constructorfor User, which took the name and user type as parameters. By creat<strong>in</strong>ganother file conta<strong>in</strong><strong>in</strong>g a partial User class, we can add that constructor to our modelaga<strong>in</strong>, as shown <strong>in</strong> list<strong>in</strong>g 12.1.Figure 12.1 The LINQ to SQL classes designershow<strong>in</strong>g the rearranged and modified entitiesLicensed to Rhona Hadida


318 CHAPTER 12 LINQ beyond collectionsList<strong>in</strong>g 12.1Add<strong>in</strong>g a constructor to the User entity classpublic partial class User{public User (str<strong>in</strong>g name, UserType userType): this(){Name = name;UserType = userType;}}It’s important to call the parameterless constructor from our new one—the generatedcode <strong>in</strong>itializes some members there. In the same way as we added a constructor, wecan override ToStr<strong>in</strong>g <strong>in</strong> all the entities, so that the results will be exactly as they were<strong>in</strong> chapter 11. The generated code conta<strong>in</strong>s many partial method declarations, so youcan easily react to various events occurr<strong>in</strong>g on particular <strong>in</strong>stances.At this po<strong>in</strong>t, we can simply copy over the SampleData.cs file from chapter 11 andbuild the project. Now for the tricky bit—copy<strong>in</strong>g the sample data <strong>in</strong>to the databasefrom the <strong>in</strong>-memory version.12.1.2 Populat<strong>in</strong>g the database with sample dataOK, I lied. Populat<strong>in</strong>g the database is ludicrously easy. List<strong>in</strong>g 12.2 shows just how simpleit is.List<strong>in</strong>g 12.2Populat<strong>in</strong>g the database with our sample dataus<strong>in</strong>g (var context = new DefectModelDataContext()){context.Log = Console.Out; C Enables console logg<strong>in</strong>g Bcontext.Users.InsertAllOnSubmit(SampleData.AllUsers);context.Projects.InsertAllOnSubmit(SampleData.AllProjects);context.Defects.InsertAllOnSubmit(SampleData.AllDefects);context.NotificationSubscriptions.InsertAllOnSubmit(SampleData.AllSubscriptions);Populatescontext.SubmitChanges();Flushes changes entities}to databaseECreatescontext towork <strong>in</strong>I’m sure you’ll agree that’s not a lot of code—but all of it is new. Let’s take it one stepat a time. First we create a new data context to work with B. Data contexts are prettymultifunctional, tak<strong>in</strong>g responsibility for connection and transaction management,query translation, track<strong>in</strong>g changes <strong>in</strong> entities, and deal<strong>in</strong>g with identity. For the purposesof this chapter, we can regard a data context as our po<strong>in</strong>t of contact with thedatabase. We won’t be look<strong>in</strong>g at the more advanced features here, but there’s oneuseful capability we’ll take advantage of: at C we tell the data context to write out allthe SQL commands it executes to the console.The four statements at D add the sample entities to the context. The four propertiesof the context (Users, Projects, Defects, and NotificationSubscriptions) areDLicensed to Rhona Hadida


LINQ to SQL319each of type Table for the correspond<strong>in</strong>g type T (User, Project, and so on). Weaccess particular entities via these tables—<strong>in</strong> this case, we’re add<strong>in</strong>g our sample datato the tables.At this po<strong>in</strong>t noth<strong>in</strong>g has actually been sent “across the wire” to the database—it’sjust <strong>in</strong> the context, <strong>in</strong> memory. All the associations <strong>in</strong> our model have been generatedbi-directionally (a user entity knows all the defects currently assigned to it, as well asthe defect know<strong>in</strong>g the user it’s assigned to, for example), which means that we couldhave just called InsertAllOnSubmit once, with any of the entity types, and everyth<strong>in</strong>gelse would have cascaded. However, I’ve explicitly added all the entities here for clarity.The data context makes sure that everyth<strong>in</strong>g is <strong>in</strong>serted once, only once, and <strong>in</strong> anappropriate order to avoid constra<strong>in</strong>t violations.Statement E is where the SQL is actually executed—it’s only here that we see anylog entries <strong>in</strong> the console, too. SubmitChanges is the equivalent of DataAdapter.Update from ADO.NET 1.0—it calls all the necessary INSERT, DELETE, and UPDATE commandson the actual database.Runn<strong>in</strong>g list<strong>in</strong>g 12.2 multiple times will <strong>in</strong>sert the data multiple times too. Thereare many ways of clean<strong>in</strong>g up the database before we start populat<strong>in</strong>g it. We could askthe data context to delete the database and re-create it, assum<strong>in</strong>g we have enoughsecurity permissions: the metadata captured by the designer conta<strong>in</strong>s all the <strong>in</strong>formationrequired for simple cases like ours.Alternatively, we can delete the exist<strong>in</strong>g data. This can either be done with a bulkdelete statement, or by fetch<strong>in</strong>g all the exist<strong>in</strong>g entities and ask<strong>in</strong>g LINQ to SQL todelete them <strong>in</strong>dividually. Clearly delet<strong>in</strong>g <strong>in</strong> bulk is more efficient, but LINQ to SQLdoesn’t provide any mechanism to do this without resort<strong>in</strong>g to a direct SQL command.The ExecuteCommand on DataContext makes this easy to accomplish without worry<strong>in</strong>gabout connection management, but it certa<strong>in</strong>ly sidesteps many of the benefits of us<strong>in</strong>gan ORM solution to start with. It’s nice to be able to execute arbitrary SQL where necessary,but you should only do so when there’s a compell<strong>in</strong>g reason.We won’t exam<strong>in</strong>e the code for any of these methods of wip<strong>in</strong>g the database clean,but they’re all available as part of the source code you can download from the website.So far, we’ve seen noth<strong>in</strong>g that isn’t available <strong>in</strong> a normal ORM system. What makesLINQ to SQL different is the query<strong>in</strong>g…12.1.3 Access<strong>in</strong>g the database with query expressionsI’m sure you’ve guessed what’s com<strong>in</strong>g, but hopefully that won’t make it any less impressive.We’re go<strong>in</strong>g to execute query expressions aga<strong>in</strong>st our data source, watch<strong>in</strong>g LINQto SQL convert the query <strong>in</strong>to SQL on the fly. For the sake of familiarity, we’ll use someof the same queries we saw execut<strong>in</strong>g aga<strong>in</strong>st our <strong>in</strong>-memory collections <strong>in</strong> chapter 11.FIRST QUERY: FINDING DEFECTS ASSIGNED TO TIMI’ll skip over the trivial examples from early <strong>in</strong> the chapter, start<strong>in</strong>g <strong>in</strong>stead with thequery from list<strong>in</strong>g 11.7 that checks for open defects assigned to Tim. Here’s the querypart of list<strong>in</strong>g 11.7, for the sake of comparison:Licensed to Rhona Hadida


320 CHAPTER 12 LINQ beyond collectionsUser tim = SampleData.Users.TesterTim;var query = from defect <strong>in</strong> SampleData.AllDefectswhere defect.Status != Status.Closedwhere defect.AssignedTo == timselect defect.Summary;The LINQ to SQL equivalent is shown <strong>in</strong> list<strong>in</strong>g 12.3.List<strong>in</strong>g 12.3Query<strong>in</strong>g the database to f<strong>in</strong>d all Tim’s open defectsus<strong>in</strong>g (var context = new DefectModelDataContext()){context.Log = Console.Out;User tim = (from user <strong>in</strong> context.Userswhere user.Name=="Tim Trotter"select user).S<strong>in</strong>gle();var query = from defect <strong>in</strong> context.Defectswhere defect.Status != Status.Closedwhere defect.AssignedTo == timselect defect.Summary;Creates contextto work withQueries databaseto f<strong>in</strong>d TimQueries databaseto f<strong>in</strong>d Tim’s opendefects}foreach (var summary <strong>in</strong> query){Console.WriteL<strong>in</strong>e(summary);}We can’t use SampleData.Users.TesterTim <strong>in</strong> the ma<strong>in</strong> query because that objectdoesn’t know the ID of Tim’s row <strong>in</strong> the DefectUser table. Instead, we use one query toload Tim’s user entity, and then a second query to f<strong>in</strong>d the open defects. The S<strong>in</strong>glemethod call at the end of the query expression just returns a s<strong>in</strong>gle result from asequence, throw<strong>in</strong>g an exception if there isn’t exactly one element. In a real-life situation,you may well have the entity as a product of other operations such as logg<strong>in</strong>g <strong>in</strong>—and if you don’t have the full entity, you may well have its ID, which can be used equallywell with<strong>in</strong> the ma<strong>in</strong> query.With<strong>in</strong> the second query expression, the only difference between the <strong>in</strong>-memoryquery and the LINQ to SQL query is the data source—<strong>in</strong>stead of us<strong>in</strong>g Sample-Data.Defects, we use context.Defects. The f<strong>in</strong>al results are the same (although theorder<strong>in</strong>g isn’t guaranteed), but the work has been done on the database. The consoleoutput shows both of the queries executed on the database, along with the queryparameter values: 1SELECT [t0].[UserID], [t0].[Name], [t0].[UserType]FROM [dbo].[DefectUser] AS [t0]WHERE [t0].[Name] = @p0-- @p0: Input Str<strong>in</strong>g (Size = 11; Prec = 0; Scale = 0) [Tim Trotter]SELECT [t0].[Summary]1Additional log output is generated show<strong>in</strong>g some details of the data context, which I’ve cut to avoid distract<strong>in</strong>gfrom the SQL.Licensed to Rhona Hadida


LINQ to SQL321FROM [dbo].[Defect] AS [t0]WHERE ([t0].[AssignedToUserID] = @p0) AND ([t0].[Status] @p1)-- @p0: Input Int32 (Size = 0; Prec = 0; Scale = 0) [2]-- @p1: Input Int32 (Size = 0; Prec = 0; Scale = 0) [4]Notice how the first query fetches all of the properties of the user because we’re populat<strong>in</strong>ga whole entity—but the second query only fetches the summary as that’s all weneed. LINQ to SQL has also converted our two separate where clauses <strong>in</strong> the secondquery <strong>in</strong>to a s<strong>in</strong>gle filter on the database.NOTEAn alternative to console logg<strong>in</strong>g: the debug visualizer—For a more <strong>in</strong>teractiveview <strong>in</strong>to LINQ to SQL queries, you can use the debug visualizer, whichScott Guthrie has made available. This shows the SQL correspond<strong>in</strong>g tothe query, and allows you to execute and even edit it manually with<strong>in</strong> thedebugger. It’s free, and <strong>in</strong>cludes source code: http://weblogs.asp.net/scottgu/archive/2007/07/31/l<strong>in</strong>q-to-sql-debug-visualizer.aspxLINQ to SQL is capable of translat<strong>in</strong>g a wide range of expressions. Let’s look at somemore examples from chapter 11, just to see what SQL is generated.SQL GENERATION FOR A MORE COMPLEX QUERY: A LET CLAUSEOur next query shows what happens when we <strong>in</strong>troduce a sort of “temporary variable”with a let clause. In chapter 11 we considered quite a bizarre situation, if you remember—pretend<strong>in</strong>gthat calculat<strong>in</strong>g the length of a str<strong>in</strong>g took a long time. Aga<strong>in</strong>, thequery expression is exactly the same as <strong>in</strong> list<strong>in</strong>g 11.11, with the exception of the datasource. List<strong>in</strong>g 12.4 shows the LINQ to SQL code.List<strong>in</strong>g 12.4Us<strong>in</strong>g a let clause <strong>in</strong> LINQ to SQLus<strong>in</strong>g (var context = new DefectModelDataContext()){context.Log = Console.Out;}var query = from user <strong>in</strong> context.Userslet length = user.Name.Lengthorderby lengthselect new { Name = user.Name, Length = length };foreach (var entry <strong>in</strong> query){Console.WriteL<strong>in</strong>e("{0}: {1}", entry.Length, entry.Name);}The generated SQL is very close to the spirit of the sequences we saw <strong>in</strong> figure 11.5—the <strong>in</strong>nermost sequence (the first one <strong>in</strong> the diagram) is the list of users; that’s transformed<strong>in</strong>to a sequence of name/length pairs (as the nested select), and then theno-op projection is applied, with an order<strong>in</strong>g by length:SELECT [t1].[Name], [t1].[value]FROM (SELECT LEN([t0].[Name]) AS [value], [t0].[Name]Licensed to Rhona Hadida


322 CHAPTER 12 LINQ beyond collectionsFROM [dbo].[DefectUser] AS [t0]) AS [t1]ORDER BY [t1].[value]This is a good example of where the generated SQL is wordier than it needs to be.Although we couldn’t reference the elements of the f<strong>in</strong>al output sequence when perform<strong>in</strong>gan order<strong>in</strong>g on the query expression, you can <strong>in</strong> SQL. This simpler querywould have worked f<strong>in</strong>e:SELECT LEN([t0].[Name]) AS [value], [t0].[Name]FROM [dbo].[DefectUser] AS [t0]ORDER BY [value]Of course, what’s important is what the query optimizer does on the database—theexecution plan displayed <strong>in</strong> SQL Server Management Studio Express is the same forboth queries, so it doesn’t look like we’re los<strong>in</strong>g out.Next we’ll have a look at a couple of the jo<strong>in</strong>s we used <strong>in</strong> chapter 11.EXPLICIT JOINS: MATCHING DEFECTS WITH NOTIFICATION SUBSCRIPTIONSWe’ll try both <strong>in</strong>ner jo<strong>in</strong>s and group jo<strong>in</strong>s, us<strong>in</strong>g the examples of jo<strong>in</strong><strong>in</strong>g notificationsubscriptions aga<strong>in</strong>st projects. I suspect you’re used to the drill now—the pattern ofthe code is the same for each query, so from here on I’ll just show the query expressionand the generated SQL unless someth<strong>in</strong>g else is go<strong>in</strong>g on.// Query expression (modified from list<strong>in</strong>g 11.12)from defect <strong>in</strong> context.Defectsjo<strong>in</strong> subscription <strong>in</strong> context.NotificationSubscriptionson defect.Project equals subscription.Projectselect new { defect.Summary, subscription.EmailAddress }-- Generated SQLSELECT [t0].[Summary], [t1].[EmailAddress]FROM [dbo].[Defect] AS [t0]INNER JOIN [dbo].[NotificationSubscription] AS [t1]ON [t0].[ProjectID] = [t1].[ProjectID]Unsurpris<strong>in</strong>gly, it uses an <strong>in</strong>ner jo<strong>in</strong> <strong>in</strong> SQL. It would be easy to guess at the generatedSQL <strong>in</strong> this case. How about a group jo<strong>in</strong>, though? Well, this is where th<strong>in</strong>gs getslightly more hectic:// Query expression (modified from list<strong>in</strong>g 11.13)from defect <strong>in</strong> context.Defectsjo<strong>in</strong> subscription <strong>in</strong> context.NotificationSubscriptionson defect.Project equals subscription.Project<strong>in</strong>to groupedSubscriptionsselect new { Defect = defect, Subscriptions = groupedSubscriptions }-- Generated SQLSELECT [t0].[DefectID] AS [ID], [t0].[Created],[t0].[LastModified], [t0].[Summary], [t0].[Severity],[t0].[Status], [t0].[AssignedToUserID],[t0].[CreatedByUserID], [t0].[ProjectID],[t1].[NotificationSubscriptionID],[t1].[ProjectID] AS [ProjectID2], [t1].[EmailAddress],(SELECT COUNT(*)Licensed to Rhona Hadida


LINQ to SQL323FROM [dbo].[NotificationSubscription] AS [t2]WHERE [t0].[ProjectID] = [t2].[ProjectID]) AS [count]FROM [dbo].[Defect] AS [t0]LEFT OUTER JOIN [dbo].[NotificationSubscription] AS [t1]ON [t0].[ProjectID] = [t1].[ProjectID]ORDER BY [t0].[DefectID], [t1].[NotificationSubscriptionID]That’s a pretty major change <strong>in</strong> the amount of SQL generated! There are two importantth<strong>in</strong>gs to notice. First, it uses a left outer jo<strong>in</strong> <strong>in</strong>stead of an <strong>in</strong>ner jo<strong>in</strong>, so we wouldstill see a defect even if it didn’t have anyone subscrib<strong>in</strong>g to its project. If you want aleft outer jo<strong>in</strong> but without the group<strong>in</strong>g, the conventional way of express<strong>in</strong>g this is touse a group jo<strong>in</strong> and then an extra from clause us<strong>in</strong>g the DefaultIfEmpty extensionmethod on the embedded sequence. It looks quite odd, but it works well. See the samplesource code for this chapter on the book’s website for more details.The second odd th<strong>in</strong>g about the previous query is that it calculates the count foreach group with<strong>in</strong> the database. This is effectively a trick performed by LINQ to SQL tomake sure that all the process<strong>in</strong>g can be done on the server. A naive implementationwould have to perform the group<strong>in</strong>g <strong>in</strong> memory, after fetch<strong>in</strong>g all the results. In somecases the provider could do tricks to avoid need<strong>in</strong>g the count, simply spott<strong>in</strong>g whenthe group<strong>in</strong>g ID changes, but there are issues with this approach for some queries. It’spossible that a later implementation of LINQ to SQL will be able to switch courses ofaction depend<strong>in</strong>g on the exact query.You don’t need to explicitly write a jo<strong>in</strong> <strong>in</strong> the query expression to see one <strong>in</strong> theSQL, however. We’re able to express our query <strong>in</strong> an object-oriented way, even thoughit will be converted <strong>in</strong>to SQL. Let’s see this <strong>in</strong> action.IMPLICIT JOINS: SHOWING DEFECT SUMMARIES AND PROJECT NAMESLet’s take a simple example. Suppose we want to list each defect, show<strong>in</strong>g its summaryand the name of the project it’s part of. The query expression is just a matterof a projection:// Query expressionfrom defect <strong>in</strong> context.Defectsselect new { defect.Summary, ProjectName=defect.Project.Name }-- Generated SQLSELECT [t0].[Summary], [t1].[Name]FROM [dbo].[Defect] AS [t0]INNER JOIN [dbo].[Project] AS [t1]ON [t1].[ProjectID] = [t0].[ProjectID]Notice how we’ve navigated from the defect to the project via a property—LINQ toSQL has converted that navigation <strong>in</strong>to an <strong>in</strong>ner jo<strong>in</strong>. It’s able to use an <strong>in</strong>ner jo<strong>in</strong>here because the schema has a non-nullable constra<strong>in</strong>t on the ProjectID column ofthe Defect table—every defect has a project. Not every defect has an assignee, however—theAssignedToUserID field is nullable, so if we use the assignee <strong>in</strong> a projection<strong>in</strong>stead, a left outer jo<strong>in</strong> is generated:// Query expressionfrom defect <strong>in</strong> context.Defectsselect new { defect.Summary, Assignee=defect.AssignedTo.Name }Licensed to Rhona Hadida


324 CHAPTER 12 LINQ beyond collections-- Generated SQLSELECT [t0].[Summary], [t1].[Name]FROM [dbo].[Defect] AS [t0]LEFT OUTER JOIN [dbo].[DefectUser] AS [t1]ON [t1].[UserID] = [t0].[AssignedToUserID]Of course, if you navigate via more properties, the jo<strong>in</strong>s get more and more complicated.I’m not go<strong>in</strong>g <strong>in</strong>to the details here—the important th<strong>in</strong>g is that LINQ to SQLhas to do a lot of analysis of the query expression to work out what SQL is required.Before we leave LINQ to SQL, I ought to show you one more feature. It’s part ofwhat you’d expect from any decent ORM system, but leav<strong>in</strong>g it out would just feelwrong. Let’s update some values <strong>in</strong> our database.12.1.4 Updat<strong>in</strong>g the databaseAlthough <strong>in</strong>sertions are straightforward, updates can be handled <strong>in</strong> a variety of ways,depend<strong>in</strong>g on how concurrency is configured. If you’ve done any serious databasework you’ll know that handl<strong>in</strong>g conflicts <strong>in</strong> updates from different users at the sametime is quite hairy—and I’m not go<strong>in</strong>g to open that particular can of worms here. I’lljust show you how easy it is to persist a changed entity when there are no conflicts.Let’s change the status of one of our defects, and its assignee, and that person’sname, all <strong>in</strong> one go. As it happens, I know that the defect with an ID of 1 (as createdon a clean system) is a bug that was created by Tim, and is currently <strong>in</strong> an “accepted”state, assigned to Darren. We’ll imag<strong>in</strong>e that Darren has now fixed the bug, andassigned it back to Tim. At the same time, Tim has decided he wants to be a bit moreformal, so we’ll change his name to Timothy. Oh, and we should remember to updatethe “last modified” field of the defect too. (In a real system, we’d probably handle thatwith a trigger—<strong>in</strong> LINQ to SQL we could implement partial methods to set the lastmodified time when any of the other fields changed. For the sake of simplicity here,we’ll do it manually.)List<strong>in</strong>g 12.5 accomplishes all of this and shows the result—load<strong>in</strong>g it <strong>in</strong> a freshDataContext to show that it has gone back to the database.List<strong>in</strong>g 12.5Updat<strong>in</strong>g a defect and show<strong>in</strong>g the new detailsus<strong>in</strong>g (var context = new DefectModelDataContext()){context.Log = Console.Out;Defect defect = context.Defects.Where(d => d.ID==1).S<strong>in</strong>gle();BF<strong>in</strong>ds defect withextension methodsUser tim = defect.CreatedBy;}defect.AssignedTo = tim;tim.Name = "Timothy Trotter";defect.Status = Status.Fixed;defect.LastModified = SampleData.August(31);context.SubmitChanges();C Updatesentity detailsD Submits changes to databaseLicensed to Rhona Hadida


LINQ to SQL325us<strong>in</strong>g (var context = new DefectModelDataContext()){Defect d = (from defect <strong>in</strong> context.Defectswhere defect.ID==1select defect).S<strong>in</strong>gle();EF<strong>in</strong>ds defect withquery expression}Console.WriteL<strong>in</strong>e (d);List<strong>in</strong>g 12.5 is easy enough to follow—we open up a context and fetch the firstdefect B. After chang<strong>in</strong>g the defect and the entity represent<strong>in</strong>g Tim Trotter C, weask LINQ to SQL to save the changes to the database D. F<strong>in</strong>ally, we fetch thedefect E aga<strong>in</strong> <strong>in</strong> a new context and write the details to the console. Just for a bit ofvariety, I’ve shown two different ways of fetch<strong>in</strong>g the defect—they’re absolutelyequivalent, because the compiler translates the query expression form <strong>in</strong>to the“method call” form anyway.That’s all the LINQ to SQL we’re go<strong>in</strong>g to see—hopefully it’s shown you enough ofthe capabilities to understand how it’s a normal ORM system, but one that has goodsupport for query expressions and the LINQ standard query operators.12.1.5 LINQ to SQL summaryThere are lots of ORMs out there, and many of them allow you to build up queries programmatically<strong>in</strong> a way that can look like LINQ to SQL—if you ignore compile-timecheck<strong>in</strong>g. It’s the comb<strong>in</strong>ation of lambda expressions, expression trees, extensionmethods, and query expressions that make LINQ special, giv<strong>in</strong>g these advantages:■ We’ve been able to use familiar syntax to write the query (at least, familiar whenyou know LINQ to Objects!).■ The compiler has been able to do a lot of validation for us.■ Visual Studio 2008 is able to help us build the query with IntelliSense.■ If we need a mixture of client-side and server-side process<strong>in</strong>g, we can do both <strong>in</strong>a consistent manner.■ We’re still us<strong>in</strong>g the database to do the hard work.Of course, this comes at a cost. As with any ORM system, you want to keep an eye onwhat SQL queries are be<strong>in</strong>g executed for a particular query expression. That’s wherethe logg<strong>in</strong>g is <strong>in</strong>valuable—but don’t forget to turn it off for production! In particular,you will need to be careful of the <strong>in</strong>famous “N+1 selects” issue, where an <strong>in</strong>itial querypulls back results from a s<strong>in</strong>gle table, but us<strong>in</strong>g each result transparently executesanother query to lazily load associated entities. Sometimes you’ll be able to f<strong>in</strong>d anelegant query expression that results <strong>in</strong> exactly the SQL you want to use; other timesyou’ll need to bend the query expression out of shape somewhat. Occasionally you’llneed to write the SQL manually or use a stored procedure <strong>in</strong>stead—as is often the casewith ORMs.I f<strong>in</strong>d it <strong>in</strong>terest<strong>in</strong>g just to take query expressions that you already know work <strong>in</strong>LINQ to Objects and see what SQL is generated when you run them aga<strong>in</strong>st a database.Licensed to Rhona Hadida


326 CHAPTER 12 LINQ beyond collectionsSometimes you can predict how th<strong>in</strong>gs will work, but sometimes there’s more go<strong>in</strong>gon than you might expect. There’s sample code on the book’s website for various queries,but query expressions are pretty easy to write, and I’d strongly encourage you towrite your own for fun. The SkeetySoft defects model is quite a simple one to query, ofcourse—but you’ve seen how Visual Studio 2008 makes it easy to generate entities, sogive it a try with a schema from a real system. I can’t emphasize enough that this brieflook at LINQ to SQL should not be taken as sufficient <strong>in</strong>formation to start build<strong>in</strong>g productioncode. It should be enough to let you experiment, but please read moredetailed documentation before embark<strong>in</strong>g on a real application!I’ve deliberately not gone <strong>in</strong>to how query expressions are converted <strong>in</strong>to SQL <strong>in</strong>this section. I wanted you to get a feel for the capabilities of LINQ to SQL before start<strong>in</strong>gto pick it apart.12.2 Translations us<strong>in</strong>g IQueryable and IQueryProviderIn this section we’re go<strong>in</strong>g to f<strong>in</strong>d out the basics of how LINQ to SQL manages to convertour query expressions <strong>in</strong>to SQL. This is the start<strong>in</strong>g po<strong>in</strong>t for implement<strong>in</strong>g yourown LINQ provider, should you wish to. This is the most theoretical section <strong>in</strong> thechapter, but it’s useful to have some <strong>in</strong>sight as to how LINQ is able to decide whetherto use <strong>in</strong>-memory process<strong>in</strong>g, a database, or some other query eng<strong>in</strong>e.In all the query expressions we’ve seen <strong>in</strong> LINQ to SQL, the source has been aTable. However, if you look at Table, you’ll see it doesn’t have a Where method,or Select, or Jo<strong>in</strong>, or any of the other standard query operators. Instead, it uses thesame trick that LINQ to Objects does—just as the source <strong>in</strong> LINQ to Objects alwaysimplements IEnumerable (possibly after a call to Cast or OfType) and then usesthe extension methods <strong>in</strong> Enumerable, so Table implements IQueryable andthen uses the extension methods <strong>in</strong> Queryable. We’ll see how LINQ builds up anexpression tree and then allows a provider to execute it at the appropriate time. Let’sstart off by look<strong>in</strong>g at what IQueryable consists of.12.2.1 Introduc<strong>in</strong>g IQueryable and related <strong>in</strong>terfacesIf you look up IQueryable <strong>in</strong> the documentation and see what members it conta<strong>in</strong>sdirectly (rather than <strong>in</strong>herit<strong>in</strong>g), you may be disappo<strong>in</strong>ted. There aren’t any. Instead,it <strong>in</strong>herits from IEnumerable and the nongeneric IQueryable.IQueryable <strong>in</strong> turn<strong>in</strong>herits from the nongeneric IEnumerable. So, IQueryable is where the new andexcit<strong>in</strong>g members are, right? Well, nearly. In fact, IQueryable just has three properties:QueryProvider, ElementType, and Expression. The QueryProvider property isof type IQueryProvider—yet another new <strong>in</strong>terface to consider.Lost? Perhaps figure 12.2 will help out—a class diagram of all the <strong>in</strong>terfacesdirectly <strong>in</strong>volved.The easiest way of th<strong>in</strong>k<strong>in</strong>g of IQueryable is that it represents a query that, when executed,will yield a sequence of results. The details of the query <strong>in</strong> LINQ terms are held<strong>in</strong> an expression tree, as returned by the Expression property of the IQueryable. Execut<strong>in</strong>ga query is performed by beg<strong>in</strong>n<strong>in</strong>g to iterate through an IQueryable (<strong>in</strong> otherLicensed to Rhona Hadida


Translations us<strong>in</strong>g IQueryable and IQueryProvider327Figure 12.2Class diagram based on the <strong>in</strong>terfaces <strong>in</strong>volved <strong>in</strong> IQueryablewords, call<strong>in</strong>g the GetEnumerator method) or by a call to the Execute method on anIQueryProvider, pass<strong>in</strong>g <strong>in</strong> an expression tree.So, with at least some grasp of what IQueryable is for, what is IQueryProvider?Well, we can do more than execute a query—we can also build a bigger query from it,which is the purpose of the standard query operators <strong>in</strong> LINQ. 2 To build up a query,we need to use the CreateQuery method on the relevant IQueryProvider. 3Th<strong>in</strong>k of a data source as a simple query (SELECT * FROM SomeTable <strong>in</strong> SQL, for<strong>in</strong>stance)—call<strong>in</strong>g Where, Select, OrderBy, and similar methods results <strong>in</strong> a differentquery, based on the first one. Given any IQueryable query, you can create a new queryby perform<strong>in</strong>g the follow<strong>in</strong>g steps:1 Ask the exist<strong>in</strong>g query for its query expression tree (us<strong>in</strong>g the Expression property).2 Build a new expression tree that conta<strong>in</strong>s the orig<strong>in</strong>al expression and the extrafunctionality you want (a filter, projection, or order<strong>in</strong>g, for <strong>in</strong>stance).2Well, the ones that keep deferr<strong>in</strong>g execution, such as Where and Jo<strong>in</strong>. We’ll see what happens with the aggregationssuch as Count <strong>in</strong> a little while.3Both Execute and CreateQuery have generic and nongeneric overloads. The nongeneric versions make iteasier to create queries dynamically <strong>in</strong> code. Compile-time query expressions use the generic version.Licensed to Rhona Hadida


328 CHAPTER 12 LINQ beyond collections3 Ask the exist<strong>in</strong>g query for its query provider (us<strong>in</strong>g the Provider property).4 Call CreateQuery on the provider, pass<strong>in</strong>g <strong>in</strong> the new expression tree.Of those steps, the only tricky one is creat<strong>in</strong>g the new expression tree. Fortunately,there’s a whole bunch of extension methods on the static Queryable class that do allthat for us. Enough theory—let’s start implement<strong>in</strong>g the <strong>in</strong>terfaces so we can see allthis <strong>in</strong> action.12.2.2 Fak<strong>in</strong>g it: <strong>in</strong>terface implementations to log callsBefore you get too excited, we’re not go<strong>in</strong>g to build our own fully fledged query provider<strong>in</strong> this chapter. However, if you understand everyth<strong>in</strong>g <strong>in</strong> this section, you’ll be<strong>in</strong> a much better position to build one if you ever need to—and possibly more importantly,you’ll understand what’s go<strong>in</strong>g on when you issue LINQ to SQL queries. Most ofthe hard work of query providers goes on at the po<strong>in</strong>t of execution, where they needto parse an expression tree and convert it <strong>in</strong>to the appropriate form for the targetplatform. We’re concentrat<strong>in</strong>g on the work that happens before that—how LINQ preparesto execute a query.We’ll write our own implementations of IQueryable and IQueryProvider, andthen try to run a few queries aga<strong>in</strong>st them. The <strong>in</strong>terest<strong>in</strong>g part isn’t the results—wewon’t be do<strong>in</strong>g anyth<strong>in</strong>g useful with the queries when we execute them—but theseries of calls made up to the po<strong>in</strong>t of execution. We’ll write types FakeQueryProviderand FakeQuery. The implementation of each <strong>in</strong>terface method writes out the currentexpression <strong>in</strong>volved, us<strong>in</strong>g a simple logg<strong>in</strong>g method (not shown here). Let’s look firstat FakeQuery, as shown <strong>in</strong> list<strong>in</strong>g 12.6.List<strong>in</strong>g 12.6A simple implementation of IQueryable that logs method callsclass FakeQuery : IQueryable{public Expression Expression { get; private set; }public IQueryProvider Provider { get; private set; }public Type ElementType { get; private set; }<strong>in</strong>ternal FakeQuery(IQueryProvider provider,Expression expression){Expression = expression;Provider = provider;ElementType = typeof(T);}BDeclares simpleautomaticproperties<strong>in</strong>ternal FakeQuery(): this(new FakeQueryProvider(), null){Expression = Expression.Constant(this);}CUses this query as<strong>in</strong>itial expressionpublic IEnumerator GetEnumerator(){Logger.Log(this, Expression);Licensed to Rhona Hadida


Translations us<strong>in</strong>g IQueryable and IQueryProvider329}}return Enumerable.Empty().GetEnumerator();IEnumerator IEnumerable.GetEnumerator(){Logger.Log(this, Expression);return Enumerable.Empty().GetEnumerator();}public override str<strong>in</strong>g ToStr<strong>in</strong>g(){return "FakeQuery";Overrides for}sake of logg<strong>in</strong>gEThe property members of IQueryable are implemented <strong>in</strong> FakeQuery with automaticproperties B, which are set by the constructors. There are two constructors: a parameterlessone that is used by our ma<strong>in</strong> program to create a pla<strong>in</strong> “source” for the query,and one that is called by FakeQueryProvider with the current query expression.The use of Expression.Constant(this) as the <strong>in</strong>itial source expression C is just away of show<strong>in</strong>g that the query <strong>in</strong>itially represents the orig<strong>in</strong>al object. (Imag<strong>in</strong>e animplementation represent<strong>in</strong>g a table, for example—until you apply any query operators,the query would just return the whole table.) When the constant expression islogged, it uses the overridden ToStr<strong>in</strong>g method, which is why we’ve given a short,constant description E. This makes the f<strong>in</strong>al expression much cleaner than it wouldhave been without the override. When we are asked to iterate over the results of thequery, we always just return an empty sequence D to make life easy. Production implementationswould parse the expression here, or (more likely) call Execute on theirquery provider and just return the result.As you can see, there’s not a lot go<strong>in</strong>g on <strong>in</strong> FakeQuery, and list<strong>in</strong>g 12.7 shows thatFakeQueryProvider is equally simple.DReturns emptyresult sequenceList<strong>in</strong>g 12.7An implementation of IQueryProvider that uses FakeQueryclass FakeQueryProvider : IQueryProvider{public IQueryable CreateQuery(Expression expression){Logger.Log(this, expression);return new FakeQuery(this, expression);}public IQueryable CreateQuery(Expression expression){Logger.Log(this, expression);return new FakeQuery(this, expression);}public T Execute(Expression expression){Logger.Log(this, expression);return default(T);Licensed to Rhona Hadida


330 CHAPTER 12 LINQ beyond collections}}public object Execute(Expression expression){Logger.Log(this, expression);return null;}There’s even less to talk about <strong>in</strong> terms of the implementation of FakeQueryProviderthan there was for FakeQuery. The CreateQuery methods do no real process<strong>in</strong>g but actas factory methods for FakeQuery. The Execute method overloads just return emptyresults after logg<strong>in</strong>g the call. This is where a lot of analysis would normally be done, alongwith the actual call to the web service, database, or whatever the target platform is.Even though we’ve done no real work, when we start to use FakeQuery as thesource <strong>in</strong> a query expression <strong>in</strong>terest<strong>in</strong>g th<strong>in</strong>gs start to happen. I’ve already let sliphow we are able to write query expressions without explicitly writ<strong>in</strong>g methods to handlethe standard query operators: it’s all about extension methods, this time the ones<strong>in</strong> the Queryable class.12.2.3 Glu<strong>in</strong>g expressions together: the Queryable extension methodsJust as the Enumerable type conta<strong>in</strong>s extension methods on IEnumerable to implementthe LINQ standard query operators, the Queryable type conta<strong>in</strong>s extensionmethods on IQueryable. There are two big differences between the implementations<strong>in</strong> Enumerable and those <strong>in</strong> Queryable.First, the Enumerable methods all use delegates as their parameters—the Selectmethod takes a Func, for example. That’s f<strong>in</strong>e for <strong>in</strong>-memorymanipulation, but for LINQ providers that execute the query elsewhere, we need a formatwe can exam<strong>in</strong>e more closely—expression trees. For example, the correspond<strong>in</strong>goverload of Select <strong>in</strong> Queryable takes a parameter of type Expression. The compiler doesn’t m<strong>in</strong>d at all—after query translation, it hasa lambda expression that it needs to pass as a parameter to the method, and lambdaexpressions can be converted to either delegate <strong>in</strong>stances or expression trees.This is the reason that LINQ to SQL is able to work so seamlessly. The four key elements<strong>in</strong>volved are all new features of <strong>C#</strong> 3: lambda expressions, the translation ofquery expressions <strong>in</strong>to “normal” expressions that use lambda expressions, extensionmethods, and expression trees. Without all four, there would be problems. If queryexpressions were always translated <strong>in</strong>to delegates, for <strong>in</strong>stance, they couldn’t be usedwith a provider such as LINQ to SQL, which requires expression trees. Figure 12.3shows the two paths taken by query expressions; they differ only <strong>in</strong> what <strong>in</strong>terfacestheir data source implements.Notice how <strong>in</strong> figure 12.3 the early parts of the compilation process are <strong>in</strong>dependentof the data source. The same query expression is used, and it’s translated <strong>in</strong> exactly thesame way. It’s only when the compiler looks at the translated query to f<strong>in</strong>d the appropriateSelect and Where methods to use that the data source is truly important. At thatLicensed to Rhona Hadida


Translations us<strong>in</strong>g IQueryable and IQueryProvider331from user <strong>in</strong> userswhere user.Name.StartsWith("D")select user.NameQuery expression translationusers.Where(user => user.Name.StartsWith("D").Select(user => user.Name)Overload resolutionIQueryable not implementedIQueryable implementedExtension methods on Enumerableare chosen, which use delegatesas parametersExtension methods on Queryableare chosen, which use expressiontrees as parametersIL to create delegate <strong>in</strong>stances, withcalls to Enumerable.Where andEnumerable.SelectIL to create expression trees, withcalls to Queryable.Where andQueryable.SelectFigure 12.3 A query tak<strong>in</strong>g two paths, depend<strong>in</strong>g on whether the data source implementsIQueryable or only IEnumerablepo<strong>in</strong>t, the lambda expressions can be converted to either delegate <strong>in</strong>stances or expressiontrees, potentially giv<strong>in</strong>g radically different implementations: typically <strong>in</strong>-memoryfor the left path, and SQL execut<strong>in</strong>g aga<strong>in</strong>st a database <strong>in</strong> the right path.The second big difference between Enumerable and Queryable is that theEnumerable extension methods do the actual work associated with the correspond<strong>in</strong>gquery operator. There is code <strong>in</strong> Enumerable.Where to execute the specified filter andonly yield appropriate elements as the result sequence, for example. By contrast, thequery operator “implementations” <strong>in</strong> Queryable do very little: they just create a newquery based on the parameters or call Execute on the query provider, as described atthe end of section 12.2.1. In other words, they are only used to build up queries andrequest that they be executed—they don’t conta<strong>in</strong> the logic beh<strong>in</strong>d the operators. Thismeans they’re suitable for any LINQ provider that uses expression trees.With the Queryable extension methods available and mak<strong>in</strong>g use of our IQueryableand IQueryProvider implementations, it’s f<strong>in</strong>ally time to see what happens when we usea query expression with our custom provider.Licensed to Rhona Hadida


332 CHAPTER 12 LINQ beyond collections12.2.4 The fake query provider <strong>in</strong> actionList<strong>in</strong>g 12.8 shows a simple query expression, which (supposedly) f<strong>in</strong>ds all the str<strong>in</strong>gs<strong>in</strong> our fake source beg<strong>in</strong>n<strong>in</strong>g with “abc” and projects the results <strong>in</strong>to a sequence of thelengths of the match<strong>in</strong>g str<strong>in</strong>gs. We iterate through the results, but don’t do anyth<strong>in</strong>gwith them, as we know already that they’ll be empty. Of course, we have no sourcedata, and we haven’t written any code to do any real filter<strong>in</strong>g—we’re just logg<strong>in</strong>gwhich calls are made by LINQ <strong>in</strong> the course of creat<strong>in</strong>g the query expression and iterat<strong>in</strong>gthrough the results.List<strong>in</strong>g 12.8A simple query expression us<strong>in</strong>g the fake query classesvar query = from x <strong>in</strong> new FakeQuery()where x.StartsWith("abc")select x.Length;foreach (<strong>in</strong>t i <strong>in</strong> query) { }What would you expect the results of runn<strong>in</strong>g list<strong>in</strong>g 12.8 to be? In particular, whatwould you like to be logged last, at the po<strong>in</strong>t where we’d normally expect to do somereal work with the expression tree? Here are the results of list<strong>in</strong>g 12.8, reformattedslightly for clarity:FakeQueryProvider.CreateQueryExpression=FakeQuery.Where(x => x.StartsWith("abc"))FakeQueryProvider.CreateQueryExpression=FakeQuery.Where(x => x.StartsWith("abc")).Select(x => x.Length)FakeQuery.GetEnumeratorExpression=FakeQuery.Where(x => x.StartsWith("abc")).Select(x => x.Length)The two important th<strong>in</strong>gs to note are that GetEnumerator is only called at the end, noton any <strong>in</strong>termediate queries, and that by the time GetEnumerator is called we have allthe <strong>in</strong>formation present <strong>in</strong> the orig<strong>in</strong>al query expression. We haven’t manually had tokeep track of earlier parts of the expression <strong>in</strong> each step—a s<strong>in</strong>gle expression tree capturesall the <strong>in</strong>formation “so far” at any po<strong>in</strong>t <strong>in</strong> time.Don’t be fooled by the concise output, by the way—the actual expression tree isquite deep and complicated, particularly due to the where clause <strong>in</strong>clud<strong>in</strong>g an extramethod call. This expression tree is what LINQ to SQL would be exam<strong>in</strong><strong>in</strong>g to workout what query to execute. LINQ providers could build up their own queries (<strong>in</strong> whateverform they may need) as calls to CreateQuery are made, but usually look<strong>in</strong>g at thef<strong>in</strong>al tree when GetEnumerator is called is simpler, as all the necessary <strong>in</strong>formation isavailable <strong>in</strong> one place.The f<strong>in</strong>al call logged by list<strong>in</strong>g 12.8 was to FakeQuery.GetEnumerator, and you maybe wonder<strong>in</strong>g why we also need an Execute method on IQueryProvider. Well, not allquery expressions generate sequences—if you use an aggregation operator such as Sum,Licensed to Rhona Hadida


Translations us<strong>in</strong>g IQueryable and IQueryProvider333Count, or Average, we’re no longer really creat<strong>in</strong>g a “source”—we’re evaluat<strong>in</strong>g a resultimmediately. That’s when Execute is called, as shown by list<strong>in</strong>g 12.9 and its output.List<strong>in</strong>g 12.9 IQueryProvider.Executevar query = from x <strong>in</strong> new FakeQuery()where x.StartsWith("abc")select x.Length;double mean = query.Average();// OutputFakeQueryProvider.CreateQueryExpression=FakeQuery.Where(x => x.StartsWith("abc"))FakeQueryProvider.CreateQueryExpression=FakeQuery.Where(x => x.StartsWith("abc")).Select(x => x.Length)FakeQueryProvider.ExecuteExpression=FakeQuery.Where(x => x.StartsWith("abc")).Select(x => x.Length).Average()The FakeQueryProvider can be quite useful when it comes to understand<strong>in</strong>g what the<strong>C#</strong> compiler is do<strong>in</strong>g beh<strong>in</strong>d the scenes with query expressions. It will show the transparentidentifiers <strong>in</strong>troduced with<strong>in</strong> a query expression, along with the translated callsto SelectMany, GroupJo<strong>in</strong>, and the like.12.2.5 Wrapp<strong>in</strong>g up IQueryableWe haven’t written any of the significant code that a real query provider would need<strong>in</strong> order to get useful work done, but hopefully our fake provider has given you<strong>in</strong>sight <strong>in</strong>to how LINQ providers are given the <strong>in</strong>formation from query expressions.It’s all built up by the Queryable extension methods, given an appropriate implementationof IQueryable and IQueryProvider.We’ve gone <strong>in</strong>to a bit more detail <strong>in</strong> this section than we will for the rest of thechapter, as it’s <strong>in</strong>volved the foundations that underp<strong>in</strong> the LINQ to SQL code we sawearlier. You’re unlikely to want to write your own query provider—it takes a lot of workto produce a really good one—but this section has been important <strong>in</strong> terms of conceptualunderstand<strong>in</strong>g. The steps <strong>in</strong>volved <strong>in</strong> tak<strong>in</strong>g a <strong>C#</strong> query expression and (atexecution time) runn<strong>in</strong>g some SQL on a database are quite profound and lie at theheart of the big features of <strong>C#</strong> 3. Understand<strong>in</strong>g why <strong>C#</strong> has ga<strong>in</strong>ed these features willhelp keep you more <strong>in</strong> tune with the language.In fact, LINQ to SQL is the only provider built <strong>in</strong>to the framework that actually usesIQueryable—the other “providers” are just APIs that play nicely with LINQ to Objects.I don’t wish to dim<strong>in</strong>ish their importance—they’re still useful. However, it does meanthat you can relax a bit now—the hardest part of the chapter is beh<strong>in</strong>d you. We’re stillstay<strong>in</strong>g with database-related access for our next section, though, which looks at LINQto DataSet.Licensed to Rhona Hadida


334 CHAPTER 12 LINQ beyond collections12.3 LINQ to DataSetSee<strong>in</strong>g all the neat stuff that LINQ to SQL can achieve is all very well, but most developersare likely to be improv<strong>in</strong>g an exist<strong>in</strong>g application rather than creat<strong>in</strong>g a new onefrom the ground up. Rather than ripp<strong>in</strong>g out the entire persistence layer and replac<strong>in</strong>git with LINQ to SQL, it would be nice to be able to ga<strong>in</strong> some of the advantages ofLINQ while us<strong>in</strong>g exist<strong>in</strong>g technology. Many ADO.NET applications use datasets,whether typed or untyped 4 —and 4 LINQ to DataSet gives you access to a lot of the benefitsof LINQ with little change to your current code.The query expressions used with<strong>in</strong> LINQ to DataSet are just LINQ to Objects queries—there’sno translation <strong>in</strong>to a call to DataTable.Select, for example. Instead,data rows are filtered and ordered with normal delegate <strong>in</strong>stances that operate onthose rows.Unsurpris<strong>in</strong>gly, you’ll get a better experience us<strong>in</strong>g typed datasets, but a set ofextension methods on DataTable and DataRow make it at least possible to work withuntyped datasets too. In this section we’ll look at both k<strong>in</strong>ds of datasets, start<strong>in</strong>g withuntyped ones.12.3.1 Work<strong>in</strong>g with untyped datasetsUntyped datasets have two problems as far as LINQ is concerned. First, we don’t haveaccess to the fields with<strong>in</strong> the tables as typed properties; second, the tables themselvesaren’t enumerable. To some extent both are merely a matter of convenience—wecould use direct casts <strong>in</strong> all the queries, handle DBNull explicitly and so forth, as well asenumerate the rows <strong>in</strong> a table us<strong>in</strong>g dataTable.Rows.Cast. Theseworkarounds are quite ugly, which is why the DataTableExtensions and DataRow-Extensions classes exist.Code us<strong>in</strong>g untyped datasets is never go<strong>in</strong>g to be pretty, but us<strong>in</strong>g LINQ is far nicerthan filter<strong>in</strong>g and sort<strong>in</strong>g us<strong>in</strong>g DataTable.Select. No more escap<strong>in</strong>g, worry<strong>in</strong>gabout date and time formatt<strong>in</strong>g, and similar nast<strong>in</strong>ess.List<strong>in</strong>g 12.10 gives a simple example. It just fills a s<strong>in</strong>gle defect table and pr<strong>in</strong>ts thesummaries of all the defects that don’t have a status of “closed.”List<strong>in</strong>g 12.10Display<strong>in</strong>g the summaries of open defects from an untyped DataTableDataTable dataTable = new DataTable();us<strong>in</strong>g (var connection = new SqlConnection(Sett<strong>in</strong>gs.Default.SkeetySoftDefectsConnectionStr<strong>in</strong>g)){str<strong>in</strong>g sql = "SELECT Summary, Status FROM Defect";new SqlDataAdapter(sql, connection).Fill(dataTable);}var query = from defect <strong>in</strong> dataTable.AsEnumerable()BCFills tablefromdatabaseMakes tableenumerable4An untyped dataset is one that has no static <strong>in</strong>formation about the contents of its tables. Typed datasets, usuallygenerated <strong>in</strong> the Visual Studio designer, know the tables which can be present <strong>in</strong> the dataset, and the columnswith<strong>in</strong> the rows <strong>in</strong> those tables.Licensed to Rhona Hadida


LINQ to DataSet335where defect.Field("Status") != Status.Closedselect defect.Field("Summary");foreach (str<strong>in</strong>g summary <strong>in</strong> query){Console.WriteL<strong>in</strong>e (summary);}SelectsSummary fieldCom<strong>in</strong>g so soon after the nice, clean world of LINQ to SQL, list<strong>in</strong>g 12.10 makes mefeel somewhat dirty. There’s hard-coded SQL, column names, and casts all over theplace. However, we’ll see that th<strong>in</strong>gs are better when we have a typed dataset—and thiscode does get the job done. If you’re us<strong>in</strong>g both LINQ to SQL and LINQ to DataSet, youcan fill a DataTable us<strong>in</strong>g the DataTableExtensions.CopyToDataTable extensionmethod, but I wanted to keep to just one new technology at a time for this example.The first part B is “old-fashioned” ADO.NET code to fill the data table. I haven’tused an actual dataset for this example because we’re only <strong>in</strong>terested <strong>in</strong> a s<strong>in</strong>gletable—putt<strong>in</strong>g it <strong>in</strong> a dataset would have made th<strong>in</strong>gs slightly more complicated forno benefit. It’s only when we reach the query expression (C D and E) that LINQstarts com<strong>in</strong>g <strong>in</strong>.The source of a query expression has to be enumerable <strong>in</strong> some form—and theDataTable type doesn’t even implement IEnumerable, let alone IEnumerable. TheDataTableExtensions class provides the AsEnumerable extension method (C), whichmerely returns an IEnumerable that iterates over the rows <strong>in</strong> the table.Access<strong>in</strong>g fields with<strong>in</strong> a row is made slightly easier <strong>in</strong> LINQ to DataSet us<strong>in</strong>g theField extension method on DataRow. This not only removes the need to castresults, but it also deals with null values for you—it converts DBNull to a null referencefor you, or the null value of a nullable type.I won’t give any further examples of untyped datasets here, although there are acouple more queries <strong>in</strong> the book’s sample code. Hopefully you’ll f<strong>in</strong>d yourself <strong>in</strong> thesituation where you can use a typed dataset <strong>in</strong>stead.12.3.2 Work<strong>in</strong>g with typed datasetsAlthough typed datasets aren’t as rich as us<strong>in</strong>g LINQ to SQL directly, they providemuch more static type <strong>in</strong>formation, which lets your code stay cleaner. There’s a bit ofwork to start with: we have to create a typed dataset for our defect-track<strong>in</strong>g systembefore we can beg<strong>in</strong> us<strong>in</strong>g it.CREATING THE TYPED DATASET WITH VISUAL STUDIOThe process for generat<strong>in</strong>g a typed dataset <strong>in</strong> Visual Studio 2008 is almost exactly thesame as it is to generate LINQ to SQL entities. Aga<strong>in</strong>, you add a new item to the project(this time select<strong>in</strong>g DataSet <strong>in</strong> the list of options), and aga<strong>in</strong> you can drag and droptables from the Server Explorer w<strong>in</strong>dow onto the designer surface.There aren’t quite as many options available <strong>in</strong> the property panes for typeddatasets, but we can still rename the DefectUser table, the DefectID field, along withthe associations. Likewise, we can still tell the Status, Summary, and UserType propertiesto use the enumeration types from the model. Figure 12.4 shows the designerafter a bit of edit<strong>in</strong>g and rearrang<strong>in</strong>g.EAccessesStatusfieldDLicensed to Rhona Hadida


336 CHAPTER 12 LINQ beyond collectionsFigure 12.4The defect-track<strong>in</strong>g database <strong>in</strong> the dataset designerEach table <strong>in</strong> the dataset has its own types for the table, rows, notification event handlers,notification event arguments, and adapters. The adapters are placed <strong>in</strong>to a separatenamespace, based on the name of the dataset.Once you’ve created the dataset, us<strong>in</strong>g it is easy.POPULATING AND QUERYING TYPED DATASETSAs <strong>in</strong> previous versions of ADO.NET, typed datasets are populated us<strong>in</strong>g adapters. Theadapters generated by the designer already use the connection str<strong>in</strong>g of the server orig<strong>in</strong>allyused to create the dataset: this can be changed <strong>in</strong> the application sett<strong>in</strong>gs. Thismeans that <strong>in</strong> many cases, you can fill the tables with barely any code. List<strong>in</strong>g 12.11achieves the same results as the query on the untyped dataset, but <strong>in</strong> a considerablycleaner fashion.List<strong>in</strong>g 12.11Display<strong>in</strong>g the summaries of open defects from a typed datasetDefectDataSet dataSet = new DefectDataSet();new DefectTableAdapter().Fill(dataSet.Defect);var query = from defect <strong>in</strong> dataSet.Defectwhere defect.Status != Status.Closedselect defect.Summary;foreach (str<strong>in</strong>g summary <strong>in</strong> query){Console.WriteL<strong>in</strong>e (summary);}Licensed to Rhona Hadida


LINQ to DataSet337Creat<strong>in</strong>g and populat<strong>in</strong>g the dataset is now a breeze—and even though we use onlyone table, it’s as easy to do that us<strong>in</strong>g a complete dataset as it would have been if we’donly created a s<strong>in</strong>gle data table. Of course, we’re pull<strong>in</strong>g more data down this timebecause we haven’t specified a SQL projection, but you can access the “raw” adapter ofa typed data adapter and modify the query yourself if you need to.The query expression <strong>in</strong> list<strong>in</strong>g 12.11 looks like it could have come straight fromLINQ to SQL or LINQ to Objects, other than us<strong>in</strong>g dataSet <strong>in</strong>stead of context orSampleData.AllDefects. The DefectDataSet.Defect property returns a DefectData-Table, which implements IEnumerable already (via its base class,TypedTableBase) so we don’t need any extension methods, and eachrow is strongly typed.For me, one of the most compell<strong>in</strong>g aspects of LINQ is this consistency between differentdata access mechanisms. Even if you only query a s<strong>in</strong>gle data source, LINQ is useful—butbe<strong>in</strong>g able to query multiple data sources with the same syntax is phenomenal.You still need to be aware of the consequences of query<strong>in</strong>g aga<strong>in</strong>st the different technologies,but the fundamental grammar of query expressions rema<strong>in</strong>s constant.Us<strong>in</strong>g associations between different tables is also simple with typed datasets. Inlist<strong>in</strong>g 12.12 we group the open defects accord<strong>in</strong>g to their status, and display thedefect ID, the name of the project, and the name of the user assigned to work on it.List<strong>in</strong>g 12.12Us<strong>in</strong>g associations <strong>in</strong> a typed datasetDefectDataSet dataSet = new DefectDataSet();var query = from defect <strong>in</strong> dataSet.Defectwhere defect.Status != Status.Closedgroup defect by defect.Status;BDef<strong>in</strong>esquerynew DefectTableAdapter().Fill(dataSet.Defect);new UserTableAdapter().Fill(dataSet.User);new ProjectTableAdapter().Fill(dataSet.Project);CPopulatesdatasetforeach (var group <strong>in</strong> query){Console.WriteL<strong>in</strong>e (group.Key);foreach (var defect <strong>in</strong> group){Console.WriteL<strong>in</strong>e (" {0}: {1}/{2}",defect.ID,defect.ProjectRow.Name,defect.UserRowByAssignedTo.Name);}}DisplaysresultsUnlike <strong>in</strong> LINQ to SQL (which can lazily load entities), we have to load all the datawe’re <strong>in</strong>terested <strong>in</strong> before we execute the query. Just for fun, this time I’ve populatedthe dataset C after we’ve def<strong>in</strong>ed the query B but before we’ve executed it D. Apartfrom anyth<strong>in</strong>g else, this proves that LINQ is still work<strong>in</strong>g <strong>in</strong> a deferred executionmode—it doesn’t try to look at the list of defects until we iterate through them.DLicensed to Rhona Hadida


338 CHAPTER 12 LINQ beyond collectionsWhen we display the results D, the associations (ProjectRow, UserRowBy-AssignedTo) aren’t quite as neat as they are <strong>in</strong> LINQ to SQL, because the datasetdesigner doesn’t give us as much control over those names as the entity model designerdoes. It’s still a lot nicer than the equivalent code for untyped datasets, however.As you can see, Microsoft has made LINQ work nicely with databases and associatedtechnologies. That’s not the only form of data access we can use, though—<strong>in</strong> a worldwhere XML is so frequently used for stor<strong>in</strong>g and exchang<strong>in</strong>g data, LINQ makes process<strong>in</strong>gdocuments that much simpler.12.4 LINQ to XMLWhereas LINQ to DataSet is really just some extensions to the exist<strong>in</strong>g dataset capabilities(along with a new base class for typed data tables), LINQ to XML is a completelynew XML API. If your immediate reaction to hear<strong>in</strong>g about yet another XML API is to askwhether or not we need one, you’re not alone. However, LINQ to XML simplifies documentmanipulation considerably, and is designed to play well with LINQ to Objects.It doesn’t have much extra query<strong>in</strong>g functionality <strong>in</strong> itself, and it doesn’t perform anyquery translations <strong>in</strong> the way that LINQ to SQL does, but the <strong>in</strong>tegration with LINQ viaiterators makes it a pleasure to work with.It’s not just about query<strong>in</strong>g, though—one of the major benefits of LINQ to XMLover the exist<strong>in</strong>g APIs is its “developer friendl<strong>in</strong>ess” when it comes to creat<strong>in</strong>g andtransform<strong>in</strong>g and XML documents too. We’ll see that <strong>in</strong> a moment, when we create anXML document conta<strong>in</strong><strong>in</strong>g all our sample data, ready for query<strong>in</strong>g. Of course, thiswon’t be a comprehensive guide to LINQ to XML, but the deeper topics that we won’tcover here (such as XML namespace handl<strong>in</strong>g) have been well thought out to makedevelopment easier.Let’s start by look<strong>in</strong>g at two of the most important classes of LINQ to XML.12.4.1 XElement and XAttributeThe bulk of LINQ to XML consists of a set of classes <strong>in</strong> the System.Xml.L<strong>in</strong>qnamespace, most of which are prefixed with X: XName, XDocument, XComment, and soforth. Even though lots of classes are available, for our purposes we only need to knowabout XElement and XAttribute, which obviously represent XML elements andattributes, respectively.We’ll also be us<strong>in</strong>g XName <strong>in</strong>directly, which is used for names of elements andattributes, and can conta<strong>in</strong> namespace <strong>in</strong>formation. The reason we only need to use it<strong>in</strong>directly is that there’s an implicit conversion from str<strong>in</strong>g to XName, so every time wecreate an element or attribute by pass<strong>in</strong>g the name as a str<strong>in</strong>g, it’s convert<strong>in</strong>g it <strong>in</strong>to anXName. I mention this only so that you won’t get confused about what’s be<strong>in</strong>g called ifyou look at the available methods and constructors <strong>in</strong> MSDN.One of the great th<strong>in</strong>gs about LINQ to XML is how easy it is to construct elementsand attributes. In DOM, to create a simple document with a nested element you’dhave to go through lots of hoops:Licensed to Rhona Hadida


LINQ to XML339■ Create a document.■ Create a root element from the document.■ Add the root element to the document.■ Create a child element from the document.■ Add the child element to the root element.Add<strong>in</strong>g attributes was relatively pa<strong>in</strong>ful, too. None of this would be naturally nested <strong>in</strong>the code to give an at-a-glance <strong>in</strong>dication of the f<strong>in</strong>al markup. LINQ to XML makes lifeeasier <strong>in</strong> the follow<strong>in</strong>g ways:■ You don’t need an XDocument unless you want one—you can create elementsand attributes separately.■ You can specify the contents of an element (or document) <strong>in</strong> the constructor.These don’t immediately sound important, but see<strong>in</strong>g the API <strong>in</strong> action makes th<strong>in</strong>gsclearer. List<strong>in</strong>g 12.13 creates and pr<strong>in</strong>ts out an XML element with attributes and anested element conta<strong>in</strong><strong>in</strong>g some text.List<strong>in</strong>g 12.13Creat<strong>in</strong>g a simple piece of XMLvar root = new XElement("root",new XAttribute ("attr", "value"),new XElement("child","text"));Console.WriteL<strong>in</strong>e (root);Choos<strong>in</strong>g how you format LINQ to XML creation code is a personal decision, <strong>in</strong> termsof where to use whitespace and how much to use, where to place parentheses, and soforth. However, the important po<strong>in</strong>t to note from list<strong>in</strong>g 12.13 is how it’s quite clear thatthe element “root” conta<strong>in</strong>s two nodes: an attribute and a child element. The child elementthen has a text node (“text”). In other words, the structure of the result is apparentfrom the structure of the code. The output from list<strong>in</strong>g 12.13 is the XML we’d hope for:textThe equivalent code <strong>in</strong> DOM would have been much nastier. If we’d wanted to <strong>in</strong>cludean XML declaration (, for <strong>in</strong>stance), it wouldhave been easy to do so with XDocument—but I’m try<strong>in</strong>g to keep th<strong>in</strong>gs as simple aspossible for this brief tour. Likewise, you can modify XElements after creat<strong>in</strong>g them <strong>in</strong>a DOM-like manner, but we don’t need to go down that particular path here.XElementconstructorrecurses <strong>in</strong>toargumentsThe constructor for XElement accepts any number of objects us<strong>in</strong>g aparams parameter, but importantly it will also then recurse <strong>in</strong>to any enumerablearguments that are passed to it. This is absolutely crucial whenus<strong>in</strong>g LINQ queries with<strong>in</strong> XML creation expressions. Speak<strong>in</strong>g of which,let’s start build<strong>in</strong>g some XML with our familiar defect data.Licensed to Rhona Hadida


340 CHAPTER 12 LINQ beyond collections12.4.2 Convert<strong>in</strong>g sample defect data <strong>in</strong>to XMLWe’ve currently got our sample data available <strong>in</strong> three different forms—<strong>in</strong> memory asobjects, <strong>in</strong> the database, and <strong>in</strong> a typed dataset. Of course, the typed dataset can writeXML directly, but let’s convert our orig<strong>in</strong>al <strong>in</strong>-memory version for simplicity. To startwith, we’ll just generate a list of the users. List<strong>in</strong>g 12.14 creates a list of user elementswith<strong>in</strong> a users element, and then writes it out; the output is shown beneath the list<strong>in</strong>g.List<strong>in</strong>g 12.14Creat<strong>in</strong>g an element from the sample usersvar users = new XElement("users",from user <strong>in</strong> SampleData.AllUsersselect new XElement("user",new XAttribute("name", user.Name),new XAttribute("type", user.UserType)));Console.WriteL<strong>in</strong>e (users);// OutputI hope you’ll agree that list<strong>in</strong>g 12.14 is simple, once you’ve got your head around theidea that the contents of the top-level element depend on the result of the embeddedquery expression. It’s possible to make it even simpler, however. I’ve written a smallextension method on object that generates an IEnumerable based onthe properties of the object it’s called on, which are discovered with reflection. This isideal for anonymous types—list<strong>in</strong>g 12.15 creates the same output, but without theexplicit XAttribute constructor calls. With only two attributes, there isn’t much difference<strong>in</strong> the code, but for more complicated elements it’s a real boon.List<strong>in</strong>g 12.15Us<strong>in</strong>g object properties to generate XAttributes with reflectionvar users = new XElement("users",from user <strong>in</strong> SampleData.AllUsersselect new XElement("user",new { name=user.Name, type=user.UserType }.AsXAttributes()));Console.WriteL<strong>in</strong>e (users);For the rest of the chapter I’ll use the “vanilla” LINQ to XML calls, but it’s worth be<strong>in</strong>gaware of the possibilities available with a bit of reusable code. (The source for theextension method is available as part of the code for the book.)Licensed to Rhona Hadida


LINQ to XML341You can nest query expressions and do all k<strong>in</strong>ds of other clever th<strong>in</strong>gs with them.List<strong>in</strong>g 12.16 generates an element conta<strong>in</strong><strong>in</strong>g the project <strong>in</strong>formation: the notificationsubscriptions for a project are embedded with<strong>in</strong> the project element by us<strong>in</strong>g anested query. The results are shown beneath the list<strong>in</strong>g.List<strong>in</strong>g 12.16Generat<strong>in</strong>g projects with nested subscription elementsvar projects = new XElement("projects",from project <strong>in</strong> SampleData.AllProjectsselect new XElement("project",new XAttribute("name", project.Name),new XAttribute("id", project.ProjectID),from subscription <strong>in</strong> SampleData.AllSubscriptionswhere subscription.Project == projectselect new XElement("subscription",new XAttribute("email", subscription.EmailAddress))));Console.WriteL<strong>in</strong>e (projects);// OutputThe two queries are highlighted <strong>in</strong> bold. There are alternative ways of generat<strong>in</strong>g thesame output, of course—you could use a s<strong>in</strong>gle query that groups the subscriptions byproject, for <strong>in</strong>stance. The code <strong>in</strong> list<strong>in</strong>g 12.16 was just the first way I thought of tackl<strong>in</strong>gthe problem—and <strong>in</strong> cases where the performance isn’t terribly important itdoesn’t matter that we’ll be runn<strong>in</strong>g the nested query multiple times. In productioncode you’d want to consider possible performance issues more carefully, of course!I won’t show you all the code to generate all the elements—even with LINQ toXML, it’s quite tedious just because there are so many attributes to set. I’ve placed it all<strong>in</strong> a s<strong>in</strong>gle XmlSampleData.GetElement() method that returns a root XElement. We’lluse this method as the start<strong>in</strong>g po<strong>in</strong>t for the examples <strong>in</strong> our f<strong>in</strong>al avenue of<strong>in</strong>vestigation: query<strong>in</strong>g.12.4.3 Queries <strong>in</strong> LINQ to XMLYou may well be expect<strong>in</strong>g me to reveal that XElement implements IEnumerable andthat LINQ queries come for free. Well, it’s not quite that simple, because there are soLicensed to Rhona Hadida


342 CHAPTER 12 LINQ beyond collectionsmany different th<strong>in</strong>gs that an XElement could iterate through. XElement conta<strong>in</strong>s anumber of axis methods that are used as query sources. If you’re familiar with XPath,the idea of an axis will no doubt be familiar to you. Here are the axis methods useddirectly for query<strong>in</strong>g, each of which returns an appropriate IEnumerable:■ Ancestors ■ DescendantNodes■ AncestorsAndSelf ■ DescendantNodesAndSelf■ Annotations ■ Elements■ Attributes ■ ElementsAfterSelf■ Descendants ■ ElementsBeforeSelf■ DescendantsAndSelfAll of these are fairly self-explanatory (and the MSDN documentation provides moredetails). There are useful overloads to retrieve only nodes with an appropriate name:call<strong>in</strong>g Descendants("user") on an XElement will return all user elements underneaththe element you call it on, for <strong>in</strong>stance. A number of extension methods alsomake these axes available to whole sequences of nodes: the result is a concatenatedsequence as if the method has been called on each node <strong>in</strong> turn.As well as these calls return<strong>in</strong>g sequences, there are some methods that return as<strong>in</strong>gle result—Attribute and Element are the most important, return<strong>in</strong>g the namedattribute and the first descendant element with the specified name, respectively.One aspect of XAttribute that is particularly relevant to query<strong>in</strong>g is the set of theexplicit conversions from an XAttribute to any number of other types, such as <strong>in</strong>t,str<strong>in</strong>g, and DateTime. These are important for both filter<strong>in</strong>g and project<strong>in</strong>g results.Enough talk! Let’s see some code. We’ll start off simply, just display<strong>in</strong>g the userswith<strong>in</strong> our XML structure, as shown <strong>in</strong> list<strong>in</strong>g 12.17.List<strong>in</strong>g 12.17Display<strong>in</strong>g the users with<strong>in</strong> an XML structureXElement root = XmlSampleData.GetElement();var query = from user <strong>in</strong> root.Element("users").Elements()select new { Name=(str<strong>in</strong>g)user.Attribute("name"),UserType=(str<strong>in</strong>g)user.Attribute("type") };foreach (var user <strong>in</strong> query){Console.WriteL<strong>in</strong>e ("{0}: {1}", user.Name, user.UserType);}After creat<strong>in</strong>g the data at the start, we navigate down to the users element, and ask itfor its direct child elements. This two-step fetch could be shortened to justroot.Descendants("user"), but it’s good to see the more rigid navigation so youcan use it where necessary. It’s also more robust <strong>in</strong> the face of changes to the documentstructure, such as another (unrelated) user element be<strong>in</strong>g added elsewhere <strong>in</strong>the document.The rest of the query expression is merely a projection of an XElement <strong>in</strong>to ananonymous type. I’ll admit that we’re cheat<strong>in</strong>g slightly with the user type: we’ve kept itas a str<strong>in</strong>g <strong>in</strong>stead of call<strong>in</strong>g Enum.Parse to convert it <strong>in</strong>to a proper UserType value.The latter approach works perfectly well—but it’s quite longw<strong>in</strong>ded when you onlyLicensed to Rhona Hadida


LINQ to XML343need the str<strong>in</strong>g form, and the code becomes hard to format sensibly with<strong>in</strong> the strictlimits of the pr<strong>in</strong>ted page.List<strong>in</strong>g 12.17 isn’t do<strong>in</strong>g anyth<strong>in</strong>g particularly impressive, of course. In particular,it would be easy to achieve a similar effect with a s<strong>in</strong>gle XPath expression. Jo<strong>in</strong>s, however,are harder to express <strong>in</strong> XPath. They work, but they’re somewhat messy. WithLINQ to XML, we can use our familiar query expression syntax. List<strong>in</strong>g 12.18 demonstratesthis, show<strong>in</strong>g each open defect’s ID with its assignee and project.List<strong>in</strong>g 12.18Two jo<strong>in</strong>s and a filter with<strong>in</strong> a LINQ to XML queryXElement root = XmlSampleData.GetElement();var query = from defect <strong>in</strong> root.Descendants("defect")jo<strong>in</strong> user <strong>in</strong> root.Descendants("user")on (<strong>in</strong>t?)defect.Attribute("assigned-to") equals(<strong>in</strong>t)user.Attribute("id")jo<strong>in</strong> project <strong>in</strong> root.Descendants("project")on (<strong>in</strong>t)defect.Attribute("project") equals(<strong>in</strong>t)project.Attribute("id")where (str<strong>in</strong>g)defect.Attribute("status") != "Closed"select new { ID=(<strong>in</strong>t)defect.Attribute("id"),Project=(str<strong>in</strong>g)project.Attribute("name"),Assignee=(str<strong>in</strong>g)user.Attribute("name") };foreach (var defect <strong>in</strong> query){Console.WriteL<strong>in</strong>e ("{0}: {1}/{2}",defect.ID,defect.Project,defect.Assignee);}I’m not go<strong>in</strong>g to pretend that list<strong>in</strong>g 12.18 is particularly pleasant. It has lots of str<strong>in</strong>gliterals (which could easily be turned <strong>in</strong>to constants) and it’s generally pretty wordy. Onthe other hand, it’s do<strong>in</strong>g quite a lot of work, <strong>in</strong>clud<strong>in</strong>g cop<strong>in</strong>g with the possibility of adefect not be<strong>in</strong>g assigned to a user (the <strong>in</strong>t? conversion <strong>in</strong> the jo<strong>in</strong> of defect toassignee). Consider how horrible the correspond<strong>in</strong>g XPath expression would have tobe, or how much manual code you’d have to write to perform the same query <strong>in</strong> directcode. The other standard query operators are available, too: once you’ve got a querysource, LINQ to XML itself takes a back seat and lets LINQ to Objects do most of thework. We’ll stop there, however—you may have seen enough query expressions to makeyou dizzy by now, and if you want to experiment further it’s easy enough to do so.12.4.4 LINQ to XML summaryLike the other topics <strong>in</strong> this chapter, we’ve barely scratched the surface of the LINQ toXML API. I haven’t touched the <strong>in</strong>tegration with the previous technologies such as DOMand XPath, nor have I given details of the other node types—not even XDocument!Even if I were to go through all of the features, that wouldn’t come close toexpla<strong>in</strong><strong>in</strong>g all the possible uses of it. Practically everywhere you currently deal withXML, I expect LINQ to XML will make your life easier. To reiterate a cliché, the onlylimit is your imag<strong>in</strong>ation.Licensed to Rhona Hadida


344 CHAPTER 12 LINQ beyond collectionsJust as an example, remember how we created our XML from LINQ queries? Well,there’s noth<strong>in</strong>g to stop the sources <strong>in</strong>volved <strong>in</strong> those queries be<strong>in</strong>g LINQ to XML <strong>in</strong>the first place: lo and behold, a new (and powerful) way of transform<strong>in</strong>g one form ofXML to another is born.Even though we’ve been so brief, I hope that I’ve opened the door for you—givenyou an <strong>in</strong>kl<strong>in</strong>g of the k<strong>in</strong>d of XML process<strong>in</strong>g that can be achieved relatively simply us<strong>in</strong>gLINQ to XML. You’ll need to learn a lot more before you master the API, but my aim wasto whet your appetite and show you how LINQ query expressions have a consistencybetween providers that easily surpasses previous query techniques and technologies.We’ve now seen all of the LINQ providers that are built <strong>in</strong>to the .NET 3.5 Framework.That’s not the same th<strong>in</strong>g as say<strong>in</strong>g that you’ve seen all the LINQ providers you’llever use, however.12.5 LINQ beyond .NET 3.5Even before .NET 3.5 was f<strong>in</strong>ally released, developers outside Microsoft were hard atwork writ<strong>in</strong>g their own providers, and Microsoft isn’t rest<strong>in</strong>g on its laurels either. We’llround this chapter off with a quick look at what else is available now, and some ofwhat’s still to come. In this section we’re mov<strong>in</strong>g even faster than before, cover<strong>in</strong>g providers<strong>in</strong> even less depth. We don’t need to know much about them, but their variety isimportant to demonstrate LINQ’s flexibility.12.5.1 Third-party LINQFrom the start, LINQ was designed to be general purpose. It would be hard to deny itsSQL-like feel, but at the same time LINQ to Objects proves that you don’t need to bework<strong>in</strong>g with a database <strong>in</strong> order to benefit from it.A number of third-party providers have started popp<strong>in</strong>g up, and although at thetime of this writ<strong>in</strong>g most are “proof of concept” more than production code—ways ofexplor<strong>in</strong>g LINQ as much as anyth<strong>in</strong>g else—they give a good <strong>in</strong>dication of the widerange of potential uses for LINQ. We’ll only look at three examples, but providers forother data sources (SharePo<strong>in</strong>t and Flickr, for example) are emerg<strong>in</strong>g, and the list willonly get longer. Let’s start off with a slightly closer look at the example we first saw <strong>in</strong>chapter 1.LINQ TO AMAZONOne of the flagship e-commerce sites, Amazon has always tried to drive technologyforward, and it has a number of web services available for applications to talk to. Somecost money—but fortunately search<strong>in</strong>g for a list of books is free. Simply visit http://aws.amazon.com, sign up to the scheme, and you’ll receive an access ID by email.You’ll need this if you want to run the example for yourself.As part of Mann<strong>in</strong>g’s LINQ <strong>in</strong> Action book, Fabrice Marguerie implemented a LINQprovider to make requests to the Amazon web service. 5 List<strong>in</strong>g 12.19 shows an exampleof us<strong>in</strong>g the provider to query Amazon’s list of books with “LINQ” <strong>in</strong> the title.5See http://l<strong>in</strong>q<strong>in</strong>action.net for more <strong>in</strong>formation and updates.Licensed to Rhona Hadida


LINQ beyond .NET 3.5345List<strong>in</strong>g 12.19Query<strong>in</strong>g Amazon’s web service with LINQAmazonBookSearch source = new AmazonBookSearch();var webQuery = from book <strong>in</strong> sourcewhere book.Title.Conta<strong>in</strong>s("LINQ")select book;CCreates providerB with your access IDCreates queryfor web servicevar query = from book <strong>in</strong> webQuery.AsEnumerable()orderby book.Year, book.Titleselect book;DPerforms moreoperations <strong>in</strong> memoryforeach (AmazonBook book <strong>in</strong> query){Console.WriteL<strong>in</strong>e ("{0}: {1}", book.Year, book.Title);}I’ve taken Fabrice’s provider and tweaked it slightly so that we can pass our own AmazonAccess ID <strong>in</strong>to the provider’s constructor B. The source is part of the Visual Studio2008 solution conta<strong>in</strong><strong>in</strong>g the examples for this chapter.You may be slightly surprised to see two query expressions <strong>in</strong> list<strong>in</strong>g 12.19. As aproof of concept, the LINQ to Amazon provider only allows a limited number ofoperations, not <strong>in</strong>clud<strong>in</strong>g order<strong>in</strong>g. We use the web query C as the source for an<strong>in</strong>-memory LINQ to Objects query expression D. The web service call still only takesplace when we start execut<strong>in</strong>g the query E, due to the deferred execution approachtaken by LINQ.The output at the time of this writ<strong>in</strong>g is <strong>in</strong>cluded here. By the time you read this, Iexpect the list may be considerably longer.1998: A l<strong>in</strong>q between nonwovens and wovens. (...)2007: Introduc<strong>in</strong>g Microsoft LINQ2007: LINQ for VB 20052007: LINQ for Visual <strong>C#</strong> 20052007: Pro LINQ: Language Integrated Query <strong>in</strong> <strong>C#</strong> 20082008: Beg<strong>in</strong>n<strong>in</strong>g ASP.NET 3.5 Data Access with LINQ , <strong>C#</strong> (...)2008: Beg<strong>in</strong>n<strong>in</strong>g ASP.NET 3.5 Data Access with LINQ, VB (...)2008: LINQ <strong>in</strong> Action2008: Professional LINQE Executesquery,displaysresultsEven though LINQ to Amazon is primitive, it demonstrates an important po<strong>in</strong>t: LINQis capable of more than just database and <strong>in</strong>-memory queries. Our next providerproves that even when it’s talk<strong>in</strong>g to databases, there’s more to LINQ than just LINQto SQL.LINQ TO NHIBERNATENHibernate is an open source ORM framework for .NET, based on the Hibernateproject for Java. It supports textual queries <strong>in</strong> its own query language (HQL) and alsoa more programmatic way of build<strong>in</strong>g up queries—the Criteria API.Prolific blogger and NHibernate contributor Ayende 6 has <strong>in</strong>itiated a LINQ to NHibernateprovider that converts LINQ queries not <strong>in</strong>to SQL but <strong>in</strong>to NHibernate Criteriaqueries, tak<strong>in</strong>g advantage of the SQL translation code <strong>in</strong> the rest of the project. Aside6www.ayende.com/Blog/Licensed to Rhona Hadida


346 CHAPTER 12 LINQ beyond collectionsfrom anyth<strong>in</strong>g else, this means that the feature of RDBMS <strong>in</strong>dependence comes forfree. The same LINQ queries can be run aga<strong>in</strong>st SQL Server, Oracle, or Postgres, forexample: any system that NHibernate knows about, with SQL tailored for that particularimplementation.Before writ<strong>in</strong>g this book, I hadn’t used NHibernate (although I am reasonablyexperienced with its cous<strong>in</strong> <strong>in</strong> the Java world), and it’s a testament to the project thatwith<strong>in</strong> about an hour I was up and runn<strong>in</strong>g with the SkeetySoft defect database, us<strong>in</strong>gnoth<strong>in</strong>g but the onl<strong>in</strong>e tutorial. List<strong>in</strong>g 12.20 shows the same query we used aga<strong>in</strong>stLINQ to SQL <strong>in</strong> list<strong>in</strong>g 12.3 to list all of Tim’s open defects.List<strong>in</strong>g 12.20LINQ to NHibernate query to list defects assigned to Tim TrotterISessionFactory sessionFactory =new Configuration().Configure().BuildSessionFactory();us<strong>in</strong>g (ISession session = sessionFactory.OpenSession()){us<strong>in</strong>g (ITransaction tx = session.Beg<strong>in</strong>Transaction()){User tim = (from user <strong>in</strong> session.L<strong>in</strong>q()where user.Name == "Tim Trotter"select user).S<strong>in</strong>gle();}}var query = from defect <strong>in</strong> session.L<strong>in</strong>q()where defect.Status != Status.Closedwhere defect.AssignedTo == timselect defect.Summary;foreach (var summary <strong>in</strong> query){Console.WriteL<strong>in</strong>e(summary);}tx.Commit();As you can see, once the session and transaction have been set up, the code is similarto that used <strong>in</strong> LINQ to SQL. The generated SQL is different, although it executes thesame sort of queries. In other cases, identical query expressions can generate differentSQL, mostly due to decisions regard<strong>in</strong>g the lazy or eager load<strong>in</strong>g of entities. This is anexample of a leaky abstraction 7 —where <strong>in</strong> theory the abstraction layer of LINQ mightbe considered to isolate the developer from the implementation perform<strong>in</strong>g theactual query, but <strong>in</strong> practice the implementation details leak through. Don’t fall forthe abstraction: it takes noth<strong>in</strong>g away from the value of LINQ, but you do need to beaware of what you’re cod<strong>in</strong>g aga<strong>in</strong>st, and keep an eye on what queries are be<strong>in</strong>g executedfor you.So, we’ve seen LINQ work<strong>in</strong>g aga<strong>in</strong>st both web services and multiple databases.There’s another piece of <strong>in</strong>frastructure that is commonly queried, though: an enterprisedirectory.7www.joelonsoftware.com/articles/LeakyAbstractions.htmlLicensed to Rhona Hadida


LINQ beyond .NET 3.5347LINQ TO ACTIVE DIRECTORYAlmost all companies runn<strong>in</strong>g their IT <strong>in</strong>frastructure on W<strong>in</strong>dows use Active Directoryto manage users, computers, sett<strong>in</strong>gs, and more. Active Directory is a directoryserver that implements LDAP as one query protocol, but other LDAP servers are available,<strong>in</strong>clud<strong>in</strong>g the free OpenLDAP platform. 8Bart De Smet 9 (who now works for Microsoft) implemented a prototype LINQ toActive Directory provider as a tutorial on how LINQ providers work—but it has alsoproved to be a valuable rem<strong>in</strong>der of how broadly LINQ is targeted. Despite the name,it is capable of query<strong>in</strong>g non-Microsoft servers. He has also repeated the feat with aLINQ to SharePo<strong>in</strong>t provider, which we won’t cover here, but which is a more completeprovider implementation.If you’re not familiar with directories, you can th<strong>in</strong>k of them as a sort of cross betweenfile system directory trees and databases. They form hierarchies rather than tables, buteach node <strong>in</strong> the tree is an entity with properties and an associated schema. (For thoseof you who are familiar with directories, please forgive this gross oversimplification.)Install<strong>in</strong>g and populat<strong>in</strong>g a directory is a bit of an effort, so for the sample codeI’ve connected to a public server. If you happen to have access to an <strong>in</strong>ternal server,you’ll probably f<strong>in</strong>d the results more mean<strong>in</strong>gful if you connect to that.List<strong>in</strong>g 12.21 connects to an LDAP server and lists all the users whose first namebeg<strong>in</strong>s with K. The Person type is described elsewhere <strong>in</strong> the sample source code,complete with attributes to describe to LINQ to Active Directory how the propertieswith<strong>in</strong> the object map to attributes with<strong>in</strong> the directory.List<strong>in</strong>g 12.21LINQ to Active Directory sample, query<strong>in</strong>g users by first namestr<strong>in</strong>g url = "LDAP://ldap.andrew.cmu.edu/dc=cmu,dc=edu";DirectoryEntry root = new DirectoryEntry(url);root.AuthenticationType = AuthenticationTypes.None;var users = new DirectorySource(root, SearchScope.Subtree);users.Log = Console.Out;var query = from user <strong>in</strong> userswhere user.FirstName.StartsWith("K")select user;foreach (Person person <strong>in</strong> query){Console.WriteL<strong>in</strong>e (person.DisplayName);foreach (str<strong>in</strong>g title <strong>in</strong> person.Titles){Console.WriteL<strong>in</strong>e(" {0}", title);}}Are you gett<strong>in</strong>g a sense of déjà vu yet? List<strong>in</strong>g 12.21 shows yet another query expression,which just happens to target LDAP. If you didn’t have the first part, which sets the8www.openldap.org9http://blogs.bartdesmet.net/bartLicensed to Rhona Hadida


348 CHAPTER 12 LINQ beyond collectionsscene, you wouldn’t know that LDAP was <strong>in</strong>volved at all. As we’ve already noted, youshould be aware of the data source particularly with regard to the limitations of anyone provider—but I’m sure you understood the query expression despite not know<strong>in</strong>gLDAP. The pla<strong>in</strong> text version is (&(objectClass=person)(givenName=K*)), whichisn’t horrific but isn’t nearly as familiar as the query expression should be by now.The third-party examples we’ve seen are all “early adoption” code, largely writtenfor the sake of <strong>in</strong>vestigat<strong>in</strong>g what’s possible rather than creat<strong>in</strong>g production code. Ipredict that 2008 and 2009 will see a number of more feature-complete and high-qualityproviders emerg<strong>in</strong>g. At least two of these are likely to come from Microsoft.12.5.2 Future Microsoft LINQ technologiesLINQ providers can work <strong>in</strong> very different ways. As we’ve already seen, LINQ to XMLand LINQ to DataSet are just APIs that offer easy <strong>in</strong>tegration with LINQ to Objects,whereas LINQ to SQL is a more fully fledged provider, offer<strong>in</strong>g translation of queriesto SQL. There are more providers be<strong>in</strong>g developed by Microsoft at the time of thiswrit<strong>in</strong>g, however—another ORM system, and an <strong>in</strong>trigu<strong>in</strong>g project to provide “no cost”parallelism where appropriate.THE ADO.NET ENTITY FRAMEWORKWhile LINQ to SQL is far from a toy, it doesn’t have all the features that many developersexpect from a modern ORM solution. For example, fetch<strong>in</strong>g strategies can be seton a per-context basis, but they can’t be set for <strong>in</strong>dividual queries. Likewise, the entity<strong>in</strong>heritance strategies of LINQ to SQL are somewhat limited. Also, LINQ to SQL onlysupports SQL Server, which will obviously rule out its use for many projects that havealready chosen a different RDBMS.The “ADO.NET Entity Framework” forms the basis of Microsoft’s “Data Access Strategy”and will ship with SQL Server 2008. The entity framework is a much more powerfulsolution than LINQ to SQL, <strong>in</strong>clud<strong>in</strong>g its own text-based query language (Entity SQL, alsoknown as eSQL) and allow<strong>in</strong>g a flexible mapp<strong>in</strong>g between the conceptual model (which iswhat the bus<strong>in</strong>ess layer of your code will use) and the logical model (which is what the databasesees). An Object Services aspect of the entity framework deals with object identity andupdate management, and an Entity Client layer is responsible for all queries. The LINQpart of the entity framework is known as LINQ to Entities.As with so many aspects of software development, flexibility comes with the burdenof complexity—the two models and the mapp<strong>in</strong>g between them require their ownXML files, for example. Microsoft will release an update to Visual Studio 2008 whenthe entity framework is released, with designers to help lighten the load of the variousmapp<strong>in</strong>g tasks. The framework itself is still more complicated than LINQ to SQL, however,and will take longer to master.Rather than provide yet another ORM query aga<strong>in</strong>st the SkeetySoft defect database—andone that would look nearly identical to those we’ve already seen—I’ve just<strong>in</strong>cluded some examples <strong>in</strong> the downloadable source code. 1010 I will update these when the entity framework is released, if necessary.Licensed to Rhona Hadida


LINQ beyond .NET 3.5349All of the LINQ providers we’ve seen so far have acted on a particular data source,and performed the appropriate transformations. Our next topic is slightly different—but it’s one I’m particularly excited about.PARALLEL LINQ (PLINQ)Ten years ago, the idea of even fairly low-to-middl<strong>in</strong>g laptops hav<strong>in</strong>g dual-processorcores would have seemed ridiculous. Today, that’s taken for granted—and if the chipmanufacturers’ plans are anyth<strong>in</strong>g to go by, that’s only the start. Of course, it’s only usefulto have more than one processor core if you’ve got tasks you can run <strong>in</strong> parallel.Parallel LINQ, or PLINQ for short, is a project with one “simple” goal: to execute LINQto Objects queries <strong>in</strong> parallel, realiz<strong>in</strong>g the benefits of multithread<strong>in</strong>g with as few headachesas possible. At the time of this writ<strong>in</strong>g, PLINQ is targeted to be released as part ofParallel Extensions, the next generation of .NET concurrency support. The sample Idescribe is based on the December 2007 Community Technology Preview (CTP).Us<strong>in</strong>g PLINQ is simple, if (and only if) you have to perform the same task on eachelement <strong>in</strong> a sequence, and those tasks are <strong>in</strong>dependent. If you need the result ofone calculation step <strong>in</strong> order to f<strong>in</strong>d the next, PLINQ is not for you—but many CPU<strong>in</strong>tensivetasks can <strong>in</strong> fact be done <strong>in</strong> parallel. To tell the compiler to use PLINQ, youjust need to call AsParallel (an extension method on IEnumerable) on your datasource, and let PLINQ handle the thread<strong>in</strong>g. As with IQueryable, the magic is justnormal compiler method resolution: AsParallel returns an IParallelEnumerable,and the ParallelEnumerable class provides static methods to handle the standardquery operators.List<strong>in</strong>g 12.22 demonstrates PLINQ <strong>in</strong> an entirely artificial way, putt<strong>in</strong>g threads tosleep for random periods <strong>in</strong>stead of actually hitt<strong>in</strong>g the processor hard.List<strong>in</strong>g 12.22Execut<strong>in</strong>g a LINQ query on multiple threads with Parallel LINQstatic <strong>in</strong>t Obta<strong>in</strong>LengthSlowly(str<strong>in</strong>g name){Thread.Sleep(StaticRandom.Next(10000));return name.Length;}...str<strong>in</strong>g[] names = {"Jon", "Holly", "Tom", "Rob<strong>in</strong>", "William"};var query = from name <strong>in</strong> names.AsParallel(3)select Obta<strong>in</strong>LengthSlowly(name);foreach (<strong>in</strong>t length <strong>in</strong> query){Console.WriteL<strong>in</strong>e(length);}List<strong>in</strong>g 12.22 will pr<strong>in</strong>t out the length of each name. We’re us<strong>in</strong>g a random 11 sleep tosimulate do<strong>in</strong>g some real work with<strong>in</strong> the call to Obta<strong>in</strong>LengthSlowly. Without the11 The StaticRandom class used for this is merely a thread-safe wrapper of static methods around a normalRandom class. It’s part of my miscellaneous utility library.Licensed to Rhona Hadida


350 CHAPTER 12 LINQ beyond collectionsAsParallel call, we would only use a s<strong>in</strong>gle thread, but AsParallel and the result<strong>in</strong>gcalls to the ParallelEnumerable extension methods means that the work is split <strong>in</strong>toup to three threads. 12One caveat about this: unless you specify that you want the results <strong>in</strong> the sameorder as the str<strong>in</strong>gs <strong>in</strong> the orig<strong>in</strong>al sequence, PLINQ will assume you don’t m<strong>in</strong>d gett<strong>in</strong>gresults as soon as they’re available, even if results from earlier elements haven’tbeen returned yet. You can prevent this by pass<strong>in</strong>g QueryOptions.PreserveOrder<strong>in</strong>gas a parameter to AsParallel.There are other subtleties to us<strong>in</strong>g PLINQ, such as handl<strong>in</strong>g the possibility of multipleexceptions occurr<strong>in</strong>g <strong>in</strong>stead of the whole process stopp<strong>in</strong>g on the first problematicelement—consult the documentation for further details when ParallelExtensions is fully released. More examples of PLINQ are <strong>in</strong>cluded <strong>in</strong> the downloadablesource code.As you can see, PLINQ isn’t a “data source”—it’s a k<strong>in</strong>d of meta-provider, alter<strong>in</strong>ghow a query is executed. Many developers will never need it—but I’m sure that thosewho do will be eternally grateful for the coord<strong>in</strong>ation it performs for them beh<strong>in</strong>dthe scenes.These won’t be the only new providers Microsoft comes up with—we should expectnew APIs to be built with LINQ <strong>in</strong> m<strong>in</strong>d, and that should <strong>in</strong>clude your own code as well.I confidently expect to see some weird and wonderful uses of LINQ <strong>in</strong> the future.12.6 SummaryPhew! This chapter has been the exact opposite of most of the rest of the book.Instead of focus<strong>in</strong>g on a s<strong>in</strong>gle topic <strong>in</strong> great detail, we’ve covered a vast array of LINQproviders, but at a shallow level.I wouldn’t expect you to feel particularly familiar with any one of the specific technologieswe’ve looked at here, but I hope you’ve got a deeper understand<strong>in</strong>g of whyLINQ is important. It’s not about XML, or <strong>in</strong>-memory queries, or even SQL queries—it’sabout consistency of expression, and giv<strong>in</strong>g the <strong>C#</strong> compiler the opportunity to validateyour queries to at least some extent, regardless of their f<strong>in</strong>al execution platform.You should now appreciate why expression trees are so important that they areamong the few framework elements that the <strong>C#</strong> compiler has direct <strong>in</strong>timate knowledgeof (along with str<strong>in</strong>gs, IDisposable, IEnumerable, and Nullable, for example).They are passports for behavior, allow<strong>in</strong>g it to cross the border of the local mach<strong>in</strong>e,express<strong>in</strong>g logic <strong>in</strong> whatever foreign tongue is catered for by a LINQ provider.It’s not just expression trees—we’ve also relied on the query expression translationemployed by the compiler, and the way that lambda expressions can be converted toboth delegates and expression trees. Extension methods are also important, as withoutthem each provider would have to give implementations of all the relevant methods. If12 I’ve explicitly specified the number of threads <strong>in</strong> this example to force parallelism even on a s<strong>in</strong>gle-core system.If the number of threads isn’t specified, the system acts as it sees fit, depend<strong>in</strong>g on the number of coresavailable and how much other work they have.Licensed to Rhona Hadida


Summary351you look back at all the new features of <strong>C#</strong>, you’ll f<strong>in</strong>d few that don’t contribute significantlyto LINQ <strong>in</strong> some way or other. That is part of the reason for this chapter’s existence:to show the connections between all the features.I shouldn’t wax lyrical for too long, though—as well as the upsides of LINQ, we’veseen a few “gotchas.” LINQ will not always allow us to express everyth<strong>in</strong>g we need <strong>in</strong> aquery, nor does it hide all the details of the underly<strong>in</strong>g data source. The impedancemismatches that have caused developers so much trouble <strong>in</strong> the past are still with us:we can reduce their impact with ORM systems and the like, but without a properunderstand<strong>in</strong>g of the query be<strong>in</strong>g executed on your behalf, you are likely to run <strong>in</strong>tosignificant issues. In particular, don’t th<strong>in</strong>k of LINQ as a way of remov<strong>in</strong>g your need tounderstand SQL—just th<strong>in</strong>k of it as a way of hid<strong>in</strong>g the SQL when you’re not <strong>in</strong>terested<strong>in</strong> the details.Despite the limitations, LINQ is undoubtedly go<strong>in</strong>g to play a major part <strong>in</strong> future .NETdevelopment. In the f<strong>in</strong>al chapter, I will look at some of the ways development is likelyto change <strong>in</strong> the next few years, and the part I believe <strong>C#</strong> 3 will play <strong>in</strong> that evolution.Licensed to Rhona Hadida


Elegant code<strong>in</strong> the new eraThis chapter covers■Reasons for language evolution■ Changes of emphasis for <strong>C#</strong> 3■■Readability: “what” over “how”Effects of parallel comput<strong>in</strong>gYou’ve now seen all the features that <strong>C#</strong> 3 has to offer, and you’ve had a taste ofsome of the flavors of LINQ available now and <strong>in</strong> the near future. Hopefully I’vegiven you a feel<strong>in</strong>g for the directions <strong>C#</strong> 3 might guide you <strong>in</strong> when cod<strong>in</strong>g, and thischapter puts those directions <strong>in</strong>to the context of software development <strong>in</strong> general.There’s a certa<strong>in</strong> amount of speculation <strong>in</strong> this chapter. Take everyth<strong>in</strong>g with agra<strong>in</strong> of salt—I don’t have a crystal ball, after all, and technology is notoriously difficultto predict. However, the themes are fairly common ones and I am confidentthat they’ll broadly hit the mark, even if the details are completely off.Life is all about learn<strong>in</strong>g from our mistakes—and occasionally fail<strong>in</strong>g to do so.The software <strong>in</strong>dustry has been both <strong>in</strong>novative and shock<strong>in</strong>gly backward at times.There are elegant new technologies such as <strong>C#</strong> 3 and LINQ, frameworks that do352Licensed to Rhona Hadida


The chang<strong>in</strong>g nature of language preferences353more than we might have dreamed about ten years ago, and tools that hold our handsthroughout the development processes… and yet we know that a large proportion ofsoftware projects fail. Often this is due to management failures or even customer failures,but sometimes developers need to take at least some of the blame.Many, many books have been written about why this is the case, and I won’t pretendto be an expert, but I believe that ultimately it comes down to human nature.The vast majority of us are sloppy—and I certa<strong>in</strong>ly <strong>in</strong>clude myself <strong>in</strong> that category.Even when we know that best practices such as unit test<strong>in</strong>g and layered designs willhelp us <strong>in</strong> the long run, we sometimes go for quick fixes that eventually come back tohaunt us.There’s only so much a language or a platform can do to counter this. The onlyway to appeal to laz<strong>in</strong>ess is to make the right th<strong>in</strong>g to do also the easiest one. Someareas make that difficult—it will always seem easier <strong>in</strong> some ways to not write unit teststhan to write them. Quite often break<strong>in</strong>g our design layers (“just for this one littleth<strong>in</strong>g, honest”) really is easier than do<strong>in</strong>g the job properly—temporarily.On the bright side, <strong>C#</strong> 3 and LINQ allow many ideas and goals to be expressedmuch more easily than before, improv<strong>in</strong>g readability while simultaneously speed<strong>in</strong>gup development. If you have the opportunity to use <strong>C#</strong> 3 for pleasure before putt<strong>in</strong>g it<strong>in</strong> a bus<strong>in</strong>ess context, you may well f<strong>in</strong>d yourself be<strong>in</strong>g frustrated at the shacklesimposed when you have to go back to <strong>C#</strong> 2 (or, heaven forbid, <strong>C#</strong> 1). There are somany shortcuts that you may often f<strong>in</strong>d yourself surprised at just how easy it is toachieve what might previously have been a time-consum<strong>in</strong>g goal.Some of the improvements are simply obvious: automatic properties replace severall<strong>in</strong>es of code with a s<strong>in</strong>gle one, at no cost. There’s no need to change the way youth<strong>in</strong>k or how you approach design and development—it’s just a common scenariothat is now more streaml<strong>in</strong>ed.What I f<strong>in</strong>d more <strong>in</strong>terest<strong>in</strong>g are the features that do ask us to take a step back. Theysuggest to us that while we haven’t been do<strong>in</strong>g th<strong>in</strong>gs “wrong,” there may be a better wayof look<strong>in</strong>g at the world. In a few years’ time, we may look back at old code and be amazedat the way we used to develop. Whenever a language evolves, it’s worth ask<strong>in</strong>g what thechanges mean <strong>in</strong> this larger sense. I’ll try to answer that question now, for <strong>C#</strong> 3.13.1 The chang<strong>in</strong>g nature of language preferencesThe changes <strong>in</strong> <strong>C#</strong> 3 haven’t just added more features. They’ve altered the idiom ofthe language, the natural way of express<strong>in</strong>g certa<strong>in</strong> ideas and implement<strong>in</strong>g behavior.These shifts <strong>in</strong> emphasis aren’t limited to <strong>C#</strong>, however—they’re part of what’s happen<strong>in</strong>gwith<strong>in</strong> our <strong>in</strong>dustry as a whole.13.1.1 A more functional emphasisIt would be hard to deny that <strong>C#</strong> has become more functional <strong>in</strong> the move from version2 to version 3. Delegates have been part of <strong>C#</strong> 1 s<strong>in</strong>ce the first version, but theyhave become <strong>in</strong>creas<strong>in</strong>gly convenient to specify and <strong>in</strong>creas<strong>in</strong>gly widely used <strong>in</strong> theframework libraries.Licensed to Rhona Hadida


354 CHAPTER 13 Elegant code <strong>in</strong> the new eraThe most extreme example of this is LINQ, of course, which has delegates at itsvery core. While LINQ queries can be written quite readably without us<strong>in</strong>g queryexpressions, if you take away lambda expressions and extension methods they becomefrankly hideous. Even a simple query expression requires extra methods to be writtenso that they can be used as delegate actions. The creation of those delegates is ugly,and the way that the calls are cha<strong>in</strong>ed together is un<strong>in</strong>tuitive. Consider this fairly simplequery expression:from user <strong>in</strong> SampleData.AllUserswhere user.UserType == UserType.Developerorderby user.Nameselect user.Name.ToUpper();That is translated <strong>in</strong>to the equally reasonable set of extension method calls:SampleData.AllUsers.Where(user => user.UserType == UserType.Developer).OrderBy(user => user.Name).Select(user => user.Name.ToUpper());It’s not quite as pretty, but it’s still clear. To express that <strong>in</strong> a s<strong>in</strong>gle expression withoutany extra local variables and without us<strong>in</strong>g any <strong>C#</strong> 2 or 3 features beyond genericsrequires someth<strong>in</strong>g along these l<strong>in</strong>es:Enumerable.Select(Enumerable.OrderBy(Enumerable.Where(SampleData.AllUsers,new Func(AcceptDevelopers)),new Func(OrderByName)),new Func(ProjectToUpperName));Oh, and the AcceptDevelopers, OrderByName, and ProjectToUpperName methods allneed to be def<strong>in</strong>ed, of course. It’s an abom<strong>in</strong>ation. LINQ is just not designed to be usefulwithout a concise way of specify<strong>in</strong>g delegates. Where previously functional languageshave been relatively obscure <strong>in</strong> the bus<strong>in</strong>ess world, some of their benefits arenow be<strong>in</strong>g reaped <strong>in</strong> <strong>C#</strong>.At the same time as ma<strong>in</strong>stream languages are becom<strong>in</strong>g more functional, functionallanguages are becom<strong>in</strong>g more ma<strong>in</strong>stream. The Microsoft Research “F#” language1 is <strong>in</strong> the ML family, but execut<strong>in</strong>g on the CLR: it’s ga<strong>in</strong>ed enough <strong>in</strong>terest tonow have a dedicated team with<strong>in</strong> the nonresearch side of Microsoft br<strong>in</strong>g<strong>in</strong>g it <strong>in</strong>toproduction so that it can be a truly <strong>in</strong>tegrated language <strong>in</strong> the .NET family.The differences aren’t just about be<strong>in</strong>g more functional, though. Is <strong>C#</strong> becom<strong>in</strong>g adynamic language?13.1.2 Static, dynamic, implicit, explicit, or a mixture?As I’ve emphasized a number of times <strong>in</strong> this book, <strong>C#</strong> 3 is still a statically typed language.It has no truly dynamic aspects to it. However, many of the features <strong>in</strong> <strong>C#</strong> 2 and 3 are those1http://research.microsoft.com/projects/cambridge/fsharp/fsharp.aspxLicensed to Rhona Hadida


Delegation as the new <strong>in</strong>heritance355associated with dynamic languages. In particular, the implicitly typed local variables andarrays, extra type <strong>in</strong>ference capabilities for generic methods, extension methods, andbetter <strong>in</strong>itialization structures are all th<strong>in</strong>gs that <strong>in</strong> some ways look like they belong <strong>in</strong>dynamic languages.While <strong>C#</strong> itself is currently statically typed, the Dynamic Language Runtime (DLR)will br<strong>in</strong>g dynamic languages to .NET. Integration between static languages anddynamic ones such as IronRuby and IronPython should therefore be relativelystraightforward—this will allow projects to pick which areas they want to write dynamically,and which are better kept statically typed.Should <strong>C#</strong> become dynamic <strong>in</strong> the future? Given recent blog posts from the <strong>C#</strong>team, it seems likely that <strong>C#</strong> 4 will allow dynamic lookup <strong>in</strong> clearly marked sections ofcode. Call<strong>in</strong>g code dynamically isn’t the same as respond<strong>in</strong>g to calls dynamically, however—andit’s possible that <strong>C#</strong> will rema<strong>in</strong> statically typed at that level. That doesn’tmean there can’t be a language that is like <strong>C#</strong> <strong>in</strong> many ways but dynamic, <strong>in</strong> the sameway that Groovy is like Java <strong>in</strong> many ways but with some extra features and dynamicexecution. It should be noted that Visual Basic already allows for optionally dynamiclookups, just by turn<strong>in</strong>g Option Strict on and off. In the meantime, we should begrateful for the <strong>in</strong>fluence of dynamic languages <strong>in</strong> mak<strong>in</strong>g <strong>C#</strong> 3 a lot more expressive,allow<strong>in</strong>g us to state our <strong>in</strong>tentions without as much fluff surround<strong>in</strong>g the really usefulbits of code.The changes to <strong>C#</strong> don’t just affect how our source code looks <strong>in</strong> pla<strong>in</strong> text terms,however. They should also make us reconsider the structure of our programs, allow<strong>in</strong>gdesigns to make much greater use of delegates without fear of forc<strong>in</strong>g thousands ofone-l<strong>in</strong>e methods on users.13.2 Delegation as the new <strong>in</strong>heritanceThere are many situations where <strong>in</strong>heritance is currently used to alter the behavior ofa component <strong>in</strong> just one or two ways—and they’re often ways that aren’t so much<strong>in</strong>herent <strong>in</strong> the component itself as <strong>in</strong> how it <strong>in</strong>teracts with the world around it.Take a data grid, for example. A grid may use <strong>in</strong>heritance (possibly of a typerelated to a specific row or column) to determ<strong>in</strong>e how data should be formatted. Inmany ways, this is absolutely right—you can build up a flexible design that allows forall k<strong>in</strong>ds of different values to be displayed, possibly <strong>in</strong>clud<strong>in</strong>g images, buttons,embedded tables, and the like. The vast majority of read-only data is likely to consist ofsome pla<strong>in</strong> text, however. Now, we could have a TextDataColumn type with an abstractFormatData method, and derive from that <strong>in</strong> order to format dates, pla<strong>in</strong> str<strong>in</strong>gs,numbers, and all k<strong>in</strong>ds of other data <strong>in</strong> whatever way we want.Alternatively, we could allow the user to specify the formatt<strong>in</strong>g by way of a delegate,which simply converts the appropriate data type to a str<strong>in</strong>g. With <strong>C#</strong> 3’s lambda expressions,this makes it easy to provide a custom display of the data. Of course, you may wellwant to provide easy ways of handl<strong>in</strong>g common cases—but delegates are immutable <strong>in</strong>.NET, so simple “constant” delegates for frequently used types can fill this need neatly.Licensed to Rhona Hadida


356 CHAPTER 13 Elegant code <strong>in</strong> the new eraThis works well when a s<strong>in</strong>gle, isolated aspect of the component needs to be specialized.It’s certa<strong>in</strong>ly not a complete replacement of <strong>in</strong>heritance, nor would I want itto be (the title of this section notwithstand<strong>in</strong>g)—but it allows a more direct approachto be used <strong>in</strong> many situations. Us<strong>in</strong>g <strong>in</strong>terfaces with a small set of methods has oftenbeen another way of provid<strong>in</strong>g custom behavior, and delegates can be regarded as anextreme case of this approach.Of course, this is similar to the po<strong>in</strong>t made earlier about a more functional bias, butit’s applied to the specific area of <strong>in</strong>heritance and <strong>in</strong>terface implementation. It’s notentirely new to <strong>C#</strong> 3, either: List made a start <strong>in</strong> .NET 2.0 even when only <strong>C#</strong> 2 wasavailable, with methods such as Sort and F<strong>in</strong>dAll. Sort allows both an <strong>in</strong>terface-basedcomparison (with IComparer) and a delegate-based comparison (with Comparison),whereas F<strong>in</strong>dAll is purely delegate based. Anonymous methods made these calls relativelysimple and lambda expressions add even more readability.In short, when a type or method needs a s<strong>in</strong>gle aspect of specialized behavior, it’sworth at least consider<strong>in</strong>g the ability to specify that behavior <strong>in</strong> terms of a delegate<strong>in</strong>stead of via <strong>in</strong>heritance or an <strong>in</strong>terface.All of this contributes to our next big goal: readable code.13.3 Readability of results over implementationThe word readability is bandied around quite casually as if it can only mean one th<strong>in</strong>gand can somehow be measured objectively. In real life, different developers f<strong>in</strong>d differentth<strong>in</strong>gs readable, and <strong>in</strong> different ways. There are two k<strong>in</strong>ds of readability I’dlike to separate—while acknowledg<strong>in</strong>g that many more categorizations are possible.First, there is the ease with which a reader can understand exactly what your codeis do<strong>in</strong>g at every step. For <strong>in</strong>stance, mak<strong>in</strong>g every conversion explicit even if there’s animplicit one available makes it clear that a conversion is <strong>in</strong>deed tak<strong>in</strong>g place. This sortof detail can be useful if you’re ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g code and have already isolated the problemto a few l<strong>in</strong>es of code. However, it tends to be longw<strong>in</strong>ded, mak<strong>in</strong>g it harder tobrowse large sections of source. I th<strong>in</strong>k of this as “readability of implementation.”When it comes to gett<strong>in</strong>g the broad sweep of code, what is required is “readabilityof results”—I want to know what the code does, but I don’t care how it does it rightnow. Much of this has traditionally been down to refactor<strong>in</strong>g, careful nam<strong>in</strong>g, andother best practices. For example, a method that needs to perform several steps canoften be refactored <strong>in</strong>to a method that simply calls other (reasonably short) methodsto do the actual work. Declarative languages tend to emphasize readability of results.<strong>C#</strong> 3 and LINQ comb<strong>in</strong>e to improve readability of results quite significantly—at thecost of readability of implementation. Almost all the cleverness shown by the <strong>C#</strong> 3compiler adds to this: extension methods make the <strong>in</strong>tention of the code clearer, butat the cost of the visibility of the extra static class <strong>in</strong>volved, for example.This isn’t just a language issue, though; it’s also part of the framework support.Consider how you might have implemented our earlier user query <strong>in</strong> .NET 1.1. Theessential <strong>in</strong>gredients are filter<strong>in</strong>g, sort<strong>in</strong>g, and project<strong>in</strong>g:Licensed to Rhona Hadida


Life <strong>in</strong> a parallel universe357ArrayList filteredUsers = new ArrayList();foreach (User user <strong>in</strong> SampleData.AllUsers){if (user.UserType==UserType.Developer){filteredUsers.Add(user);}}filteredUsers.Sort(new UserNameComparer());ArrayList upperCasedNames = new ArrayList();foreach (User user <strong>in</strong> filteredUsers){upperCasedNames.Add(user.Name.ToUpper());}Each step is clear, but it’s relatively hard to understand exactly what’s go<strong>in</strong>g on! Theversion we saw earlier with the explicit calls to Enumerable was shorter, but the evaluationorder still made it difficult to read. <strong>C#</strong> 3 hides exactly how and where the filter<strong>in</strong>g,sort<strong>in</strong>g, and projection is tak<strong>in</strong>g place—even after translat<strong>in</strong>g the query expression<strong>in</strong>to method calls—but the overall purpose of the code is much more obvious.Usually this type of readability is a good th<strong>in</strong>g, but it does mean you need to keepyour wits about you. For <strong>in</strong>stance, captur<strong>in</strong>g local variables makes it a lot easier towrite query expressions—but you need to understand that if you change the values ofthose local variables after creat<strong>in</strong>g the query expression, those changes will applywhen you execute the query expression.One of the aims of this book has been to make you sufficiently comfortable withthe mechanics of <strong>C#</strong> 3 that you can make use of the magic without f<strong>in</strong>d<strong>in</strong>g it hard tounderstand what’s go<strong>in</strong>g on when you need to dig <strong>in</strong>to it—as well as warn<strong>in</strong>g you ofsome of the potential hazards you might run <strong>in</strong>to.So far these have all been somewhat <strong>in</strong>ward-look<strong>in</strong>g aspects of development—changes that could have happened at any time. The next po<strong>in</strong>t is very much due towhat a biologist might call an “external stimulus.”13.4 Life <strong>in</strong> a parallel universeIn chapter 12 we looked briefly at Parallel LINQ, and I mentioned that it is part of awider project called Parallel Extensions. This is Microsoft’s next attempt to make concurrencyeasier. I don’t expect it to be the f<strong>in</strong>al word on such a daunt<strong>in</strong>g topic, but it’sexcit<strong>in</strong>g nonetheless.As I write this, most computers still have just a few cores. Some servers have eightor possibly even 16 (with<strong>in</strong> the x86/x64 space—other architectures already supportfar more than this). Given how everyth<strong>in</strong>g <strong>in</strong> the <strong>in</strong>dustry is progress<strong>in</strong>g, it may not belong before that looks like small fry, with genu<strong>in</strong>e massively parallel chips becom<strong>in</strong>gpart of everyday life. Concurrency is at the tipp<strong>in</strong>g po<strong>in</strong>t between “nice to have” and“must have” as a developer skill.We’ve already seen how the functional aspects of <strong>C#</strong> 3 and LINQ enable some concurrencyscenarios—parallelism is often a matter of break<strong>in</strong>g down a big task <strong>in</strong>to lotsLicensed to Rhona Hadida


358 CHAPTER 13 Elegant code <strong>in</strong> the new eraof smaller ones that can run at the same time, after all, and delegates are nice build<strong>in</strong>gblocks for that. The support for delegates <strong>in</strong> the form of lambda expressions—andeven expression trees to express logic <strong>in</strong> a more data-like manner—will certa<strong>in</strong>ly helpparallelization efforts <strong>in</strong> the future.There will be more advances to come. Some improvements may come throughnew frameworks such as Parallel Extensions, while others may come through futurelanguage features. Some of the frameworks may use exist<strong>in</strong>g language features <strong>in</strong>novel ways, just as the Concurrency and Coord<strong>in</strong>ation Runtime uses iterator blocks aswe saw <strong>in</strong> chapter 6.One area we may well see becom<strong>in</strong>g more prom<strong>in</strong>ent is provability. Concurrency isa murky area full of hidden pitfalls, and it’s also very hard to test properly. Test<strong>in</strong>gevery possibility is effectively impossible—but <strong>in</strong> some cases source code can be analyzedfor concurrency correctness automatically. Mak<strong>in</strong>g this applicable to bus<strong>in</strong>esssoftware at a level that is usable by “normal” developers such as ourselves is likely to bechalleng<strong>in</strong>g, but we may see progress as it becomes <strong>in</strong>creas<strong>in</strong>gly important to use thelarge number of cores becom<strong>in</strong>g available to us.There are clearly dozens of areas I could have picked that could become crucial <strong>in</strong>the next decade—mobile comput<strong>in</strong>g, service-oriented architectures (SOA), humancomputer <strong>in</strong>terfaces, rich Internet applications, system <strong>in</strong>teroperability, and so forth.These are all likely to be transformed significantly—but parallel comput<strong>in</strong>g is likely tobe at the heart of many of them. If you don’t know much about thread<strong>in</strong>g, I stronglyadvise you to start learn<strong>in</strong>g right now.13.5 FarewellSo, that’s <strong>C#</strong>—for now. I doubt that it will stay at version 3 forever, although I would personallylike Microsoft to give us at least a few years of explor<strong>in</strong>g and becom<strong>in</strong>g comfortablewith <strong>C#</strong> 3 before mov<strong>in</strong>g the world on aga<strong>in</strong>. I don’t know about you, but I coulddo with a bit of time to use what we’ve got <strong>in</strong>stead of learn<strong>in</strong>g the next version. If weneed a bit more variety and spice, there are always other languages to be studied…In the meantime, there will certa<strong>in</strong>ly be new libraries and architectures to come togrips with. Developers can never afford to stand still—but hopefully this book hasgiven you a rock-solid foundation <strong>in</strong> <strong>C#</strong>, enabl<strong>in</strong>g you to learn new technologies withoutworry<strong>in</strong>g about what the language is do<strong>in</strong>g.There’s more to life than learn<strong>in</strong>g about the new tools available, and while you mayhave bought this book purely out of <strong>in</strong>tellectual curiosity, it’s more likely that you justwant to get the most out of <strong>C#</strong> 3. After all, there’s relatively little po<strong>in</strong>t <strong>in</strong> acquir<strong>in</strong>g a skillif you’re not go<strong>in</strong>g to use it. <strong>C#</strong> 3 is a wonderful language, and .NET 3.5 is a great platform—buton their own they mean very little. They need to be used to provide value.I’ve tried to give you a thorough understand<strong>in</strong>g of <strong>C#</strong> 3, but that doesn’t mean thatyou’ve seen all that it can do, any more than play<strong>in</strong>g each note on a piano <strong>in</strong> turnmeans you’ve heard every possible tune. I’ve put the features <strong>in</strong> context and givensome examples of where you might f<strong>in</strong>d them helpful. I can’t tell you exactly whatground-break<strong>in</strong>g use you might f<strong>in</strong>d for <strong>C#</strong> 3—but I wish you the very best of luck.Licensed to Rhona Hadida


appendix:LINQ standardquery operatorsThere are many standard query operators <strong>in</strong> LINQ, only some of which are supporteddirectly <strong>in</strong> <strong>C#</strong> query expressions—the others have to be called “manually” asnormal methods. Some of the standard query operators are demonstrated <strong>in</strong> thema<strong>in</strong> text of the book, but they’re all listed <strong>in</strong> this appendix. For the examples, I’vedef<strong>in</strong>ed two sample sequences:str<strong>in</strong>g[] words = {"zero", "one", "two", "three", "four"};<strong>in</strong>t[] numbers = {0, 1, 2, 3, 4};For completeness I’ve <strong>in</strong>cluded the operators we’ve already seen, although <strong>in</strong> mostcases chapter 11 conta<strong>in</strong>s more detail on them than I’ve provided here. For eachoperator, I’ve specified whether it uses deferred or immediate execution.A.1 AggregationThe aggregation operators (see table A.1) all result <strong>in</strong> a s<strong>in</strong>gle value rather than asequence. Average and Sum all operate either on a sequence of numbers (any of thebuilt-<strong>in</strong> numeric types) or on a sequence of elements with a delegate to convert fromeach element to one of the built-<strong>in</strong> numeric types. M<strong>in</strong> and Max have overloads fornumeric types, but can also operate on any sequence either us<strong>in</strong>g the default comparerfor the element type or us<strong>in</strong>g a conversion delegate. Count and LongCount areequivalent to each other, just with different return types. Both of these have two overloads—onethat just counts the length of the sequence, and one that takes a predicate:only elements match<strong>in</strong>g the predicate are counted.The most generalized aggregation operator is just called Aggregate. All the otheraggregation operators could be expressed as calls to Aggregate, although it wouldbe relatively pa<strong>in</strong>ful to do so. The basic idea is that there’s always a “result so far,”start<strong>in</strong>g with an <strong>in</strong>itial seed. An aggregation delegate is applied for each element of359Licensed to Rhona Hadida


360 APPENDIX LINQ standard query operatorsthe <strong>in</strong>put sequence: the delegate takes the result so far and the <strong>in</strong>put element, and producesthe next result. As a f<strong>in</strong>al optional step, a conversion is applied from the aggregationresult to the return value of the method. This conversion may result <strong>in</strong> adifferent type, if necessary. It’s not quite as complicated as it sounds, but you’re stillunlikely to use it very often.All of the aggregation operators use immediate execution.Table A.1Examples of aggregation operatorsExpressionResultnumbers.Sum() 10numbers.Count() 5numbers.Average() 2numbers.LongCount(x => x%2 == 0)words.M<strong>in</strong>(word => word.Length)words.Max(word => word.Length)numbers.Aggregate("seed",(soFar, elt) => soFar+elt.ToStr<strong>in</strong>g(),result => result.ToUpper())3 (as a long; there are three even numbers)3 ("one" and "two")5 ("three")SEED01234A.2 ConcatenationThere is a s<strong>in</strong>gle concatenation operator: Concat (see table A.2). As you might expect,this operates on two sequences, and returns a s<strong>in</strong>gle sequence consist<strong>in</strong>g of all the elementsof the first sequence followed by all the elements of the second. The two <strong>in</strong>putsequences must be of the same type, and execution is deferred.Table A.2Concat exampleExpressionResultnumbers.Concat(new[] {2, 3, 4, 5, 6}) 0, 1, 2, 3, 4, 2, 3, 4, 5, 6A.3 ConversionThe conversion operators (see table A.3) cover a fair range of uses, but they all come<strong>in</strong> pairs. AsEnumerable and AsQueryable allow a sequence to be treated as IEnumerable or IQueryable respectively, forc<strong>in</strong>g further calls to convert lambda expressions <strong>in</strong>todelegate <strong>in</strong>stances or expression trees respectively, and use the appropriate extensionmethods. These operators use deferred execution.ToArray and ToList are fairly self-explanatory: they read the whole sequence <strong>in</strong>tomemory, return<strong>in</strong>g it either as an array or as a List. Both use immediate execution.Licensed to Rhona Hadida


Conversion361Cast and OfType convert an untyped sequence <strong>in</strong>to a typed one, either throw<strong>in</strong>gan exception (for Cast) or ignor<strong>in</strong>g (for OfType) elements of the <strong>in</strong>put sequence thataren’t implicitly convertible to the output sequence element type. This may also beused to convert typed sequences <strong>in</strong>to more specifically typed sequences, such as convert<strong>in</strong>gIEnumerable to IEnumerable. The conversions are performed<strong>in</strong> a stream<strong>in</strong>g manner with deferred execution.ToDictionary and ToLookup both take delegates to obta<strong>in</strong> the key for any particularelement; ToDictionary returns a dictionary mapp<strong>in</strong>g the key to the element type,whereas ToLookup returns an appropriately typed ILookup. A lookup is like a dictionarywhere the value associated with a key isn’t one element but a sequence of elements.Lookups are generally used when duplicate keys are expected as part of normal operation,whereas a duplicate key will cause ToDictionary to throw an exception. Morecomplicated overloads of both methods allow a custom IEqualityComparer to beused to compare keys, and a conversion delegate to be applied to each element beforeit is put <strong>in</strong>to the dictionary or lookup.The examples <strong>in</strong> table A.3 use two additional sequences to demonstrate Cast andOfType:object[] allStr<strong>in</strong>gs = {"These", "are", "all", "str<strong>in</strong>gs"};object[] notAllStr<strong>in</strong>gs = {"Number", "at", "the", "end", 5};Table A.3Conversion examplesExpressionallStr<strong>in</strong>gs.Cast()allStr<strong>in</strong>gs.OfType()notAllStr<strong>in</strong>gs.Cast()notAllStr<strong>in</strong>gs.OfType()Result"These", "are", "all", "str<strong>in</strong>gs"(as IEnumerable)"These", "are", "all", "str<strong>in</strong>gs"(as IEnumerable)Exception is thrown while iterat<strong>in</strong>g, at po<strong>in</strong>t of fail<strong>in</strong>gconversion"Number", "at", "the", "end"(as IEnumerable)numbers.ToArray() 0, 1, 2, 3, 4(as <strong>in</strong>t[])numbers.ToList() 0, 1, 2, 3, 4(as List)words.ToDictionary(word =>word.Substr<strong>in</strong>g(0, 2))Dictionary contents:"ze": "zero""on": "one""tw": "two""th": "three""fo": "four"Licensed to Rhona Hadida


362 APPENDIX LINQ standard query operatorsTable A.3Conversion examples (cont<strong>in</strong>ued)Expression// Key is first character of wordwords.ToLookup(word => word[0])words.ToDictionary(word => word[0])ResultLookup contents:'z': "zero"'o': "one"'t': "two", "three"'f': "four"Exception: Can only have one entry per key, sofails on 't'I haven’t provided examples for AsEnumerable or AsQueryable because they don’taffect the results <strong>in</strong> an immediately obvious way. Instead, they affect the manner <strong>in</strong> whichthe query is executed. Queryable.AsQueryable is an extension method on IEnumerablethat returns an IQueryable (both types be<strong>in</strong>g generic or nongeneric, depend<strong>in</strong>gon which overload you pick). If the IEnumerable you call it on is already an IQueryable,it just returns the same reference—otherwise it creates a wrapper around the orig<strong>in</strong>alsequence. The wrapper allows you to use all the normal Queryable extension methods,pass<strong>in</strong>g <strong>in</strong> expression trees, but when the query is executed the expression tree is compiled<strong>in</strong>to normal IL and executed directly, us<strong>in</strong>g the LambdaExpression.Compilemethod shown <strong>in</strong> section 9.3.2.Enumerable.AsEnumerable is an extension method on IEnumerable and has atrivial implementation, simply return<strong>in</strong>g the reference it was called on. No wrappersare <strong>in</strong>volved—it just returns the same reference. This forces the Enumerable extensionmethods to be used <strong>in</strong> subsequent LINQ operators. Consider the follow<strong>in</strong>gquery expressions:// Filter the users <strong>in</strong> the database with LIKEfrom user <strong>in</strong> context.Userswhere user.Name.StartsWith("Tim")select user;// Filter the users <strong>in</strong> memoryfrom user <strong>in</strong> context.Users.AsEnumerable()where user.Name.StartsWith("Tim")select user;The second query expression forces the compile-time type of the source to beIEnumerable <strong>in</strong>stead of IQueryable, so all the process<strong>in</strong>g is done <strong>in</strong>memory <strong>in</strong>stead of at the database. The compiler will use the Enumerable extensionmethods (tak<strong>in</strong>g delegate parameters) <strong>in</strong>stead of the Queryable extension methods(tak<strong>in</strong>g expression tree parameters). Normally you want to do as much process<strong>in</strong>g aspossible <strong>in</strong> SQL, but when there are transformations that require “local” code, yousometimes have to force LINQ to use the appropriate Enumerable extension methods.Licensed to Rhona Hadida


Equality operations363A.4 Element operationsThis is another selection of query operators that are grouped <strong>in</strong> pairs (see table A.4).This time, the pairs all work the same way. There’s a simple version that picks a s<strong>in</strong>gleelement if it can or throws an exception if the specified element doesn’t exist, and aversion with OrDefault at the end of the name. The OrDefault version is exactly thesame except that it returns the default value for the result type <strong>in</strong>stead of throw<strong>in</strong>g anexception if it can’t f<strong>in</strong>d the element you’ve asked for. All of these operators useimmediate execution.The operator names are easily understood: First and Last return the first andlast elements of the sequence respectively (only default<strong>in</strong>g if there are no elements),S<strong>in</strong>gle returns the only element <strong>in</strong> a sequence (default<strong>in</strong>g if there isn’t exactly oneelement), and ElementAt returns a specific element by <strong>in</strong>dex (the fifth element, forexample). In addition, there’s an overload for all of the operators other thanElementAt to filter the sequence first—for example, First can return the first elementthat matches a given condition.Table A.4S<strong>in</strong>gle element selection examplesExpressionResultwords.ElementAt(2)words.ElementAtOrDefault(10)words.First()words.First(word => word.Length==3)words.First(word => word.Length==10)words.FirstOrDefault(word => word.Length==10)words.Last()words.S<strong>in</strong>gle()words.S<strong>in</strong>gleOrDefault()words.S<strong>in</strong>gle(word => word.Length==5)words.S<strong>in</strong>gle(word => word.Length==10)"two"null"zero""one"Exception: No match<strong>in</strong>g elementsnull"four"Exception: More than one elementnull"three"Exception: No match<strong>in</strong>g elementsA.5 Equality operationsThere’s only one equality operation: SequenceEqual (see table A.5). This just comparestwo sequences for element-by-element equality, <strong>in</strong>clud<strong>in</strong>g order. For <strong>in</strong>stance,the sequence 0, 1, 2, 3, 4 is not equal to 4, 3, 2, 1, 0. An overload allows a specificIEqualityComparer to be used when compar<strong>in</strong>g elements. The return value is justa Boolean, and is computed with immediate execution.Licensed to Rhona Hadida


364 APPENDIX LINQ standard query operatorsTable A.5Sequence equality examplesExpressionwords.SequenceEqual(new[]{"zero","one","two","three","four"})words.SequenceEqual(new[]{"ZERO","ONE","TWO","THREE","FOUR"})words.SequenceEqual(new[]{"ZERO","ONE","TWO","THREE","FOUR"},Str<strong>in</strong>gComparer.Ord<strong>in</strong>alIgnoreCase)ResultTrueFalseTrueA.6 GenerationOut of all the generation operators (see table A.6), only one acts on an exist<strong>in</strong>gsequence: DefaultIfEmpty. This returns either the orig<strong>in</strong>al sequence if it’s not empty,or a sequence with a s<strong>in</strong>gle element otherwise. The element is normally the defaultvalue for the sequence type, but an overload allows you to specify which value to use.There are three other generation operators that are just static methods <strong>in</strong>Enumerable:■ Range generates a sequence of <strong>in</strong>tegers, with the parameters specify<strong>in</strong>g the firstvalue and how many values to generate.■ Repeat generates a sequence of any type by repeat<strong>in</strong>g a specified s<strong>in</strong>gle valuefor a specified number of times.■ Empty generates an empty sequence of any type.All of the generation operators use deferred execution.Table A.6Generation examplesExpressionResultnumbers.DefaultIfEmpty() 0, 1, 2, 3, 4new <strong>in</strong>t[0].DefaultIfEmpty()new <strong>in</strong>t[0].DefaultIfEmpty(10)0 (with<strong>in</strong> an IEnumerable)10 (with<strong>in</strong> an IEnumerable)Enumerable.Range(15, 2) 15, 16Enumerable.Repeat(25, 2) 25, 25Enumerable.Empty()An empty IEnumerableLicensed to Rhona Hadida


Jo<strong>in</strong>s365A.7 Group<strong>in</strong>gThere are two group<strong>in</strong>g operators, but one of them is ToLookup (which we’ve alreadyseen <strong>in</strong> A.3 as a conversion operator). That just leaves GroupBy, which we saw <strong>in</strong> section11.6.1 when discuss<strong>in</strong>g the group … by clause <strong>in</strong> query expressions. It usesdeferred execution, but buffers results.The result of GroupBy is a sequence of appropriately typed IGroup<strong>in</strong>g elements.Each element has a key and a sequence of elements that match that key. In many ways,this is just a different way of look<strong>in</strong>g at a lookup—<strong>in</strong>stead of hav<strong>in</strong>g random access tothe groups by key, the groups are enumerated <strong>in</strong> turn. The order <strong>in</strong> which the groupsare returned is the order <strong>in</strong> which their respective keys are discovered. With<strong>in</strong> agroup, the order is the same as <strong>in</strong> the orig<strong>in</strong>al sequence.GroupBy (see table A.7) has a daunt<strong>in</strong>g number of overloads, allow<strong>in</strong>g you to specifynot only how a key is derived from an element (which is always required) but alsooptionally the follow<strong>in</strong>g:■■■How to compare keys.A projection from orig<strong>in</strong>al element to the element with<strong>in</strong> a group.A projection from a key and an enumeration of elements to a result type. If thisis specified, the result is just a sequence of elements of this result type.Frankly the last option is very confus<strong>in</strong>g. I’d recommend avoid<strong>in</strong>g it unless it def<strong>in</strong>itelymakes the code simpler for some reason.Table A.7GroupBy examplesExpressionwords.GroupBy(word => word.Length)words.GroupBy(word => word.Length, // Keyword => word.ToUpper() // Group element)ResultKey: 4; Sequence: "zero", "four"Key: 3; Sequence: "one", "two"Key: 5; Sequence: "three"Key: 4; Sequence: "ZERO", "FOUR"Key: 3; Sequence: "ONE", "TWO"Key: 5; Sequence: "THREE"A.8 Jo<strong>in</strong>sTwo operators are specified as jo<strong>in</strong> operators: Jo<strong>in</strong> and GroupJo<strong>in</strong>, both of which we saw<strong>in</strong> section 11.5 us<strong>in</strong>g jo<strong>in</strong> and jo<strong>in</strong> … <strong>in</strong>to query expression clauses respectively. Eachmethod takes several parameters: two sequences, a key selector for each sequence, a projectionto apply to each match<strong>in</strong>g pair of elements, and optionally a key comparison.For Jo<strong>in</strong> the projection takes one element from each sequence and produces aresult; for GroupJo<strong>in</strong> the projection takes an element from the left sequence (<strong>in</strong> thechapter 11 term<strong>in</strong>ology—the first one specified, usually as the sequence the extensionmethod appears to be called on) and a sequence of match<strong>in</strong>g elements from the rightLicensed to Rhona Hadida


366 APPENDIX LINQ standard query operatorssequence. Both use deferred execution, and stream the left sequence but buffer theright sequence.Table A.8Jo<strong>in</strong> examplesExpressionnames.Jo<strong>in</strong> // Left sequence(colors, // Right sequencename => name[0], // Left key selectorcolor => color[0], // Right key selector// Projection for result pairs(name, color) => name+" - "+color)names.GroupJo<strong>in</strong>(colors,name => name[0],color => color[0],// Projection for key/sequence pairs(name, matches) => name+": "+str<strong>in</strong>g.Jo<strong>in</strong>("/", matches.ToArray()))Result"Rob<strong>in</strong> - Red","Ruth - Red","Bob - Blue","Bob - Beige""Rob<strong>in</strong>: Red","Ruth: Red","Bob: Blue/Beige","Emma: "For the jo<strong>in</strong> examples <strong>in</strong> table A.8, we’ll match a sequence of names (Rob<strong>in</strong>, Ruth,Bob, Emma) aga<strong>in</strong>st a sequence of colors (Red, Blue, Beige, Green) by look<strong>in</strong>g at thefirst character of both the name and the color, so Rob<strong>in</strong> will jo<strong>in</strong> with Red and Bobwill jo<strong>in</strong> with both Blue and Beige, for example.Note that Emma doesn’t match any of the colors—the name doesn’t appear at all<strong>in</strong> the results of the first example, but it does appear <strong>in</strong> the second, with an emptysequence of colors.A.9 Partition<strong>in</strong>gThe partition<strong>in</strong>g operators either skip an <strong>in</strong>itial part of the sequence, return<strong>in</strong>g only therest, or take only the <strong>in</strong>itial part of a sequence, ignor<strong>in</strong>g the rest. In each case you caneither specify how many elements are <strong>in</strong> the first part of the sequence, or specify a condition—thefirst part of the sequence cont<strong>in</strong>ues until the condition fails. After the conditionfails for the first time, it isn’t tested aga<strong>in</strong>—it doesn’t matter whether laterelements <strong>in</strong> the sequence match or not. All of the partition<strong>in</strong>g operators (see table A.9)use deferred execution.Table A.9Partition<strong>in</strong>g examplesExpressionResultwords.Take(3)words.Skip(3)"zero", "one", "two""three", "four"Licensed to Rhona Hadida


Quantifiers367Table A.9Partition<strong>in</strong>g examples (cont<strong>in</strong>ued)Expressionwords.TakeWhile(word => word[0] > 'k')words.SkipWhile(word => word[0] > 'k')"zero", "one", "two", "three""four"ResultA.10 ProjectionWe’ve seen both projection operators (Select and SelectMany) <strong>in</strong> chapter 11. Selectis a simple one-to-one projection from element to result. SelectMany is used when thereare multiple from clauses <strong>in</strong> a query expression: each element <strong>in</strong> the orig<strong>in</strong>al sequenceis used to generate a new sequence. Both projection operators (see table A.10) usedeferred execution.There are overloads we didn’t see <strong>in</strong> chapter 11. Both methods have overloads thatallow the <strong>in</strong>dex with<strong>in</strong> the orig<strong>in</strong>al sequence to be used with<strong>in</strong> the projection, andSelectMany either flattens all of the generated sequences <strong>in</strong>to a s<strong>in</strong>gle sequence without<strong>in</strong>clud<strong>in</strong>g the orig<strong>in</strong>al element at all, or uses a projection to generate a result elementfor each pair of elements. Multiple from clauses always use the overload thattakes a projection. (Examples of this are quite long-w<strong>in</strong>ded, and not <strong>in</strong>cluded here.See chapter 11 for more details.)Table A.10Projection examplesExpressionResultwords.Select(word => word.Length) 4, 3, 3, 5, 4words.Select((word, <strong>in</strong>dex) =><strong>in</strong>dex.ToStr<strong>in</strong>g()+": "+word)words.SelectMany(word => word.ToCharArray())words.SelectMany((word, <strong>in</strong>dex) =>Enumerable.Repeat(word, <strong>in</strong>dex))"0: zero", "1: one", "2: two","3: three", "4: four"'z', 'e', 'r', 'o', 'o', 'n', 'e', 't','w', 'o', 't', 'h', 'r', 'e', 'e', 'f','o', 'u', 'r'"one", "two", "two", "three","three", "three", "four", "four","four", "four"A.11 QuantifiersThe quantifier operators (see table A.11) all return a Boolean value, us<strong>in</strong>g immediateexecution:■ All checks whether all the elements <strong>in</strong> the sequence satisfy a specified condition.■ Any checks whether any of the elements <strong>in</strong> the sequence satisfy a specified condition,or if no condition is specified, whether there are any elements at all.■ Conta<strong>in</strong>s checks whether the sequence conta<strong>in</strong>s a particular element, optionallyspecify<strong>in</strong>g a comparison to use.Licensed to Rhona Hadida


368 APPENDIX LINQ standard query operatorsTable A.11Quantifier examplesExpressionwords.All(word => word.Length > 3)words.All(word => word.Length > 2)words.Any()words.Any(word => word.Length == 6)words.Any(word => word.Length == 5)words.Conta<strong>in</strong>s("FOUR")words.Conta<strong>in</strong>s("FOUR",Str<strong>in</strong>gComparer.Ord<strong>in</strong>alIgnoreCase)Resultfalse ("one" and "two" have exactly threeletters)Truetrue (the sequence is not empty)false (no six-letter words)true ("three" satisfies the condition)FalseTrueA.12 Filter<strong>in</strong>gThe two filter<strong>in</strong>g operators are OfType and Where. For details and examples of theOfType operator, see the conversion operators section (A.3). The Where operator (seetable A.12) has overloads so that the filter can take account of the element’s <strong>in</strong>dex. It’sunusual to require the <strong>in</strong>dex, and the where clause <strong>in</strong> query expressions doesn’t usethis overload. Where always uses deferred execution.Table A.12Filter<strong>in</strong>g examplesExpressionwords.Where(word => word.Length > 3)words.Where((word, <strong>in</strong>dex) =><strong>in</strong>dex < word.Length)Result"zero", "three", "four""zero", // length=4, <strong>in</strong>dex=0"one", // length=3, <strong>in</strong>dex=1"two", // length=3, <strong>in</strong>dex=2"three", // length=5, <strong>in</strong>dex=3// Not "four", length=4, <strong>in</strong>dex=4A.13 Set-based operationsIt’s natural to be able to consider two sequences as sets of elements. The four setbasedoperators all have two overloads, one us<strong>in</strong>g the default equality comparison forthe element type, and one where the comparison is specified <strong>in</strong> an extra parameter.All of them use deferred execution.The Dist<strong>in</strong>ct operator is the simplest—it acts on a s<strong>in</strong>gle sequence, and just returnsa new sequence of all the dist<strong>in</strong>ct elements, discard<strong>in</strong>g duplicates. The other operatorsalso make sure they only return dist<strong>in</strong>ct values, but they act on two sequences:■■■Intersect returns elements that appear <strong>in</strong> both sequences.Union returns the elements that are <strong>in</strong> either sequence.Except returns the elements that are <strong>in</strong> the first sequence but not <strong>in</strong> the second.(Elements that are <strong>in</strong> the second sequence but not the first are not returned.)Licensed to Rhona Hadida


Sort<strong>in</strong>g369For the examples of these operators <strong>in</strong> table A.13, we’ll use two new sequences: abbc("a", "b", "b", "c") and cd ("c", "d").Table A.13Set-based examplesExpressionabbc.Dist<strong>in</strong>ct()abbc.Intersect(cd)abbc.Union(cd)abbc.Except(cd)cd.Except(abbc)Result"a", "b", "c""c""a", "b", "c", "d""a", "b""d"A.14 Sort<strong>in</strong>gWe’ve seen all the sort<strong>in</strong>g operators before: OrderBy and OrderByDescend<strong>in</strong>g providea “primary” order<strong>in</strong>g, while ThenBy and ThenByDescend<strong>in</strong>g provide subsequent order<strong>in</strong>gsfor elements that aren’t differentiated by the primary one. In each case a projectionis specified from an element to its sort<strong>in</strong>g key, and a comparison (between keys)can also be specified. Unlike some other sort<strong>in</strong>g algorithms <strong>in</strong> the framework (such asList.Sort), the LINQ order<strong>in</strong>gs are stable—<strong>in</strong> other words, if two elements areregarded as equal <strong>in</strong> terms of their sort<strong>in</strong>g key, they will be returned <strong>in</strong> the order theyappeared <strong>in</strong> the orig<strong>in</strong>al sequence.The f<strong>in</strong>al sort<strong>in</strong>g operator is Reverse, which simply reverses the order of thesequence. All of the sort<strong>in</strong>g operators (see table A.14) use deferred execution, butbuffer their data.Table A.14Sort<strong>in</strong>g examplesExpressionwords.OrderBy(word => word)// Order words by second characterwords.OrderBy(word => word[1])// Order words by length;// equal lengths returned <strong>in</strong> orig<strong>in</strong>al// orderwords.OrderBy(word => word.Length)words.OrderByDescend<strong>in</strong>g(word => word.Length)// Order words by length and then// alphabeticallywords.OrderBy(word => word.Length).ThenBy(word => word)Result"four", "one", "three", "two","zero""zero", "three", "one", "four","two""one", "two", "zero", "four","three""three", "zero", "four", "one","two""one", "two", "four", "zero","three"Licensed to Rhona Hadida


370 APPENDIX LINQ standard query operatorsTable A.14Sort<strong>in</strong>g examples (cont<strong>in</strong>ued)// Order words by length and then// alphabetically backwardswords.OrderBy(word => word.Length).ThenByDescend<strong>in</strong>g(word => word)words.Reverse()ExpressionResult"two", "one", "zero", "four","three""four", "three", "two", "one","zero"Licensed to Rhona Hadida


<strong>in</strong>dexSymbols!= 82with generics 76#pragma 197*. See transparent identifiers:: 194== 82with generics 76=> 233? modifier 121?? operator. See null coalesc<strong>in</strong>g operatorAabout:blank, as URL equivalent of nullreference 113abstract 175implicit for static classes 191abstract classes 256, 271abstraction 68academic purity 220access modifiersdefault 193for partial types 186for properties 192Action 12, 97, 145, 232, 236Active Directory 347Active Server Pages (ASP) 19Adapter 98adapters 336Add 219, 286addition us<strong>in</strong>g expression trees 239ADO.NET 114, 319, 334–336Entity Framework 22, 315, 348Adobe 22, 24aesthetics 222Aggregate. See Standard Query Operators,AggregateAJAX 24aliases 213aliases. See namespace aliasesAll. See Standard Query Operators, AllAmazon 344web service 17Ancestors 342AncestorsAndSelf 342angle brackets 68Annotations method 342anonymous functions 232, 249, 253conversions 252anonymous methods 10, 55, 144–159, 245,265, 277, 356ambiguity 150compared with lambda expressions 232implicit typ<strong>in</strong>g 212lifetime 159parametersignor<strong>in</strong>g 149us<strong>in</strong>g 145return<strong>in</strong>g from methods 154return<strong>in</strong>g values 147anonymous object <strong>in</strong>itializers 57, 224anonymous types 57, 224, 236, 266, 340members 226number of dist<strong>in</strong>ct types 225transparent identifiers 296Any. See Standard Query Operators, AnyAppend 265applet 18application doma<strong>in</strong>s 49Application.Run 139371Licensed to Rhona Hadida


372INDEXarchitecture 275AreaComparer 105argument evaluation 189argument, parameter pass<strong>in</strong>g 53arithmetic 239Array 43array <strong>in</strong>dex<strong>in</strong>g 239ArrayList 5, 8, 46, 52comparisons with List 65, 73, 88, 96use with LINQ 289arrays 278, 360as reference types 49covariance 45covariance support 103implicit typ<strong>in</strong>g. See implicitly typed arrays<strong>in</strong>itialization 223ASCII 44AsEnumerable, DataTable extension method 335See also Standard Query Operators,AsEnumerableASP Classic 20See also Active Server Pages (ASP)ASP.NET 20, 26, 198AsParallel 349AsQueryable. See Standard Query Operators,AsQueryableAsReadOnly 98assemblies 225, 261extern aliases 196assembly reference 28AssemblyInfo.cs 203assignment 50–51of null to a nullable type variable 120assignments 242associations 316, 337asterisk 297AsXAttributes 340asynchronous code 178asynchronous pattern 81attorney 34, 38Attribute method 342Attribute suffix 201attributes 316Attributes method 342automatically implemented properties208–210, 235Average. See Standard Query Operators, Averageaxis methods 342Ayende 345Bback tick 93BackgroundWorker 41backward compatibility 253Base Class Libraries (BCL) 27base class library 207base type 191specify<strong>in</strong>g for partial types 185BCL. See Base Class Libraries (BCL)Beg<strong>in</strong>XXX/EndXXX pattern 81behavioradd<strong>in</strong>g us<strong>in</strong>g <strong>in</strong>heritance 255specify<strong>in</strong>g with delegates 33behavioral pattern 161best practices 209, 274, 356better conversion 251Bill Gates 31b<strong>in</strong>ary operators 125B<strong>in</strong>aryExpression 239black box view of iterator blocks 167bloat of type responsibility 187blockanonymous methods 146lambda expressions 234restrictions of expression trees 242blogs 270, 272, 315, 345bluepr<strong>in</strong>ts 68Booleanflags 115logic 127bounds of type variables 248box<strong>in</strong>g 53–54, 58, 89–90, 115, 119, 178<strong>in</strong> Java 110<strong>in</strong> Java 1.5 20of Nullable. See Nullable, box<strong>in</strong>g and unbox<strong>in</strong>gbraces 147, 234break<strong>in</strong>g changes from <strong>C#</strong> 1 to <strong>C#</strong> 2 144buffer 257buffer<strong>in</strong>g 265, 281, 308bugsdue to lack of generics 65us<strong>in</strong>g Stream 257bulk delete 319bus<strong>in</strong>esscode 174layer 348logic 187requirements 269rules 130Button 139, 193byte, range of values 113bytecode 110ByValArray 201by-value argument. See parameter pass<strong>in</strong>gCC ω 21, 24, 33, 44<strong>C#</strong>def<strong>in</strong>ition of events 40evolution 4Licensed to Rhona Hadida


INDEX 373<strong>C#</strong> (cont<strong>in</strong>ued)<strong>in</strong>fluences 18language dist<strong>in</strong>ction 25language specification 117version numbers 26<strong>C#</strong> 1comparison of delegate usage with <strong>C#</strong> 3 233delegate creation expressions 138event handler plumb<strong>in</strong>g 138iterator implementation 162patterns for null values 114revision of key topics 32–54<strong>C#</strong> 2delegate improvements 137–160fixed size buffers 199generics 63–111iterators 161–182namespace aliases 193nullable types 112–136partial types 184pragma directives 197separate property access modifiers 192static classes 190support for nullable types 120term<strong>in</strong>ology for nullable types 117, 120<strong>C#</strong> 3anonymous types 224automatically implemented properties 208–210collection <strong>in</strong>itializers 218–221expression trees 238–245extension methods 255–274implicitly typed arrays 223implicitly typed local variables 210, 215lambda expressions 230object <strong>in</strong>itializers 215–218overload<strong>in</strong>g 251–253partial methods 188query expressions 275–313two-phase type <strong>in</strong>ference 248type <strong>in</strong>ference 245–253<strong>C#</strong> 4 106, 358<strong>C#</strong> team 48C++ 20, 24, 83, 102, 108, 201C++/CLI 24C++0x 108cach<strong>in</strong>g of delegate <strong>in</strong>stances 237caller, parameter pass<strong>in</strong>g 52captured variables 150–159, 285extended lifetime 154<strong>in</strong>stantiations 155Cartesian product 304cascad<strong>in</strong>g <strong>in</strong>sertion 319cast 45Cast. See Standard Query Operators, Castcast<strong>in</strong>ganonymous methods 150as a necessary evil 9, 64migrat<strong>in</strong>g to generics 74necessity removed by generics 67, 178range variables 290reference identity 51type <strong>in</strong>ference 80, 223, 247unsafe abuse <strong>in</strong> C 44untyped datasets 334–335Catchphrase 271CCR. See Concurrency and Coord<strong>in</strong>ation Runtime(CCR)cha<strong>in</strong><strong>in</strong>gextension methods 265–266, 354iterators 265change track<strong>in</strong>g 318, 348characters 278checksum pragmas 198clarity 272clashes of names 194class declaration, absence of <strong>in</strong> snippets 28class libraries 255class. See reference typesclasses as generic types 67classification of variables 151CLI. See Common Language Infrastructure (CLI)Click 139, 142, 149Clone 46, 98clon<strong>in</strong>g 226closed types 69, 86closures 145, 151, 160CLR. See Common Language Runtime (CLR)codeas data 238bloat 108generation 115generators 184, 186, 188shape 222smells 136CodeDOM 239, 241, 244cod<strong>in</strong>gconventions 209standards 182style, harmony with language 4collection <strong>in</strong>itializers 8, 218–221, 235, 286, 355CollectionBase 46collections 56generic 66generic collections <strong>in</strong> .NET 2.0 96remov<strong>in</strong>g multiple elements 98specialized 46strongly typed vs weakly typed 45Licensed to Rhona Hadida


374INDEXCOM 211See also Component Object Model (COM)COM3 19Comb<strong>in</strong>e 38, 138command l<strong>in</strong>e 196commas, separat<strong>in</strong>g type parameters 68comments 158Common Language Infrastructureannotated standard 103designers 103specification 77, 120Common Language Infrastructure (CLI) 19, 25Common Language Runtime (CLR)anonymous types 226aspects of generics 85described 25fixed buffer allocation 199generics 88lack of change for .NET 3.5 207modifications for box<strong>in</strong>g of Nullable 118multi-language support 354nullable types 128type safety 45version numbers 26community 184, 213, 272<strong>in</strong>fluence on nullable type behavior 118Community Technology Preview (CTP)Parallel Extensions 349compact code 146, 232, 253compactness 146Compare 133Comparer 83Comparer.Default 134CompareTo 82Comparison 9, 138, 148, 236, 356comparisons 239<strong>in</strong> generics 82us<strong>in</strong>g the null coalesc<strong>in</strong>g operator 133with null, un<strong>in</strong>tended 126compatibility<strong>in</strong>ferred type parameters 249, 253of delegate signatures 55compatible declarations for partial types 185competition between technologies 22compilation errors 288, 294Compile method of LambdaExpression 241, 362compiled code 211compiler 25application of Attribute suffix 201apply<strong>in</strong>g complex transformations 181captured variable implementation 158check<strong>in</strong>g delegate return value types 147check<strong>in</strong>g value type mutation 116choices <strong>in</strong> overload<strong>in</strong>g 251code validation 111comb<strong>in</strong><strong>in</strong>g files for partial types 185creat<strong>in</strong>g extra classes for captured variables 155creat<strong>in</strong>g state mach<strong>in</strong>e for iterators 166creation of expression trees 244generat<strong>in</strong>g anonymous methods 146giv<strong>in</strong>g <strong>in</strong>formation with casts 53, 64handl<strong>in</strong>g of null 122handl<strong>in</strong>g of query expressions 276implementation-specific features 197<strong>in</strong>ferr<strong>in</strong>g array types 223<strong>in</strong>ferr<strong>in</strong>g delegate types 139<strong>in</strong>ferr<strong>in</strong>g local variable types 57, 211knowledge of framework 350knowledge of static classes 191lambda expression expansion 237logic for type <strong>in</strong>ference 80, 248prevent<strong>in</strong>g assignment of null to normal valuetypes 113prevent<strong>in</strong>g def<strong>in</strong>itely <strong>in</strong>valid casts 45quirks <strong>in</strong> iterator blocks 172remov<strong>in</strong>g degenerate query expressions 292responsibility for language improvements54, 208tim<strong>in</strong>g of lambda expression body check<strong>in</strong>g 251transform<strong>in</strong>g delegate shorthand 39translations of query expressions 284warn<strong>in</strong>gs 197compile-timecheck<strong>in</strong>g 65, 244, 325, 350knowledge of variable type 44lack of type safety with weakly typedcollections 46overload<strong>in</strong>g 83safety 245type 211–212, 223complete programs 28complex numbers 107complexity 145, 159, 184, 244, 314, 348<strong>in</strong> appropriate place 174of language 48Component Object Model (COM) 47computer science 231Concat. See Standard Query Operators, Concatconcatenation. See Standard Query Operators,concatenationconcepts <strong>in</strong> C++ templates 108conceptual model 348concurrency 324, 357Concurrency and Coord<strong>in</strong>ation Runtime(CCR) 178, 358conditional logical operators 127conditional operator 121, 129configuration files 187conflicts 324Licensed to Rhona Hadida


INDEX 375confusion 150, 222over captured variables 157connection management 318connection str<strong>in</strong>g 316, 336consistencycode formatt<strong>in</strong>g 147, 149language design 253of immutable types 39of <strong>in</strong>ferred type arguments 80of nullable equality rules 119of nullable syntax choice 121of query<strong>in</strong>g with LINQ 281, 325, 337, 344, 350Console 67, 194, 237, 320Constant 329constant collections 221constants 212<strong>in</strong> expression trees 243constra<strong>in</strong>t violations 319constra<strong>in</strong>ts on type parameters. See typeconstra<strong>in</strong>tsconstructed types 68, 92constructor parameters, anonymous types 226constructor type constra<strong>in</strong>ts 76constructors 108, 191generated 188<strong>in</strong> generic types 70of Nullable 116parameterless 77, 210, 217private <strong>in</strong> utility classes 190signatures other than parameterless 77static. See static constructorsXElement 339Conta<strong>in</strong>s. See Standard Query Operators, Conta<strong>in</strong>sConta<strong>in</strong>sKey 99context 151, 271, 287contextual keywords 298, 310partial 185cont<strong>in</strong>uation-pass<strong>in</strong>g style 180contracts <strong>in</strong> Spec# 21contravariance 103, 105message group conversions 141of delegates 55supported by CLR 106conventions 193, 209, 271, 273for events 142conversionof lists us<strong>in</strong>g delegates 72standard query operators. See Standard QueryOperators, conversionconversionsfrom method group to delegate <strong>in</strong>stance 55preference <strong>in</strong> overload<strong>in</strong>g 251ConvertAll 71, 96, 227, 267, 277, 281Converter 72, 175conveyor belt 278COOL 19coord<strong>in</strong>ation 178, 350Copy 257copy behavior for value/reference types 49CopyToDataTable 335corner cases 85corout<strong>in</strong>e 182correspondence between queries on different datasources 16Count 98Count. See Standard Query Operators, Countcovariance 103method group conversions 141of arrays 45of delegates 55of return type 46supported by the CLR 106CPU 88CreateQuery 327, 330criteria for match<strong>in</strong>g 97cross jo<strong>in</strong>s 303CTP. See Community Technology Preview (CTP)Current 90, 162, 164, 167, 172, 279Ddatacontexts 17, 318models 276pipel<strong>in</strong>e 161source 14sources 276transfer 51transfer type 57Data Access Strategy 348data b<strong>in</strong>d<strong>in</strong>g 283Data Connections 316data models 276DataAdapter 319databases 179, 269, 278, 313–314access 131column selection 228<strong>in</strong>dependence 346jo<strong>in</strong>s 297nullable fields 58, 114recreat<strong>in</strong>g 319data-centric approach 11DataReader 265, 278DataRow 335DataRowExtensions 334DataSet 265datasetstyped 335–338untyped 334–335DataTable 334–335Licensed to Rhona Hadida


376INDEXDataTableExtensions 334–335DateTime 51, 82, 116, 125, 283non-nullability 113DateTimeRange 301DBNull 114, 334–335debugger 198, 321declarationof local variables 211, 213of partial methods 189declarative 14declarative programm<strong>in</strong>g 313decompilation 123, 225deduction event handlers 139defaultaccess modifier 193mode of parameter pass<strong>in</strong>g 54default constructor 191of Nullable 116, 120, 122Default property of EqualityComparer andComparer 83default value expressions 81default values 172, 210for type parameters 75See also default value expressionsDefaultIfEmpty. See Standard Query Operators,DefaultIfEmptydeferred execution 264, 267, 279, 290, 303,312, 337def<strong>in</strong>ite assignment 210output parameters 81variables 65degenerate query expressions 291delegate creation expression 143delegate <strong>in</strong>stance 34–38, 284always referr<strong>in</strong>g to a method 35clums<strong>in</strong>ess of <strong>C#</strong> 1 54created by anonymous methods 146delegate keyword 146, 149for anonymous methods 10delegate type <strong>in</strong>ference 227delegate types 34compatibility 143declaration 34, 37<strong>in</strong>fererred by the compiler 139delegate, used for sort<strong>in</strong>g 9delegates 354action 35, 140, 142as generic types 67as reference types 49<strong>C#</strong> 2 improvements 137–160<strong>C#</strong> <strong>in</strong>vocation shorthand 36candidate signatures 35comb<strong>in</strong><strong>in</strong>g 38–40compil<strong>in</strong>g from expression trees 240–241contravariance 35, 55, 138covariance 138creat<strong>in</strong>g <strong>in</strong>stances 34creat<strong>in</strong>g one <strong>in</strong>stance from another 143events 40–41examples of comb<strong>in</strong>ation and removal 39exceptions 40function po<strong>in</strong>ter 33immutability 38<strong>in</strong> <strong>C#</strong> 1 33–42<strong>in</strong> the framework 138, 145, 153, 160<strong>in</strong>stances. See delegate <strong>in</strong>stances<strong>in</strong>vocation 34, 36–38, 138, 154Invoke method 36lambda expressions 232–238method group conversions 73specify<strong>in</strong>g behavior 33summary of <strong>C#</strong> 1 features 41target 140target of action 36types. See delegate typesused <strong>in</strong> List 96used <strong>in</strong> the CCR 181us<strong>in</strong>g for specialization 355DELETE 319Dequeue 100derivation type constra<strong>in</strong>ts 77example with default values 81restrictions 78DescendantNodes 342DescendantNodesAndSelf 342Descendants 342DescendantsAndSelf 342descend<strong>in</strong>g order<strong>in</strong>g 294design layers 353design overhead 255design patterns 161, 181, 222, 274designers 186DataSets 335LINQ to SQL 316, 319W<strong>in</strong>dows Forms 184developers 353development platform 26diagrams, LINQ to SQL 316dictionaries 219Dictionary 66, 99, 211Dictionary`2 93DictionaryEntry 67directories, LDAP 347disabl<strong>in</strong>g warn<strong>in</strong>gs 198disassembly 123discretion 208discussion groups 113Dispose 25, 92, 168, 171, 193Dist<strong>in</strong>ct. See Standard Query Operators, Dist<strong>in</strong>ctDLR. See Dynamic Language Runtime (DLR)Licensed to Rhona Hadida


INDEX 377Document Object Model (DOM) 338–339, 343documentation 273DOM. See Document Object Model (DOM)doma<strong>in</strong> specific languages 271doubly l<strong>in</strong>ked lists 101DSL. See doma<strong>in</strong> specific languagesduck typ<strong>in</strong>g, relationship to query expressions 286Dyer, Wes 270dynamic configuration 92Dynamic Language Runtime (DLR) 22, 355dynamic languages 24, 211, 354dynamic typ<strong>in</strong>g 42–43, 48, 65, 211, 354Eearly out, of iterator blocks 169ECMA 19, 85e-commerce application 113Effective Java 85efficiency 265–266, 280of value types 54elegance 136of static classes 190Element method 342element operations. See Standard QueryOperators, element operationsElementAt. See Standard Query Operators,ElementAtElementAtOrDefault. See Standard QueryOperators, ElementAtOrDefaultElements method 342ElementsAfterSelf 342ElementsBeforeSelf 342ElementType 326ellipsis, <strong>in</strong> snippets 29embedded objects 218emphasis 133empty str<strong>in</strong>gs 50, 67Empty. See Standard Query Operators, Emptyenabl<strong>in</strong>g warn<strong>in</strong>gs 198encapsulation 178, 184, 192, 216, 222, 267events 41of behavior <strong>in</strong> delegates 33of iteration 174of the iterator pattern 161of variables with<strong>in</strong> properties 6sequences 278enhanced for loop 20Enqueue 100entities 187, 316Entity Client 348Entity Framework. See ADO.NET EntityFrameworkentity <strong>in</strong>heritance strategies 348Entity SQL 348Enum 342enum. See enumerationsEnumerable 263–277, 314, 326, 364enumerationand iteration 162of a l<strong>in</strong>ked list 102of generic collections 86enumerations 316, 335as value types 49enums 20, 271environment context of a closure 151equality operators 119, 125on nullable types 125EqualityComparer 83Equals 53, 82–83, 99, 133, 226equals, contextual keyword 298equijo<strong>in</strong>s 303erasure, transparent identifiers 297Error List w<strong>in</strong>dow 198error, dur<strong>in</strong>g type <strong>in</strong>ference 248errors 150eSQL. See Entity SQLEvans, Eric 271event handlers 36, 138, 142, 148, 235, 237, 336See also eventsEventArgs 141–142, 186, 238EventHandler 55, 138, 141–142events 40–41, 108, 188, 235, 318logg<strong>in</strong>g 141no-op handlers 149subscrib<strong>in</strong>g with method group conversions 140Except. See Standard Query Operators, Exceptexceptions 81, 262, 273, 361thrown by delegate actions 40thrown by iterators 173thrown by nullable conversions 118thrown for <strong>in</strong>valid casts 51Exchange Server 2007 21exclusive OR operator for bool? 127exclusive ranges 175Execute 327, 329, 332ExecuteCommand 319executionmodel 244of static constructors 87plan 322executive summary of nullable type conversionsand operators 124Exists 96exit po<strong>in</strong>ts 148experimentation 66explicit conversions, XAttribute 342explicit <strong>in</strong>terface implementation 46, 90, 177explicit type arguments 79Licensed to Rhona Hadida


378INDEXexplicit typ<strong>in</strong>g 43, 214, 354lambda expression parameters 234range variables 289Express editions of Visual Studio 30Expression 239–240, 326expression to <strong>in</strong>itialize an object 217expression trees 238–245, 284, 325–326, 350, 358comb<strong>in</strong><strong>in</strong>g with IQueryProvider 327compilation 241parameters for Queryable methods 330ExpressionType 239expressive 3expressiveness 111, 122, 154, 162of <strong>C#</strong> 3 355extended type 258extend<strong>in</strong>g the world 270Extensible Application Markup Language(XAML) 187extension methods 10, 57, 255–274call<strong>in</strong>g 259compiler discovery 261declaration 258Enumerable 285, 354extended type 258guidel<strong>in</strong>es 270LINQ to XML 342Queryable 325–326restrictions 258ExtensionAttribute 261extern aliases 194, 196FF# 354factory methods 239FakeQuery 328FakeQueryProvider 328false operator 125fetch<strong>in</strong>g strategies 348Field 335field-like events 40fields 216back<strong>in</strong>g anonymous type properties 226FIFO. See first <strong>in</strong> first outfiles, spread<strong>in</strong>g a type across multiple 185FileStream 257films 236filter expressions 291filter<strong>in</strong>g<strong>in</strong> SQL 321LINQ to Objects 265, 279, 290LINQ to XML 342lists 15, 236, 356Filter<strong>in</strong>g. See Standard Query Operators, filter<strong>in</strong>gfilters 161f<strong>in</strong>alizers 108f<strong>in</strong>ally 92f<strong>in</strong>ally blocks<strong>in</strong> iterator blocks 170–172restrictions <strong>in</strong> iterator blocks 166F<strong>in</strong>dAll 96, 153, 236, 356F<strong>in</strong>dFirst 96F<strong>in</strong>dLast 96first <strong>in</strong> first out 100First. See Standard Query Operators, FirstFirstOrDefault. See Standard Query Operators,FirstOrDefaultfixed 199fixed type variables 248FixedBufferAttribute 199FixedSize 98fix<strong>in</strong>g of type parameters 250flagfor fak<strong>in</strong>g nullity 13to dist<strong>in</strong>guish null values 115Flex 22flexibility 175, 281, 344, 348Flickr 344flow of iterator blocks 167fluent <strong>in</strong>terfaces 271fluff 12, 28, 55, 219for loop 157, 159, 162, 174implicit typ<strong>in</strong>g 213ForEach 146, 236foreachand anonymous types 225and captured variables 159deferred execution 303implicit cast<strong>in</strong>g 9implicit typ<strong>in</strong>g 213iterator pattern 162nested collections 301over generic collections 90sequence diagrams 168, 280foreach loop 170, 173ForEach method 97foreach statement 181foreign keys 300, 316formatt<strong>in</strong>gcode 149LINQ to XML source code 339us<strong>in</strong>g delegates 355with<strong>in</strong> DataTable.Select 334forward references 185Fowler, Mart<strong>in</strong> 271framework 128, 352framework libraries 25, 207absence of range type 174frequency count<strong>in</strong>g 66friend assemblies 201–204from clauses. See query expressions, from clausefrustration with <strong>C#</strong> 1 144Licensed to Rhona Hadida


INDEX 379fully qualified type name 197Func 56, 232, 288function 151functional 136, 160cod<strong>in</strong>g style 98, 244languages 231programm<strong>in</strong>g 235, 313, 353G.g.cs file extension for generated code 187garbage collectionanonymous methods 155, 159fixed-size buffers 201iterator blocks 173language requirements 25of delegate <strong>in</strong>stances 36value types 51, 115garbage collector 88performance penalty for excessive box<strong>in</strong>g 54generation. See Standard Query Operators,generationgeneric collections 289generic delegate types 72generic methods 67, 71, 74, 79, 106, 258<strong>in</strong> non-generic types 73generic type def<strong>in</strong>ition 92, 94generic type <strong>in</strong>ference used withSystem.Nullable 120generic types 67nest<strong>in</strong>g 86restriction on extension methods 258generics 56, 58–111benefits vs complexity 85delegate types 145experiment<strong>in</strong>g with 74implementation 85implement<strong>in</strong>g your own 81<strong>in</strong> Java 20lack of contravariance 103lack of covariance 103lack of generic properties and <strong>in</strong>dexers 108limitations 102native code shar<strong>in</strong>g 88nooks and crannies 85overload<strong>in</strong>g by type parameter numbers 70performance benefits 88read<strong>in</strong>g generic declarations 71type declaration syntax 69used for anonymous types 225GetConsoleScreenBufferEx 200GetEnumerator 90, 162–163, 169, 327, 332GetGenericArguments 94GetGenericTypeDef<strong>in</strong>ition 94GetHashCode 53, 83, 99, 226GetMethods 95GetNextValue 175GetRange 98getter 192GetType 94GetValueOrDefault method of Nullable. SeeNullable, GetValueOrDefault methodglobal namespace alias 194–195golden rule of captured variables 159grammar 271Groovy 22, 48, 272, 355group … by clauses. See query expressions, group… by clausesgroup jo<strong>in</strong>s 301, 306GroupBy. See Standard Query Operators, GroupBygroup<strong>in</strong>g 269, 307See also Standard Query Operators, group<strong>in</strong>gGroupJo<strong>in</strong> 333See also Standard Query Operators, GroupJo<strong>in</strong>GTK# 26guesswork 66GUID 199guidel<strong>in</strong>esfor captured variables 158for events 55, 142Gunnerson, Eric 107gut feel<strong>in</strong>gs 214, 273Guthrie, Scott 321Hhandwritten code 190, 226hash code, anonymous types 226hash codes 85HashSet, <strong>in</strong> Java 110Hashtable 46, 66–67, 99, 131HasValue 13See also Nullable, HasValue propertyheap 51–52, 54, 155, 201allocation of wrappers 115Hejlsberg, Anders 107helper class for partial comparisons 134helper methods 190Hewlett-Packard 19Hibernate Query Language (HQL) 345Hibernate 22, 345hierarchy 222higher-order functions 235history 18hooks 188Hotspot 18HQL. See Hibernate Query Language (HQL)HTML 24Human Computer Interfaces 358human nature 353hyperbole 275Licensed to Rhona Hadida


380INDEXIIBM 20ICloneable 46, 100ICollection 220IComparable 8, 53, 77–78, 82, 175IComparer 8–9, 85, 99, 105, 138, 356icon, for extension methods 260IDE. See Integrated Development Environment(IDE)identifiers 283, 287identity management 318, 348idioms 131, 233, 277changes regard<strong>in</strong>g delegates 54IDisposable 25, 92, 162, 223IEnumerable 335IEnumerablecollection <strong>in</strong>itializers 219explicit <strong>in</strong>terface implementation 90, 177extension methods 58, 263, 277–278, 286generic form 70, 90iterator blocks 165, 168iterator pattern 161LINQ and non-generic iterators 290parallelization 349relation to IQueryable 326, 330specified behavior 172IEnumerator 90, 161, 168, 171–172IEqualityComparer 83, 99, 361, 363IEquatable 82, 85IGroup<strong>in</strong>g 270, 307, 365IHashCodeProvider 99IL. See Intermediate Language (IL)ildasm 123, 128, 146, 172IList 46, 96, 145, 256ILookup 361immediate execution 281, 303immutability 240anonymous types 226of delegates 38of Nullable 116of ranges 177impedance mismatch 276, 351imperative programm<strong>in</strong>g 313implementation 346of captured variables 158of partial methods 189implicitconversion from method groups todelegates 140conversion operators 115conversions 230type arguments 79implicit conversionoverload<strong>in</strong>g 251str<strong>in</strong>g to XName 338type <strong>in</strong>ference 248implicit typ<strong>in</strong>g 43, 57, 219, 354arrays 223, 246, 248guidel<strong>in</strong>es 213local variables 210, 215, 235, 267parameter lists 234, 246restrictions 212Visual Basic 47<strong>in</strong>clusive OR operator for bool? 127<strong>in</strong>clusive ranges 175Increment 210<strong>in</strong>dentation 222<strong>in</strong>dexer 99, 102, 108separate access modifiers 193IndexOfKey 101IndexOfValue 101<strong>in</strong>direction 33, 38, 139<strong>in</strong>ferred return types 247, 253role <strong>in</strong> overload<strong>in</strong>g 252<strong>in</strong>formation loss 114<strong>in</strong>heritance 184, 255, 355prohibited for value types 51<strong>in</strong>itializationof local variables 211, 213order<strong>in</strong>g <strong>in</strong> partial types 186<strong>in</strong>itialization expression 211, 213<strong>in</strong>itobj 123<strong>in</strong>l<strong>in</strong>ecode for delegate actions. See anonymousmethodsfilter<strong>in</strong>g 265<strong>in</strong>itialization 221<strong>in</strong>l<strong>in</strong><strong>in</strong>g 109<strong>in</strong>-memory list 281<strong>in</strong>ner jo<strong>in</strong>s 297, 306<strong>in</strong>ner sequences 297INSERT 319InsertAllOnSubmit 319<strong>in</strong>stance variables created <strong>in</strong> iterators from localvariables 167<strong>in</strong>stantiation of captured variables 155<strong>in</strong>st<strong>in</strong>cts 273<strong>in</strong>t 52<strong>in</strong>teger literals 214Integrated Development Environment (IDE) 111Intel 19<strong>in</strong>tellectual curiosity 358Intellisense 47, 64, 260, 263, 325<strong>in</strong>terfaces 90, 177, 191, 220, 226, 255, 285, 356and extension methods 58as generic types 67as reference types 49explicit implementation 46implemented by partial types 185Licensed to Rhona Hadida


INDEX 381<strong>in</strong>terfaces (cont<strong>in</strong>ued)no covariant return types 46specify<strong>in</strong>g multiple <strong>in</strong> derivation typeconstra<strong>in</strong>ts 78Interlocked 210Intermediate Language (IL) 25code as data 238compiled forms of LINQ queries 277disassembl<strong>in</strong>g to discover implementationdetails 123, 146, 156, 172, 189, 225generic type names 93<strong>in</strong>struction to obta<strong>in</strong> MethodInfo directly 244relation to Java bytecode 110support for non-virtual calls 262when us<strong>in</strong>g ? modifier 121<strong>in</strong>ternal access modifier 201InternalsVisibleTo 201–204<strong>in</strong>teroperability 49, 358<strong>in</strong>terpreted 18Intersect. See Standard Query Operators, Intersect<strong>in</strong>to. See query expressions, jo<strong>in</strong> ... <strong>in</strong>to andcont<strong>in</strong>uations<strong>in</strong>tuition 138InvalidCastException 53InvalidOperationException 116, 124<strong>in</strong>variance 103Inversion of Control 92<strong>in</strong>vocation of delegates 36–38Invoke 36, 239IParallelEnumerable 349IQueryable 326–328, 333IQueryable 261, 263, 286, 326–333, 360IQueryProvider 326–328, 333IronPython 22, 24, 355IronRuby 355IsGenericMethod 95IsGenericType 93–94IsGenericTypeDef<strong>in</strong>ition 94IShape 105IsInterned 245IsNullOrEmpty 262ITask 180iterable 162iterationmanual vs foreach 172over generic collections 90iterator blocks 165–173, 177, 179, 258, 358Microsoft implementation 172sequence of events 168us<strong>in</strong>g method parameters 169yield type 166iterator pattern 161iterators 161–182cha<strong>in</strong><strong>in</strong>g together 265implement<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 1 162iterator blocks. See iterator blocksJrestart<strong>in</strong>g with<strong>in</strong> iterator block 167yield statements. See yield statementsJ++ 19jargon 68Java 102, 190, 345, 3551.5 features 20<strong>in</strong>fluence on .NET 22<strong>in</strong>fluence over .NET 18Java Virtual Mach<strong>in</strong>e (JVM) 19Javascript 22JavaServer Pages 18JIT. See Just-In-Time (JIT) compilerjo<strong>in</strong> … <strong>in</strong>to clauses. See query expressions, jo<strong>in</strong> ...<strong>in</strong>to clausesjo<strong>in</strong> clauses. See query expressions, jo<strong>in</strong> clausesJo<strong>in</strong>. See Standard Query Operators, Jo<strong>in</strong>jo<strong>in</strong>s 15, 297LINQ to SQL 322LINQ to XML 343standard query operators. See Standard QueryOperators, jo<strong>in</strong>sJust-In-Time (JIT) compiler 18, 25, 201behavior with generics 65, 82, 88, 107treat<strong>in</strong>g code as data 238JVM. See Java Virtual Mach<strong>in</strong>e (JVM)KKeyproperty of DictionaryEntry 67property of IGroup<strong>in</strong>g 307keyof group<strong>in</strong>g 307of jo<strong>in</strong>s 297key selectors 297key/value pair 101KeyPress 139KeyPressEventArgs 141KeyPressEventHandler 140–141Keys 101keys, <strong>in</strong> dictionaries 84keystrokes 214KeyValuePair 67, 108, 220Llambda calculus 231lambda expressions 10, 56, 137, 230–254, 277body 234, 246, 251built from query expressions 284–285, 288,290, 303, 330conversion to expression trees 241Licensed to Rhona Hadida


382INDEXlambda expressions (cont<strong>in</strong>ued)<strong>in</strong> fluent <strong>in</strong>terfaces 271readability 356role <strong>in</strong> LINQ 325second phase of type <strong>in</strong>ference 249simplicity 12use when filter<strong>in</strong>g 265Lambda method of Expression class 240LambdaExpression 240, 362language changes 208language designers 113, 128, 136, 183, 204, 253Language INtegrated Query (LINQ) 14, 21,56–57, 288–352<strong>C#</strong> 2 as stepp<strong>in</strong>g stone 160–161core concepts 275–281deferred execution 279functional style 354jo<strong>in</strong>s 297Standard Query Operators. See Standard QueryOperatorstype <strong>in</strong>ference 251use of anonymous types 224use of expression trees 239, 244use of extension methods 256, 263, 265use of lambda expressions 231, 235use of projection <strong>in</strong>itializers 226language specificationcollection <strong>in</strong>itializers 220embedded objects 218extended type term<strong>in</strong>ology 258generics 85iterator blocks 172method group conversions 141overload resolution 251pragmas 197query expression pattern 285transparent identifiers 295, 297type <strong>in</strong>ference 248language-lawyer 80languagesadd<strong>in</strong>g functionality on nullable types 119different behavior for nullable types 128last <strong>in</strong> first out 100Last. See Standard Query Operators, LastLastOrDefault. See Standard Query Operators,LastOrDefaultlaz<strong>in</strong>ess 214lazy code generation 88lazy load<strong>in</strong>g 325, 337LDAP. See Lightweight Directory Access Protocolleaf nodes 240leaky abstraction 346learn<strong>in</strong>g 358left outer jo<strong>in</strong>s 301, 323Length 295LessThan 239let clauses. See query expressions, let clausesLibraries 25library changes 208library design 81lifetime of captured variables 154LIFO last <strong>in</strong> first out 100lifted conversions 124lifted operators 125, 128Lightn<strong>in</strong>g 19Lightweight Directory Access Protocol 347limitations of generics 102l<strong>in</strong>e breaks 146L<strong>in</strong>kedList 101LINQ <strong>in</strong> Action 18, 315, 344LINQ providers 244, 276–277, 281, 292,314, 333, 344LINQ. See Language INtegraged Query (LINQ)LINQ to Active Directory 347LINQ to Amazon 18, 344LINQ to DataSet 315, 334–338LINQ to Entities 348LINQ to NHibernate 345LINQ to Objects 275–313lambda expressions 244us<strong>in</strong>g with LINQ to DataSet 333–334, 337us<strong>in</strong>g with LINQ to XML 343LINQ to SharePo<strong>in</strong>t 347LINQ to SQL 17, 22, 314–326compared with Entity Framework 348debug visualizer 321implicit jo<strong>in</strong>s 323<strong>in</strong>itial population 318–319jo<strong>in</strong>s 322model creation 315–318queries 319–324updat<strong>in</strong>g data 324–325use of expression trees 244use of IQueryable 330LINQ to XML 16, 315, 338–344LINQ. See Language INtegraged Query (LINQ)Lippert, Eric 106Listcollection <strong>in</strong>itializers 219comparison with ArrayList 88, 96ConvertAll 71, 267, 277Enumerable.ToList 360F<strong>in</strong>dAll 153ForEach 146lambda expressions 235–236unstable sort 369use of delegates <strong>in</strong> .NET 2.0 160, 356Licensed to Rhona Hadida


INDEX 383List`1 93lists 278local variables 151, 155, 285declared <strong>in</strong> anonymous methods 146implicit typ<strong>in</strong>g. See impicitly typed local variablesseparate <strong>in</strong>stances 155stored <strong>in</strong> iterators 166lock 209lock statement 181lock<strong>in</strong>g 192log files 306logg<strong>in</strong>g 188, 192, 228, 237, 318, 325, 332logic 239logic gates, <strong>in</strong> electronics 127logical AND operator for bool? 127logical model 348LongCount. See Standard Query Operators,LongCountloop<strong>in</strong>g <strong>in</strong> anonymous methods 146loopsand anonymous methods 159captured variables 156<strong>in</strong> lambda expressions 234lowest common denom<strong>in</strong>ator 32Mmacros 20, 108magic valueas substitute for null 114for fak<strong>in</strong>g nullity 13Ma<strong>in</strong> 28ma<strong>in</strong>stream 354ma<strong>in</strong>ta<strong>in</strong>ability 159ma<strong>in</strong>tenance 273MakeGenericMethod 95MakeGenericType 94management 353Mann<strong>in</strong>g 315, 344mapp<strong>in</strong>g ADO.NET Entity Framework 348Marguerie, Fabrice 344marshaller 199match<strong>in</strong>g with predicates 236Math 190mathematical comput<strong>in</strong>g 106Max. See Standard Query Operators, Maxmaybe sense of null bool? value 128mcs 199MemberAccess 239memory 50–52, 114cost of l<strong>in</strong>ked lists 102MemoryStream 51, 88, 142, 257mental process 122metadata 110, 315, 319meta-provider 350method argumentscomb<strong>in</strong><strong>in</strong>g for type <strong>in</strong>ference 248compared with type arguments 69<strong>in</strong> expression trees 243used <strong>in</strong> type <strong>in</strong>ference 80method calls<strong>in</strong> expression trees 243result used for implicit typ<strong>in</strong>g 212method cha<strong>in</strong><strong>in</strong>g 354method exit po<strong>in</strong>ts 169method groups 140, 230, 244, 246ambiguous conversions 141convert<strong>in</strong>g <strong>in</strong>to delegates 140covariance and contravariance 141–144implicit typ<strong>in</strong>g 212second phase of type <strong>in</strong>ference 249method parameterscompared with type parameters 69contravariance 105used <strong>in</strong> iterator blocks 169method signatures, read<strong>in</strong>g generics 72MethodInfo 95, 243MethodInvoker 143methods as value type members 51metrics 269micro-optimization 266Microsoft<strong>C#</strong> compiler implementation 158, 172,197–198, 237design guidel<strong>in</strong>es 93, 95language designers 120, 184, 358language specification 85LINQ 338, 344, 347–350reasons for creat<strong>in</strong>g .NET 18, 20, 24, 31Research 21, 354Microsoft Robotics Studio 178M<strong>in</strong>. See Standard Query Operators, M<strong>in</strong>m<strong>in</strong>dset 271m<strong>in</strong>dshare 20m<strong>in</strong>i-CLR 22M<strong>in</strong>Value of DateTime 114Miscellaneous Utility Library 174, 257, 349MiscUtil. See Miscellaneous Utility Librarymiss<strong>in</strong>g features of Stream 257miss<strong>in</strong>g keys <strong>in</strong> Hashtables 132mistakes 352ML 231, 354mock<strong>in</strong>g 272Mono 20, 25, 199Monty Python 221MouseClick 139MouseEventArgs 141MouseEventHandler 55MoveNext 162, 164, 166, 169, 172, 279Licensed to Rhona Hadida


384INDEXMSDNconfus<strong>in</strong>g jo<strong>in</strong> term<strong>in</strong>ology 297EventArgs 141expression trees 239–240extension methods 10generic reflection 93–95LINQ standard query operators 281LINQ to XML 338, 342multi-core processors 357Multiply 239multi-thread<strong>in</strong>g 116, 178, 349mutability wrappers for value types 115myths 48, 51NN+1 selects problem 325name/value pairs 226namespace aliases 193–194namespace hierarchy 195namespaces 261for extension methods 262, 273XML 338nam<strong>in</strong>g 356conventions for type parameters 71NAND gates 127native code 49, 87, 199, 239native language of LINQ provider targets 244natural language 271natural result type 132navigation via references 300nested classes, restriction on extensionmethods 258nested types 191partial types 185used for iterator implementation 164.NET<strong>in</strong>fluences 18term<strong>in</strong>ology 24version numbers 27.NET 1.0 20.NET 1.1 20–21.NET 2.0 21, 201, 232, 261.NET 3.0 21.NET 3.5 21, 208, 232, 261, 266, 277, 315, 344extension methods 263, 289.NET Passport 25NetworkStream 257new 224anonymous types 57with constructor type constra<strong>in</strong>ts 77NHibernate 345Criteria API 345nodes 101NodeType 239non-abstract 76non-nullable constra<strong>in</strong>t 323NOR gates 127Norås, Anders 272Novell 20nullargument when <strong>in</strong>vok<strong>in</strong>g static methods 95as empty delegate <strong>in</strong>vocation list 39as failure value 131behavior as an argument value 53comparisons <strong>in</strong> generics 81–82DBNull 114, 335default value for reference types 82extension methods 262implicit typ<strong>in</strong>g 212<strong>in</strong>itial value <strong>in</strong> iterator blocks 172nullable types 13, 58, 112–136returned by Hashtable <strong>in</strong>dexer 99semantic mean<strong>in</strong>gs 114target of extension methods 262–263, 273type <strong>in</strong>ference 223null coalesc<strong>in</strong>g operator 128, 133use with reference types 130null literal 122null value 335See also nullable types, null valueNullable class 119Compare method 120Equals method 120GetUnderly<strong>in</strong>gType method 120nullable logic 127–128nullable types 58–59, 112–136, 258, 335CLR support 118–119conversions 124conversions and operators 124–128framework support 115–118<strong>in</strong> generic comparisons 82logical operators 127not valid with value type constra<strong>in</strong>ts 76novel uses 131–136null value 120operators 125syntactic sugar <strong>in</strong> <strong>C#</strong> 2 120underly<strong>in</strong>g type 116, 120Value property 127Nullable 13, 116equality 119Equals behavior 119explicit conversion to non-nullable type 117GetHashCode behavior 116HasValue property 116implicit conversion from non-nullable type 117struct 116ToStr<strong>in</strong>g behavior 117Licensed to Rhona Hadida


INDEX 385Nullable (cont<strong>in</strong>ued)Value property 116wrapp<strong>in</strong>g and unwrapp<strong>in</strong>g 117nullity check 149NullReferenceException 82, 118–119, 262numbers 272numeric constra<strong>in</strong>ts 106numeric types 125Oobject creation 215object <strong>in</strong>itializers 215–218, 283, 355object orientation, <strong>in</strong>direction 38Object Relational Mapp<strong>in</strong>g (ORM) 22, 351Entity Framework 348LINQ to SQL 314, 325NHibernate 345use of partial types 187Object Services 348object-orientation 20object-oriented data model 276, 300objectslocation <strong>in</strong> memory 51parameter pass<strong>in</strong>g 52ObjectSpaces 22obscure code 214Office 2007 21OfType. See Standard Query Operators, OfTypeopen source 345open types 68closed at execution time 93operands, of null coalesc<strong>in</strong>g operator 128operator constra<strong>in</strong>ts 106operators 108, 191default. See default value expressionstypeof. See typeof operatoroptimization 109, 242, 266of jo<strong>in</strong> ... select 300Option Explicit 47Option Strict 47, 355optional fields 130Oracle 346OrDefault 363OrderBy 10, 237orderby clauses. See query expressions, orderbyclausesOrderBy. See Standard Query Operators, OrderByOrderByDescend<strong>in</strong>g. See Standard QueryOperators, OrderByDescend<strong>in</strong>gorder<strong>in</strong>g 15, 267, 290group<strong>in</strong>g 308of object <strong>in</strong>itialization 217ORM. See Object Relational Mapp<strong>in</strong>g (ORM)out 234, 258outer sequences 297outer variables 151output parameters 81, 131Output w<strong>in</strong>dow 198over-eng<strong>in</strong>eer<strong>in</strong>g 136overkill 12overloaded methods and generics 83overload<strong>in</strong>g 290ambiguity 140, 252changes <strong>in</strong> <strong>C#</strong> 3 251–253extension methods 261GroupBy 310of Add method 220PP/Invoke 199pairs of values 83parallel comput<strong>in</strong>g 358Parallel Extensions 349, 357Parallel LINQ (PLINQ) 315, 349, 357ParallelEnumerable 349parallelism 179, 357parallelization 313parameter 52covariance 230pass<strong>in</strong>g 51–52type contravariance XE 47type of object 64types <strong>in</strong> anonymous method declarations 146parameter wildcard<strong>in</strong>g 149ParameterExpression 243parameterized typ<strong>in</strong>g. See genericsParameterizedThreadStart 150, 153parameterless constructors 76, 217entity classes 318parameters 151for delegates 35<strong>in</strong> expression trees 243positional 217params 20parentheses 234, 339Parse 342partial 185classes 317comparisons 134methods 188, 318modifier 189types 184–190, 258PartialComparer 135partition of the Common Language Infrastructurespecification 25partition<strong>in</strong>g. See Standard Query Operators,partition<strong>in</strong>gpass by reference. See parameter pass<strong>in</strong>gLicensed to Rhona Hadida


386INDEXpatterns 131, 136for comparison 135paus<strong>in</strong>g, apparent effect <strong>in</strong> iterators 167PDC. See Professional Developers Conferencepeek 100performance 48, 111, 149, 190, 265–266, 291, 341improvements due to generics 65of value types 51permissions 203persistence layer 334personal preference for conditional operator 129personality 214philosophy of nullability 112photocopy<strong>in</strong>g, comparison with value types 49p<strong>in</strong>n<strong>in</strong>g 201pipel<strong>in</strong>es 266placeholders, type parameters 68PLINQ. See Parallel LINQ (PLINQ)Po<strong>in</strong>t, as reference type or value type 50po<strong>in</strong>ters 199pollution of classes with methods fordelegates 145Pop 100portable 19positional parameters 217Postgres 346pragma directives 197, 199Predicate 97, 147, 236predicates 97, 291preprocess<strong>in</strong>g query expressions 276, 284,286, 290PreserveOrder<strong>in</strong>g 350primary keys 300primitive types 199private access modifier 193private variables, back<strong>in</strong>g properties 209Process 222ProcessStartInfo 222producer/consumer pattern 101production assemblies 202productivity 214, 273Professional Developers Conference 19, 21project management 269project plan 175projection expressions 287, 307projection <strong>in</strong>itializers 227, 269projection. See Standard Query Operators,projectionprojections 266, 283, 300, 303, 342, 356properties 57, 108, 212, 215, 340anonymous types 226automatic implementation 7automatically implemented. See automaticallyimplemented propertiescomparison with events 40differ<strong>in</strong>g access for getter/setter 6forced equality of access <strong>in</strong> <strong>C#</strong> 1 6<strong>in</strong> value types 51separate access modifiers 192Properties w<strong>in</strong>dow 197property descriptors 238property evaluations <strong>in</strong> expression trees 243provability 358pseudo-code 306pseudo-synchronous code 178public key 203token 203public variables 209Push 100Python 48Qqualified namespace aliases 194quantifiers. See Standard Query Operators,quantifiersquery cont<strong>in</strong>uations 310query expression pattern 285query expressions 14, 275–314consistency across data sources 337cont<strong>in</strong>uations 310embedded with<strong>in</strong> XElement constructors 340from clauses 283, 303, 367functional style 354jo<strong>in</strong> ... <strong>in</strong>to clauses 301let clauses 295, 321LINQ to SQL 325nested 341nest<strong>in</strong>g 300orderby clauses 292Queryable extension methods 330range variables. See range variablesreadability 357select clauses 283, 287, 292, 300, 303translations 284transparent identifiers. See transparentidentifierstype <strong>in</strong>ference 246where clauses 290, 321, 332, 368query parameters 320query translation 318Queryable 263, 266, 326, 328, 330, 333, 362query<strong>in</strong>gloop<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 1 11separat<strong>in</strong>g concerns <strong>in</strong> <strong>C#</strong> 2 11QueryOptions 350QueryProvider 326Queue 100quirks <strong>in</strong> iterator block implementation 172Licensed to Rhona Hadida


INDEX 387RRails 22random access 102random numbers 278range 174required functionality 174range variables 287, 290, 295, 300, 304, 306, 310explicitly typed 289Range. See Standard Query Operators, RangeRange 264, 268Read 257readability 12, 208anonymous methods 138, 151, 159, 230<strong>C#</strong> 3 <strong>in</strong>itializers 217, 222–223C++ templates 108extension methods 260, 263, 266, 270, 272generic method type <strong>in</strong>ference 80generics 111implicit typ<strong>in</strong>g 213lambda expressions 10, 232, 234namespace aliases 193null coalesc<strong>in</strong>g operator 130–131, 133of implementation 356of nullable type syntax 120of results 356query expressions 291, 308, 312ReadFully 257ReadOnly 98read-onlyautomatically implemented properties 210properties 215recursion, XElement constructor 339redundancy of specify<strong>in</strong>g type arguments 79ref 234, 258refactor<strong>in</strong>g 21, 85, 133, 187, 316, 356reference 113obta<strong>in</strong>ed via box<strong>in</strong>g 53passed a a parameter 52reference parameters 210reference types 48, 75use with null coalesc<strong>in</strong>g operator 130ReferenceCompare 135reflection 77, 92, 238, 244, 340, 355Reflector 123, 146, 172, 292register optimizations 88relational data model 300relational model 276relational operators 125on nullable types 125reliability 21remote execution 245Remove 38, 138RemoveAll 97RemoveAt 101Repeat 98Repeat. See Standard Query Operators, RepeatReset 172resource requirements 175responsibilities <strong>in</strong> design 187restor<strong>in</strong>g warn<strong>in</strong>gs 198restra<strong>in</strong>t 208return keyword, anonymous methods 148return statements 169, 233, 253prohibited with<strong>in</strong> iterator blocks 166return typecovariance 46<strong>in</strong>ference 288of object 64requirement for delegate match<strong>in</strong>g 35return types 108covariance 141covariance for delegates 143lambda expressions 233methods differ<strong>in</strong>g only by 90of partial methods 189type <strong>in</strong>ference 247return values 131Reverse. See Standard Query Operators, Reverserevers<strong>in</strong>g 265of ranges 175RIA. See Rich Internet Applications (RIA)Rich Internet Applications (RIA) 22, 24, 358richness of libraries <strong>in</strong>creas<strong>in</strong>g with extensionmethods 271right associativity of null coalesc<strong>in</strong>g operator 130Robotics Studio 178robustness 21, 194Ruby 22, 48runtime 25cast check<strong>in</strong>g 51Ssafety 111sample data 283scalability 181schema 316LDAP 347Scheme 151scientific comput<strong>in</strong>g 106scope 151, 155of f<strong>in</strong>ally blocks <strong>in</strong> iterators 171range variables 311script<strong>in</strong>g 24SDK. See Software Development Kit (SDK)sealed 255implicit for static classes 191sealed types 190security 203Licensed to Rhona Hadida


388INDEXsecurity permissions 319seed aggregation 359select clauses. See query expressions, select clausesSelect, method of DataTable 334Select. See Standard Query Operators, SelectSelectMany. See Standard Query Operators,SelectManysemi-colons 234sender events 237sequence diagram 279SequenceEqual. See Standard Query Operators,SequenceEqualsequences 278, 290embedded 301flatten<strong>in</strong>g 306generated by multiple from clauses 306<strong>in</strong>f<strong>in</strong>ite 278, 294<strong>in</strong>ner and outer 297untyped 289, 361serialization 226Server Explorer 316, 335Service Oriented Architecture (SOA) 358servlet 18sessions, NHibernate 346set-based operations. See Standard QueryOperators, set-based operationsSetRange 98setter 192SharePo<strong>in</strong>t 344shar<strong>in</strong>g variables between anonymousmethods 158short-circuit<strong>in</strong>g operators 127, 133shorthand for nullable types 59signaturescomplicated generics 250of event handlers 142of lifted operators 125of methods 35overload<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 3 251real vs effective 288signed assemblies 203sign-off 175Silverlight 22simplicity 151, 159, 175, 214s<strong>in</strong>gle <strong>in</strong>heritance of implementation 256s<strong>in</strong>gle responsibility 163, 187S<strong>in</strong>gle. See Standard Query Operators, S<strong>in</strong>gleS<strong>in</strong>gleOrDefault. See Standard Query Operators,S<strong>in</strong>gleOrDefaults<strong>in</strong>gleton pattern 190SkeetySoft 282, 289, 307Skip. See Standard Query Operators, SkipSkipWhile. See Standard Query Operators,SkipWhileslopp<strong>in</strong>ess 353snippets 28–30, 264expansions 28mean<strong>in</strong>g of ellipsis 29Snippy 30, 144, 187, 241, 263SOA. See Service Oriented Architecture (SOA)software development 352Software Development Kit (SDK) 123software eng<strong>in</strong>eer<strong>in</strong>g 274Sort 105, 148, 236, 369SortedDictionary 101SortedList 101SortedList 101sort<strong>in</strong>g 133, 265, 267, 356files by name and size 148<strong>in</strong> <strong>C#</strong> 1 8See also Standard Query Operators, sort<strong>in</strong>gsource assembly 201source code 25available to download 30source files 198Spec# 21special types, restrictions <strong>in</strong> derivation typeconstra<strong>in</strong>ts 78specialization 255, 356of templates 109specification 25Common Language Infrastructure 25See also language specificationspeculation 352SQL 15, 22, 275–276, 351behavior of NULL values 128jo<strong>in</strong>s 294, 300–301, 304, 322LINQ to SQL 244, 314, 325logg<strong>in</strong>g 318order<strong>in</strong>g of clauses 286SQL Server 346, 348SQL Server 2005 21, 315Express edition 316SQL Server 2008 348SQL Server Management Studio Express 322square brackets 94square root 267stable sort 369stack 50, 52, 54, 155–156, 201stack frame 152, 155Stack 100Standard Query Operators 274, 281, 327,343, 359–370Aggregate 359aggregation 332, 359All 367Any 367AsEnumerable 360AsQueryable 360Average 333, 359Licensed to Rhona Hadida


INDEX 389Standard Query Operators (cont<strong>in</strong>ued)Cast 289, 326, 361Concat 360concatenation 360Conta<strong>in</strong>s 367conversion 360Count 301, 333, 359DefaultIfEmpty 323Dist<strong>in</strong>ct 368element operations 363ElementAt 363ElementAtOrDefault 363Empty 364equality operations 363Except 368filter<strong>in</strong>g 368First 363FirstOrDefault 363generation 364GroupBy 269, 308, 365group<strong>in</strong>g 365GroupJo<strong>in</strong> 303, 365Intersect 368Jo<strong>in</strong> 365jo<strong>in</strong>s 365Last 363LastOrDefault 363LongCount 359Max 359M<strong>in</strong> 359OfType 289, 326, 361, 368OrderBy 267, 271, 294, 369OrderByDescend<strong>in</strong>g 267, 294, 369partition<strong>in</strong>g 366quantifiers 367Range 264, 364Repeat 364Reverse 264, 281, 369Select 266, 277, 281, 285, 292, 296, 367SelectMany 306, 367SequenceEqual 363set-based operations 368S<strong>in</strong>gle 320, 363S<strong>in</strong>gleOrDefault 363Skip 366SkipWhile 366sort<strong>in</strong>g 369Sum 269, 332, 359Take 366TakeWhile 366ThenBy 267, 271, 294, 369ThenByDescend<strong>in</strong>g 267, 294, 369ToArray 360ToDictionary 361ToList 360ToLookup 361, 365Union 368Where 265, 285, 291, 368standardization 19StartsWith 243statelack of <strong>in</strong> utility classes 190requirement for iterators 163state mach<strong>in</strong>e 181for iterators 166static classes 190–197, 256, 258, 356System.Nullable 119static constructors 87, 221static generic methods, workaround for lack ofgeneric constructors 108static languages 354static membersof generic types 85thread safety 209static methods 190, 255, 258as action of delegate 36static modifier for static classes 191static properties 209static typ<strong>in</strong>g 42–43, 57, 65, 141, 211, 354compiler enforcement 43of Expression 240StaticRandom 349storage 278Stream 51, 142, 257, 278stream<strong>in</strong>g 265, 279, 289–290, 306, 308streaml<strong>in</strong><strong>in</strong>g 353StreamUtil 257str<strong>in</strong>gas a reference type 50as example of immutability 39Str<strong>in</strong>gBuilder 52Str<strong>in</strong>gCollection 46Str<strong>in</strong>gComparer 364Str<strong>in</strong>gDictionary 99Str<strong>in</strong>gProcessor 34strong typ<strong>in</strong>g 42, 45, 56of implementation 74strongly typed <strong>in</strong>terface 85Stroustrup, Bjarne 109struct. See value typesstructs as lightweight classes 51structure 222, 339structures as generic types 67SubmitChanges 319subtraction of nullable values 123Sum. See Standard Query Operators, SumSun 18, 20, 24symmetry 135Synchronized 98SynchronizedCollection 99–100Licensed to Rhona Hadida


390INDEXSynchronizedKeyedCollection 71, 100synchronous code 179syntactic shortcut for anonymous methods 149syntactic sugar 13, 162, 181, 216, 276for nullable types 120syntax 25System namespace 232System.L<strong>in</strong>q namespace 263System.L<strong>in</strong>q.Expressions namespace 239System.Nullable class. See Nullable classSystem.Nullable struct. See Nullable structSystem.Runtime.CompilerServices namespace 261System.Web.UI.WebControls 193System.W<strong>in</strong>dows.Forms 193System.Xml.L<strong>in</strong>q 338TTable 319, 326Take. See Standard Query Operators, TakeTakeWhile. See Standard Query Operators, Take-WhileTatham, Simon 182TDD. See Test-Driven Development (TDD)technologyhistory 18milestones 22pace of change 3tedious cod<strong>in</strong>g 229template arguments 109template metaprogramm<strong>in</strong>g 109templates 20, 83, 102, 108term<strong>in</strong>ology 253, 258, 297automatically implemented properties 209ternary operator. See conditional operatortest-bed for captured variables behavior 160Test-Driven Development (TDD) 174text editor 222TextReader 278ThenBy. See Standard Query Operators, ThenByThenByDescend<strong>in</strong>g. See Standard QueryOperators, ThenByDescend<strong>in</strong>gth<strong>in</strong>k<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 18third party libraries 273this 151, 155, 163, 221, 258, 329Thread constructor 140thread safe 39thread safety 39, 116, 209, 226uses for queues 100threadpool 153threads 153, 178ThreadStart 34, 140, 143, 150, 152throwaway code 209TIME_ZONE_INFORMATION 199TimeSpan 116, 125ToArray. See Standard Query Operators, ToArrayToDictionary. See Standard Query Operators,ToDictionaryToList 236See also Standard Query Operators, ToListToLookup. See Standard Query Operators,ToLookuptooltips 211, 225, 260ToStr<strong>in</strong>g 53, 226, 238, 240, 268, 284, 318, 329traditional managed developers 24transaction management 318transactions, NHibernate 346transformations 161, 279translations 284transparent identifiers 294, 300, 306erasure 297trial and error 66TrimExcess 98TrimToSize 98tri-state logic 127trivial key selector 298trivial properties 209, 228–229Trojan horse 24true operator 125TrueForAll 96truth tables 127TryGetValue 99TryParse 81, 132TryXXX pattern 81, 131T-SQL 316two-phase type <strong>in</strong>ference 248type arguments 68, 74, 92, 246<strong>in</strong>ference 79of anonyous types 225type constra<strong>in</strong>ts 74–75, 185comb<strong>in</strong><strong>in</strong>g 78lack of operator/numeric 106restricted comb<strong>in</strong>ations 78type conversions and type safety 44type declarations 185type erasure 110type hierarchy expression classes 240type <strong>in</strong>ference 43, 79, 288, 290, 355anonymous types 225changes <strong>in</strong> <strong>C#</strong> 3 245–253<strong>in</strong> <strong>C#</strong> 2 74local variables 210return types of anonymous functions 247two-phase <strong>in</strong> <strong>C#</strong> 3 248type <strong>in</strong>itialization 85type parameter constra<strong>in</strong>ts 77type parameters 67, 92, 109, 211and typeof 93constra<strong>in</strong>ts. See type constra<strong>in</strong>tsLicensed to Rhona Hadida


INDEX 391type parameters (cont<strong>in</strong>ued)<strong>in</strong>ference 80unconstra<strong>in</strong>ed 75Type property of Expression 239type safety 42, 44, 103, 111type systemcharacteristics of <strong>C#</strong> 42, 48<strong>in</strong> <strong>C#</strong> 1 33richness 45summary of <strong>C#</strong> 1 48type systems 42, 59type variables, fixed and unfixed 248typed datasets 335–338typedef 106TypedTableBase 337typeless expressions 223typeof 244with nullable types 121typeof operator 69, 81, 92type-specific optimizations 109typ<strong>in</strong>g, IComparer 9Uunary logical negation operatorfor bool? 127unary operators 125unbound generic types 68, 92unbox<strong>in</strong>g 53–54, 89of Nullable. See Nullable, box<strong>in</strong>g and unbox<strong>in</strong>guncerta<strong>in</strong>ty 121unconstra<strong>in</strong>ed type parameters 75undef<strong>in</strong>ed behavior of IEnumerator.Current 173unfixed type variables 248unified data access 276un<strong>in</strong>tended comparisons with nullable values 126Union. See Standard Query Operators, Unionunit test<strong>in</strong>g 202, 353unit tests 131, 222, 272Unix 19unknown values 13UnmanagedType 201unsafe code 42, 199unsafe conversions 44unsigned assemblies 203untyped datasets 334–335unwrapp<strong>in</strong>g 123UPDATE 319Update 319URLscomparison with references 49, 54, 113fetch<strong>in</strong>g <strong>in</strong> new threads 153user <strong>in</strong>terface 269user-def<strong>in</strong>ed operators 125us<strong>in</strong>g directive 135, 193, 261us<strong>in</strong>g statement 25, 28, 181, 193implicit typ<strong>in</strong>g 213UTF-8 44utility classes 190, 256, 266Vvalidation 188, 192, 209, 273Valueproperty of DictionaryEntry 67property of Nullable. See Nullable, Valuepropertyvalue type constra<strong>in</strong>ts 76on Nullable 116value types 48, 178, 255automatic properties 210generic type parameters 76handled efficiently with generics 65non-nullability 58, 113Values 101var 43, 211See also implicit typ<strong>in</strong>gvarargs 20variableimpossibility of hid<strong>in</strong>g <strong>in</strong> <strong>C#</strong> 1 6untyped 43variablesdef<strong>in</strong>ite assignment 65implicitly typed 57<strong>in</strong>stance 50local 50location <strong>in</strong> memory 50–52range variables 287static 51Variant 211VB Classic 20VB.NET 20, 24, 201VB6 20, 24, 211VB8 21VB9 277, 281version number 26version<strong>in</strong>g 110, 261, 273virtual methods 184, 188visibility 356visible constructors 191Vista 21, 200Visual Basic 47, 355version numbers 27Visual C++ team 109Visual Studio .NET 2002 20Visual Studio .NET 2003 20Visual Studio 2005 21, 187, 197Visual Studio 2008 22, 263DataSet designer 335display<strong>in</strong>g extension methods 260Licensed to Rhona Hadida


392INDEXVisual Studio 2008 (cont<strong>in</strong>ued)display<strong>in</strong>g <strong>in</strong>ferred types 211, 225Intellisense <strong>in</strong> LINQ 325LINQ to SQL designer 316reference properties w<strong>in</strong>dow 197update for Entity Framework 348Visual Studio, version numbers 27void 147, 189Wwarn<strong>in</strong>g numbers 198warn<strong>in</strong>g pragmas 197warn<strong>in</strong>gs 144, 197weak typ<strong>in</strong>g 42, 211of <strong>in</strong>terface 74web page, comparison with reference types 49web service proxies 187web services 179, 269, 344, 346web site of book 30WebRequest 258WebResponse 258where 15type constra<strong>in</strong>ts 75Where. See Standard Query Operators, Wherewhitespace 147, 203, 218–219, 267wiki 49wildcards 110will 34, 38W<strong>in</strong>dows 19, 26, 199, 347W<strong>in</strong>dows Forms 26, 184W<strong>in</strong>dows Live ID 24W<strong>in</strong>dows Presentation Foundationused to write Snippy 30W<strong>in</strong>dows Presentation Foundation (WPF) 21, 187W<strong>in</strong>dows Server 2008 200workflow, of iterators 167WPF. See W<strong>in</strong>dows Presentation Foundation(WPF)wrapperfor fak<strong>in</strong>g nullity 13for value types 115write once, run anywhere 19WriteL<strong>in</strong>e 67XXAML. See Extensible Application Markup Language(XAML)XAttribute 338, 340, 342XComment 338XDocument 338–339, 343XElement 338–339, 342axis methods 342XML 16, 222, 277, 338declaration 339documentation 209mapp<strong>in</strong>g files for ADO.NET EntityFramework 348metadata for LINQ to SQL 316used to describe standard libraries 26XName 338XPath 343Yyield statements 165f<strong>in</strong>ally blocks 170–172restrictions 166yield break 169–170yield return 165–169yield type 166Zzerorepresentation of null 113result of comparison 134Licensed to Rhona Hadida

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!