C# Query Expressions

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View C# Query Expressions as PDF for free.

More details

  • Words: 49,390
  • Pages: 239
C# Query Expressions And Supporting Features in C# 3.0

Preview Release 1.0

Jamie King, Neumont University Bruce Eckel, MindView, Inc.

©2008 MindView, Inc. All Rights Reserved Available only from www.MindViewInc.com

About this sample This is a sampling of a full book, which as of March 2008, is not yet available to the public. However, this sample covers C# 3.0 fundamentals, and provides a full grounding in C# 3.0 Query Expressions. It’s common for authors to offer a few pages or a chapter of their text to the public as a means of marketing. However, interpreting chapters out of context is challenging for a reader. Our aim is to not only provide a sample, but also a useful stand-alone text. By itself, this sample provides any C# 2.0 programmer a foundation in C# 3.0. The full text includes more material from Query Expressions chapter. It also includes additional chapters that cover LINQ to SQL and LINQ to XML, respectively. We hope you find this sample useful. If you wish to purchase the full text, you may do so when the full book is released; watch the website www.MindView.net for further details. Remember that this is a preview, and you will see items that have not yet been completed. The finished work will fix all such errata.

i

Copyright All rights reserved. This publication is protected by copyright, and permission must be obtained from the authors prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. C# is a trademark of Microsoft Corporation. Windows 95, Windows NT, Windows 2000, and Windows XP are trademarks of Microsoft Corporation. All other product names and company names mentioned herein are the property of their respective owners. The authors have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein.

ii

Contents Preface

1

What makes this book stand out? ..................... 1 Reading vs. Wrestling ...........2 The build system ...................2 Reviews .................................4 Errors ....................................4 Source Code...........................4 Using this book to teach a class.......................4 Dedications ........................... 5 Jamie: .............................................5

Simple New Features

7

Extension methods................ 7 Inheritance vs. extension methods ........................11 Utilities for this book.................... 15 Extended delegates....................... 19 Other rules....................................20

Implicitly-typed local variables......................23 Automatic properties ..........25 Implicitly-typed arrays........28 Object initializers ................33 Collection initializers ..........35 Anonymous types................ 37 Lambda expressions............45 Func ..............................................50

Query Expressions

52

Basic LINQ ..........................52 Translation ..........................54 Degeneracy ...................................59 Chained where clauses ...............60

iii

Introduction to Deferred Execution ............. 61 Multiple froms ...................63 Transparent identifiers ....... 71 Iteration Variable Scope............... 73

More complex data..............78 let clauses ...........................83 Ordering data ......................86 Grouping data .....................89 Joining data.........................95 Nested Queries .................. 102 into.............................................. 105 let clause translations.................108 let vs. into....................................113 joining into ..................................115 Outer joins ...................................121

Other query operators....... 126

Exercise Solutions

139

Chapter 1 ........................... 139 Chapter 2............................161

iv

Preface This book introduces experienced C# 2.0 programmers to version 3.0’s enhancements.1 We relied on one another’s particular strengths, expertise, and mutual professionalism to realize our shared goals, which are to: •

Guide readers toward enriched knowledge of this often-complex material.



Cover every relevant topic.



Illustrate every concept with corresponding exercises.



Clarify each idea with illustrative examples.



Explore all the avenues in each topic.

We did not write this text from a pure C# point of view. The synergy of our expertise our respective fields of C++, C#, Java, Python, etc. contribute greatly to the variety in this text. Indeed you will see many areas where features from other languages grant different points of view on each topic. We met on a daily basis via Internet, sharing the document and conversing for an hour or two at a time. These frequent meetings sparked ideas that led to the improvement of all areas of the book.

What makes this book stand out? We collaborated on every aspect of planning, research, writing, testing, and editing this book. Some key features make this book superior: •

The incremental teaching approach.



Our rigorously accurate build system.

1

Appendix A covers some fundamental topics you must understand from previous versions of C#.

1



The combined efforts of two skilled programmers working in tandem.



Various teaching approaches to concepts and exercises.



Integrated exercises and solution guide.

Reading vs. Wrestling No one can gain a Masters degree in a weekend, nor become a guru at any topic in a matter of hours. Often my students ask, “How did you become so good at .NET?” The answer is generally not what they want to hear: “Time, lots of time struggling in the saddle.” There is no magic bullet in software, nor is there any in becoming a guru at any topic. The dollar amount you pay for any education is the cheapest part. It’s the work, dedication, and sweat you put forth to acquire your education that increases your value as a programmer (and your salary). This book is an experience, not a show. It is not enough to merely read it. Nor is it enough to load an example here and there just to see it execute. At the end of almost every section are well thought-out exercises. These are, we feel, more than just exercises – in fact, we think of them as “wrestles.” The term “exercise” infers breaking a sweat doing some simple task over and over. When you wrestle with something or someone, you’re generally tied up struggling with them, learning how they fight back, and what you must do to be victorious. Wrestling with someone requires much more work than a basic exercise. And each individual you wrestle behaves differently. Just like individuals, any new technology has its own “behaviors,” and learning those behaviors are key to becoming an expert on the topic. The only way to do so is by struggling with each one, trying every possible path. We strongly encourage doing the exercises as you study. These “wrestles” prepare you to understand concepts in the subsequent sections. We also encourage loading as many examples as you can, modifying them to answer the “what if I do this” questions that surely come to mind as you study.

The build system Unit testing ensures the reliability of this book’s code: this book builds.2 We find that technical texts (especially those about new technologies) all too 2 As this is a “sample” of the full text, the sample by itself will not build because of files required from the full text. However, the full text builds.

2

C# Query Expressions

Preview Release 1.0

often include such errors in code that either won’t compile or whose output is incorrect. We rely on our automated build system, written specifically for this book, to verify that every example and all output in this book are correct. The build system alone raises this book to a class above the norm. As Microsoft© released Community Technology Previews (CTPs), beta releases, and final releases, we automatically ran each example in this book with them. Every time our build system reported an error, we researched the revision that triggered it (including some that stopped the compile process or affected output). Our build system caught every change, and we corrected each one. The build system also ensures crucial features as: •

No duplicate example file names.



No unnecessary or duplicate using declarations.



No code wrapping in the printed book.



All lines commented as //c! introduce a compiler error.



All lines commented as //e! throw exceptions.



Opening braces never appear on a line by themselves.



Examples that do not compile have the error text embedded directly in them, with which the build system also verifies.



Etc.

We also wrote several automated tests for the build system itself (we tested our tester). We used the test-driven approach, adding each test before modifying the build system to pass that test. This is state-of-the-art, twotiered “test-driven development.” We test the build system, and the build system tests the book. Finally, hidden examples catch changes in the language, testing verbal statements that do not require a full example to prove. These examples do not show in the printed version of the book, however, the build system executes them as it does any other. You will find these examples in the HiddenCheck directory of the book’s source code.

Preface

King/Eckel ©2008 MindView, Inc.

3

[[[Appendix XXX]]] describes the build system directives that you’ll encounter throughout the book. Although it’s not necessary for you to understand the build system to learn from this book, it helps.

Reviews No test is as effective as a code review. We’ve used this book in three separate courses (two at Neumont University, and one for consulting at a professional development house). Each run weeded out its own issues and errors in this text.

Errors Painstaking attention to detail can guarantee many, but not all, aspects of this work. If you find an error or omission, please call it to our attention at [[[tic#[email protected]]]].

Source Code On www.MindView.net, you will find the installer for this book’s code. It installs Visual Studio projects that contain all the code organized by chapter. It also contains each example in its compiled version.

The MindView.Util.dll assembly Verbiage on what it is, etc. we scatter examples throughout the book for the mindview.util, but many features pulled from the main volumes (printing collections) The book’s build system compiles all cs files within the MindView\Util directory into a single DLL file called MindView.Util.dll

Talk about response files and how mindview.util is included in the response file, deploy system puts the line in the response file

Using this book to teach a class Because all the solutions to exercises are included in the back of the book, they may initially seem like they are not suitable as homework exercises when

4

C# Query Expressions

Preview Release 1.0

using this book for teaching a seminar or a college class. There are two approaches we suggest when teaching from the book: 1.

Choose selected book exercises as homework assignments, but do not grade the homework. Notify the students that you will use the exercises as the basis for quiz and test questions. This provides incentive for students to thoroughly understand the exercises and solutions that you’ve assigned.

2.

Create variations of our exercises as homework questions.

Dedications Jamie: Many technical authors dedicate their work to close loved ones. My wife thinks this is strange, so I’ll forgo doing so. My siblings and in-laws already think my excitement over “dry” technical texts is abnormal. My in-laws are very artistic, and they often impress everyone by displaying their finished work. In return, I tried framing some of my code for a family party, but unfortunately, it didn’t have the same effect. I hope that for you, the book has meaning. It is people like you that truly appreciate how code is an art. Therefore, I dedicate this text to all who get excited when wrestling any sort of technical text, especially this one. Being a teacher, I know quite a few who do. Many thanks to Neumont University’s Advanced C# classes, whose rigorous review enabled Bruce and I to correct flaws we would have missed.

Preface

King/Eckel ©2008 MindView, Inc.

5

Simple New Features C# 3.0’s basic features are simple but powerful. More sophisticated features follow in later chapters.

Extension methods Extension methods1 make an existing type meet your needs by effectively (but not literally) adding new methods.2 You can call an extension method like any other non-static method of that class. For example, even with all DateTime’s existing functionality, it doesn’t support combining two DateTime instances when one holds only a unique date and the other only a unique time. A normal static method provides one solution, yet a normal static method becomes an extension method with the addition of the this modifier to the first argument: //: SimpleNewFeatures\CombiningDateTimes.cs using System; static class CombiningDateTimes { // "this" indicates an extension method: static DateTime Merge(this DateTime datePart, DateTime timePart) { return new DateTime(datePart.Year, datePart.Month, datePart.Day, timePart.Hour, timePart.Minute, timePart.Second); } static void Main() {

1

Some languages call these multimethods.

2 Extension methods are different than inheritance; we compare the two approaches later. Techniques like inheritance, composition, and operator overloading are covered in [[[reference]]].

7

DateTime datePart = DateTime.Parse("11/15/2008 12:00:00 AM"); DateTime timePart = DateTime.Parse("1/1/1001 9:23:00 PM"); // Normal static syntax: Console.WriteLine(Merge(datePart, timePart)); // Extension method syntax: Console.WriteLine(datePart.Merge(timePart)); } } /* Output: 11/15/2008 9:23:00 PM 11/15/2008 9:23:00 PM *///:~

In this example, we Merge( ) datePart’s date with timePart’s time. An extension method must be a static member of a non-generic, non-nested static class. The this modifier applied to the first argument signals the compiler that Merge( ) is an extension method for the DateTime class. When it discovers that Merge( ) is not an ordinary, non-static DateTime instance method, the compiler searches all in-scope static classes. If it finds a class that contains a Merge( ) extension method which takes a DateTime as its first argument and the rest of the arguments match, the compiler rewrites the call as a normal static method call. Thus, we effectively extend the DateTime structure to include a Merge( ) method. If DateTime had a Merge( ) instance method, you could only invoke the extension method using static method syntax, DateTimeExtensions.Merge( ). The compiler issues an ambiguity error when it finds that multiple possible matching extension methods are in scope. Although Merge( ) is public, you can also make private extension methods usable only within that static class. Extension methods are essentially static methods with special lookup help from the compiler. They have fewer privileges than native method; normal accessibility rules apply.3

3

To achieve full access, a programming language must support open classes (e.g., Ruby and Smalltalk).

8

C# Query Expressions

Preview Release 1.0

Extension methods allow clients to add their own functionality, thus DateTime designers needn’t guess what methods their clients will need. This feature is especially useful for interfaces, where having fewer methods benefits the implementer but not necessarily the client. For example, using IEnumerable’s only method, GetEnumerator( ), we can derive such information as whether each of two sequences is the same length and if each corresponding element is equal.4 We’ll do this while beginning to create the MindView.Util.Enumerable class, which includes some features we think that System.Linq.Enumerable (a class that we explore in great detail later in the book) could have included: //: MindView.Util\Enumerable.1.cs // {CF: /target:library} using System.Collections; namespace MindView.Util { public static partial class Enumerable { public static bool SequenceEqual(this IEnumerable first, IEnumerable second) { if(object.ReferenceEquals(first, second)) return true; IEnumerator enumerator1 = first.GetEnumerator(); IEnumerator enumerator2 = second.GetEnumerator(); while(enumerator1.MoveNext()) { if(!enumerator2.MoveNext()) return false; // Not same length object leftVal = enumerator1.Current; object rightVal = enumerator2.Current; // If either is null, but not both, then not equal: if(leftVal == null || rightVal == null) { if(leftVal != rightVal) return false; } else if(leftVal is IEnumerable && rightVal is IEnumerable) { // Recursively check IEnumerable sequences if(!(leftVal as IEnumerable) .SequenceEqual(rightVal as IEnumerable))

4 The “CF:” at the top of the file stands for “CompileFlags.” Our build system inserts text after the CF: on the compiler command line, as you would to compile the examples by hand. (A guide to the build system’s meta-instructions appears in the Introduction.)

Simple New Features

King/Eckel ©2008 MindView, Inc.

9

return false; } else if(!leftVal.Equals(rightVal)) return false; } if(enumerator2.MoveNext()) return false; // Not same length return true; } } } ///:~

Now we can use SequenceEqual( ) without the IEnumerable designers adding it to their interface: //: SimpleNewFeatures\SequenceEqualDemo.cs using MindView.Util; using System.Collections; using System.Diagnostics; class SequenceEqualDemo { static void Main() { ArrayList list1 = new ArrayList(); ArrayList list2 = new ArrayList(); Debug.Assert(list1.SequenceEqual(list2)); Debug.Assert(list2.SequenceEqual(list1)); list1.Add(7); Debug.Assert(!list1.SequenceEqual(list2)); Debug.Assert(!list2.SequenceEqual(list1)); list2.Add(7); Debug.Assert(list1.SequenceEqual(list2)); Debug.Assert(list2.SequenceEqual(list1)); list2.Add(8); Debug.Assert(!list1.SequenceEqual(list2)); Debug.Assert(!list2.SequenceEqual(list1)); list1.Add(8); Debug.Assert(list1.SequenceEqual(list2)); Debug.Assert(list2.SequenceEqual(list1)); list1.Add("now a string object"); Debug.Assert(!list1.SequenceEqual(list2)); Debug.Assert(!list2.SequenceEqual(list1)); } } ///:~

10

C# Query Expressions

Preview Release 1.0

System.Linq.Enumerable already has a SequenceEqual( ) method like this one, but it uses the generic IEnumerable.5 However, you must use our version of SequenceEqual( ) for a non-generic IEnumerable (instead of System.Linq.Enumerable’s).6 We use Debug.Assert( ) here to prove our results. The book’s build system catches failed assertions. You will soon see a cleaner approach. Extension methods are similar to operator overloading; both are static methods with special help from the compiler, which calls the method passing the operands as arguments.

Exercise 1: Make an Add( ), Subtract( ), Multiply( ), and Divide( ) extension method for int. Exercise 2: Make two extension methods: a ToArrayList( ) that converts its IEnumerable argument to an ArrayList, and a generic ForEach( ) method that iterates over its IEnumerable argument executing its second System.Action delegate argument on each element.

Inheritance vs. extension methods You normally create an extension method to add only a method or two to an existing type. However, you can use extension methods to extend the functionality of sealed types, from which you cannot inherit. So sometimes extension methods are your only option, even though the semantics are not identical to inheritance. You can inherit a new type from a base type and add an ordinary instance method, but you must use a downcast to call that method with a base-type reference. Also, if you want that method to be common among all derivations then you must add it to all derived types. C# 3.0 lets you add extension methods directly to the base type:7

5

Microsoft’s version doesn’t recursively check nested IEnumerables as ours does.

6

Microsoft told us that the omission of a SequenceEqual( ) for a non-generic IEnumerable was an oversight.

7

You should add an extension method only when it makes sense for a given type. Note that the Visitor design pattern is sometimes used for this purpose.

Simple New Features

King/Eckel ©2008 MindView, Inc.

11

//: SimpleNewFeatures\ExtendingTheBase.cs abstract class Vehicle { public void Start() {} public void Stop() {} } class Scooter : Vehicle {} class Bus : Vehicle {} static class VehicleExtensions { public static void StartAndStop(this Vehicle vehicle) { vehicle.Start(); // Sleep a minute or two... vehicle.Stop(); } } class ExtendingTheBase { static void Main() { Vehicle vehicle = new Scooter(); vehicle.StartAndStop(); vehicle = new Bus(); vehicle.StartAndStop(); // You can still use the derived reference: new Scooter().StartAndStop(); } } ///:~

StartAndStop( ) works on any Vehicle subclass because it extends Vehicle. If we want to use ordinary method-call syntax in C# 2.0 but Vehicle’s declaration is inaccessible, we must add StartAndStop( ) to an abstract subclass of Vehicle.8 Adding an intermediate class is not an ideal solution, but is preferable to requiring each Vehicle subclass to define a StartAndStop( ). However, we still cannot call StartAndStop( ) on other types that inherit directly from Vehicle. Extension methods are not polymorphic:

8

12

Such an exercise appears at the end of the section.

C# Query Expressions

Preview Release 1.0

//: SimpleNewFeatures\ExtensionMethodsNotPolymorphic.cs using System.Diagnostics; abstract class Vehicle {} class Motorcycle : Vehicle {} static class VehicleExtensions2 { public static string StartAndStop(this Vehicle vehicle) { return "Vehicle.StartAndStop()"; } public static string StartAndStop(this Motorcycle motorcycle) { return "Motorcycle.StartAndStop()"; } } class ExtensionsNotPolymorphic { static void Main() { Vehicle vehicle = new Motorcycle(); Debug.Assert( vehicle.StartAndStop() == "Vehicle.StartAndStop()"); Debug.Assert((vehicle as Motorcycle) .StartAndStop() == "Motorcycle.StartAndStop()"); } } ///:~

At runtime, vehicle references a Motorcycle, but the compiler, not the runtime, resolves the StartAndStop( ) call. Extension methods are useful only for simple situations; use normal object-oriented techniques when you require polymorphic methods. If you need an extension method determined by the runtime type of your object, you can use the is operator, but the result is a messy maintenance nightmare: //: SimpleNewFeatures\DynamicExtensionMethods.cs abstract class Vehicle {} class MotorCycle : Vehicle {} class Scooter : Vehicle {} class Bus : Vehicle {} class Car : Vehicle {} class Truck : Vehicle {}

Simple New Features

King/Eckel ©2008 MindView, Inc.

13

class SUV : Truck {} static class VehicleExtensions { public static void Accelerate(this Vehicle vehicle) { if(vehicle is MotorCycle) (vehicle as MotorCycle).Accelerate(); else if(vehicle is Scooter) (vehicle as Scooter).Accelerate(); else if(vehicle is Bus) (vehicle as Bus).Accelerate(); else if(vehicle is Car) (vehicle as Car).Accelerate(); else if(vehicle is Truck) { // Must first check Truck subtypes: if(vehicle is SUV) (vehicle as SUV).Accelerate(); // Otherwise, treat it as a basic Truck: else (vehicle as Truck).Accelerate(); } } static void Accelerate(this MotorCycle motorcycle) { "MotorCycle.Accelerate()".P(); } static void Accelerate(this Scooter scooter) { "Scooter.Accelerate()".P(); } static void Accelerate(this Bus bus) { "Bus.Accelerate()".P(); } static void Accelerate(this Car car) { "Car.Accelerate()".P(); } static void Accelerate(this Truck truck) { "Truck.Accelerate()".P(); } static void Accelerate(this SUV suv) { "SUV.Accelerate()".P(); } } class DynamicExtensionMethods { static void Main() { Vehicle vehicle = new Truck();

14

C# Query Expressions

Preview Release 1.0

vehicle.Accelerate(); vehicle = new MotorCycle(); vehicle.Accelerate(); vehicle = new SUV(); vehicle.Accelerate(); } } /* Output: Truck.Accelerate() MotorCycle.Accelerate() SUV.Accelerate() *///:~

In Accelerate(Vehicle) we use is to detect the runtime type, and then use as to cast the argument, then call Accelerate( ) again. The compiler resolves this to .Accelerate( ), and invokes the proper version. The code above looks highly suspect; it’s not easy to read and will be expensive to maintain. If you add a class to the hierarchy, you must also remember to add the appropriate checks. This is error prone because you could add a class and easily forget to add the checks. If you introduce a new subtype, like SUV, you must check first for the supertype (Truck), then all possible subtypes within that check. If the object isn’t one of the subtypes, then you must treat it as “just” a supertype. Anyone who inserts new subtypes must be aware of, and conform to this convention. This is the kind of programming you want to avoid; if you’re using extension methods this heavily you may want to rethink your solution.

Exercise 3: Change ExtendingTheBase.cs to use an intermediate base class rather than extension methods.

Utilities for this book We use extension methods to simplify code throughout the book. For example, our P( ) method prints its argument to the Console: //: MindView.Util\Printer.1.cs // {CF: /target:library} using System; using System.Collections; public enum POptions { NoNewLines, InsertNewLines }

Simple New Features

King/Eckel ©2008 MindView, Inc.

15

public static partial class Printer { public static void P(this T item) { if(item is IEnumerable && !(item is string)) (item as IEnumerable).P(); else Console.WriteLine(item); } public static void P(this object item, string message) { item.P(message, POptions.NoNewLines); } public static void P(this object item, string message, POptions pOptions) { (message + ": " + GetSeperator(pOptions) + item).P(); } static string GetSeperator(POptions pOptions) { return pOptions == POptions.InsertNewLines ? Environment.NewLine : string.Empty; } } ///:~

P( ) simply replaces the Console.WriteLine( ) statement. Notice that the overloaded versions invoke the first version. The filename Printer.1.cs follows this book’s naming convention for files that hold partial types. We’ll later add partial declaration files to Printer (Printer.2.cs, etc.). We print IEnumerable sequences of items differently than other objects, as upcoming overloads demonstrate. POptions is a simple enum that determines whether the output should include newlines in the appropriate locations. We use this to make the output more readable. Our build system compiles all the .cs files in the MindView.Util directory to create the MindView.Util.dll assembly. P( ) is automatically available after our installer configures your compiler’s response file, and requires no using statement (as most MindView.Util.dll components do).9 Here is a simple demonstration of P( ): //: SimpleNewFeatures\SimplePrintStatements.cs

9

See the Introduction for details. Note that when you create your own Visual Studio project, you must add a reference to MindView.Util.dll to access the library.

16

C# Query Expressions

Preview Release 1.0

using System; class SimplePrintStatements { static void Main() { Console.WriteLine("Hello"); "Hello".P(); Console.WriteLine(5); 5.P(); Console.WriteLine("some message" + ": " + "some text"); "some text".P("some message"); } } /* Output: Hello Hello 5 5 some message: some text some message: some text *///:~

Note that the output of each pair of print statements is identical. P( ) reduces visual noise 10 in this book.11 Besides saving space, it reads more intuitively at the end of statements (as opposed to the more common Console.WriteLine( ), which you must ignore at the beginning to decipher print statements). Mentally, you can naturally truncate P( ) from end of the print statement. MindView.Util’s True( ) method Assert( )s that its argument is true; the False( ) method asserts that its argument is false. AssertEquals( ) simplifies asserting object equivalence: //: MindView.Util\Asserter.cs // {CF: /target:library} using MindView.Util; using System.Collections; using System.Diagnostics;

10 You could also use methods like P( ) for logging. For example, a T( ) method could call Trace.Write( ). 11

You won’t see it elsewhere outside this book unless others adopt it.

Simple New Features

King/Eckel ©2008 MindView, Inc.

17

public static class Asserter { public static void True(this bool value) { Debug.Assert(value); } public static void False(this bool value) { Debug.Assert(!value); } public static void AssertEquals(this T left, T right) { if(left == null || right == null) object.ReferenceEquals(left, right).True(); else if(left is IEnumerable && right is IEnumerable) (left as IEnumerable) .AssertEquals(right as IEnumerable); else left.Equals(right).True(); } public static void AssertEquals(this IEnumerable left, IEnumerable right) { left.SequenceEqual(right).True(); } public static void ExceptionFailed() { Debug.Assert(false, "Exception failed to throw"); } } ///:~

Note how SequenceEqual( ) from Enumerable.1.cs handles the special case of two IEnumerable operands in the last overload of AssertEquals( ).12 We now insert verifications inline where possible rather than printing output that appears at the end of listings. This simplifies interpreting the examples: //: SimpleNewFeatures\BasicAssertions.cs using System; using MindView.Util; using System.Collections.Generic; class BasicAssertions {

12 We wrote our non-generic SequenceEqual( ) just for cases like this version of AssertEquals( ). Since you don’t need to know the exact types our SequenceEqual( ) contains, you can compare any container regardless of its contained types.

18

C# Query Expressions

Preview Release 1.0

static void Main() { Random random = new Random(47); random.Next().Equals(601795864).True(); (random.Next() == 1305670887).True(); random.Next().AssertEquals(1332423928); random.Next().Equals(5).False(); List list = new List(); list.SequenceEqual(list).True(); // Shorter syntax to above: list.AssertEquals(list); } } ///:~

We rely on Asserter’s methods throughout this book. Our build system catches failed assertions, so when you see True( ), False( ), or AssertEquals( ), you know the code is correct.

Exercise 4: Use Asserter to show that (1) the compiler interns string literals;13 (2) boxing a value type twice produces two different objects; (3) by default, enum members not assigned an explicit value will take their predecessor’s plus one (unless they are also the first members, whose value is zero); and (4) System.Linq.Enumerable.Range( ) returns a numeric sequence of values.

Extended delegates A delegate instance normally references a method with a signature and return type that matches the delegate. Here we demonstrate how you can create delegates to extension methods: //: SimpleNewFeatures\ExtendedDelegates.cs using System; class ExtendedDelegates { static void Main() { object obj = new object(); Action action; //c! action = Asserter.AssertEquals; // Must use extension syntax: action = obj.AssertEquals;

13

Look up string.IsInterned( ) in the documentation.

Simple New Features

King/Eckel ©2008 MindView, Inc.

19

action(obj); // obj.AssertEquals(obj) action.Method.Name.AssertEquals("AssertEquals"); action.Target.AssertEquals(obj); // Doesn't work with value types: int value = 5; /*c! Action action2 = value.AssertEquals; */ } } ///:~

Remember that the build system uncomments lines marked //c!, or blocks marked /*c!, and ensures that they cause a compile error. A delegate simply points to a Method and a Target object.14 The built-in generic Action delegate references methods that take one object and return void. Notice the compiler prevents direct assignment of the extension method to action. Instead, we must bind the action’s Target to the extended obj and AssertEquals( ) to its Method using extension method syntax. AssertEquals( ) takes two arguments. However, when we use extension method syntax to assign obj.AssertEquals to action, the compiler binds the first argument to obj, and thus treats AssertEquals( ) as if it took only one argument (instead of two).

Exercise 5: Use extension method syntax to assign a delegate to a System.Linq.Enumerable’s Min( ) invoked upon a List of random numbers. Make sure that invoking the delegate returns the smallest value. } ///:~

Other rules Extension methods can never redefine existing methods. For example, ToLower( ) can mean nothing else for a string: //: SimpleNewFeatures\ExtensionPrecedence.cs static class StringExtensions { public static string ToLower(this string str) {

14

20

The details are more complex. (See [[[reference]]] for more on delegates.)

C# Query Expressions

Preview Release 1.0

return str.ToUpper(); } } class ExtensionPrecedence { static void Main() { string s = "ABCdef"; s.ToLower().AssertEquals("abcdef"); // Must call it directly: StringExtensions.ToLower(s).AssertEquals("ABCDEF"); } } ///:~

Because string’s existing ToLower( ) method takes no arguments, StringExtensions’ attempt to redefine ToLower( ) fails. The compiler searches a type for a match first and stops when it finds one. The this modifier in StringExtensions.ToLower( ) is wasted here. However, the compiler applies our ToLower( ) when it has different arguments than string.ToLower( )’s: //: SimpleNewFeatures\CompilerSearchContinues.cs static class StringExtensions2 { public static string ToLower(this string s, int i) { return s.ToUpper() + new string('a', i); } } class CompilerSearchContinues { static void Main() { string s = "afds"; s.ToLower(5).AssertEquals("AFDSaaaaa"); } } ///:~

An extension method call on a null reference causes no NullReferenceException, because the compiler translates extension method calls into static method calls: //: SimpleNewFeatures\NoNullReferenceException.cs static class NoNullReferenceException { static void SomeMethod(this object o) {}

Simple New Features

King/Eckel ©2008 MindView, Inc.

21

static void Main() { object obj = null; obj.SomeMethod(); } } ///:~

The compiler rewrites our method call to NoNullReferenceException.SomeMethod(obj), so we never call SomeMethod( ) directly on obj. You cannot call extension methods with an implicit this: //: SimpleNewFeatures\SelfPrinter.cs class SelfPrinter { public void PrintYourSelf() { //c! P(); this.P(); } static void Main() { new SelfPrinter().PrintYourSelf(); } } /* Output: SelfPrinter *///:~

You cannot implicitly invoke the in-scope P( ) in PrintYourSelf( ), as you can with ordinary methods. You must instead use this to trigger the compiler to search through the available extension methods. A properly deployed extension method improves your code’s legibility, but too many extension methods can make your code harder to understand (as does excessive operator overloading). Consider whether your extension method clarifies or obfuscates your code’s meaning. Can you achieve the same result with normal static methods?

Exercise 6: Put two extension methods with the same signature in two separate classes, each within a unique namespace. Put each in their own file. In a third file, bring both classes into scope with using statements, and invoke the methods using extension syntax rather than normal static-method call syntax. Does the compiler issue an ambiguity error? What happens when you declare a third class after the using statements to introduce a third, same-signature extension method?

22

C# Query Expressions

Preview Release 1.0

Exercise 7: Create an IndexesOf( ) string extension method that returns all of a given string’s indices instead of just the first one (as IndexOf( ) does).

Implicitly-typed local variables Object declarations are often redundant. Consider the following: Letter letter = new Letter();

That code uses the same type name to declare the type of the variable (the first “Letter”) and to instantiate an instance (the second “Letter”). C# 3.0’s implicitly typed local variables don’t require you to repeat the type name when you define such a variable inside a method or property. Instead, when you declare the type as var, the compiler infers the type from the initialization expression as if you had defined it explicitly. The compiler issues an error if you have no initialization expression, or if you later assign a different type of object to the variable: //: SimpleNewFeatures\ImplicitVariableTypes.cs class ImplicitVariableTypes { // Implicit typing doesn't work with fields: //c! static var field = new ImplicitVariableTypes(); static void Main() { ImplicitVariableTypes a = new ImplicitVariableTypes(); // Identical to above: var b = new ImplicitVariableTypes(); // Can't change b's type: //c! b = 5; // Somewhat ambiguous with primitive types: var c = 5; c.GetType().AssertEquals(typeof(int)); // Less ambiguous with a suffix code: var d = 1.1f; //c! var e; // Error, no expression } } ///:~

In this example, the variable c is an int because 5 falls into the range of an int, but type inference automatically changes the type to a uint when the compile-time initialization value is greater than int.MaxValue. We declare

Simple New Features

King/Eckel ©2008 MindView, Inc.

23

primitives explicitly herein, since we needn’t repeat their type names to declare and initialize them. Type inference is especially valuable for longer declarations, especially generics: var robots = new Dictionary>();

By eliminating redundancies, var makes the declaration simpler to write and read. Also, if you later change the generic type arguments, you only need to update the initialization. You cannot use var for fields.15 In fact, C# 3.0 includes var because it is the only way to declare variables of anonymous types (shown later). The compiler sets var declared variables to the exact type with which you initialize it. You must explicitly declare a variable’s type when you initialize it with a derived type in order to make its compile-time type a base type.

Exercise 8: Use var to assign a variable to a derived type. Prove that you cannot assign a base object or a different derived type to your variable. Exercise 9: Prove that the compiler still infers the correct type when you assign a var variable using a complicated initialization expression, for example: var value = ((new Random().Next().ToString() + "some string").Substring(3, 3) + AnotherMethod()) .ToUpper();

15

C# author Anders Hejlsberg’s email to the authors explains how this might eventually be possible: “One reason [implicitly-typed fields are not supported] is complexity. If we support ‘var’ for fields we would really need to also support it for properties and method return values . . . [and] resolve ‘var’ usages that depend on other ‘var’ usages. The other reason is the . . . CLR only supports ‘nominal types,’ i.e., types that have a declared name, and we emulate structural types by having the compiler generate anonymous type names which are never revealed to the user. . . . If ‘var’ was permitted for fields, properties, and methods, it would be possible to capture an anonymous type and make it public. . . . One way to avoid the issue altogether is to support ‘var’ only with local variables—which are never public. We may indeed relax the restrictions on ‘var’ in the future, once we make progress on these underlying issues. Meanwhile, we have the simple rule that ‘var’ can only be used for local variables.”

24

C# Query Expressions

Preview Release 1.0

Exercise 10: Prove that var is not a keyword by creating a variable named var. Can the compiler still infer a variable’s type when it is declared using var?

Automatic properties It’s common for types to have several properties that only expose a backing field. You could instead just make such fields public, but when you change a field to a property after client programmers compile against it, the client code breaks. The differences between the metadata for a field and a property forces them to recompile, and this is the primary reason for the use of properties even when those properties don’t do anything extra. Threading is another reason public fields are problematic, because you can guard code, but not data. A field accessible from outside the class is always vulnerable to access by multiple threads.16 Properties that only expose a get and set for a backing field are inherently noisy, even when Visual Studio and some third party plug-ins assume the burden of generating the code for them. Automatic properties add an implicit backing field for you: //: SimpleNewFeatures\Person.cs // {CF: /target:library} using System.Text; public class Person { int id; // Normal backing field public Person() {} public Person(int id) { ID = id; } // Normal property: public int ID { get { return id; } set { id = value; } } // Automatic properties: public string FirstName { get; set; } public string LastName { get; set; }

16

MethodImplOptionsAttribute with MethodImplOptions.Synchronized is the only way to make an automatic property thread-safe, but it produces other problems.

Simple New Features

King/Eckel ©2008 MindView, Inc.

25

// We use ToString() in later examples: public override string ToString() { var ret = new StringBuilder(); ret.AppendLine("ID: " + id); ret.AppendLine(" FirstName: " + FirstName); ret.Append(" LastName: " + LastName); return ret.ToString(); } } ///:~

Automatic properties must have both an empty set and an empty get (the single semicolon makes the “empty body”). The compiler generates a hidden backing field that you cannot access. An empty set by itself is useless. Likewise, you can read but never change the backing field’s default value of an empty get by itself (also pointless). In C# 2.0, you can apply access modifiers to decrease the visibility of either the get or set accessors, a feature C# 3.0’s automatic properties also support: //: SimpleNewFeatures\LimitedAccess.cs class Sibling { int irritationCount; public void Irritate() { Angry = ++irritationCount >= 3; } public bool Angry { private set; get; } } class LimitedAccess { static void Main() { var sibling = new Sibling(); sibling.Irritate(); sibling.Irritate(); sibling.Irritate(); sibling.Angry.True(); //c! sibling.Angry = false; } } ///:~

Angry is visible only from within the Sibling class.

26

C# Query Expressions

Preview Release 1.0

Automatic properties satisfy both interfaces’ and abstract classes’17 requirements: //: SimpleNewFeatures\ImplementingInterfaces.cs // {CF: /target:library} // Interface property syntax // resembles automatic property syntax: interface SomeInterface { int WriteOnly { set; } int ReadOnly { get; } int ReadAndWrite { get; set; } } // To use automatic properties when implementing an // interface, you must also add the missing accessors: class Implementer : SomeInterface { public int WriteOnly { get; set; } public int ReadOnly { get; set; } public int ReadAndWrite { get; set; } } ///:~

Implementer’s automatic properties satisfy SomeInterface’s requirements. To use automatic properties in the class definition you must add the missing get or set. Implementer demonstrates the case of a class that defines an implementation larger than its interface (SomeInterface) requires. If Implementer used public fields rather than automatic properties, the compiler would issue an error. When you have a private backing field, there’s a tendency to use the field directly within the class instead of using the property. However, if you later add some code within the field’s associated property, the direct field access doesn’t cause the new code to execute. Automatic properties hide the backing field, preventing you from accessing the field directly. If you later change your automatic property to a normal property with some boundary checks, you need not update all references to the property.

Exercise 11: Change Person’s ID field to an automatic property.

17

See exercises for a twist with abstract classes.

Simple New Features

King/Eckel ©2008 MindView, Inc.

27

Exercise 12: Change ImplementingInterfaces.cs to prove that automatic properties also satisfy abstract property definitions.

Exercise 13: Create a simple Rectangle class with Width, Height, and Area properties.

Exercise 14: Make one of Implementer’s properties a public field and note the compiler error.

Implicitly-typed arrays The compiler can infer an array’s type using C# 3.0’s new array-creation syntax: //: SimpleNewFeatures\NewArrayCreationSyntax.cs class NewArrayCreationSyntax { static void Main() { // C# 1.0: int[] ints1 = { 5, 4, 3 }; int[] ints2 = new int[] { 5, 4, 3 }; // C# 3.0: int[] ints3 = new[] { 5, 4, 3 }; var ints4 = new[] { 5, 4, 3 }; // ints3 and ints4 is an array of ints: ints3.GetType().AssertEquals(typeof(int[])); ints4.GetType().AssertEquals(typeof(int[])); } } ///:~

3.0’s syntax enables the compiler to determine the array type by examining the types of all the expressions within curly braces. ints3’s and ints4’s array initializer values are ints, so that makes them int[]s. The compiler infers a C# 1.0 curly-brace-initialized array from the type on the left of the assignment: //: SimpleNewFeatures\NakedBracesRequireLeftHandType.cs class NakedBracesRequireLeftHandType { static void Main() { int[] ints = { 1, 2, 3 }; //c! var ints2 = { 1, 2, 3 }; // Must state type directly or

28

C# Query Expressions

Preview Release 1.0

// use implicitly-typed array: var ints3 = new int[] { 1, 2, 3 }; var ints4 = new[] { 1, 2, 3 }; } } ///:~

In 3.0 syntax, the compiler infers the array type by finding exactly one type to which all types must convert. When it finds no single type, the compiler issues an error: //: SimpleNewFeatures\ConvertibleToOnetype.cs class Base {} class Derived1 : Base {} class Derived2 : Base {} class ConvertibleToOnetype { static void Main() { // Everything implicitly convertible to double: var doubles = new[] { 5, 4, 3.5 }; doubles.GetType().AssertEquals(typeof(double[])); // Not OK, decimal incompatible with double: //c! var unknown1 = new[] { 5, 3m, 7.5 }; /*c! // Not OK, Derived1 not convertible to Derived2, // and Derived2 not convertible to Derived1: var unknown2 = new[] { new Derived1(), new Derived2() }; */ // Now OK, everything convertible to Base: var baseArray = new[] { new Derived1(), new Derived2(), new Base() }; baseArray.GetType().AssertEquals(typeof(Base[])); // OK, everything convertible to object: var objectArray = new[] { new Derived1(), new Derived2(), new Base(), new object() }; objectArray.GetType().AssertEquals(typeof(object[])); } } ///:~

The compiler exempts unknown2’s Base from possible array types since it is not in the initialization list. However, after baseArray adds a Base to the

Simple New Features

King/Eckel ©2008 MindView, Inc.

29

initialization list, the compiler can convert everything to a Base, so it makes baseArray a Base[]. The objectArray definition adds an object to baseArray’s list. Derived1 and Derived2 are implicitly convertible to both Base and object. The compiler infers the type to be object[] because Base is convertible to object, making object is the only common type. The term “implicit conversion” covers the gamut of conversions, not just upcasts and widening conversions: //: SimpleNewFeatures\ImplictConversionOperators.cs class CommonType {} class Convertible1 { public static implicit operator CommonType(Convertible1 toConvert) { return null; } } class Convertible2 { public static implicit operator CommonType(Convertible2 toConvert) { return null; } } class ImplictConversionOperators { static void Main() { /*c! var unknown = new[] { new Convertible1(), new Convertible2() }; */ var commonTypeArray = new[] { new Convertible1(), new Convertible2(), new CommonType() }; commonTypeArray.GetType() .AssertEquals(typeof(CommonType[])); } } ///:~

30

C# Query Expressions

Preview Release 1.0

In this example, ConvertibleType1’s and ConvertibleType2’s conversion operators convert them to CommonType.18 Because unknown’s initialization contains no CommonType, the compiler doesn’t consider CommonType a candidate for the array type. The following example demonstrates an ambiguity error when there is more than one possible type for conversion: //: SimpleNewFeatures\AmbiguityErrors.cs class CommonType1 {} class CommonType2 {} // Two types with implicit conversions to both // CommonType1 and CommonType2: class ConvertibleType1 { public static implicit operator CommonType1(ConvertibleType1 toConvert) { return null; } public static implicit operator CommonType2(ConvertibleType1 toConvert) { return null; } } class ConvertibleType2 { public static implicit operator CommonType1(ConvertibleType2 toConvert) { return null; } public static implicit operator CommonType2(ConvertibleType2 toConvert) { return null; } } class AmbiguityErrors { static void Main() {

18 See the Operator Overloading chapter in the [[[reference]]] for help understanding conversion operators.

Simple New Features

King/Eckel ©2008 MindView, Inc.

31

/*c! var ambiguous1 = new[] { new ConvertibleType1(), new ConvertibleType2() }; */ var nonAmbiguous1 = new[] { new ConvertibleType1(), new ConvertibleType2(), new CommonType1() }; nonAmbiguous1.GetType() .AssertEquals(typeof(CommonType1[])); /*c! var ambiguous2 = new[] { new ConvertibleType1(), new ConvertibleType2(), new CommonType1(), new CommonType2() }; */ var nonAmbiguous2 = new[] { new ConvertibleType1(), new ConvertibleType2(), new CommonType1(), new CommonType2(), new object() }; nonAmbiguous2.GetType() .AssertEquals(typeof(object[])); } } ///:~

ConvertibleType1 and ConvertibleType2 are now convertible to both CommonType1 and CommonType2. The compiler issues an error when it finds no single common type defined for ambiguous1. We add a CommonType1 in nonAmbiguous1’s definition, so the other types implicitly convert to CommonType1. However, we only complicate matters by adding a CommonType2 to ambiguous2, for now the compiler cannot choose any single target type. nonAmbiguous2’s initialization expression includes an object that resolves the type of the array. Everything implicitly converts to object, which cannot be implicitly converted to any other type. The example illustrates why you must avoid any possible ambiguity. We shall later demonstrate that implicitly typed arrays are also the only way to create concrete arrays of anonymous types.

32

C# Query Expressions

Preview Release 1.0

Object initializers Initializing an object with multiple properties presents the potential confusion of writing several constructors for each permutation of property names. You can instead create an object and then set its properties: //: SimpleNewFeatures\SettingProperties.cs // {CF: /reference:Person.dll} class SettingProperties { static void Main() { var person = new Person(3); person.FirstName = "Suzanne"; person.LastName = "Barney"; person.P(); } } /* Output: ID: 3 FirstName: Suzanne LastName: Barney *///:~

However, the FirstName and LastName should not appear in separate statements, because setting them is part of our initialization process. Using C# 3.0’s object initializers, you can set properties in a member initializer list that follows your constructor call: //: SimpleNewFeatures\ObjectInitializers.cs // {CF: /reference:Person.dll} class ObjectInitializers { static void Main() { // Parenthesis on parameterless // constructor are optional: var person = new Person { FirstName = "Suzanne", LastName = "Barney" }; person.P(); // Call parameterized constructor // and use initializer list: person = new Person(4) { FirstName = "Joe", LastName = "Sandstrom" }; person.P(); person = new Person { ID = 12, FirstName = "Bob", LastName = "Dobbs" };

Simple New Features

King/Eckel ©2008 MindView, Inc.

33

person.P(); } } /* Output: ID: 0 FirstName: Suzanne LastName: Barney ID: 4 FirstName: Joe LastName: Sandstrom ID: 12 FirstName: Bob LastName: Dobbs *///:~

This technique works for any field and property as long as you have access to set it. The compiler translates the initializers as if the properties were on separate lines (as we did explicitly in SettingProperties.cs). You can also call parameterized constructors before the initializer list, as we do here. You can nest initializations of any objects that your object contains, like so: //: SimpleNewFeatures\NestedInitializers.cs class Object1 { public Object2 Object2; } class Object2 { public Object3 Object3; } class Object3 { public int Field; } class NestedInitializers { static void Main() { var object1 = new Object1 { Object2 = new Object2 { Object3 = new Object3 { Field = 5 } } }; } } ///:~

34

C# Query Expressions

Preview Release 1.0

Take care not to nest initializer lists too deeply. The complexity of even single-field object initializer lists increases with the number of fields.

Exercise 15: Change Person’s properties to fields. Prove that accessible fields work in an initialization list.

Exercise 16: Initialize only the FirstName property of a Person object. Exercise 17: Make both a type with an int Field, and a local variable called Field. Why does it look odd when you initialize an instance of your type’s Field with the local variable Field, and why is this not ambiguous?

Collection initializers You can initialize any container using C# 3.0’s array-like syntax, as long as it implements the non-generic IEnumerable19 interface and has an Add( ) method. Just separate the elements with commas inside a braced list: //: SimpleNewFeatures\CollectionInitializers.cs using System.Collections.Generic; class CollectionInitializers { static void Main() { var ints = new List() { 1, 2, 3, 4, 5 }; ints.AssertEquals(new[] { 1, 2, 3, 4, 5 }); // Parentheses not required: var ints2 = new List { 1, 2, 3, 4, 5 }; ints2.AssertEquals(ints); // Initialized via both parameter and initializer: var ints3 = new List(ints) { 6, 7 }; ints3.AssertEquals(new[] { 1, 2, 3, 4, 5, 6, 7 }); // Can nest: var nested = new List> { new List { 1, 2, 3, 4, 5 }, new List { 6, 7, 8, 9, 10 }, new List { 11, 12, 13, 14, 15 } }; nested.P("nested"); }

19 You’re only required to implement the non-generic IEnumerable interface, but its generic counterpart implements the non-generic version, so it works too.

Simple New Features

King/Eckel ©2008 MindView, Inc.

35

} /* Output: nested: [ [1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15] ] *///:~

The compiler translates every element in the comma-separated list to Add( ) calls, starting with the first. In nested’s case, the compiler creates the first nested List, Add( )ing each of its elements. It then Add( )’s that first List to nested. The compiler repeats the process for the two subsequent Lists. IEnumerable has no Add( ) method, so it appears odd to require initializable types to implement it. It would seem that any type with an Add( ) method could theoretically qualify. However, the word “add” is overloaded, for example “add an element to a sequence” or “add these two numbers. Early drafts of C# 3.0’s standard actually required the initialized type to implement ICollection (which has an Add( )) instead of IEnumerable. However, most “collection” types implement IEnumerable, and many of these have an Add( ) method, but only a few implement ICollection.20 ICollection doesn’t fit many container types, whereas IEnumerable naturally fits almost all of them by its simplicity. Thus C# 3.0 defines “collection” as a type that implements IEnumerable and has an accessible Add( ) method. The compiler’s normal overload resolution determines the appropriate Add( ) call. You call multi-argument Add( ) methods by enclosing the parameters in additional curly braces: //: SimpleNewFeatures\AddOverloads.cs using System.Collections; class SomeCollection : IEnumerable { public IEnumerator GetEnumerator() { return null; }

20

You usually satisfy foreach’s requirement by implementing IEnumerable. (See [[[reference]]] for details.)

36

C# Query Expressions

Preview Release 1.0

public void public void public void ("Add(" + }

Add(int i) { ("Add(" + i + ")").P(); } Add(char c) { ("Add(" + c + ")").P(); } Add(int i, char c) { i + ", " + c + ")").P();

} class AddOverloads { static void Main() { var someCollection = new SomeCollection { 5, 'x', { 10, 'z' } }; } } /* Output: Add(5) Add(x) Add(10, z) *///:~

The compiler resolves the appropriate Add( ) method for each value in the initializer list.

Exercise 18: Use collection initializer syntax to create a Dictionary that stores a random number for the Key, and an IEnumerable of random numbers and random length for the Value.

Exercise 19: Write code to prove whether a Stack, a LinkedList, and a Queue meet C# 3.0’s definition of “collection.” Exercise 20: When you have an IEnumerable type without an Add( ) method, but a valid Add( ) extension method is in scope, does your type then qualify as a C# 3.0 collection type? Exercise 21: Write code that proves the Add( ) method must be perfectly cased (i.e. “Add”, not “ADD” or “add”). Exercise 22: Write code that proves the compiler initializes nested collections immediately, Add( )ing each one individually instead of creating them all and then adding them as a batch at the end.

Anonymous types C# 3.0’s anonymous types are types that store data and contain no user code. The compiler generates the classes with properties that have associated

Simple New Features

King/Eckel ©2008 MindView, Inc.

37

backing fields.21 To declare an anonymous type, follow new with curly braces: //: SimpleNewFeatures\BasicAnonymousTypes.cs class BasicAnonymousTypes { static int SomeProperty { get { return 5; } } static void Main() { // Create an anonymous type: var type1 = new { SomeProperty }; type1.SomeProperty.AssertEquals(5); var type2 = new { DifferentFieldName = SomeProperty }; type2.DifferentFieldName.AssertEquals(5); //c! type2.SomeProperty.AssertEquals(5); var type3 = new { AnotherFieldName = SomeProperty, SomeProperty }; type3.AnotherFieldName.AssertEquals(5); type3.SomeProperty.AssertEquals(5); int someVariable = 10; var type4 = new { SomeProperty, someVariable, AnotherProperty = someVariable * 10, }; type4.someVariable.AssertEquals(10); type4.SomeProperty.AssertEquals(5); type4.AnotherProperty.AssertEquals(100); } } ///:~

You reference the property names with normal syntax (as seen with the AssertEquals( ) calls). The compiler performs the normal checks to ensure that you use the properties correctly.

21 In C++ these are sometimes called “PODS,” i.e., Plain Old Data Structures, and in Java they are called POJOs (Plain Old Java Objects), but neither provides direct language support.

38

C# Query Expressions

Preview Release 1.0

The compiler generates only properties, not public fields. The compiler infers the name and type of type1’s only property, SomeProperty, from its initialization expression. Since we use “SomeProperty” in type1’s anonymous type declaration, the compiler gives the anonymous type a “SomeProperty” property to which it assigns BasicAnonymousTypes.SomeProperty’s value. You explicitly create an anonymous type field by stating a name followed by an assignment (as we did with DifferentFieldName). Note that type2’s type has no SomeProperty property. As type3 and type4 show, you can mix the two techniques any way you like. The optional comma that ends type4’s declaration makes it easier to add lines to it later. Notice that we use a local variable to create type4’s someVariable property. Anonymous type declarations look much like ordinary object initializer lists but without a type name; they instead take the unique (hidden) type name that the compiler generates. The generated type inherits directly from object. Anonymous types are immutable,22 so no generated property has a set accessor. The compiler converts anonymous type initializations to constructor parameters rather than setting each property after it runs a default constructor. var is the only way to declare a reference to an anonymous type or sequence of anonymous types. You use C# 3.0’s new array declaration syntax to create arrays of anonymous types: //: SimpleNewFeatures\ArraysOfAnonymousTypes.cs class ArraysOfAnonymousTypes { static void Main() { var tuples = new[] { new { FirstName = "Joe", LastName = "Jewkes"}, new { FirstName = "Sarah", LastName = "Newby"}, new { FirstName = "John", LastName = "Freeman"} };

22 One significant advantage to the immutability of anonymous types is in threading; it’s impossible multiple threads to interfere with each other via immutable objects.

Simple New Features

King/Eckel ©2008 MindView, Inc.

39

foreach(var tuple in tuples) tuple.P(); } } /* Output: { FirstName = Joe, LastName = Jewkes } { FirstName = Sarah, LastName = Newby } { FirstName = John, LastName = Freeman } *///:~

We can’t explicitly state the tuples array type because the compiler generates anonymous type names for the array elements. C# 3.0’s array-creation syntax tells the compiler to insert the compiler-generated type name. The compiler considers two anonymous types to be the same type only when they have identical property names with identical types in identical order. When an anonymous type fails to meet any of these requirements, the compiler generates a different type: //: SimpleNewFeatures\DifferentAnonymousTypes.cs class DifferentAnonymousTypes { static void Main() { // Same field names of same type in same order // use same compiler-generated anonymous type: var anonymous1 = new { Property1 = 5, Property2 = "Hello" }; var anonymous2 = new { Property1 = 10, Property2 = "London" }; anonymous1.GetType() .AssertEquals(anonymous2.GetType()); // Different property names produce // different anonymous types: var type1 = new { SomeProperty = 5 }; var type2 = new { AnotherProperty = 5 }; type1.GetType().Equals(type2.GetType()).False(); // Different property types make // different anonymous types: var type3 = new { SomeProperty = 5 }; var type4 = new { SomeProperty = "5" }; type3.GetType().Equals(type4.GetType()).False(); // Different property order creates // different anonymous types: var type5 = new { FirstProperty = 5,

40

C# Query Expressions

Preview Release 1.0

SecondProperty = 5 }; var type6 = new { SecondProperty = 5, FirstProperty = 5 }; type5.GetType().Equals(type6.GetType()).False(); } } ///:~

The compiler overrides ToString( ), Equals( ), and GetHashCode( ) when it generates an anonymous type: //: SimpleNewFeatures\GeneratedMethods.cs static class GeneratedMethods { static void Main() { var anonymous = new { Property1 = 5, Property2 = "Hello" }; anonymous.ToString().AssertEquals( "{ Property1 = 5, Property2 = Hello }"); anonymous.GetHashCode().AssertEquals(1231812064); anonymous = new { Property1 = 5, Property2 = "Hello" }; // Same hash code: anonymous.GetHashCode().AssertEquals(1231812064); anonymous = new { Property1 = 5, Property2 = "Hi" }; anonymous.ToString().AssertEquals( "{ Property1 = 5, Property2 = Hi }"); // GetHashCode() value changes: anonymous.GetHashCode().AssertEquals(392710151); // Generated Equals() compares property values: var another = new { Property1 = 5, Property2 = "Hi" }; anonymous.AssertEquals(another); // Call Equals() // No == operator generated: (anonymous == another).False(); another = new { Property1 = 5, Property2 = "Howdy" };

Simple New Features

King/Eckel ©2008 MindView, Inc.

41

anonymous.Equals(another).False(); } } ///:~

The compiler-generated GetHashCode( ) calculates its value from the generated type’s property values rather than from the default implementation.23 We show that GetHashCode( ) returns a different value when we change a property value. object’s implementation of Equals( ) returns true when this and the Equals( ) argument reference the same object. However, for anonymous types, the compiler generates an Equals( ) override that compares each property value in the anonymous type to its corresponding property value in the other instance: //: SimpleNewFeatures\AnonymousEquals.cs class HeldType { string id; public HeldType(string id) { this.id = id; } public override bool Equals(object obj) { // Trace statement proves this method is called: "Equals()".P(id); return true; } } class AnonymousEquals { static void Main() { var typeHolder = new { Property = new HeldType("held1") }; // AssertEquals() calls Equals(): typeHolder.AssertEquals(typeHolder); var typeHolder2 = new { Property = new HeldType("held2") }; typeHolder.AssertEquals(typeHolder2); typeHolder = new { Property = (HeldType)null }; "Trying null...".P();

23

For reference types, the default implementation guarantees a unique identity within an app domain until the garbage collector reclaims the object, after which the value can be used again. The default implementation uses reflection for value types, generating a unique value from the field values.

42

C# Query Expressions

Preview Release 1.0

typeHolder.Equals(typeHolder2).False(); typeHolder2 = new { Property = (HeldType)null }; typeHolder.AssertEquals(typeHolder2); // Two anonymous types of differing // types are never equal: var oneType = new { A = 5, B = 5 }; var twoType = new { B = 5, A = 5 }; oneType.Equals(twoType).False(); } } /* Output: held1: Equals() held1: Equals() Trying null... *///:~

Notice that typeHolder calls Equals( ) on its members (as shown in the output “held1: Equals()”). For null values to be equal, the matching property values from both objects must be null. In typeHolder and typeHolder2’s definitions, we must cast null to HeldType because the compiler cannot infer type information from just null. Anonymous types differ from normal types only in that we cannot explicitly state the compiler-generated anonymous type name. However, we can reflect an anonymous type to show what the compiler creates: //: SimpleNewFeatures\AnonymousReflection.cs class AnonymousReflection { static void Main() { var type1 = new { Property = 5 }.GetType(); type1.Name.AssertEquals("<>f__AnonymousType0`1"); // Inherits from Object: type1.BaseType.AssertEquals(typeof(object)); // Closed type: type1.ToString() .AssertEquals("<>f__AnonymousType0`1[System.Int32]"); // It's a generic type with one type argument: type1.GetGenericArguments().Length.AssertEquals(1); // Only has one property: type1.GetProperties().Length.AssertEquals(1); var propertyInfo = type1.GetProperties()[0]; propertyInfo.Name.AssertEquals("Property"); propertyInfo.PropertyType.AssertEquals(typeof(int));

Simple New Features

King/Eckel ©2008 MindView, Inc.

43

// Immutable: propertyInfo.CanRead.True(); propertyInfo.CanWrite.False(); // Has one contructor for the initialization: type1.GetConstructors().Length.AssertEquals(1); type1.GetConstructors()[0].ToString() .AssertEquals("Void .ctor(Int32)"); // Reuses the generic type from before because // property names match (but property types don't): var type2 = new { Property = "Hi" }.GetType(); type2.Name.AssertEquals(type1.Name); // But the "closed" type" differs: type2.ToString().AssertEquals( "<>f__AnonymousType0`1[System.String]"); var type3 = new { DifferentName = "Hi" }.GetType(); // Generates new anonymous type // since property names differ: type3.Name.AssertEquals("<>f__AnonymousType1`1"); } } ///:~

This example uses reflection, which we explore in detail at [[[reference]]]. The compiler generates the name <>f__AnonymousType0`1 for the first anonymous type. The “`1” indicates one generic type argument (the arity). The compiler uses the same generic type for the second anonymous type’s identical property names. However, two anonymous types with different closed types are not the same.24 The last anonymous has a different property name, so the compiler generates a new generic type for it. Do not rely on such reflected information; we use it here as a learning tool only. This implementation’s details, current at this book’s printing, could change in future versions.25

24 A closed type is a generic type with actual type arguments. For example, List is a closed type. (See the Generics chapter in [[[reference]]].) 25 Our build system discovers any such changes, so we will update the code distribution accordingly.

44

C# Query Expressions

Preview Release 1.0

Although you can’t use the generated type name explicitly, all the normal compile-time checks still apply; you can’t assign an incorrect type to a generated property, for example. Anonymous types are just compilergenerated classes, so generic type inference is the same. The compiler infers the generic parameter when you pass an anonymous type to a generic method (a simple feature you’ll appreciate in the Query Expressions chapter).

Exercise 23: Add five instances of some anonymous type to a List. Give the anonymous type three integer fields: a value, its square, and its cube. Exercise 24: Make two instances of an anonymous type that holds another anonymous type that holds a HeldType from AnonymousEquals.cs. Call Equals( ) on the first-anonymous type, and notice that HeldType.Equals( )’s trace statement prints successfully (indicating that the second-level Equals( ) is indeed generated). Exercise 25: (Advanced.) Use reflection to determine whether the compiler reuses the same generated generic type if you change only the order of the identical property names and property types of two anonymous types.

Lambda expressions Anonymous methods, inherently verbose,26 can clutter your code, making it hard to read (though they are a convenient way to pass one temporary method to another). C# 3.0 introduces lambda expressions that need only the parameters and the statements to make an anonymous method: //: SimpleNewFeatures\LambdaIntro.cs using System; class LambdaIntro { static void CheckLessThanFive(Predicate predicate) { predicate(10).False(); predicate(0).True(); } static void Main() {

26 Even anonymous methods that contain only one statement are inherently bulky: you must write the delegate keyword, declare parameters with their types and encapsulate them in parentheses, add braces, and add semicolons to the end of statements. So rather than write an anonymous method that spans multiple lines, make it a member of your class.

Simple New Features

King/Eckel ©2008 MindView, Inc.

45

// Anonymous method: CheckLessThanFive(delegate(int i) { return i < 5; }); // Equivalent lambda expression: CheckLessThanFive(i => i < 5); } } ///:~

Note that you can declare a lambda expression much more concisely than an anonymous method by removing unnecessary tokens such as parentheses, the delegate keyword, etc. The => symbol identifies a lambda expression, and separates the arguments from the statements. Here, the lambda expression declares i as an argument to the method, and returns the value of the expression i < 5. The anonymous method and the lambda expression both produce nearly identical MSIL. Note that the compiler preserves static type safety when it infers lambda arguments. Anything you do with the argument that is undefined for the inferred type causes an error. The compiler examines the lambda expression’s defining context to infer i’s type. Here it infers that i is an int because predicate’s type is Predicate. The result of a single-statement lambda expression is the lambda’s return type, which must implicitly convert to the target delegate’s return type (the same as with anonymous methods). Although it’s common to write single-statement lambda expressions like the one above, nothing limits your lambda expression to just one line or parameter: //: SimpleNewFeatures\LambdaVariations.cs using System; delegate void Parameterless(); class LambdaVariations { static void Main() { // Assign the lambda expression to a variable: Predicate predicate1 = i => i < 6; predicate1(4).True(); predicate1(7).False(); // Can't use var because compiler // relies on target delegate type: //c! var illegal = j => j < 6;

46

C# Query Expressions

Preview Release 1.0

// Can put parentheses on a single parameter, // but it's not necessary: Predicate predicate2 = (i) => i < 6; // Parentheses required with more than one parameter: Comparison comparer = (l, r) => l - r; comparer(5, 10).AssertEquals(-5); // Explicitly typed paramaters comparer = (int l, int r) => l - r; comparer(5, 10).AssertEquals(-5); // Parameterless lambda requires parentheses: Parameterless parameterless = () => Console.Write("hello, "); for(int i = 0; i < 3; i++) parameterless(); Console.WriteLine(); // More than one statement requires braces: parameterless = () => { Console.Write(1); Console.Write(2); Console.WriteLine(); }; parameterless(); // return keyword required with braces Predicate predicate3 = i => { 3.P(); return true; }; predicate3(20).True(); // Can't have return without braces //c! predicate = i => return true; } } /* Output: hello, hello, hello, 12 3 *///:~

If there’s only one argument, you don’t need parentheses. Multiple-statement lambda expressions require curly braces. The braces clarify which statements comprise your lambda expression. You use the return keyword when your braced lambda expression returns a value. You need not specify an argument type that the compiler can infer. In the definition of illegal, we cannot use var for type inference because there’s not enough information to infer j’s type. In predicate1’s definition,

Simple New Features

King/Eckel ©2008 MindView, Inc.

47

Predicate provides the extra context the compiler requires to infer the argument types and return types of the lambda expression. Even when they have extra tokens, lambda expressions are much more succinct than anonymous methods. We use several lambda expressions in a single statement in the Query Expressions chapter. The compiler inserts a return when you assign a lambda expression to a delegate that returns anything but void: //: SimpleNewFeatures\LambdasMayReturnVoid.cs using System; class LambdasMayReturnVoid { static void Main() { // Func returns int: Func func = () => 5; // Action takes int, returns void: Action action = i => i.P(); } } ///:~

Func returns a value (int, in this case; we introduce Func shortly), so the compiler inserts the return keyword before the ‘5’ in its generated method. Action doesn’t return anything, so the compiler doesn’t insert a return statement. Consider the error message in this example:27 //: SimpleNewFeatures\LambdaExpressionIsAnExpression.cs // {CompileTimeError: Only assignment, call, increment, // decrement, and new object expressions can be used as // a statement} using System; class LambdaExpressionIsAnExpression { static void Main() { Action action2 = i => i + 1; } } ///:~

27

The CompileTimeError flag tells our build system that the example should not compile and indicates the compiler error string. Our build system verifies that error.

48

C# Query Expressions

Preview Release 1.0

The compiler detects that Action returns void and inserts no return before i + 1. If we do this: //: SimpleNewFeatures\PlainExpression.cs // {CompileTimeError: Only assignment, call, increment, // decrement, and new object expressions can be used as // a statement} using System; class PlainExpression { static void Main() { int i = 5; i + 1; } } ///:~

The error messages are the same. The last line in Main( ) causes the error because it has no side effects (and makes no sense). Only assignment, call, increment, decrement, and new object expressions can act as statements by themselves because of their side effects. We’d get an entirely different error message if the compiler inserted a return: //: SimpleNewFeatures\BadReturn.cs // {CompileTimeError: Since 'System.Action' returns // void, a return keyword must not be followed by an // object expression} using System; class BadReturn { static void Main() { Action action2 = i => { return i + 1; }; } } ///:~

We added braces in order to include the return. The compiler never inserts a return inside a braced lambda.

Exercise 26: Write a single-argument lambda expression whose parameter the compiler infers as a DateTime. Access the Day property from within the lambda. Make another lambda whose parameter type the compiler infers as an int. Again, try to access the Day property. Prove that the compiler catches the error.

Simple New Features

King/Eckel ©2008 MindView, Inc.

49

Exercise 27: Make a List of random TimeSpans between 0 and 24 hours. Use FindAll( ) to locate all TimeSpans less than 12 hours, Exists( ) to see if any have an Hours property value of five, TrueForAll( ) to ensure each is between 0 and 24 hours, and ConvertAll( ) to return just the Hours portion of each. Use lambda expressions for all calls.

Exercise 28: Write a Generate( ) method that returns an IEnumerable. Have it take two arguments: an int with the number of items to generate, and a no-arg delegate that returns a type T. Make it execute the delegate the given number of times, yielding each result. Hint: You may wish to use the Func delegate talked about in the next section instead of making your own. Exercise 29: Fill two Lists with the same random numbers. Sort( ) both lists in descending order, and use RemoveAll( ) to take out the odd numbers. For the first list, use anonymous methods, for the second list, use lambda expressions. Verify that they produce identical results (TrueForAll( )).

Func C# 3.0 introduces System.Func, which is the core delegate type for many of C# 3.0’s features. Func replaces most of the generic delegate types introduced in version 2.0. Func has as many as five generic overloads. In the declaration for a Func, the first n-1 generic arguments specify the delegate argument types, and the last generic argument always specifies the return type. //: SimpleNewFeatures\FuncDemo.cs using System; class FuncDemo { static void Main() { Predicate predicate = x => x < 3; predicate(2).True(); // Can just use Func instead: Func predicate2 = x => x < 3; predicate2(2).True(); Comparison comparer1 = (left, right) => left – right; comparer1(5, 3).AssertEquals(2); // Equivalent Func replacement:

50

C# Query Expressions

Preview Release 1.0

Func comparer2 = (left, right) => left - right; comparer2(5, 3).AssertEquals(2); // Func can't replace Action because // Action doesn't return a value: Action action1 = x => x.P(); action1(3); //c! Func action2 = x => x.P(); } } /* Output: 3 *///:~

You can use Func if your method returns a value, instead of searching the standard namespaces for an appropriate delegate type. Here, we show how to use Func instead of Predicate or Comparison. However, do use Predicate when it makes your code more readable; it requires only one generic argument, whereas Func requires two. Func’s last generic argument always determines its return type.

Exercise 30: Write a basic selection sort algorithm that returns an IEnumerable and takes an IEnumerable and a Func as the comparer. Test your algorithm.

Simple New Features

King/Eckel ©2008 MindView, Inc.

51

Query Expressions C# 3.0 query expressions provide SQL-like syntax for querying objects. Query expressions change the way we think about data. Database programming has typically been considered a foreign discipline, outside the world of “normal programming.” Query expressions don’t require that you leave your programming paradigm to work with a database. They provide a bridge between the two.

Basic LINQ You can query objects as abstract data stores with LINQ (Language INtegrated Query, pronounced “link”). These objects range from normal inmemory objects to those that abstract away other data stores such as databases, files, web services, etc. Query syntax is very different from ordinary programming prose. Query expressions return an IEnumerable or IQueryable object.1 Consider these simple query expressions: //: QueryExpressions\SimpleIntro.cs using System.Linq; using System.Collections.Generic; class SimpleIntro { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; IEnumerable copy = from number in numbers select number; copy.P(POptions.NoNewLines); object.ReferenceEquals(numbers, copy).False(); var timesTen =

1 In fact, query expressions can return any type (as we will later demonstrate), but they ordinarily return either an IEnumerable or an IQueryable.

52

C# Query Expressions

Preview Release 1.0

from number in numbers select number * 10; timesTen.P(POptions.NoNewLines); } } /* Output: [1, 2, 3, 4, 5] [10, 20, 30, 40, 50] *///:~

Every query expression begins with from followed by an identifier, then in followed by the source. from is like foreach, except the compiler infers the iteration variable type,2 and from is not a native looping construct. Most query expressions end with a select clause to choose the objects they return. In this example, the iteration variable number produces each element from the source numbers. The first query expression selects each element in the array, which produces a copy of numbers. The second query expression performs the same selection as the first, then multiplies each value by ten in the select clause. We usually use var with queries, as we did in the second query. However, in the first query, we wanted to show copy’s compile-time type. While from may seem like another foreach, it’s declarative because we express our goal but not how to achieve it. So a query expression can say “get all the numbers” or “get all the numbers times ten,” while a foreach requires that you write code to multiply each number by ten; then store the result in a temporary List; and finally return the List upcast to an IEnumerable. When you understand how query expressions work, you’ll appreciate their power, brevity and clarity (and that of such related technologies as LINQ to SQL, LINQ to XML, etc. found in later chapters). Contextual keywords such as from, etc. are keywords only in the context of a query. To treat them normally within a query, precede them with an @: //: QueryExpressions\ContextualKeywords.cs using System.Linq;

2

We will show you how you can also specify the type.

Query Expressions

King/Eckel ©2008 MindView, Inc.

53

class ContextualKeywords { static void Main() { var from = new[] { 1, 2, 3 }; /*c! var result1 = from f in from select f + 1; */ // Using '@': var result2 = from f in @from select f + 1; } } ///:~

We used an @ symbol to access the array from, since its name is also a contextual keyword inside a query. However, you should avoid situations that force you to use @, because they are confusing.

Translation Query expressions’ type-safe SQL-like syntax requires very few additions to C#.3 The compiler translates query clauses to normal method calls: //: QueryExpressions\SelectTranslation.cs using System.Linq; using System.Collections.Generic; class SelectTranslation { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; IEnumerable timesTen = from number in numbers select number * 10; timesTen.P(POptions.NoNewLines); // The compiler translates the query into: timesTen = numbers.Select(number => number * 10); timesTen.P(POptions.NoNewLines); // Which further translates,

3 However, notice that all the reserved words in query expressions must be in lower case, whereas SQL keywords are not case-sensitive.

54

C# Query Expressions

Preview Release 1.0

// using extension method rules: timesTen = Enumerable.Select(numbers, number => number * 10); timesTen.P(POptions.NoNewLines); } } /* Output: [10, 20, 30, 40, 50] [10, 20, 30, 40, 50] [10, 20, 30, 40, 50] *///:~

Here, the compiler converts the select clause into a Select( ) instance method call on numbers. It also converts the select expression into a full lambda expression, using from’s iteration variable name as the lambda’s argument name. Since Array has no Select( ) method, the compiler’s search for possible extension methods finds System.Linq.Enumerable.Select( ), which takes an IEnumerable for its first argument and a Func for its second. The query expression imports no special types (as foreach does). The compiler won’t find an appropriate Select( ) unless you either include a using System.Linq statement or bring your own Select( ) into scope. Recall that the generic delegate Func takes as many as four arguments of any type and returns a non-void value. The compiler-converted lambda becomes our Func. The from clause, which has no direct translation, only states the source name and brings the lambda argument name into scope. Select( ) invokes the Func on each item to yield each result. Its code might look something like this: //: QueryExpressions\SelectCode.cs using System; using System.Collections.Generic; static class CustomEnumerable { public static IEnumerable Select(this IEnumerable collection, Func selector) { "Select()".P(); foreach(T element in collection) yield return selector(element);

Query Expressions

King/Eckel ©2008 MindView, Inc.

55

} // Other query expression helper methods go here... } class SelectCode { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; var timesTen = from number in numbers select number * 10; timesTen.AssertEquals(new[] { 10, 20, 30, 40, 50 }); } } /* Output: Select() *///:~

Once the compiler converts select to .Select( ), our Select( ) method is the only one in scope, and it satisfies the extension method lookup, so the compiler uses it (as its trace statement shows). The compiler resolves query clauses to any extension method with the proper signature, not just Enumerable’s, as the exclusion of “using System.Linq” in this example demonstrates. In fact, even if we brought System.Linq.Enumerable into scope with a using statement, the compiler would resolve to the still “nearer” in scope CustomEnumerable.Select( ). Enumerable and Queryable4 methods match every possible query expression clause, such as select (and many others shown throughout this chapter). Although you usually want the lookup to resolve to Enumerable’s methods or Queryable’s methods, you can provide your own as we did here. The compiler first searches the source for any ordinary method that satisfies the lookup, then for possible extensions, so Select( ) needn’t always be an extension method. The compiler issues an error if it finds no matching method: //: QueryExpressions\NoValidOverload.cs // {CompileTimeError: Could not find an implementation // of the query pattern for source type

4 Our focus here is Enumerable; you will see Queryable later, and the idea is much the same.

56

C# Query Expressions

Preview Release 1.0

// 'NoValidOverload'.

'Select' not found.}

class NoValidOverload { static void Main() { var source = new NoValidOverload(); var result = from s in source select s; } } ///:~

Select( )’s generic Func takes two type arguments. You can return a different type than your select data source provides, as this example shows: //: QueryExpressions\SelectingADifferentType.cs using System.Linq; class SelectingADifferentType { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; var lessThanThree = from i in numbers select i < 3; lessThanThree.P(POptions.NoNewLines); } } /* Output: [True, True, False, False, False] *///:~

Thus, our lambda takes an int and returns a bool, which indicates whether an item’s value is less than 3. However, knowing the values are less than three isn’t as useful as retrieving them. We use a where to retrieve those items: //: QueryExpressions\WhereClause.cs using System.Linq; class WhereClause { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; var lessThanThree = from i in numbers where i < 3 select i; var result = new[] { 1, 2 }; lessThanThree.AssertEquals(result);

Query Expressions

King/Eckel ©2008 MindView, Inc.

57

// Could translate to: lessThanThree = numbers .Where(i => i < 3) .Select(i => i); lessThanThree.AssertEquals(result); } } ///:~

where is a syntactic revision, like select. The compiler chains successive clauses as method invocations, using select’s source as where’s return value in this example. We encourage you to complete the translation steps in the exercises where applicable.

Exercise 1: From a sequence of strings, select the ones with no vowels, uppercasing the ones you select. Exercise 2: Look up Enumerable.Range( ) and use it to create a number source to write a query that calculate the squares of the odd numbers from 1 to 100. Exercise 3: Modify Select( ) in SelectCode.cs so the select clause returns 100 for each element instead of using the selector argument.5 Exercise 4: When you add a using for System.Linq to SelectCode.cs, does the compiler still resolve to CustomEnumerable’s Select( ) method or use Enumerable’s instead? Exercise 5: Write a Where( ) method and use it in a query. Give it a trace statement to prove the compiler calls your version instead of Enumerable’s.

Exercise 6: Here’s a twist: create a Select( ) method that handles the following query expression: bool returnedValue = from s in 5 select 5.5;

5 We suggest, for practice with these exercises, that you first formulate the query and then translate it into Enumerable calls.

58

C# Query Expressions

Preview Release 1.0

Degeneracy Query expressions must end with either a select clause or a group…by clause (which we look at later). However, a select with a lambda that only returns its argument just wastes processing time; we call such clauses degenerate. The compiler removes degenerate Select( ) calls: //: QueryExpressions\DegenerateSelect.cs using System.Linq; class DegenerateSelect { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; var lessThanThree = from i in numbers where i < 3 select i; var result = new[] { 1, 2 }; lessThanThree.AssertEquals(result); // Actual translation: lessThanThree = numbers .Where(i => i < 3); // No Select() here lessThanThree.AssertEquals(result); } } ///:~

The compiler never eliminates a Select( ) call that would produce a direct reference to the original source: //: QueryExpressions\SelectNotRemoved.cs using System.Linq; class SelectNotRemoved { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; var copy = from n in numbers select n; object.ReferenceEquals(numbers, copy).False(); } } ///:~

Query Expressions

King/Eckel ©2008 MindView, Inc.

59

Here, only the numbers object would remain if the compiler removed the degenerate Select( ) call.

Exercise 7: Prove that the compiler drops degenerate select clauses by relying on the trace statement within Select( ) in SelectCode.cs. Make two queries, one with a degenerate select clause, and one without, and notice that the trace statement only prints for the non-degenerate select clause.

Exercise 8: Does the compiler consider a where true clause to be degenerate? That is, does it consider it a waste of processor cycles and thus eliminate the clause in the translation? Write code to prove your answer.

Chained where clauses You can chain where clauses, just like Where( ) methods: //: QueryExpressions\MultipleWhereClauses.cs using System.Linq; class MultipleWhereClauses { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; var betweenOneAndFive = from n in numbers where n < 5 where n > 1 select n; betweenOneAndFive.AssertEquals(new[] { 2, 3, 4 }); // Translates to: var betweenOneAndFive2 = numbers .Where(n => n < 5) .Where(n => n > 1); // Select() is degenerate betweenOneAndFive.AssertEquals(betweenOneAndFive2); // Which further translates, via // extension method rules: var betweenOneAndFive3 = Enumerable.Where( Enumerable.Where(numbers, n => n < 5), n => n > 1); betweenOneAndFive2.AssertEquals(betweenOneAndFive3); } } ///:~

60

C# Query Expressions

Preview Release 1.0

The compiler invokes the where clauses in the order they appear. The first where filters for values less than 5, and the second determines which of those are greater than 1.6 The rest of Main( ) shows the full translation process. The compiler first syntactically transforms multiple where clauses into a chain of Where( ) calls, and drops the degenerate Select( ). Since IEnumerable has no Where( ) method, the compiler follows the standard extension method protocol, translating the instance method calls to static calls on Enumerable, and rewriting our chained Where( ) calls as nested Where( ) calls. Notice that for each anonymous method it generates, the compiler repeats the variable name n. Each where clause becomes a Where( ) call with its own anonymous method argument. We verify identical results by checking all three queries.

Exercise 9: Make a HasWhere object with a Where( ) instance method (instead of the static extension methods we’ve used thus far). Make another class that has Where( ) and Select( ) extension methods. Put trace statements in all methods (be sure to distinguish between the two Where( ) methods). Use an instance of HasWhere for a query expression’s source. Which Where( ) does the compiler use? Does this setup cause any issues with the Select( ) extension method (make sure your select isn’t degenerate)?

Introduction to Deferred Execution When do queries actually execute? A query is an expression describing what you want. Only when you iterate through the results do the gears actually start to turn. This is called deferred execution.7 We’ll save the details for later in the book, but at this point you must understand the basics:

6

As we will see, the values pass through one by one, not as an entire set.

7

Some programmers will see a similarity to lazy evaluation here.

Query Expressions

King/Eckel ©2008 MindView, Inc.

61

//: QueryExpressions\DeferredIntro.cs using System.Linq; class DeferredIntro { static void Main() { int[] numbers = { 1, 2, 3 }; var result = from n in numbers select n * 10; result.AssertEquals(new[] { 10, 20, 30 }); numbers[1] = 234; // Changes result: result.AssertEquals(new[] { 10, 2340, 30 }); } } ///:~

You assume that changing a value in numbers after setting result to a query based on numbers does not affect result. However, you can see here that it does. If you are familiar with iterators8 (and C#’s yield mechanism), this should make some sense, since iterators are simply objects that store a data retrieval mechanism. They don’t execute that mechanism until you enumerate the results. You can also see this behavior in this example: //: QueryExpressions\DeferredIteration.cs using System; using System.Collections.Generic; static class MyExtensions { public static IEnumerable Select(this IEnumerable collection, Func selector) { "Select()".P(); foreach(T element in collection) yield return selector(element); } } class DeferredIteration { static void Main() {

8

62

Later in the chapter we show details of how this works with iterators.

C# Query Expressions

Preview Release 1.0

int[] ints = { 1, 2, 3 }; var result = from i in ints select i; result.P(POptions.NoNewLines); result.P(POptions.NoNewLines); result.P(POptions.NoNewLines); } } /* Output: Select() [1, 2, 3] Select() [1, 2, 3] Select() [1, 2, 3] *///:~

We write the query in Main( ) once, but print the results three times. Each time that P( ) iterates through the results, the code within Select( ) executes (as shown by the “Select( )” trace statement). Initially you’d expect the trace statement to print only once, but C# implements iterators in a way that causes it to print each time you iterate through the results.9 Deferred execution is essential when implementing composability, which we cover later in the chapter.

Multiple froms We can use multiple from clauses, much like control structures such as foreach and for: //: QueryExpressions\MultipleFroms.cs using System.Linq; using System.Collections.Generic; class MultipleFroms { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; // Query expression:

9

See appendix [[[xxx]]] for a refresher on iterators if necessary.

Query Expressions

King/Eckel ©2008 MindView, Inc.

63

var additions1 = from n1 in numbers1 from n2 in numbers2 select n1 + n2; additions1.AssertEquals(new[] { 4, 5, 5, 6 }); // Nested loops: var additions2 = new List(); foreach(int n1 in numbers1) foreach(int n2 in numbers2) additions2.Add(n1 + n2); additions2.AssertEquals(additions1); } } ///:~

The query iterates through every element in numbers2 for every element in numbers1, adding each pair. The nested foreach loops do the same. However, from is not a built-in looping construct like foreach, so you can think of each from clause as virtually iterating through its source elements; from just states the iteration variable’s name and source. Each from, pairing all its values with other froms, makes a Cartesian product, or a cross-join. This is sometimes referred to as an uncontrolled join because it combines all the elements without a condition. You don’t usually want this, but it has some uses. Later you’ll see inner joins and outer joins, which are more common. The foreach approach requires an extra List, whereas query expressions generate their own sequence. You’ll learn how this works later in the chapter. Here’s how the compiler translates the above query: //: QueryExpressions\MultipleFromsTranslation.cs using System.Linq; class MultipleFromsTranslation { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; var additions1 = from n1 in numbers1 from n2 in numbers2 select n1 + n2; additions1.AssertEquals(new[] { 4, 5, 5, 6 }); // Translates to:

64

C# Query Expressions

Preview Release 1.0

var additions2 = numbers1.SelectMany( n1 => numbers2, // n1 not used here (n1, n2) => n1 + n2); additions2.AssertEquals(additions1); // Converted via extension method rules: var additions3 = Enumerable.SelectMany(numbers1, n1 => numbers2, (n1, n2) => n1 + n2); additions3.AssertEquals(additions2); } } ///:~

SelectMany( ) takes the source collection and two delegates, expressed here as lambdas. The first delegate10 returns a sequence of items to combine with every element in the source collection. The second delegate produces a value from every pair. Our custom implementation of SelectMany( ) makes this clearer: //: QueryExpressions\SelectingMany.cs using System; using System.Collections.Generic; static class ManySelector { public static IEnumerable<S> Select( IEnumerable source, Func selector) { // One foreach: foreach(T tValue in source) yield return selector(tValue); } public static IEnumerable SelectMany( this IEnumerable source, Func> collectionSelector, Func resultSelector) { // Two foreachs: foreach(T tValue in source) foreach(C cValue in collectionSelector(tValue)) yield return resultSelector(tValue, cValue); }

10

Notice the first lambda doesn’t even use its n1 argument.

Query Expressions

King/Eckel ©2008 MindView, Inc.

65

} class SelectingMany { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; var query = from n1 in numbers1 from n2 in numbers2 select n1 + n2; // Compare our version against Enumerable's: var enumerableNumbers = System.Linq.Enumerable.SelectMany(numbers1, n1 => numbers2, // n1 not used here (n1, n2) => n1 + n2); var manySelectorNumbers = ManySelector.SelectMany(numbers1, n1 => numbers2, (n1, n2) => n1 + n2); enumerableNumbers.AssertEquals(new[] { 4, 5, 5, 6 }); enumerableNumbers.AssertEquals(manySelectorNumbers); } } ///:~

Select( ) has one foreach, whereas SelectMany( ) has two.11 SelectMany( )’s first foreach iterates over source, passing each element to collectionSelector, which returns a sequence for the second foreach. The second foreach passes both iteration variables to resultSelector. Another overload of SelectMany( ) takes one delegate instead of two; eliminating resultSelector, it provides the same results as above: //: QueryExpressions\AlternativeSelectMany.cs using System.Linq; class AlternativeSelectMany { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; var query1 = from n1 in numbers1

11

66

We repeat Select( )’s implementation for convenience.

C# Query Expressions

Preview Release 1.0

from n2 in numbers2 select n1 + n2; // Alternative translation: var query2 = numbers1.SelectMany( n1 => numbers2.Select(n2 => n1 + n2)); query1.AssertEquals(new[] { 4, 5, 5, 6 }); query1.AssertEquals(query2); } } ///:~

This overload of SelectMany( ) takes the source (numbers1 becomes the first parameter) and a lambda that returns an IEnumerable. This technique uses variable captures instead of passing both iteration variables to a second delegate, which becomes more apparent with deeper levels of nesting. We demonstrate by adding another from layer to the custom implementation of this version of SelectMany( ): //: QueryExpressions\SelectingManyWithCaptures.cs using System; using System.Linq; using System.Collections.Generic; static class ManySelector2 { public static IEnumerable<S> Select( IEnumerable source, Func selector) { foreach(T tValue in source) yield return selector(tValue); } public static IEnumerable<S> SelectMany( IEnumerable source, Func> selector) { foreach(T tValue in source) foreach(S sValue in selector(tValue)) yield return sValue; } } class SelectingManyWithCaptures { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; int[] numbers3 = { 5, 6 }; var result =

Query Expressions

King/Eckel ©2008 MindView, Inc.

67

from n1 in numbers1 from n2 in numbers2 from n3 in numbers3 select n1 + n2 + n3; result.AssertEquals( new[] { 9, 10, 10, 11, 10, 11, 11, 12 }); // Possible translation: var sequenceNumbers = Enumerable.SelectMany(numbers1, n1 => Enumerable.SelectMany(numbers2, n2 => Enumerable.Select(numbers3, n3 => n1 + n2 + n3))); result.AssertEquals(sequenceNumbers); // Which translates to: var manySelectorNumbers = ManySelector2.SelectMany(numbers1, n1 => ManySelector2.SelectMany(numbers2, n2 => ManySelector2.Select(numbers3, n3 => n1 + n2 + n3))); sequenceNumbers.AssertEquals(manySelectorNumbers); } } ///:~

Again, while SelectMany( ) has two foreaches, the second foreach yields its value directly instead of the result of a second delegate call. We use variable captures to nest lambdas within lambdas rather than explicitly passing parameters to each method. The innermost lambda captures the values of n1 and n2, down to Select( )’s lambda, which produces a single value by combining them with n3. Here’s another demonstration: //: QueryExpressions\NestedCaptures.cs using System; class NestedCaptures { static void Main() { Func aDel = a => { Func bDel = b => { Func cDel = c => a + b + c; return cDel(3); }; return bDel(2); }; aDel(1).AssertEquals(6); }

68

C# Query Expressions

Preview Release 1.0

} ///:~

We nest three simple lambdas that all accept and return an int. The invocation of its owning lambda instantiates each variable capture,12 binding the current value of its parent’s parameters. When we invoke aDel that invokes bDel, which invokes cDel; this binds a to 1, b to 2, and c to 3. cDel’s lambda captures a and b, and combines it with the argument c for a sum of 6. Note that the lambda’s caller determines the argument value, not the lambda itself. So, Main( ) determines a’s value (not aDel); aDel determines b’s value (not bDel); and bDel determines c’s value (not cDel). cDel combines all the values, while invoking each delegate determines the current value. This works much like currying (shown in the Functional Programming chapter). Within a query expression, from clauses state a source and bring parameter names into scope. The translation of from depends on what follows it: 1.

The first from produces the query’s initial sequence object.

2. The compiler combines a from followed directly by a select into a single SelectMany( ) call (given that the from is not the query’s initial from, as covered by rule 1). 3. Any other from’s not accounted for by rules 1 and 2 translate into a SelectMany( ). The following example demonstrates these rules and also shows that the compiler does use SelectMany( ) (without using captures): //: QueryExpressions\ActualCompilerTranslation.cs using System.Linq; class ActualCompilerTranslation { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; int[] numbers3 = { 5, 6 }; var result = from n1 in numbers1

12

C# implements variable captures using objects (see the C# standard).

Query Expressions

King/Eckel ©2008 MindView, Inc.

69

from n2 in numbers2 from n3 in numbers3 select n1 + n2 + n3; result.AssertEquals( new[] { 9, 10, 10, 11, 10, 11, 11, 12 }); // Translation: var resultTranslation = numbers1 .SelectMany(n1 => numbers2, // n1 not used // Compiler packs into anonymous type // instead of using variable captures: (n1, n2) => new { n1, n2 }) .SelectMany(temp => numbers3, // temp not used (temp, n3) => temp.n1 + temp.n2 + n3); result.AssertEquals(resultTranslation); } } ///:~

Note the compiler’s approach: It chains rather than nests the SelectMany( ) calls. The compiler translates the first from into the initial source: numbers1. The second from does not fall under rule 1 or 2, thus the compiler converts it to a SelectMany( ) call. The last from is not the initial from, and it is also followed by a select clause, thus rule 2 applies. The compiler combines both clauses into a single SelectMany( ), rather than two separate calls: SelectMany( ) for the last from, and a Select( ) for the select. Our first SelectMany( ) call returns a source for a subsequent clause. The compiler packs multiple values into an anonymous type to pass them to the next call.13 The first SelectMany( ) produces an IEnumerable of each anonymous type; the second SelectMany( ) consumes that IEnumerable. temp represents each anonymous instance. We created the name “temp”; the compiler can call it anything it wants. The select clause adds the values of n1, n2, and n3. The compiler explicitly scopes n1 and n2 from temp in the last lambda. Thus the first SelectMany( ) packs, and the last SelectMany( ) unpacks and combines the values with n3 (its second lambda parameter).

13

70

The capture approach does not require the “packing” technique.

C# Query Expressions

Preview Release 1.0

Exercise 10: Use string’s CompareTo( ) method to list all unique pairs of programmer names: Jeff, Andrew, Craig, Susan, Derek, Drew, Katlyn. Store each pair in an anonymous type with First and Second fields.

Exercise 11: Use both the compiler’s approach (packing into temporary anonymous types) and the variable capture approach (nesting lambda expressions) to translate your solution for the previous exercise. Exercise 12: Generalize your solution to the programming pair exercise to create a generic Combine( ) method. (Hint: You will have to create your own type to contain both members of each pair.)

Exercise 13: How would AlternativeSelectMany.cs’s “alternative translation” change if its query contained a where clause? Exercise 14: Write code to prove how ActualCompilerTranslation.cs’s translation changes if the compiler converts all froms except the first into SelectMany( ) calls. (That is, it converts the select clause into a normal Select( ) call instead of combining the last from and select into a single SelectMany( ).)

Transparent identifiers To the compiler, temp is a transparent identifier. A transparent identifier exposes its members without requiring explicit scoping. C# doesn’t provide transparent identifiers for direct use by the programmer. For example: //: QueryExpressions\TransparentIdentifierDemo.cs class Holder { public int Value { get; set; } } class TransparentIdentifierDemo { static void Main() { var holder = new Holder(); // If transparent identifiers were // available, you could do this: //c! Value = 5; // (The compiler can do that, we can’t) // Instead, we must explicitly scope: holder.Value = 5; }

Query Expressions

King/Eckel ©2008 MindView, Inc.

71

} ///:~

If holder were a transparent identifier, then Value would be in scope and the line marked //c! would compile. You would write only Value instead of holder.Value. The compiler would simply insert the “holder.” for you. Although the compiler doesn’t make transparent identifiers available to you, it uses them internally when packing and unpacking values into anonymous types. In ActualCompilerTranslation.cs, we omitted a translation step between the original query and the translation that follows it. In what follows, we introduce another layer of from, including the intermediate step we left out in the ActualCompilerTranslation.cs that uses transparent identifiers: //: QueryExpressions\WithTransparentIdentifiers.cs using System.Linq; class WithTransparentIdentifiers { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; int[] numbers3 = { 5, 6 }; int[] numbers4 = { 7, 8 }; var result = from n1 in numbers1 from n2 in numbers2 from n3 in numbers3 from n4 in numbers4 select n1 + n2 + n3 + n4; result.AssertEquals(new[] { 16, 17, 17, 18, 17, 18, 18, 19, 17, 18, 18, 19, 18, 19, 19, 20 }); // Intermediate translation: /*c! var resultTranslation1 = numbers1 .SelectMany(n1 => numbers2, // n1 not used (n1, n2) => new { n1, n2 }) // We use "t" in lieu of "transparent": .SelectMany(t => numbers3, // t not used (t1, n3) => new { t1, n3 }) .SelectMany(t2 => numbers4, // Notice not scoping here: (t2, n4) => n1 + n2 + n3 + n4); */

72

C# Query Expressions

Preview Release 1.0

// Next step in translation is // to scope transparent identifiers: var resultTranslation2 = numbers1 .SelectMany(n1 => numbers2, // n1 not used (n1, n2) => new { n1, n2 }) .SelectMany(t => numbers3, // t not used (t1, n3) => new { t1, n3 }) .SelectMany(t2 => numbers4, // Compiler explicitly scopes for us: (t2, n4) => t2.t1.n1 + t2.t1.n2 + t2.n3 + n4); result.AssertEquals(resultTranslation2); } } ///:~

The last line in the second query is the catch: when the compiler rewrites our select clause to a SelectMany( ) call, it places the expression “n1 + n2 + n3 + n4” verbatim within the lambda expression. But n1, n2, and n3 are not in scope because they are packed away in t2 (which further holds t1). However, t2 and t1 are transparent identifiers, thus the compiler can see right through each layer to n1, n2, and n3. The compiler explicitly scopes them for us (see the last line of the last query). The compiler uses transparent identifiers combined with the packing technique in several places, as you’ll see throughout the rest of the chapter.

Iteration Variable Scope Notice in the previous example that many lambda arguments in the SelectMany( ) calls are not used. Although we didn’t need these parameters, the compiler doesn’t omit them in the translation, as this would cause a compile-time error. Remember that from not only states a source, but also brings its iteration variable into scope. So a subsequent from can use an outer from’s iteration variable: //: QueryExpressions\UsingScopedVariablesInFollowingFrom.cs using System; using System.Linq; using System.Collections.Generic; class UsingScopedVariablesInFollowingFrom { static IEnumerable GetRandomNumbers(int amount) { Random rand = new Random(47);

Query Expressions

King/Eckel ©2008 MindView, Inc.

73

for(int i = 0; i < amount; i++) yield return rand.Next(100); } static void Main() { int[] numbers = new[] { 1, 2, 3 }; var result = from n in numbers from randomValue in GetRandomNumbers(n) select n * randomValue; int[] answer = new[] { 28, 56, 120, 84, 180, 186 }; result.AssertEquals(answer); // Translation: var result2 = numbers.SelectMany(n => GetRandomNumbers(n), (n, randomValue) => n * randomValue); result2.AssertEquals(answer); } } ///:~

The second from passes n from the first from to GetRandomNumbers( ). As the translation shows, SelectMany( )’s first lambda uses its argument in this case. As a side note, it’s better to make Random objects static members rather than local members. However, in this book, we wish to verify results via assertions, thus GetRandomNumbers( ) must return an identical sequence each time it is called. You can intermix from and where clauses, but take care to maintain readability: //: QueryExpressions\MixingWhere.cs using System.Linq; class MixingWhere { static void Main() { int[] odds = { 1, 3, 5, 7, 9 }; int[] evens = { 2, 4, 6, 8, 10 }; var confusing = from o in odds where o < 7 from e in evens where o > 1 where e < 6

74

C# Query Expressions

Preview Release 1.0

select o * e; var better = from o in odds where o < 7 where o > 1 from e in evens where e < 6 select o * e; var better2 = from o in odds from e in evens where o > 1 && o < 7 where e < 6 select o * e; var best = from o in odds from e in evens where o > 1 && o < 7 && e < 6 select o * e; // All achieve same results: confusing.AssertEquals(new[] { 6, 12, 10, 20 }); confusing.AssertEquals(better); better.AssertEquals(better2); better2.AssertEquals(best); } } ///:~

As a practice, keep from clauses at the top of the query and where clauses to a minimum.14 Following the rules outlined before, the first from translates to the query’s original source. The rest translate to SelectMany( ). Here’s part of the first query’s translation: //: QueryExpressions\MixingWhereConversions.cs using System.Linq; class MixingWhereConversions { static void Main() { int[] odds = { 1, 3, 5, 7, 9 };

14 Although it’s generally better to sacrifice speed for readability, and we’d expect you’d do so here, we point out an alternative approach in the exercise at the end of this section.

Query Expressions

King/Eckel ©2008 MindView, Inc.

75

int[] evens = { 2, 4, 6, 8, 10 }; var confusing = from o in odds where o < 7 from e in evens where o > 1 where e < 6 select o * e; // Skipping the transparent idntifier step... var confusingTranslated = odds // Direct translation to the source .Where(o => o < 7) .SelectMany(o => evens, (o, e) => new { o, e }) // We use "nt" in lieu of "nonTransparent": .Where(nt => nt.o > 1) .Where(nt => nt.e < 6) .Select(nt => nt.o * nt.e); confusing.AssertEquals(confusingTranslated); } } ///:~

As you list more from clauses, the translation of multiple froms becomes more complex. The compiler, however, continues to apply the rules: //: QueryExpressions\SuperFroms.cs using System.Linq; class SuperFroms { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; int[] numbers3 = { 4, 5 }; int[] numbers4 = { 6, 7 }; var additions1 = from n1 in numbers1 from n2 in numbers2 // The last two froms: from n3 in numbers3 from n4 in numbers4 select n1 + n2 + n3 + n4; additions1.AssertEquals(new[] { 14, 15, 15, 16, 15, 16, 16, 17, 15, 16, 16, 17, 16, 17, 17, 18 }); // Skipping the transparent idntifier step... // Translates to: var additions2 =

76

C# Query Expressions

Preview Release 1.0

numbers1 // Rule 1 .SelectMany(n1 => numbers2, // Rule 3 (n1, n2) => new { n1, n2 }) // We use "nt" in lieu of "nonTransparent": .SelectMany(nt1 => numbers3, // Rule 3 (nt1, n3) => new { nt1, n3 }) .SelectMany(nt2 => numbers4, // Rule 2 (nt2, n4) => nt2.nt1.n1 + nt2.nt1.n2 + nt2.n3 + n4); additions1.AssertEquals(additions2); } } ///:~

The compiler keeps packing values into anonymous types until the last from. The first SelectMany( ) packs n1 and n2 together in an anonymous instance. The second SelectMany( ) packs each of these into a new anonymous instance together with n3. The last SelectMany( ) unpacks each value, explicitly scoping them in the sum to produce the final value. Notice that we have three SelectMany( ) calls instead of four. The query ends with a from clause followed immediately by a select. The compiler combines these two into a single SelectMany( ).

Exercise 15: How does SuperFroms.cs’s translation change when you insert a “where true” between the third and fourth from? Exercise 16: We performed the first step of translating MixingWhere.cs’s first query in MixingWhereConversions.cs. Perform this same first step on the rest of the queries. Exercise 17: Below you see a variation to the queries in MixingWhere.cs. One requires less iterations than the other. Which is it, and why? var approach1 = from o in odds from e in evens where o > 1 && o < 7 where e < 6 select o * e; var approach2 = from o in odds

Query Expressions

King/Eckel ©2008 MindView, Inc.

77

where o > 1 && o < 7 from e in evens where e < 6 select o * e;

More complex data For many of the following examples, we use Microsoft’s Northwind sample database, which includes sales data for a fictitious importer/exporter, Northwind Traders.15 We’ll work with databases directly in the LINQ to SQL chapter. For now, we just need the data that LINQ to SQL provides into in-memory objects. First, we define a simple abstract class that determines equality by comparing its property values to those of its other argument: //: MindView.Util\Equalable.cs // {CF: /target:library} using System.Linq; namespace MindView.Util { public abstract class Equalable { public override bool Equals(object other) { var myType = GetType(); if(!myType.Equals(other.GetType())) return false; var properties = myType.GetProperties(); var equalProperties = from p in properties let leftValue = p.GetValue(this, null) let rightValue = p.GetValue(other, null) where leftValue == rightValue || (leftValue != null && leftValue.Equals(rightValue)) select p; return equalProperties.Count() == properties.Length; } } } ///:~

15 “Sample Databases Included with Access” (Microsoft Office Online, 2007, Microsoft, 19 Feb. 2007 (see http://office.microsoft.com/en-us/access/HP051886201033.aspx).

78

C# Query Expressions

Preview Release 1.0

This uses a relatively simple form of reflection in the expression myType.GetProperties( ).16 All you need to understand at this point is that any object that inherits from Equalable will, by default, compare property values to determine equality, rather than just comparing references. The default object.Equals( ) only compares references, finding them “equal” if both refer to the same object. Equalable.Equals( ) compares all its corresponding property values for equality. The query finds all the properties that have matching values. If the number of property values that match is not equal to the actual number of properties, we consider the two objects unequal. Note that the behavior of query expressions is distinctly different from regular code – for example, even though the query expression looks like a loop, you cannot just return false from within that “loop” the first time two values are inequal, as you might do with regular code. The query expression must run to completion before we are able to evaluate the results.17 Count( ) is an Enumerable extension method that returns the number of items in the sequence. Notice that the let clause injects the temporary variables leftValue and rightValue. We explore let clauses soon, but notice how these two simple variables clarify the code. Without let clauses, we’d have to repeat the GetValue( ) calls everywhere we use the variables. One major downside to Equalable is that to use it, it must be your base class. This doesn’t make sense because Equalable is more of a simple utility than a definition about what your object is. However, for this book’s simple demonstrations, we find using Equalable as a base-class vital to proving many assertions in the text. It may or may not be useful in your own programming. It makes several examples that follow clear and correct.18

16 If you don’t understand reflection, see [[[Either Richter’s book, or an appendix, or both]]]. 17 Later you will see the Any( ) method, which you could use here to eliminate some processor cycles. 18 Indeed, even in the FCL there are some deep and complicated hierarchies based on designs like this. It makes one wonder if the functionality is worth the complexity.

Query Expressions

King/Eckel ©2008 MindView, Inc.

79

In practice, queries involve more complex data structures than the simple types we’ve queried so far (such as ints and strings). We’ve created pure data-oriented classes that use data from the Northwind database — Customer, Order, OrderDetail, and Product. Since they inherit from Equalable, they compare their property values to determine equality (critical in many later assertions): //: MindView.Util\NorthwindDataClasses.cs // {CF: /target:library} using System; using MindView.Util; using System.Data.Linq.Mapping; [Table(Name = "Customers")] public class Customer : Equalable { [Column] public string CustomerID { get; set; } [Column] public string ContactName { get; set; } [Column] public string Country { get; set; } [Column] public string Phone { get; set; } } [Table(Name = "Orders")] public class Order : Equalable { [Column] public int OrderID { get; set; } [Column] public string CustomerID { get; set; } [Column] public DateTime OrderDate { get; set; } [Column] public DateTime? ShippedDate { get; set; } [Column] public string ShipCountry { get; set; } } [Table(Name = "Order Details")] public class OrderDetail : Equalable { [Column] public int OrderID { get; set; } [Column] public int ProductID { get; set; } [Column] public short Quantity { get; set; } [Column] public decimal UnitPrice { get; set; } } [Table(Name = "Products")] public class Product : Equalable { [Column] public int ProductID { get; set; } [Column] public string ProductName { get; set; } } ///:~

80

C# Query Expressions

Preview Release 1.0

The MindView.Util.Northwind class populates in-memory objects with data from the database.19 (Later we’ll study the LINQ to SQL attributes TableAttribute and ColumnAttribute.):20 //: MindView.Util\Northwind.cs // {CF: /target:library} using System.Data.Linq; using System.Collections.Generic; public static class Northwind { static DataContext db = new DataContext("Data Source=(local);" + "Initial Catalog=Northwind;" + "Integrated Security=True"); public static IEnumerable Customers { get { return db.GetTable(); } } public static IEnumerable Orders { get { return db.GetTable(); } } public static IEnumerable OrderDetails { get { return db.GetTable(); } } public static IEnumerable Products { get { return db.GetTable(); } }

19

For our purposes in this text, we ignore OrderDetail’s Discount.

20 You must modify the connection string to point to your own copy of the Northwind database. The hard-coded connection string shown here assumes you installed SQL Server and attached the Northwind database to it. If you download SQL Server Express as freeware, be sure to put the proper path to your database file. Your connection string will be similar to @"Data Source=.\SQLEXPRESS;AttachDbFilename=C:\NORTHWND.MDF;Integrated Security=True;User Instance=True".

Query Expressions

King/Eckel ©2008 MindView, Inc.

81

} ///:~

The DataContext db object pulls from the Northwind database to populate in-memory objects. We show the details in the LINQ to SQL chapter. Here is a simple example using the Northwind class: //: QueryExpressions\NorthwindAccess.cs using System.Linq; using MindView.Util; using System.Collections.Generic; class NorthwindAccess { static void ShowFirstThree(IEnumerable enumerable, string message) { (from item in enumerable.Take(3) select Reflector. PropertyValuesToString(item)).P(message); } static void Main() { ShowFirstThree(Northwind.Customers, "Customers"); ShowFirstThree(Northwind.Products, "Products"); ShowFirstThree(Northwind.Orders, "Orders"); ShowFirstThree(Northwind.OrderDetails, "OrderDetails"); } } /* Output: Customers: [ { CustomerID = ALFKI, ContactName = Maria Anders, Country = Germany, Phone = 030-0074321 }, { CustomerID = ANATR, ContactName = Ana Trujillo, Country = Mexico, Phone = (5) 555-4729 }, { CustomerID = ANTON, ContactName = Antonio Moreno, Country = Mexico, Phone = (5) 555-3932 } ] Products: [ { ProductID = 17, ProductName = Alice Mutton }, { ProductID = 3, ProductName = Aniseed Syrup }, { ProductID = 40, ProductName = Boston Crab Meat } ] Orders: [ { OrderID = 10248, CustomerID = VINET, OrderDate = 7/4/1996 12:00:00 AM, ShippedDate = 7/16/1996 12:00:00 AM, ShipCountry = France },

82

C# Query Expressions

Preview Release 1.0

{ OrderID = 10249, CustomerID = TOMSP, OrderDate = 7/5/1996 12:00:00 AM, ShippedDate = 7/10/1996 12:00:00 AM, ShipCountry = Germany }, { OrderID = 10250, CustomerID = HANAR, OrderDate = 7/8/1996 12:00:00 AM, ShippedDate = 7/12/1996 12:00:00 AM, ShipCountry = Brazil } ] OrderDetails: [ { OrderID = 10248, ProductID = 11, Quantity = 12, UnitPrice = 14.0000 }, { OrderID = 10248, ProductID = 42, Quantity = 10, UnitPrice = 9.8000 }, { OrderID = 10248, ProductID = 72, Quantity = 5, UnitPrice = 34.8000 } ] *///:~

The Northwind class properties Customers, Products, Orders, and OrderDetails return IEnumerables that produce instances of those classes. (The LINQ to SQL chapter shows how to query the data using IQueryable.) ShowFirstThree( ) dumps the first three objects to the console. The Enumerable extension method Take( ) returns the number of items requested. MindView.Util.Reflector.PropertyValuesToString( ) dumps an object’s property values to a string (following the same format the compiler uses when generating ToString( ) for an anonymous type).

Exercise 18: Use the Northwind class to list the names of Customers who live in Mexico.

Exercise 19: Write a query using the NorthWind class that pairs all Mexican customers with American customers.

let clauses let clauses place intermediate results in a temporary variable within query expressions. These intermediate variables clarify complex expressions: //: QueryExpressions\LetClause.cs using System.Linq; class LetClause { static void Main() {

Query Expressions

King/Eckel ©2008 MindView, Inc.

83

int[] numbers = { 1, 2, 3, 4, 5 }; // Select all numbers and their square // for numbers with a square that is // greater than five: var squareGreaterThanFive = from n in numbers let square = n * n where square > 5 select new { n, square }; squareGreaterThanFive.P(); } } /* Output: [ { n = 3, square = 9 }, { n = 4, square = 16 }, { n = 5, square = 25 } ] *///:~

square is a temporary variable inside our query. We could write this query without the let clause, but we’d have to repeat the expression “n * n” twice. let clauses, like many local variables, provide a way to factor your code on the smallest level. They also enhance readability. You can see the benefit of intermediate variables in this more complex query expression: //: QueryExpressions\RootFinder.cs using System; using System.Linq; using System.Collections.Generic; class Coefficients { public double A { get; set; } public double B { get; set; } public double C { get; set; } } class RootFinder { static void Main() { // Coefficients for the quadratic formula var coefficients = new List { new Coefficients { A = 1, B = 3, C = -4}, new Coefficients { A = 1, B = -6, C = 9 }, new Coefficients { A = 1, B = -4, C = 8}

84

C# Query Expressions

Preview Release 1.0

}; // Put both roots into an anonymous type (ugly): var roots = from c in coefficients select new { FirstRoot = (-c.B + Math.Sqrt( Math.Pow(c.B, 2) - 4 * c.A * c.C)) / (2 * c.A), SecondRoot = (-c.B - Math.Sqrt( Math.Pow(c.B, 2) - 4 * c.A * c.C)) / (2 * c.A) }; // Factor and make more readable: var roots2 = from c in coefficients let negativeB = -c.B let bSquared = Math.Pow(c.B, 2) let fourAC = 4 * c.A * c.C let sqrtBsquaredMinusFourAC = Math.Sqrt(bSquared - fourAC) let twoA = 2 * c.A select new { FirstRoot = (negativeB + sqrtBsquaredMinusFourAC) / twoA, SecondRoot = (negativeB - sqrtBsquaredMinusFourAC) / twoA }; roots.AssertEquals(roots2); roots.P(); } } /* Output: [ { FirstRoot = 1, SecondRoot = -4 }, { FirstRoot = 3, SecondRoot = 3 }, { FirstRoot = NaN, SecondRoot = NaN } ] *///:~

The program implements the quadratic formula:

-b ± b 2 − 4ac x= 2a

Query Expressions

King/Eckel ©2008 MindView, Inc.

85

We use lists of Coefficient objects in the quadratic formula. The duplicate expressions that make the first query more difficult to read also make it harder to maintain. We improve legibility in the second query with variables that match the solution’s equation names. We use variable names like twoA and fourAC to make the select clause read more like the actual formula. Note that AssertEquals( ) ensures identical sequences of anonymous types from both select clauses. For math buffs, coefficients contains the three possible situations for the quadratic formula, respectively: two roots, exactly one root, and no roots The compiler translates let clauses into nested queries, which we examine in the Nested Queries section of this chapter.

Exercise 20: Select the value and the square root of the numbers 1 to 100 with square roots greater than 5 and less than 6. Use a let clause.

Ordering data orderby sorts data: //: QueryExpressions\Ordering.cs using System.Linq; class Ordering { static void Main() { int[] numbers = { 3, 6, 4, 8, 2 }; var ascending = from n in numbers orderby n select n; ascending.AssertEquals(new[] { 2, 3, 4, 6, 8 }); var ascendingTranslation1 = // select is degenerate, so just OrderBy(): numbers.OrderBy(n => n); ascending.AssertEquals(ascendingTranslation1); var ascendingTranslation2 = Enumerable.OrderBy(numbers, n => n); ascendingTranslation1 .AssertEquals(ascendingTranslation2); // More realistic example:

86

C# Query Expressions

Preview Release 1.0

var ordered = from o in Northwind.Orders orderby o.OrderDate select o; var orderedTranslation1 = Northwind.Orders.OrderBy(o => o.OrderDate); ordered.AssertEquals(orderedTranslation1); var orderedTranslation2 = Enumerable.OrderBy(Northwind.Orders, o => o.OrderDate); orderedTranslation1.AssertEquals(orderedTranslation2); } } ///:~

When you sort with orderby, your type must implement IComparable. In the next example, descending overrides the default ascending (which sorts the data from low to high). We also apply orderby to more than one field:21 //: QueryExpressions\Ordering2.cs using System.Linq; class Ordering2 { static void Main() { var ordered = from c in Northwind.Customers orderby c.Country descending, c.ContactName ascending select new { c.Country, c.ContactName } ; // Translates to var orderedTranslation1 = Northwind.Customers. OrderByDescending(c => c.Country). ThenBy(c => c.ContactName). Select(c => new { c.Country, c.ContactName }); ordered.AssertEquals(orderedTranslation1); // Which translates using extension // method lookup rules: var orderedTranslation2 = Enumerable.Select(

21 Your query can redundantly specify ascending the same way we specify descending in the next example.

Query Expressions

King/Eckel ©2008 MindView, Inc.

87

Enumerable.ThenBy( Enumerable.OrderByDescending(Northwind.Customers, c => c.Country), c => c.ContactName), c => new { c.Country, c.ContactName }); orderedTranslation1.AssertEquals(orderedTranslation2); ordered.Take(10).P(); } } /* Output: [ { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = ] *///:~

Venezuela, ContactName = Carlos González }, Venezuela, ContactName = Carlos Hernández }, Venezuela, ContactName = Felipe Izquierdo }, Venezuela, ContactName = Manuel Pereira }, USA, ContactName = Art Braunschweiger }, USA, ContactName = Fran Wilson }, USA, ContactName = Helvetius Nagy }, USA, ContactName = Howard Snyder }, USA, ContactName = Jaime Yorres }, USA, ContactName = John Steel }

First we order by Country, then by ContactName. We specified descending for Country, so “Venezuela” comes before “USA.” descending changes the OrderBy( ) call to an OrderByDescending( ) call. The ContactNames are ordered within the Country names. Here, orderby arranges each ContactName by its Country and sorts the ContactNames within each Country group. The compiler changes successive ordering conditions to ThenBy( ) and ThenByDescending( ) calls. Like chained clauses, each returned collection is passed to the subsequent clause. ThenBy( ) and ThenByDescending( ) must consider their original source order when ordering the items. If the compiler used successive OrderBy( ) calls instead of ThenBy( ) calls in this example, the country names would re-scramble, but the ContactNames would be ordered as a whole. OrderBy( ) and OrderByDescending( ) return IOrderedEnumerable or IOrderedQueryable objects depending on whether you’re using Enumerable or Queryable. The types that implement these interfaces must maintain the input sequence’s original order criteria while ordering by the new criteria. For example, since we ordered by Country then

88

C# Query Expressions

Preview Release 1.0

ContactName, the IOrderedEnumerable must maintain the Country order while ordering the ContactNames. ThenBy( ) and ThenByDescending( ) also return objects that implement these interfaces, thus you can order by as many fields as needs be.

Exercise 21: Sort all Orders shipped to Mexico and Germany by CustomerID and descending order of ShippedDate.

Exercise 22: What’s the UnitPrice of the most expensive Product ever sold?

Exercise 23: Prove that an OrderBy( ) followed by a ThenBy( ) does not always produce the same results as an OrderBy( ) followed by another OrderBy( ).

Exercise 24: Sort all the Customer Phone numbers ignoring parentheses. Keep the parentheses, however, in your output.

Grouping data You often want to group data by selected properties. For example, Ordering2.cs places ContactNames by their Country; group…by puts the items into actual groups: //: QueryExpressions\Grouping.cs using System.Linq; using System.Collections.Generic; class Grouping { static void Main() { // group...by makes IGrouping objects: IEnumerable> grouped1 = from c in Northwind.Customers group c.ContactName by c.Country; foreach(IGrouping<string, string> group in grouped1.Take(3)) group.Take(2).P(group.Key, POptions.NoNewLines); // Translation: var grouped2 = Northwind.Customers .GroupBy(c => c.Country, c => c.ContactName); grouped1.AssertEquals(grouped2); var grouped3 =

Query Expressions

King/Eckel ©2008 MindView, Inc.

89

Enumerable.GroupBy(Northwind.Customers, c => c.Country, c => c.ContactName); grouped2.AssertEquals(grouped3); } } /* Output: Germany: [Maria Anders, Hanna Moos] Mexico: [Ana Trujillo, Antonio Moreno] UK: [Thomas Hardy, Victoria Ashworth] *///:~

The first declarations are explicit (instead of using var), so you see the query expression’s product. group...by groups one item based on another and returns a sequence of IGrouping objects. Here, grouping the ContactNames by their Country produces grouped1. IGrouping has a Key and is an IEnumerable of items in the group. In this example, each IGrouping’s Key is the Country, and each is an IEnumerable of the ContactNames. The compiler translates group…by to GroupBy( ) (shown above). group…by must be the last clause in your query (like select) and it chooses data (also like select). See, for instance, ContactNames in the example above. Notice that the lambda order is swapped in the translation. To limit the output, we Take( ) the first three groups, and then the first two items in each group. You can also group the actual objects, or generate groups of new objects, instead of a grouping of property values: //: QueryExpressions\GroupingDifferentTypes.cs using System.Linq; class GroupingDifferentTypes { static void Main() { // Group actual elements: var grouped1 = from c in Northwind.Customers group c by c.Country; // Print number in each group: foreach(var group in grouped1.Take(3)) group.Count().P(group.Key);

90

C# Query Expressions

Preview Release 1.0

// Uses a different overload in translation: var grouped2 = Northwind.Customers.GroupBy(c => c.Country); grouped1.AssertEquals(grouped2); var grouped3 = Enumerable.GroupBy( Northwind.Customers, c => c.Country); grouped2.AssertEquals(grouped3); // Same grouping criteria but // creates anonymous instances: var differentType = from c in Northwind.Customers group new { c.CustomerID, c.ContactName } by c.Country; foreach(var group in differentType.Take(3)) group.P(group.Key); } } /* Output: Germany: 11 Mexico: 5 UK: 7 Germany: [ { CustomerID { CustomerID { CustomerID { CustomerID { CustomerID { CustomerID { CustomerID { CustomerID { CustomerID { CustomerID { CustomerID ] Mexico: [ { CustomerID { CustomerID { CustomerID { CustomerID { CustomerID ] UK: [ { CustomerID { CustomerID

Query Expressions

= = = = = = = = = = =

ALFKI, BLAUS, DRACD, FRANK, KOENE, LEHMS, MORGK, OTTIK, QUICK, TOMSP, WANDK,

ContactName ContactName ContactName ContactName ContactName ContactName ContactName ContactName ContactName ContactName ContactName

= = = = = = = = = = =

Maria Anders }, Hanna Moos }, Sven Ottlieb }, Peter Franken }, Philip Cramer }, Renate Messner }, Alexander Feuer }, Henriette Pfalzheim }, Horst Kloss }, Karin Josephs }, Rita Müller }

= = = = =

ANATR, ANTON, CENTC, PERIC, TORTU,

ContactName ContactName ContactName ContactName ContactName

= = = = =

Ana Trujillo }, Antonio Moreno }, Francisco Chang }, Guillermo Fernández }, Miguel Angel Paolino }

= AROUT, ContactName = Thomas Hardy }, = BSBEV, ContactName = Victoria Ashworth },

King/Eckel ©2008 MindView, Inc.

91

{ { { { {

CustomerID CustomerID CustomerID CustomerID CustomerID

= = = = =

CONSH, EASTC, ISLAT, NORTS, SEVES,

ContactName ContactName ContactName ContactName ContactName

= = = = =

Elizabeth Brown }, Ann Devon }, Helen Bennett }, Simon Crowther }, Hari Kumar }

] *///:~

The Enumerable extension method Count( ) returns the number of items in the sequence. The compiler uses a different overload of GroupBy( ) when we group the actual Customer objects because the source collection already has the objects we are choosing. We no longer require the third argument selector method. This may be why the group…by clause arguments are swapped in comparison to the GroupBy( ) method arguments. Most of the clauses map the arguments to their respective method calls in the same order as they appear in the clause. However, group…by breaks this trend, possibly because of this particular overload. If the delegate arguments were swapped, the compiler would have to always include the unnecessary selector. The last grouping also groups by Country, but we choose a new anonymous instance instead of Customers or a Customer property value as before. There is also a version of GroupBy( ) that takes an IEqualityComparer for the selected items, so you can provide a comparer that determines which elements are equal. You must call this overload explicitly (there’s no query expression syntax support). To group by more than one field, use anonymous types to group by more than one field. For example, here we find Customers who have made more than one Order on the same OrderDate by grouping by the CustomerID and the OrderDate: //: QueryExpressions\GroupingByMoreThanOneField.cs using System.Linq; class GroupingByMoreThanOneField { static void Main() { var groups = from o in Northwind.Orders group o by new { o.CustomerID, o.OrderDate }; var moreThanOneOrderOnASingleDay =

92

C# Query Expressions

Preview Release 1.0

from g in groups where g.Count() > 1 orderby g.Key.CustomerID select g.Key.CustomerID; moreThanOneOrderOnASingleDay.P(POptions.NoNewLines); } } /* Output: [BOTTM, GREAL, KOENE, LACOR, LINOD, SAVEA, SAVEA] *///:~

Notice the composite grouping condition in the first query expression. The second query takes these groups and finds the ones with more than one element. Such groups mean that the Customer made more than one order on the OrderDate. The CustomerID “SAVEA” appears twice in the output because that Customer made multiple Orders on more than one OrderDate. We could eliminate this with Distinct( ), but the output is interesting enough to keep. group…by maintains the original order of its input sequence. If you wish the groups to be ordered, you must orderby the grouping criteria. If you wish the elements within the group to be ordered, you must order them as well: //: QueryExpressions\OrderedGroups.cs using System.Linq; using System.Collections.Generic; class OrderedGroups { static void ShowSomeGroups(string message, IEnumerable customers) { (message + ":").P(); var grouped = from c in customers group c.ContactName by c.Country; foreach(var group in grouped.Take(3)) group.Take(3) .P(" " + group.Key, POptions.NoNewLines); } static void Main() { ShowSomeGroups("No ordering", Northwind.Customers); ShowSomeGroups("Ordered by ContactName", Northwind.Customers.OrderBy(c => c.ContactName)); ShowSomeGroups("Ordered by Country", Northwind.Customers.OrderBy(c => c.Country));

Query Expressions

King/Eckel ©2008 MindView, Inc.

93

ShowSomeGroups("Ordered by ContactName then Country", Northwind.Customers.OrderBy(c => c.ContactName) .ThenBy(c => c.Country)); ShowSomeGroups("Ordered by Country then ContactName", Northwind.Customers.OrderBy(c => c.Country) .ThenBy(c => c.ContactName)); } } /* Output: No ordering: Germany: [Maria Anders, Hanna Moos, Sven Ottlieb] Mexico: [Ana Trujillo, Antonio Moreno, Francisco Chang] UK: [Thomas Hardy, Victoria Ashworth, Elizabeth Brown] Ordered by ContactName: Spain: [Alejandra Camino, Diego Roel, Eduardo Saavedra] Germany: [Alexander Feuer, Hanna Moos, Henriette Pfalzheim] Mexico: [Ana Trujillo, Antonio Moreno, Francisco Chang] Ordered by Country: Argentina: [Patricio Simpson, Yvonne Moncada, Sergio Gutiérrez] Austria: [Roland Mendel, Georg Pipps] Belgium: [Catherine Dewey, Pascale Cartrain] Ordered by ContactName then Country: Spain: [Alejandra Camino, Diego Roel, Eduardo Saavedra] Germany: [Alexander Feuer, Hanna Moos, Henriette Pfalzheim] Mexico: [Ana Trujillo, Antonio Moreno, Francisco Chang] Ordered by Country then ContactName: Argentina: [Patricio Simpson, Sergio Gutiérrez, Yvonne Moncada] Austria: [Georg Pipps, Roland Mendel] Belgium: [Catherine Dewey, Pascale Cartrain] *///:~

ShowSomeGroups( ) displays some sample group names along with some of the items in each group. When we order by just the ContactNames, the groups are unordered, but the actual elements within each group are ordered. This is reversed when we order only by the Country names. Notice that ordering by ContactName then Country puts the ContactNames in order, but the group names are re-scrambled during the grouping process. However, ordering by Country and then ContactName orders both the group names, and the elements within each group, which is usually what you want.

94

C# Query Expressions

Preview Release 1.0

Exercise 25: Use the Enumerable.Count( ) to find the number of elements in each group of OrderDetails, grouped by their ProductID.

Exercise 26: List each country and its number of Customers (much like the previous exercise).

Exercise 27: Group all Customers by whether their Phone numbers include parentheses. Although not necessary to solve this exercise, try making a custom IEqualityComparer to accomplish the task.

Joining data Foreign-key-constraints require a value to exist in one table in order for it to exist in another table. For example, for an Order to exist, its CustomerID must exist in the Customer table: //: QueryExpressions\ACustomersOrders.cs using System.Linq; class ACustomersOrders { static void Main() { string customerID = Northwind.Customers.ElementAt(15).CustomerID; var orders = from o in Northwind.Orders where o.CustomerID.Equals(customerID) select o.OrderID; orders.P(customerID, POptions.NoNewLines); } } /* Output: CONSH: [10435, 10462, 10848] *///:~

Here we first grab an arbitrary CustomerID, and then find all orders with that same CustomerID. Using the Northwind databases’ default setup, if you attempt to insert an Order for a CustomerID that doesn’t exist, the database will give you an error saying you violated the foreign key constraint. It’s common to combine the data from different tables on foreign keys,22 which you do using a join:

22

You can join on any fields you like, but foreign keys are the most common.

Query Expressions

King/Eckel ©2008 MindView, Inc.

95

//: QueryExpressions\Join.cs using System.Linq; class Join { static void Main() { // Get the customers with their order dates: var customerOrders = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID select new { c.ContactName, o.OrderID }; // Translation: var customerOrdersTranslation1 = Northwind.Customers.Join(Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c.ContactName, o.OrderID }); customerOrders.AssertEquals( customerOrdersTranslation1); var customerOrdersTranslation2 = Enumerable.Join(Northwind.Customers, Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c.ContactName, o.OrderID }); customerOrdersTranslation1.AssertEquals( customerOrdersTranslation2); // Can forgo the join and use multiple from clauses: var customerOrders2 = from c in Northwind.Customers from o in Northwind.Orders where c.CustomerID == o.CustomerID select new { c.ContactName, o.OrderID }; customerOrders.AssertEquals(customerOrders2); customerOrders.Take(10).P(); } } /* Output: [ { ContactName = Maria Anders, OrderID = 10643 }, { ContactName = Maria Anders, OrderID = 10692 }, { ContactName = Maria Anders, OrderID = 10702 }, { ContactName = Maria Anders, OrderID = 10835 }, { ContactName = Maria Anders, OrderID = 10952 }, { ContactName = Maria Anders, OrderID = 11011 }, { ContactName = Ana Trujillo, OrderID = 10308 }, { ContactName = Ana Trujillo, OrderID = 10625 },

96

C# Query Expressions

Preview Release 1.0

{ ContactName = Ana Trujillo, OrderID = 10759 }, { ContactName = Ana Trujillo, OrderID = 10926 } ] *///:~

The compiler changes a join clause to a Join( ) call. Join( ) takes two collections and a selector method for each. Both selectors return the values to compare with the other’s corresponding result. For example, we can’t compare Customer objects directly to Order objects. Instead, the first selector returns a Customer’s CustomerID, and the second selector returns an Order’s CustomerID. Join( ) finds any matching pairs and sends them to its last selector argument. Two from clauses achieve the same results without the join in the last query. We replace join with from, on with where, and equals with ==. But this is less intuitive and requires a SelectMany( ) and Where( ) call instead of a single Join( ), which evaluates all the data at once with no extra steps. Notice that the compiler combined the join and select clause into a single Join( ) call. This is fine because Join( )’s last delegate is considered a selector. We’ve seen similar behavior before with multiple from clauses followed directly by a select (the compiler combines the last from and select into a single SelectMany( ) call instead of a SelectMany( ) followed by Select( )). When the join clause is not immediately followed by a select, the compiler packs values into an anonymous type for further evaluation: //: QueryExpressions\JoinNotFollowedBySelect.cs using System.Linq; class JoinNotFollowedBySelect { static void Main() { var customerOrders = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID where c.Country.Contains('a') // Now a where clause select new { c.ContactName, o.OrderID }; // Different translation this time: var customerOrdersTranslation = Northwind.Customers.Join( Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c, o }) .Where(temp => temp.c.Country.Contains('a'))

Query Expressions

King/Eckel ©2008 MindView, Inc.

97

.Select(temp => new { temp.c.ContactName, temp.o.OrderID }); customerOrders.AssertEquals(customerOrdersTranslation); } } ///:~

Now the compiler packs c and o together into an anonymous type, and extracts c in the Where( ), and both c and o in the Select( ). Since join, like from, introduces another iteration variable and a source, you can refer to the iteration variable later in the query. However, you are restricted as to where you can use the variable: //: QueryExpressions\JoinVariableRestrictions.cs using System.Linq; class JoinVariableRestrictions { static void Main() { var joined = from c in Northwind.Customers join o in Northwind.Orders // OK: on c.CustomerID equals o.CustomerID select new { c, o }; /*c! var error = from c in Northwind.Customers join o in Northwind.Orders // c only available on left side of "equals" // o only available on right side of "equals" on o.CustomerID equals c.CustomerID select new { c, o }; */ // This makes sense when you // look at the translation: var error2 = Northwind.Customers.Join(Northwind.Orders, o => o.CustomerID, // o is really a Customer! c => c.CustomerID, // c is really an Order! (c, o) => new { c, o }); } } ///:~

The second query attempts to use o on the left side of the equals and c on the right. But Customers is the left input, and Orders is the right. Thus

98

C# Query Expressions

Preview Release 1.0

swapping the variables doesn’t make sense, and the compiler catches it. The translation further illustrates this. Although the translation compiles, o is really the Customer object, and c is the Order. joins can be used in long chains: //: QueryExpressions\JoinChains.cs using System.Linq; class JoinChains { static void Main() { // Get all the products a customer has ordered: var customerProducts = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID join d in Northwind.OrderDetails on o.OrderID equals d.OrderID join p in Northwind.Products on d.ProductID equals p.ProductID select new { c.ContactName, p.ProductName }; customerProducts.Take(5).P(); } } /* Output: [ { ContactName = Maria Anders, ProductName = Rössle Sauerkraut }, { ContactName = Maria Anders, ProductName = Chartreuse verte }, { ContactName = Maria Anders, ProductName = Spegesild }, { ContactName = Maria Anders, ProductName = Vegie-spread }, { ContactName = Maria Anders, ProductName = Aniseed Syrup } ] *///:~

We must join several data sources to map each ContactName to the ProductNames they have purchased. The translation of the query above is rather involved because it requires the compiler to pack several layers deep, one layer for each subsequent join: //: QueryExpressions\JoinChainsTranslated.cs using System.Linq;

Query Expressions

King/Eckel ©2008 MindView, Inc.

99

class JoinChainsTranslated { static void Main() { var customerProducts = Northwind.Customers .Join(Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c, o }) .Join(Northwind.OrderDetails, // nt means "nonTransparent" nt => nt.o.OrderID, d => d.OrderID, (nt, d) => new { nt, d }) .Join(Northwind.Products, nt2 => nt2.d.ProductID, p => p.ProductID, (nt2, p) => new { nt2.nt.c.ContactName, p.ProductName }); customerProducts.Take(5).P(); } } /* Output: [ { ContactName = Maria Anders, ProductName = Rössle Sauerkraut }, { ContactName = Maria Anders, ProductName = Chartreuse verte }, { ContactName = Maria Anders, ProductName = Spegesild }, { ContactName = Maria Anders, ProductName = Vegie-spread }, { ContactName = Maria Anders, ProductName = Aniseed Syrup } ] *///:~

Notice the first Join( ) packs c and o together into an anonymous type. The second Join( ) unpacks o in its first lambda (we named the anonymous type instance nt) to select OrderID for comparison. In its final selector lambda, the second Join( ) packs temp into yet another anonymous type for the third Join( ). The third Join( )’s first and last lambda expressions also unpack. Notice in the last lambda how far the compiler must unpack to retrieve ContactName.

100

C# Query Expressions

Preview Release 1.0

You can join any two data sources on anything as long as the types on both side of the equals are compatible. This is an excellent way to join on more than one field (or, a more likely situation, composite keys): //: QueryExpressions\JoiningOnMoreThanOneField.cs using System.Linq; class JoiningOnMoreThanOneField { static void Main() { var moreThanOneField = from c in Northwind.Customers join o in Northwind.Orders on new { c.CustomerID, c.Country } equals new { o.CustomerID, Country = o.ShipCountry } select new { c.ContactName, o.OrderID }; } } ///:~

Here we join Customers and Orders when the CustomerIDs match, and when the Country and ShipCountry match, respectively. We use an anonymous type on both sides of the equals. This is possible because the compiler generates Equals( ) and GetHashCode( ) for anonymous types. joins will also group with into, but we must first examine some other concepts.

Exercise 28: Use a join and a group…by to group all OrderIDs by their Customer ContactNames. (Hint: In the translation, you must pack values in the Join( ) for extraction by GroupBy( )).

Exercise 29: join Customers with itself to find any Customers that have the same ContactName. Exercise 30: Pair all Customers together, listing the Customer with the earlier name in the alphabet first. (Hint: A join won’t work for this one. Why not?) Exercise 31: List each CustomerID along with the ProductIDs they ordered, as well as the number of times they have ordered that Product. (Hint: group the OrderDetails by an anonymous type that holds each CustomerID with the ProductID, then count how many are in each group).

Query Expressions

King/Eckel ©2008 MindView, Inc.

101

Exercise 32: Join Customers with Orders on Customer.Country and Order.ShipCountry.

Exercise 33: You’ve seen iteration variables on the proper sides of the equals. Now try this variation in JoinChains.cs: join Orders to OrderDetails to Customers to Products, in that order. Does this work? Why? (Hint: The translation makes the answer clear.)

Exercise 34: Write two queries to retrieve all Orders, pairing each Order with its OrderDetails into an anonymous type: a. The first query uses a join b. The second query uses multiple froms

Nested Queries A nested query is a query within a query. Since queries evaluate IEnumerable types,23 preparing data via IEnumerable extension methods is a form of a nested query: //: QueryExpressions\DataPrep.cs using System.Linq; class DataPrep { static void Main() { var firstOrderedDates = from o in Northwind.Orders.Take(10) orderby o.OrderDate select o.OrderDate; firstOrderedDates.P(); } } /* Output: [ 7/4/1996 12:00:00 AM, 7/5/1996 12:00:00 AM, 7/8/1996 12:00:00 AM, 7/8/1996 12:00:00 AM, 7/9/1996 12:00:00 AM, 7/10/1996 12:00:00 AM, 7/11/1996 12:00:00 AM,

23 The semantics are the same for both IEnumerable and IQueryable, and Enumerable and Queryable.

102

C# Query Expressions

Preview Release 1.0

7/12/1996 12:00:00 AM, 7/15/1996 12:00:00 AM, 7/16/1996 12:00:00 AM ] *///:~

Calling Take(10) on Orders in the from clause filters out the first ten Order objects before the query evaluates its results. You must call Take( ) directly because there’s no clause that maps to it in query expressions. In this example, you could consider Take( ) a nested query because it manipulates the data before the from clause. If you are querying a non-generic type, cast each element by following the variable declaration with the proper typename. The compiler prepares the data via a Cast( ) call, making this a form of a nested query: //: QueryExpressions\Casting.cs using System.Linq; using System.Collections; class Casting { static void Main() { var numbers = new ArrayList(new []{ 1, 2, 3, 4, 5 }); var timesTwo = from int n in numbers // Explicit declaration select n * 2; // Translates to: timesTwo = from n in numbers.Cast() select n * 2; // Translates to: timesTwo = numbers.Cast().Select(n => n * 2); // Translates to: timesTwo = Enumerable.Select( Enumerable.Cast(numbers), n => n * 2); } } ///:~

Cast( ) converts the non-generic ArrayList of numbers to IEnumerable, where T is the target cast type (int in this example). Of course, the cast will fail if any object in the container is not convertible to the target type.

Query Expressions

King/Eckel ©2008 MindView, Inc.

103

You can also cast in join clauses, since join is a form of from: //: QueryExpressions\CastingJoins.cs using System.Linq; using MindView.Util; static class CastingJoins { static void Main() { var customers = Northwind.Customers.ToArrayList(); var orders = Northwind.Orders.ToArrayList(); var customerOrderDates = from Customer c in customers join Order o in orders on c.CustomerID equals o.CustomerID select new { c.ContactName, o.OrderDate }; } } ///:~

For demonstration only, we put Customers and Orders into an ArrayList and cast them back. Since query expressions are expressions, you can use them anywhere an IEnumerable is needed. For example, here we use a nested query (instead of a join) to find all Customers who have Orders. //: QueryExpressions\CustomersWithOrders.cs using System.Linq; class CustomersWithOrders { static void Main() { // Find all customers that have orders: var result1 = from c in Northwind.Customers where (from o in Northwind.Orders select o.CustomerID).Contains(c.CustomerID) select c; // Two customers that don't have any orders: (Northwind.Customers.Count() - result1.Count()) .AssertEquals(2); // Slightly more readable using Select() directly: var result2 = from c in Northwind.Customers where Northwind.Orders.Select(o => o.CustomerID) .Contains(c.CustomerID)

104

C# Query Expressions

Preview Release 1.0

select c; result1.AssertEquals(result2); // Above query is a step in the translation. The rest: var result3 = Northwind.Customers.Where(c => Northwind.Orders.Select(o => o.CustomerID) .Contains(c.CustomerID)); // Select() is degenerate result2.AssertEquals(result3); var result4 = Enumerable.Where(Northwind.Customers, c => Enumerable.Contains( Enumerable.Select(Northwind.Orders, o => o.CustomerID), c.CustomerID) ); result3.AssertEquals(result4); } } ///:~

The outer query embeds a query in the where clause (instead of from, as we just saw).

Exercise 35: Use nested queries to find all the Customer ContactNames in the three countries that have the most Orders. (Hint: Write a query in the where clause to find the three countries with the most orders, then combine that query with Enumerable.Contains( ) in the where.)

into As we said earlier, query expressions must end in either a select or group…by. Therefore, you must insert all filtering, joining, and ordering clauses before the final selection clause. into, however, places each result of the query above it into a variable for the query below. into concatenates two queries, and the first query appears to continue after its selection clause: //: QueryExpressions\Into.cs using System.Linq; class Into { static void Main() { int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9 }; // Without into:

Query Expressions

King/Eckel ©2008 MindView, Inc.

105

var results1 = from i in // Above query: from o in numbers where o < 7 select o // Below query: where i > 1 select i; // into nests its above query into its below query var results2 = // Above query: from o in numbers where o < 7 select o into i // Below query: where i > 1 select i; results1.AssertEquals(results2); } } ///:~

Notice that into simply re-orders the queries, nesting its above query into a hidden from clause for its below query. into allows us to write queries in their order of execution, which improves readability appreciably (and spares you deciphering from the innermost query outward). into also works with group…by. Here is the same query expressed three different ways, the first two without into: //: QueryExpressions\NestedGrouping.cs using System.Linq; class NestedGrouping { static void Main() { // How many orders from each country? // Get groupings first: var groupedOrders = from o in Northwind.Orders group o by o.ShipCountry; // Count each element in each IGrouping: var counts1 = from g in groupedOrders

106

C# Query Expressions

Preview Release 1.0

select new { Country = g.Key, OrderCount = g.Count() }; // Or, just nest the first query: var counts2 = from g in from o in Northwind.Orders group o by o.ShipCountry select new { Country = g.Key, OrderCount = g.Count() }; counts1.AssertEquals(counts2); // Using into is best: var counts3 = from o in Northwind.Orders group o by o.ShipCountry into g select new { Country = g.Key, OrderCount = g.Count() }; counts2.AssertEquals(counts3); } } ///:~

Remember GroupBy( ) returns IGrouping objects, which are Keys mapped to an IEnumerable sequence of the objects. We Count( ) the number of elements in each group and choose the result using the corresponding Key (ShipCountry name) in an anonymous type. The first approach stores the results of the first of its two queries in the temporary groupedOrders variable. The second approach nests the first query, so it needs no temporary variable. The into clause makes the last query much easier to read. into’s elegance makes our third approach the best of three solutions (though they retrieve the same results). into with a group…by nests the preceding query into a variable for the following query, making the two queries read as one.24

24

Just as we have used a select clause for the same result.

Query Expressions

King/Eckel ©2008 MindView, Inc.

107

Exercise 36: Rewrite the first query in GroupingDifferentTypes.cs to combine each group’s Key and Count( ) into a single anonymous object (instead of having to Count( ) within the foreach).

Exercise 37: Use into to find all the Customers in the three countries with the most orders. This is a slight revision of [[[exercise x]]] Exercise 38: List all CustomerIDs along with the ProductIDs they ordered combined with the number of times they ordered that product. Sort the results by descending number of times that the Customer ordered each Product. Use a group into for this repeat of Exercise {#?field code}.

let clause translations Now that you understand nested queries, you can decipher the compiler’s translation of let clauses. C# uses transparent identifiers when translating let clauses: //: QueryExpressions\LetClauseExposed.cs using System.Linq; class LetClauseExposed { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; // You saw this query before // in the let clause section: var squareGreaterThanFive = from n in numbers let square = n * n where square > 5 select new { n, square }; squareGreaterThanFive.P(); // Translates to: /*c! var squareGreaterThanFiveTranslation = from transparent in from n in numbers select new { n, square = n * n } where square > 5 select new { n, square }; */ // Scope transparent identifiers: var squareGreaterThanFiveTranslation2 = from nonTransparent in

108

C# Query Expressions

Preview Release 1.0

from n in numbers select new { n, square = n * n } where nonTransparent.square > 5 select new { nonTransparent.n, nonTransparent.square }; squareGreaterThanFive .AssertEquals(squareGreaterThanFiveTranslation2); // etc. } } /* Output: [ { n = 3, square = 9 }, { n = 4, square = 16 }, { n = 5, square = 25 } ] *///:~

The middle query doesn’t compile because C# doesn’t provide transparent identifiers directly. Internally, however, the compiler uses them to make query expressions easier to read and write. The compiler shifts the let expression into a nested query with a single select (as the second query shows). The nested query packs the original iteration variable (n) and a second field using the let variable name (square = n * n) into an anonymous type. The compiler’s let clause translation poses an issue: we don’t want to be aware of the anonymous type. Transparent variables resolve this by opening their scope. For example, the second query’s transparent variable is an instance of the generated anonymous type with fields {n, square}. The query doesn’t compile because we don’t explicitly scope transparent.square in the where and the last select clause. Yet the first query compiles when the compiler translates it in exactly the same way. The compiler puts square into scope by explicitly scoping it, as shown in the third query (the next step in the translation process). The basic use of let would require deeper knowledge if we had to be aware of and compensate for the compiler-injected anonymous type and query. Writing and reading let clauses would feel unnatural. Notice that if the compiler allowed us to refer directly to the transparent identifier, we could select it rather than having to repack both n and square together into an anonymous type. It so happens that what we’re selecting has

Query Expressions

King/Eckel ©2008 MindView, Inc.

109

the same structure as the transparent identifier, but this is usually not the case. When you have multiple let clauses, the rules apply recursively, as each let clause injects a nested from and packs transparent identifiers within transparent identifiers: //: QueryExpressions\MultipleLetClauses.cs using System.Linq; class MultipleLetClauses { static void Main() { var productDetailPrices = from od in Northwind.OrderDetails let FullPrice = od.Quantity * od.UnitPrice let Key = new { od.OrderID, od.ProductID } select new { Key, FullPrice }; // Rewrite first let to nested from: /*c! var productDetailPricesTranslation1 = from transparent1 in from od in Northwind.OrderDetails select new { od, FullPrice = od.Quantity * od.UnitPrice } let Key = new { od.OrderID, od.ProductID } select new { Key, FullPrice }; */ // Now rewrite second let clause to nested from: /*c! var productDetailPricesTranslation2 = from transparent2 in from transparent1 in from od in Northwind.OrderDetails select new { od, FullPrice = od.Quantity * od.UnitPrice } select new { transparent1, Key = new { od.OrderID, od.ProductID } } select new { Key, FullPrice }; */ // Scope the transparent identifiers:

110

C# Query Expressions

Preview Release 1.0

var productDetailPricesTranslation3 = from transparent2 in from transparent1 in from od in Northwind.OrderDetails select new { od, FullPrice = od.Quantity * od.UnitPrice } select new { transparent1, Key = new { transparent1.od.OrderID, transparent1.od.ProductID } } select new { transparent2.Key, transparent2.transparent1.FullPrice }; productDetailPrices.AssertEquals( productDetailPricesTranslation3); // Rewrite to extension-method calls: var productDetailPricesTranslation4 = Northwind.OrderDetails. Select( od => new { od, FullPrice = od.Quantity * od.UnitPrice }). Select(transparent1 => new { transparent1, Key = new { transparent1.od.OrderID, transparent1.od.ProductID } }). Select(transparent2 => new { transparent2.Key, transparent2.transparent1.FullPrice }); productDetailPricesTranslation3.AssertEquals( productDetailPricesTranslation4); // Finally rewrite to static-method calls: var productDetailPricesTranslation5 =

Query Expressions

King/Eckel ©2008 MindView, Inc.

111

Enumerable.Select( Enumerable.Select( Enumerable.Select( Northwind.OrderDetails, od => new { od, FullPrice = od.Quantity * od.UnitPrice }), transparent1 => new { transparent1, Key = new { transparent1.od.OrderID, transparent1.od.ProductID } }), transparent2 => new { transparent2.Key, transparent2.transparent1.FullPrice }); productDetailPricesTranslation4.AssertEquals( productDetailPricesTranslation5); } } ///:~

The compiler processes let clauses from the top down. First it makes a nested from clause to introduce a transparent identifier for FullPrice. Then it repeats the step for Key, making the first nested query into yet another nested query. Although we commented the transparent identifier code, notice that the compiler can scope a variable regardless of the level of nesting. For example, the compiler explicitly qualifies productDetailTranslation2’s FullPrice, even though it’s nested in transparent2.transparent1.FullPrice in the next step of the translation (productDetailPricesTranslation3). This example makes clear how much the compiler works when you give it more than one let clause (though the extra compiler time is minimal). Each let clause translation begins by indenting the from together with its let clause. Then we change the let clause to an anonymous-type select that combines the iteration variable and let variable. Then we write another from clause at the top that introduces a transparent variable.

112

C# Query Expressions

Preview Release 1.0

Exercise 39: Show all the intermediate steps of translating the last query in RootFinder.cs.

Exercise 40: The Customer’s ContactName field combines a spaceseparated first name and last name (and sometimes a middle name or middle token, such as “de Castro”). Using let clauses, write a query that retrieves first and last (or last part of) names, and packs the two strings into an anonymous type with FirstName and LastName properties.

let vs. into Both let and into introduce temporary variables in a query and translate into nested queries for another from clause, with only slightly different semantics.25 Here we’ll reconsider a query that sums the squares of all numbers (as in the LetClause.cs exercise). This time we introduce our square variable using an into instead of a let clause, and compare both approaches: //: QueryExpressions\IntoVsLet.cs using System.Linq; class IntoVsLet { static void Main() { var numbers = new[] { 1, 3, 2 }; var withLet = from n in numbers let square = n * n // "let" version select square + square; var withInto = from n in numbers select n * n into square // "into" version select square + square; withLet.AssertEquals(withInto); // Skipping transparent identifier // step in the translation here: var withLetTranslation1 = from nt in from n in numbers select new { n, square = n * n }

25

Our thanks to Levi Beaver for the question that inspired this section.

Query Expressions

King/Eckel ©2008 MindView, Inc.

113

select nt.square + nt.square; var withIntoTranslation1 = from square in from n in numbers select n * n select square + square; withLet.AssertEquals(withLetTranslation1); withInto.AssertEquals(withIntoTranslation1); var withLetTranslation2 = numbers .Select(n => new { n, square = n * n }) .Select(nt => nt.square + nt.square); withLetTranslation1.AssertEquals(withLetTranslation2); var withIntoTranslation2 = numbers .Select(n => n * n) .Select(square => square + square); withIntoTranslation1 .AssertEquals(withIntoTranslation2); } } ///:~

These two queries sum every element’s square. The only difference between them appears in the second line of each query, when they introduce square. The key difference between the two (very similar) translations lies in the whether a temporary anonymous type is used. into shifts its query’s top portion to a nested query, but let’s conversion combines the iteration variable and the new variable, square, into the resulting anonymous type. Therefore, while into’s original iteration variable n falls out of scope, let packs both n and the new variable square into an anonymous type, so both remain in scope for further query clauses. Consider the readability of both approaches: into gives the impression that we are moving into another query, while let says “make a variable with this value.” let is therefore more appropriate in this situation.

Exercise 41: Try to change both queries in IntoVsLet.cs to select each value and the sum of its square into an anonymous type, instead of just the square. See what kind of compiler messages this produces.

114

C# Query Expressions

Preview Release 1.0

joining into You have seen how into concatenates queries with select and group…by clauses. join clauses are into’s third and final use, with some semantic changes. Our previous join examples work until we try to group the results. For example, each Customer is likely to have more than one Order (a one-tomany relationship), so joining Orders with Customers produces duplicate Customers paired with each unique Order. We could fix this by grouping after the join: //: QueryExpressions\JoiningThenGrouping.cs using System.Linq; class JoiningThenGrouping { static void Main() { // Count how many orders each Customer has: var customerOrderCounts = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID group o by c.CustomerID into customerOrders select new { CustomerID = customerOrders.Key, NumOrders = customerOrders.Count() }; customerOrderCounts.Take(3).P(); } } /* Output: [ { CustomerID = ALFKI, NumOrders = 6 }, { CustomerID = ANATR, NumOrders = 4 }, { CustomerID = ANTON, NumOrders = 7 } ] *///:~

We eliminate duplicate Customer objects by pairing each Customer with its Orders, and then group the Orders by their CustomerID. The above approach first joins, then groups. When we instead place into after the join clause, we group while joining: //: QueryExpressions\GroupingWhileJoining.cs

Query Expressions

King/Eckel ©2008 MindView, Inc.

115

using System.Linq; using System.Collections.Generic; class GroupingWhileJoining { static void Main() { var customerOrderCounts = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID into customerOrders select new { c.CustomerID, NumOrders = customerOrders.Count() }; customerOrderCounts.Take(3).P(); // Translation: var customerOrderCountsTranslation1 = Northwind.Customers.GroupJoin(Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (Customer c, IEnumerable customerOrders) => new { c.CustomerID, NumOrders = customerOrders.Count() }); customerOrderCountsTranslation1 .AssertEquals(customerOrderCounts); var customerOrderCountsTranslation2 = Enumerable.GroupJoin(Northwind.Customers, Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (Customer c, IEnumerable customerOrders) => new { c.CustomerID, NumOrders = customerOrders.Count() }); customerOrderCountsTranslation1 .AssertEquals(customerOrderCountsTranslation2); } } /* Output: [ { CustomerID = ALFKI, NumOrders = 6 }, { CustomerID = ANATR, NumOrders = 4 }, { CustomerID = ANTON, NumOrders = 7 } ]

116

C# Query Expressions

Preview Release 1.0

*///:~

The first query retrieves the same results as JoiningThenGrouping.cs. However, the into after join now implicitly groups by the join condition, which eliminates the explicit group…by clause. Here we could also directly reference c.CustomerID in the selection, rather than pulling the CustomerID from grouped.Key, as we did in JoiningThenGrouping.cs. However, we can now reference the Customer’s ContactName (which is friendlier) rather than CustomerID. In JoiningThenGrouping.cs, we can only do this by re-joining the result to the Customers. GroupJoin( ) changes customerOrders’s type to an IEnumerable, instead of JoiningThenGrouping.cs’s IGrouping, as the translation shows. We thereby get the actual Customer object with an IEnumerable of its orders, instead of an IGrouping (just a Key and the Customer’s Orders).26 Except for their last lambdas, GroupJoin( )’s arguments are identical to Join( )’s. Join( )’s invokes its last lambda expression for each Customer-Order pair. GroupJoin( ) invokes its last lambda once for each Customer, along with that Customer’s IEnumerable.27 Because grouping is implicit in a join into, you see no group…by. Here, into groups the right side of a join (Orders) by the left side (Customer ContactName). Our previous into examples with select and group….by showed how it nested its “above” query into its “below” query. When into follows join it chains the queries differently, by embedding its below query in the GroupJoin( )’s lambda expression. In the above example, customerOrders is a lambda expression parameter that has each of the 26 Its IEnumerable compile-time type masks its actual runtime type, an IGrouping<string, Order>. You could cast it and then assert that its Key is equal to the corresponding Customer parameter’s CustomerID. 27 Notice a subtle difference between Join( ) and GroupJoin( ). Join( ) invokes its last lambda for each matching pair. Customers without Orders won’t have any pairs, so the last lambda never executes. However, GroupJoin( ) produces each Customer with an IEnumerable of its orders, even when that IEnumerable is empty. You can use this empty IEnumerable to perform an outer join, something you’ll see later.

Query Expressions

King/Eckel ©2008 MindView, Inc.

117

Customer’s orders. This lambda is the selector, making GroupJoin( ) a selection operation, like select and group…by. GroupJoin( ) returns an IEnumerable, as opposed to GroupBy( )’s return of the IEnumerable>. Since GroupJoin( )’s last lambda is a selector, GroupJoin( ), in effect, combines group…by and select. Notice in the example that o drops out of scope once all the Orders (o) are in customerOrders, so you see no o parameter in the translation’s last lambda. The compiler issues an error if you try to use o after the join…into in the original query. In fact, we can now reuse the identifier and, with a from, create an identical new sequence of o from customerOrders: //: QueryExpressions\ExtractingAGroupedJoin.cs using System.Linq; class ExtractingAGroupedJoin { static void Main() { // Get all CustomerIDs with their OrderIDs: var contactNamesToOrderDates = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID into customerOrders from o in customerOrders // New o select new { c.ContactName, o.OrderDate }; // Same as doing a normal join: var contactNamesToOrderDates2 = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID select new { c.ContactName, o.OrderDate }; contactNamesToOrderDates .AssertEquals(contactNamesToOrderDates2); } } ///:~

The first query counterproductively joins into customerOrders then extracts the results back out in the subsequent from, which translates to a SelectMany( ). The second query just joins with no implicit grouping, and produces the same results. As you’ll see shortly, however, an outer join requires extracting join…into.

118

C# Query Expressions

Preview Release 1.0

Notice that AssertEquals( ) works with neither result ordered. All grouping methods maintain their sequences’ original input order, maintaining both the left and right sequence order as their elements are grouped. The choice of which sequence is on the left and which is on the right doesn’t matter when you join without an into, but is crucial when into follows join, because they group at the same time: //: QueryExpressions\JoiningOrder.cs using System.Linq; class JoiningOrder { static void Main() { var customerOrderDates = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID orderby c.CustomerID, o.OrderDate select new { c.CustomerID, o.OrderDate }; // Can swap join order, doesn't affect anything: var customerOrderDates2 = from o in Northwind.Orders join c in Northwind.Customers on o.CustomerID equals c.CustomerID orderby c.CustomerID, o.OrderDate select new { c.CustomerID, o.OrderDate }; customerOrderDates.AssertEquals(customerOrderDates2); } } ///:~

The first query joins Customers to Orders; the second joins Orders to Customers. We sort both queries to compare the results. We’d get IEnumerables of one Customer for every single Order if we joined Orders to Customers rather than Customers to Orders in GroupingWhileJoining.cs, because each Order has only one associated Customer. This is the same as a normal join, except each element from the right sequence is placed by itself in a new sequence. If anything other than a select follows a join…into, the compiler packs the results into a temporary anonymous object (transparent identifier) for the chains’ next clause. Recall that multiple froms do the same: //: QueryExpressions\CustomersWithMostOrders.cs

Query Expressions

King/Eckel ©2008 MindView, Inc.

119

using System.Linq; class CustomersWithMostOrders { static void Main() { // Which three customers have the most orders? var result = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID into customerOrders orderby customerOrders.Count() descending select c.ContactName; result.Take(3).P(POptions.NoNewLines); /*c! // Transparent identifiers: var resultTranslation1 = Northwind.Customers .GroupJoin(Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, customerOrders) => new { c, customerOrders }) .OrderByDescending(t => customerOrders.Count()) .Select(t => c.ContactName); */ var resultTranslation2 = Northwind.Customers .GroupJoin(Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, customerOrders) => new { c, customerOrders }) .OrderByDescending( nt => nt.customerOrders.Count()) .Select(nt => nt.c.ContactName); result.AssertEquals(resultTranslation2); var resultTranslation3 = Enumerable.Select( Enumerable.OrderByDescending( Enumerable.GroupJoin(Northwind.Customers, Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, customerOrders) => new { c, customerOrders }), nt => nt.customerOrders.Count()), nt => nt.c.ContactName); resultTranslation2.AssertEquals(resultTranslation3); }

120

C# Query Expressions

Preview Release 1.0

} /* Output: [Jose Pavarotti, Roland Mendel, Horst Kloss] *///:~

Since GroupJoin( ) also performs the selection, a second subsequent Select( ) call is unnecessary when only a select clause follows a join…into. However, when another clause follows the join…into (as orderby does in this example), the compiler packs the results in the GroupJoin( ) lambda and unpacks them in the OrderByDescending( ) lambda. It also inserts the Select( ) call and unpacks there. Like GroupBy( ), GroupJoin( ) has an overload that takes an IEqualityComparer. The exercises for the Grouping Data section explore how this works.

Exercise 42: Find all Products never ordered using join into and Count( ).

Exercise 43: Group Orders with Customers rather than Customers with Orders in GroupingWhileJoining.cs.

Exercise 44: Use a join into to find Customers who have made at least $15,000 in Orders.

Outer joins Outer joins come in three types: left, right, and full. Right and left outer joins retain all elements on their respective sides, even if they have no oppositeside matches. Full outer joins retain all elements on both sides, even if they have no opposite-side matches. join is an “inner join” that excludes Customers who have no Orders. Therefore, we must use an outer join to count all Customer Orders (including Orderless Customers). Using left outer joins in a query expression is fairly easy. However, a right outer join requires you to swap the input sequences, and thus convert it to a left outer join. Query expressions don’t directly support outer joins, so you must use a join into (GroupJoin( )) combined with Enumerable.DefaultIfEmpty( ): //: QueryExpressions\OuterJoins.cs

Query Expressions

King/Eckel ©2008 MindView, Inc.

121

using System.Linq; class OuterJoins { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 2, 3 }; // 1) An inner join is just a normal join: var numbers = from n1 in numbers1 join n2 in numbers2 on n1 equals n2 select new { n1, n2 }; numbers.P("Inner join"); // 2) A left outer join requires grouping: numbers = from n1 in numbers1 join n2 in numbers2 on n1 equals n2 into n2s from n2 in n2s.DefaultIfEmpty(-1) select new { n1, n2 }; numbers.P("Left outer join"); // 3) A right outer join is just a left // outer join with swapped arguments: numbers = from n2 in numbers2 join n1 in numbers1 on n2 equals n1 into n1s from n1 in n1s.DefaultIfEmpty(-1) select new { n1, n2 }; numbers.P("Right outer join"); } } /* Output: Inner join: [ { n1 = 2, n2 = 2 } ] Left outer join: [ { n1 = 1, n2 = -1 }, { n1 = 2, n2 = 2 } ] Right outer join: [ { n1 = 2, n2 = 2 }, { n1 = -1, n2 = 3 } ] *///:~

122

C# Query Expressions

Preview Release 1.0

numbers1 intersects numbers2 at the element 2. Inner joining the two sequences exposes this intersection by returning only the 2s paired together (excluding 1 and 3, as the output shows). The second query joins on the same condition, but groups the results into n2s. We call DefaultIfEmpty( ) on these IEnumerable objects to re-extract the n2 values from n2s. DefaultIfEmpty( ) returns any nonempty original sequence, and for any empty sequence returns an IEnumerable that holds a single instance of T’s default value. Alternatively, we pass our own default value by supplying -1. Aside from our addition of a DefaultIfEmpty( ) call, this technique, is similar to ExtractingAGroupedJoin.cs. GroupJoin( ) invokes its last lambda expression once for every element on the left side, even when they match no right-side elements. Thus, we still get an empty IEnumerable object that represents elements from the right side (e.g., the second query still gives us 1 from numbers1, although its corresponding numbers2 n2s sequence is empty). Because DefaultIfEmpty( ) supplies a dummy item for 1, we do not lose nonmatching left side elements from the join. Remember, we can only accomplish a right outer join in a query expression by swapping the inputs, so the last join is really a left outer join. SQL has syntax for performing right outer joins, but joining right or joining left is determined by which table comes first. Here we use an outer join to get all Customers with and without Orders: //: QueryExpressions\AllCustomers.cs using System.Linq; class AllCustomers { static void Main() { var all = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID into customerOrders from co in customerOrders.DefaultIfEmpty() let OrderID = co == null ? -1 : co.OrderID orderby OrderID select new { c.ContactName, OrderID }; all.Take(5).P();

Query Expressions

King/Eckel ©2008 MindView, Inc.

123

all.OrderBy(a => a.ContactName).Take(10). P("Ordered by ContactName"); } } /* Output: [ { ContactName = Diego Roel, OrderID = -1 }, { ContactName = Marie Bertrand, OrderID = -1 }, { ContactName = Paul Henriot, OrderID = 10248 }, { ContactName = Karin Josephs, OrderID = 10249 }, { ContactName = Mario Pontes, OrderID = 10250 } ] Ordered by ContactName: [ { ContactName = Alejandra Camino, OrderID = 10281 }, { ContactName = Alejandra Camino, OrderID = 10282 }, { ContactName = Alejandra Camino, OrderID = 10306 }, { ContactName = Alejandra Camino, OrderID = 10917 }, { ContactName = Alejandra Camino, OrderID = 11013 }, { ContactName = Alexander Feuer, OrderID = 10277 }, { ContactName = Alexander Feuer, OrderID = 10575 }, { ContactName = Alexander Feuer, OrderID = 10699 }, { ContactName = Alexander Feuer, OrderID = 10779 }, { ContactName = Alexander Feuer, OrderID = 10945 } ] *///:~

customerOrders is an IEnumerable that holds all the Orders for a single Customer. For Customers with no Orders, customerOrders is empty, so we use DefaultIfEmpty( ) to provide a null value. The from that follows extracts the Customer’s Orders one-by-one into co. The let clause introduces OrderID for the subsequent orderby and select clause. OrderID is the actual Order.OrderID except when co is null. In that case, the let clause sets OrderID to -1, a dummy value.28 Notice the output shows -1 for the first two Customer OrderIDs, which signifies that they have no corresponding Orders. Sorting the data by ContactName (in the second part of the output) makes it more meaningful. To reduce the noise of changing the order with a full query, we call OrderBy( ) directly, and pass a lambda expression to return

28 You could also use the Null Object pattern, if you wish to join against actual Order objects rather than just the OrderIDs.

124

C# Query Expressions

Preview Release 1.0

the ContactName. However, this second OrderBy( ) call makes the original orderby a waste of processor cycles. There are several ways to find Customers with no Orders: //: QueryExpressions\CustomersWithoutOrders.cs using System.Linq; class CustomersWithoutOrders { static void Main() { var noOrders = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID into joined where joined.Count() == 0 select c; noOrders.Count().AssertEquals(2); // Using Enumerable.Except(): var customerIDs = from c in Northwind.Customers select c.CustomerID; var orderCustomerIDs = from o in Northwind.Orders select o.CustomerID; var noOrdersIDs = customerIDs.Except(orderCustomerIDs); noOrdersIDs.Count().AssertEquals(2); } } ///:~

The first query uses join into combined with a where clause to remove the groups of Orders whose Count( ) is not zero, instead of re-extracting the Orders from joined. The second approach retrieves all CustomerIDs from both Customers and Orders and then calls Except( ). This Enumerable extension method returns a sequence of all left-sequence items that are not also in the right sequence (set difference). Thus, here it returns all CustomerIDs not found in Orders.

Exercise 45: Translate the second query in OuterJoins.cs. Notice where the DefaultIfEmpty( ) call comes within the translation. How does this allow for outer joins?

Query Expressions

King/Eckel ©2008 MindView, Inc.

125

Exercise 46: Use an outer join.to again find any Products never ordered (as in [[[formatting Exercise 42: ]]]).

Exercise 47: Rewrite the last approach in CustomersWithoutOrders.cs so that it does not use the temporary variables customerIDs and orderCustomerIDs.

Other query operators Enumerable has several extension methods such as Take( ), SequenceEqual( ), and Except( ). These are not mapped to query expression clauses; you invoke them as extension methods on any IEnumerable object (as we’ve seen). For example, Array has no Contains( ) method. As a workaround, we call the static Array.IndexOf( ) method and check the return value for -1. However, Enumerable provides a Contains( ) extension method in addition to several others for IEnumerable types such as Array: //: QueryExpressions\EnumerableOperators.cs using System; using System.Linq; class EnumerableOperators { static void Main() { int[] first = { 1, 2, 3 }; int[] second = { 4, 5, 6 }; // Concatenate first and second into one sequence: var numbers = first.Concat(second); // Compare numbers against another sequence: numbers.SequenceEqual( new[]{ 1, 2, 3, 4, 5, 6 }).True(); // Count the elements: numbers.Count().AssertEquals(6); // Count elements less than four: numbers.Count(x => x < 4).AssertEquals(3); // There's a LongCount() version for large collections // Reverse the sequence: numbers.Reverse() .AssertEquals(new[]{ 6, 5, 4, 3, 2, 1 }); // Add all the numbers: numbers.Sum().AssertEquals(1 + 2 + 3 + 4 + 5 + 6); // Sum the OrderDetail amounts:

126

C# Query Expressions

Preview Release 1.0

Northwind.OrderDetails.Sum( od => od.Quantity * od.UnitPrice) .AssertEquals(1354458.5900m); // Average the numbers: double average = numbers.Average(); (Math.Abs(average - 3.5) < double.Epsilon).True(); // Average the OrderDetail amounts: Northwind.OrderDetails .Average(od => od.Quantity * od.UnitPrice) .AssertEquals(628.51906728538283062645011601m); // What's the minimum? numbers.Min().AssertEquals(1); // What's the maximum? numbers.Max().AssertEquals(6); // Convert numbers to an array: int[] intArray = numbers.ToArray(); intArray.AssertEquals(numbers); // Cast all elements from a non-generic // container to a generic one: var nonGenericList = MindView.Util. Enumerable.ToArrayList(numbers); var numbers2 = nonGenericList.Cast(); // Are all elements from numbers equal // to all elements in numbers2? numbers.SequenceEqual(numbers2).True(); // Are all numbers less than 7? numbers.All(x => x < 7).True(); // Are any numbers less than 3? numbers.Any(x => x < 3).True(); // Does numbers contain 3 or 9? numbers.Contains(3).True(); numbers.Contains(9).False(); // First() and Last() throw exceptions // when elements cannot be found. // Get the first element: numbers.First().AssertEquals(1); // Get the first even element: numbers.First(x => x % 2 == 0).AssertEquals(2); // Get the first odd element, or // default(T) if none are found: numbers.FirstOrDefault(x => x % 2 == 1) .AssertEquals(1); // Get the first element greater than 6, 6 not // present, so default(T) returned (0 for int):

Query Expressions

King/Eckel ©2008 MindView, Inc.

127

numbers.FirstOrDefault(x => x > 6).AssertEquals(0); // Get the last element: numbers.Last().AssertEquals(6); // Get last odd element: numbers.Last(x => x % 2 == 1).AssertEquals(5); // Get last odd element or default(T): numbers.LastOrDefault(x => x % 2 == 1).AssertEquals(5); // Get the last element greater than 6, 6 not // present, so default(T) returned: numbers.LastOrDefault(x => x > 6).AssertEquals(0); // Get the single element from the sequence: new[]{ 8 }.Single().AssertEquals(8); // Get the single element, or the default value: new[]{ 8 }.SingleOrDefault().AssertEquals(8); new int[]{}.SingleOrDefault().AssertEquals(0); // Get the single element 2 less than 5: numbers.Single(x => x + 2 == 5).AssertEquals(3); // Same as above, but default returned if not found: numbers.SingleOrDefault(x => x + 2 == 5) .AssertEquals(3); numbers.SingleOrDefault(x => x > 6).AssertEquals(0); // Retrieve by index: numbers.ElementAt(2).AssertEquals(3); numbers.ElementAtOrDefault(2).AssertEquals(3); numbers.ElementAtOrDefault(10).AssertEquals(0); // Return a default value if nothing to iterate: var enumerable = numbers.DefaultIfEmpty(); enumerable.AssertEquals(numbers); enumerable = numbers.DefaultIfEmpty(5); enumerable.AssertEquals(numbers); enumerable = new int[]{}.DefaultIfEmpty(); enumerable.Single().AssertEquals(0); enumerable = new int[]{}.DefaultIfEmpty(10); enumerable.Single().AssertEquals(10); numbers2 = new[]{ 9, 8, 8, 4, 3, 4, }; // Notice Union(), Intersect(), and Except() // (set difference) remove duplicate values // because they are set operations: var temp = numbers.Union(numbers2); temp.AssertEquals(new[]{ 1, 2, 3, 4, 5, 6, 9, 8}); temp = numbers.Intersect(numbers2); temp.AssertEquals(new[]{ 3, 4 }); temp = numbers.Except(numbers2); temp.AssertEquals(new[]{ 1, 2, 5, 6 });

128

C# Query Expressions

Preview Release 1.0

// Remove duplicates: temp = numbers2.Distinct(); temp.AssertEquals(new[]{ 9, 8, 4, 3 }); // Get the first four numbers: numbers2 = new[]{ 3, 2, 1, 5, 1, 2, 3 }; temp = numbers2.Take(4); temp.AssertEquals(new[]{ 3, 2, 1, 5}); // Get the first set of numbers less than 5: temp = numbers2.TakeWhile(x => x < 5); temp.AssertEquals(new[]{ 3, 2, 1 }); // Skip the first four numbers: temp = numbers2.Skip(4); temp.AssertEquals(new[]{ 1, 2, 3}); // Skip the first set of numbers less than 5: temp = numbers2.SkipWhile(x => x < 5); temp.AssertEquals(new[]{ 5, 1, 2, 3 }); // Generate a range of 5 numbers starting // at 100 (only works with integers): numbers = Enumerable.Range(100, 5); numbers.AssertEquals( new[]{ 100, 101, 102, 103, 104 }); // Repeat a value n times, (works with any type): Enumerable.Repeat(3.14, 3).AssertEquals( new[]{ 3.14, 3.14, 3.14 }); // Generate an empty container: Enumerable.Empty().Count().AssertEquals(0); } } ///:~

This example shows most of the Enumerable methods not mapped to query expression clauses, many of which you have seen. Most are self-explanatory, and the comments help decipher the rest. Enumerable extension methods bridge the gap between an IEnumerable consumer and an IEnumerable implementer. Suppose client A and client B need an intersecting subset of methods. The interface implementer can implement only the intersection, reducing coupling to either client. For example, client A may need only Count( ) but not Take( ), while client B needs Take( ) but not Count( ). However, both need GetEnumerator( ). Whomever implements IEnumerable only wants to provide GetEnumerator( ), not Count( ) or Take( ). Extension methods provide a compromise by allowing A to add Count( ) and B to add Take( ); the implementer must only provide GetEnumerator( ). The more methods

Query Expressions

King/Eckel ©2008 MindView, Inc.

129

an interface contains the less likely someone will implement it because not all the methods make sense for their implementation. When you write your own collection class you must decide, for instance, whether to implement ICollection or just IEnumerable. The ICollection interface provides your clients with more methods, but it also couples you to your clients more than IEnumerable. The Enumerable extension methods provide a compromise.29 All of Enumerable’s methods are termed standard query operators, meaning that they come with the standard library. As always, you can add your own extension methods in a separate static class. Where( ), Select( ), SelectMany( ), SkipWhile( ), and TakeWhile( ) all have overloads that pass the item index with the element: //: QueryExpressions\Indexing.cs using System; using System.Linq; class Indexing { static void Main() { Func func = (value, index) => 1 < index && index < 4; var elements = new[] { 7, 8, 9, 10, 11 }; // Get all elements with indices between 1 and 4: elements.Where(func).AssertEquals(new[] { 9, 10 }); // Determine whether the index is between 1 and 4: elements.Select(func).AssertEquals( new[] { false, false, true, true, false }); // Skip the first 2 elements: elements.SkipWhile((value, index) => index < 2) .AssertEquals(new[] { 9, 10, 11 }); // Take the first three elements: elements.TakeWhile((value, index) => index < 3) .AssertEquals(new[] { 7, 8, 9 }); // Take the odd values or until the index passes 4: elements.TakeWhile( (value, index) => index < 4 && value % 2 == 1) .AssertEquals(new[] { 7 });

29 However, Enumerable extension methods cannot produce ICollection’s Add( ) and Remove( ) methods, so in that case you must decide between the two approaches.

130

C# Query Expressions

Preview Release 1.0

} } ///:~

func’s lambda doesn’t use the value argument, but the various standard query operator overloads require that signature. Because func has a name (so it can be used in multiple places), it’s almost the same as an ordinary method, but it’s slightly less verbose. In your own code you must decide whether this apparent simplification is worth the potential added confusion for the reader. Average( ), Sum( ), Max( ), and Min( ) all run a delegate on each element when you provide one: //: QueryExpressions\TransformingAggregation.cs using System; using System.Linq; class TransformingAggregation { static void Main() { int[] values = new[] { 8, 9, 10, 11, 12 }; // Average distance from 10: values.Average(val => Math.Abs(val - 10)) .AssertEquals(1.2); // Sum the truncated square roots: values.Sum(val => (int)Math.Sqrt(val)) .AssertEquals(14); // Find the max remainder dividing by 7: values.Max(val => val % 7).AssertEquals(5); // Find the min remainder dividing by 7: values.Min(val => val % 7).AssertEquals(1); } } ///:~

Each algorithm executes the lambda on the individual elements, then uses the resulting value. For example, the Average( ) call averages the distance of each element to 10, instead of just the original values. OfType( ) produces a sequence containing objects of (or derived from) type T: //: QueryExpressions\OfTypeDemo.cs using System.Linq; using System.Collections.Generic;

Query Expressions

King/Eckel ©2008 MindView, Inc.

131

class class class class

Base {} D1 : Base {} D2 : Base {} D2Child : D2 {}

class OfTypeDemo { static void Main() { var list = new List { new D1(), new D2(), new D1(), new D2(), new D2Child() }; IEnumerable d2s = list.OfType(); d2s.P("d2s's Types", POptions.NoNewLines); // Can do same thing using a where clause, // but it returns an IEnumerable // instead of IEnumerable: IEnumerable moreD2s = from c in list where c is D2 select c; moreD2s.P("moreD2s's Types", POptions.NoNewLines); moreD2s.GetType().GetGenericArguments()[0] .AssertEquals(typeof(Base)); } } /* Output: d2s's Types: [D2, D2, D2Child] moreD2s's Types: [D2, D2, D2Child] *///:~

We fill list with all possible derivations of Base. OfType( ) returns all items of (or derived from) type D2. We explicitly declare d2’s type instead of using var, to show that OfType( ) returns an IEnumerable instead of another IEnumerable. We could instead use a where clause to retrieve the same results as OfType( ) (the last part of the example), but a where clause returns an IEnumerable rather than an IEnumerable (even though the IEnumerable is only holding D2s). Sum( ) and Average( ) are basic aggregate functions (they take several inputs and return a single value), but you can perform custom operations using Enumerable’s Aggregate( ) method. For example, you can multiply instead of Add( )ing: //: QueryExpressions\AggregateDemo.cs using System.Linq;

132

C# Query Expressions

Preview Release 1.0

class AggregateDemo { static int PrintWhileMultiplying(int currentTotal, int nextValue) { currentTotal.P(" currentTotal"); nextValue.P(" nextValue"); return currentTotal * nextValue; } static void Main() { int[] numbers = { 4, 5, 6 }; "Run 1:".P(); numbers.Aggregate(PrintWhileMultiplying) .AssertEquals(4 * 5 * 6); // Overloaded to take a custom seed: "Run 2:".P(); numbers.Aggregate(100, PrintWhileMultiplying) .AssertEquals(100 * 4 * 5 * 6); // Can also do an operation at the end: "Run 3:".P(); numbers .Aggregate(100, PrintWhileMultiplying, x => x / 3) .AssertEquals((100 * 4 * 5 * 6) / 3); // PrintWhileMultiplying is for demonstration. // You would normally use a lambda instead: "Run 4: (no output)".P(); numbers.Aggregate((c, n) => c * n) .AssertEquals(4 * 5 * 6); } } /* Output: Run 1: currentTotal: 4 nextValue: 5 currentTotal: 20 nextValue: 6 Run 2: currentTotal: 100 nextValue: 4 currentTotal: 400 nextValue: 5 currentTotal: 2000 nextValue: 6 Run 3: currentTotal: 100 nextValue: 4 currentTotal: 400

Query Expressions

King/Eckel ©2008 MindView, Inc.

133

nextValue: 5 currentTotal: 2000 nextValue: 6 Run 4: (no output) *///:~

The simplest Aggregate( ) takes a delegate which requires two arguments: the current running result and the next value in the sequence. The output shows the first iteration, starting with the first element (4) and the nextValue (5). The currentTotal in the next iteration provides the result (20), which we then multiply by 6. You can pass a custom seed value by inserting it before your operation in the Aggregate( ) call. In “Run 2,” Aggregate( ) uses 100 as the first element rather than the first element in the sequence. “Run 3” performs an operation on, and then returns, the end result. Here, we divide the result by 3. Enumerable also contains a few methods that convert IEnumerable objects to different collection types; for example, ToArray( ) produces an Array object and ToList( ) produces a generic List: //: QueryExpressions\ToConcreteSequences.cs using System.Linq; using System.Collections.Generic; class ToConcreteSequences { static void Main() { // Convert an IEnumerable to an array: Customer[] custArray = Northwind.Customers.ToArray(); // Convert an IEnumerable to a List: List custList = Northwind.Customers.ToList(); // Can just use List constructor: custList = new List(Northwind.Customers); } } ///:~

Alternatively, you can pass an IEnumerable to List’s constructor, but the above form is more succinct. To find Customers by their CustomerIDs you could repeat the tedious and inefficient linear search through a sequence. However, ToDictionary( ) creates a Dictionary from linear data, using a lambda to select the key:

134

C# Query Expressions

Preview Release 1.0

//: QueryExpressions\DictionaryTransform.cs using System.Linq; using System.Collections.Generic; class DictionaryTransform { static void Main() { // Map CustomerIDs to Customers: IDictionary<string, Customer> customerIDsToCustomer = Northwind.Customers.ToDictionary(c => c.CustomerID); customerIDsToCustomer["QUICK"].ContactName .AssertEquals("Horst Kloss"); // Map CustomerIDs to contact names: IDictionary<string, string> customerIDsToContactNames = Northwind.Customers.ToDictionary(c => c.CustomerID, // Second lambda determines the Values: c => c.ContactName); customerIDsToContactNames["QUICK"] .AssertEquals("Horst Kloss"); } } ///:~

The first form selects only the keys for the Dictionary. The values default to the objects in the source sequence (here, the actual Customer objects). The second form uses an additional lambda to choose the values. ToDictionary( ) throws an exception when your lambda returns a duplicate key. For example, if you use the Customer Country field as a key, you’ll get an error, because one Country maps to multiple Customers. ToLookup( ) is the better option for this. ILookup maps keys to IGroupings. If you must find groups by their Key, then ToLookup( ) is better than only a group…by. This eliminates the need for a linear search through the groups: //: QueryExpressions\Lookups.cs using System.Linq; class Lookups { static void Main() { // Group customers by their country: ILookup<string, Customer> countryToCustomers = Northwind.Customers.ToLookup(c => c.Country); countryToCustomers.Count.AssertEquals(21); countryToCustomers.Contains("France").True();

Query Expressions

King/Eckel ©2008 MindView, Inc.

135

countryToCustomers["USA"].Count().AssertEquals(13); // It takes more work to grab individual // groups using query expressions: var countryGroups = from c in Northwind.Customers group c by c.Country; countryGroups.Count().AssertEquals(21); countryGroups .Select(g => g.Key) .Contains("France").True(); countryGroups .Single(g => g.Key == "USA") .Count().AssertEquals(13); } } ///:~

ToLookup( ) and ToDictionary( ) have identical overloads. (We omitted the overloads that take an IEqualityComparer.) You can see that ILookup is an easier way to retrieve individual groups. ToLookup( ) returns a runtime type of System.Linq.Lookup.

Exercise 48: Retrieve the first three Country names from an alphabetical list of Customers’ countries. Find all the ContactNames of Customers in those three countries.

Exercise 49: Write a non-generic version of System.Linq.Enumerable.Concat( ). Exercise 50: Combine two queries with Concat( ) to retrieve the names and nationalities of Customers from the USA and Mexico. Note in this sample output that nationality differs from Country name: { { { { { {

Name Name Name Name Name Name

= = = = = =

Francisco Chang, Nationality = Mexican }, Guillermo Fernández, Nationality = Mexican }, Miguel Angel Paolino, Nationality = Mexican }, Howard Snyder, Nationality = American }, Yoshi Latimer, Nationality = American }, John Steel, Nationality = American },

Exercise 51: Find the following: a. The most parts in any Customer’s ContactName (Hint: use Split( )). Check the results three times using Any( ), All( ), and Count( ).

136

C# Query Expressions

Preview Release 1.0

b. The latest and earliest OrderDate. c. The first and last Customers whose names contain a lowercase ‘q’ (in ContactName order). d. The 15th Customer (in ContactName order). e. All Customer ContactNames that begin with ‘R’ (don’t use where or Where( )). f.

The last three Customer ContactNames (in ContactName order).

Exercise 52: Use Aggregate( ) to output Customer ContactName first names, separated by the vertical-bar character (|), and surround the entire string with square brackets, as seen in this sample output (which only shows three names): [Maria | Ana | Antonio]

Exercise 53: Put all the OrderDetails in an ILookup, and look them up by OrderID, counting the number in at least three groups.

Exercise 54: Retrieve three different sequences of Customer ContactNames in three different alphabetical ranges: a-g, h-q, and r-z.

Query Expressions

King/Eckel ©2008 MindView, Inc.

137

Exercise Solutions Chapter 1 Exercise 1: Make an Add( ), Subtract( ), Multiply( ), and Divide( ) extension method for int. //: SimpleNewFeaturesSolutions\ExtendingInt.cs using System.Diagnostics; static class ExtendingInt { static int Add(this int left, int right) { return left + right; } static int Subtract(this int left, int right) { return left - right; } static int Divide(this int left, int right) { return left / right; } static int Multiply(this int left, int right) { return left * right; } static void Main() { Debug.Assert(1.Add(2) == 3); Debug.Assert(3.Subtract(4) == -1); Debug.Assert(6.Divide(2) == 3); Debug.Assert(5.Multiply(8) == 40); Debug.Assert(5.Add(8).Subtract(2).Multiply(3) .Subtract(10) == 23); } } ///:~

We obviously don’t need methods that just mimic existing operators. However, notice that each method takes and returns an int, so we can chain the extension results. You’ll appreciate this flexibility when you use IEnumerable in upcoming query expression exercises. We use Debug.Assert( ) here to show results inline. You will soon see some assertion methods that make this a bit cleaner.

139

Exercise 2: Make two extension methods: a ToArrayList( ) that converts its IEnumerable argument to an ArrayList, and a generic ForEach( ) method that iterates over its IEnumerable argument executing its second System.Action delegate argument on each element. //: MindView.Util\Enumerable.2.cs // {CF: /target:library} using System; using System.Collections; using System.Collections.Generic; namespace MindView.Util { public static partial class Enumerable { public static ArrayList ToArrayList(this IEnumerable sequence) { var ret = new ArrayList(); foreach(T item in sequence) ret.Add(item); return ret; } public static void ForEach( this IEnumerable sequence, Action action) { foreach(T element in sequence) action(element); } } } ///:~ //: SimpleNewFeaturesSolutions\EnumerableTests.cs using System; using MindView.Util; using System.Diagnostics; using System.Collections.Generic; class SimpleEnumerableTest { static void Main() { List list = new List(); Random rand = new Random(47); for(int i = 0; i < 5; i++) list.Add(rand.Next()); rand = new Random(47); foreach(int i in list.ToArrayList()) Debug.Assert(i == rand.Next());

140

C# Query Expressions

Preview Release 1.0

((IEnumerable)list).ForEach(Console.WriteLine); } } /* Output: 601795864 1305670887 1332423928 1266970102 914533663 *///:~

ArrayList contains no constructor that takes IEnumerable, so we wrote ToArrayList( ). We use this utility later in the book, hence, it’s in the MindView.Util.dll assembly. Array has a static ForEach( ) for any type of Array, and List has a nonstatic ForEach( ). However, System.Linq.Enumerable doesn’t contain a ForEach( ) for any type of IEnumerable, so again it’s another utility method in the MindView.Util.dll. Notice we must upcast list to IEnumerable for the compiler to resolve its lookup to our version of ForEach( ). Also notice that ToArrayList( ) doesn’t require a generic argument, whereas ForEach( ) does. If you do nothing specific for the contained types in your IEnumerable argument, you don’t need the generic IEnumerable argument.

Exercise 3: Change ExtendingTheBase.cs to use an intermediate base class rather than extension methods. //: SimpleNewFeaturesSolutions\ExtendingTheBase.cs abstract class Vehicle { public void Start() {} public void Stop() {} } abstract class VehicleAdditions : Vehicle { public void StartAndStop() { Start(); // Sleep a minute or two... Stop(); } }

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

141

class Car : VehicleAdditions {} class Bus : VehicleAdditions {} class Scooter : Vehicle {} class ExtendingTheBase { static void Main() { VehicleAdditions vehicle = new Car(); vehicle.StartAndStop(); vehicle = new Bus(); vehicle.StartAndStop(); //c! vehicle = new Scooter(); } } ///:~

Our approach here differs from the more flexible extension method approach mainly by requiring vehicle to be a VehicleAdditions reference rather than just a plain Vehicle. In addition, Scooter includes no StartAndStop( ) method, because it doesn’t inherit from our intermediate class.

Exercise 4: Use Asserter to show that (1) the compiler interns string literals;1 (2) boxing a value type twice produces two different objects; (3) by default, enum members not assigned an explicit value will take their predecessor’s plus one (unless they are also the first members, whose value is zero); and (4) System.Linq.Enumerable.Range( ) returns a numeric sequence of values. //: SimpleNewFeaturesSolutions\SimpleTruths.cs using System.Linq; enum MyEnum { Member1, Member2, Member3 = -5, Member4 } class SimpleTruths { static void Main() {

1

142

Look up string.IsInterned( ) in the documentation.

C# Query Expressions

Preview Release 1.0

// (1) Interning: string string1 = "my string"; string string2 = "my string"; // Both reference same object: object.ReferenceEquals(string1, string2).True(); string.IsInterned("another literal") .AssertEquals("another literal"); // (2) Boxing: int i = 5; object o1 = i, o2 = i; object.ReferenceEquals(o1, o2).False(); o1.AssertEquals(o2); o1 = o2; object.ReferenceEquals(o1, o2).True(); // (3) Enums: 0.AssertEquals((int)MyEnum.Member1); 1.AssertEquals((int)MyEnum.Member2); (-5).AssertEquals((int)MyEnum.Member3); (-4).AssertEquals((int)MyEnum.Member4); // (4) Range() produces a sequence: Enumerable.Range(5, 5).AssertEquals( new int[] { 5, 6, 7, 8, 9 }); } } ///:~

We took different approaches to prove each assertion. While True( ) and False( ) remain useful on occasion, we prefer AssertEquals( ) to something.Equals(somethingElse).True( ).

Exercise 5: Use extension method syntax to assign a delegate to a System.Linq.Enumerable’s Min( ) invoked upon a List of random numbers. Make sure that invoking the delegate returns the smallest value. //: SimpleNewFeaturesSolutions\MinninAround.cs using System; using System.Linq; using System.Collections.Generic; delegate int ReturnsInt(); class MinninAround { static void Main() { Random rand = new Random(47); List list = new List(); for(int i = 0; i < 10; i++)

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

143

list.Add(rand.Next(1000)); ReturnsInt findMin = list.Min; findMin().AssertEquals(21); list.AddRange(new int[] { 43, 19, 33, 20 }); findMin().AssertEquals(19); } } ///:~

Notice that findMin binds to members from list and Enumerable.Min( ), two completely separate types,. Later we’ll show you collection initializers that append all the items.

Exercise 6: Put two extension methods with the same signature in two separate classes, each within a unique namespace. Put each in their own file. In a third file, bring both classes into scope with using statements, and invoke the methods using extension syntax rather than normal static-method call syntax. Does the compiler issue an ambiguity error? What happens when you declare a third class after the using statements to introduce a third, same-signature extension method? //: SimpleNewFeaturesSolutions\First.cs // {CF: /target:library} namespace SomeSpace { public static class First { public static void SomeMethod(this object o) {} } } ///:~ //: SimpleNewFeaturesSolutions\Second.cs // {CF: /target:library} namespace SomeSpace2 { public static class Second { public static void SomeMethod(this object o) {} } } ///:~ //: SimpleNewFeaturesSolutions\ExtensionAmbiguity.cs // {CF: /reference:First.dll,Second.dll} // {IE: NeedlessUsings} using SomeSpace; using SomeSpace2;

144

C# Query Expressions

Preview Release 1.0

static class MoreExtensions { public static void SomeMethod(this object o) {} } class ExtensionAmbiguity { static void Main() { object o = new object(); o.SomeMethod(); } } ///:~

If you remove MoreExtensions.SomeMethod( ), the compiler issues an ambiguity error upon finding that two equally scoped SomeMethod( )s satisfy the lookup. When MoreExtensions.SomeMethod( ) is available, it’s “nearer” in scope than the other two. Extension methods cannot be members of nested classes, so we can never introduce an even “nearer-in-scope” SomeMethod( ) in a nested class inside ExtensionAmbiguity. NeedlessUsings tells the book’s build system not to flag an error if the example compiles successfully, even when the extra using statements are superfluous.

Exercise 7: Create an IndexesOf( ) string extension method that returns all of a given string’s indices instead of just the first one (as IndexOf( ) does). //: SimpleNewFeaturesSolutions\MultipleIndexes.cs using System.Collections.Generic; static class StringExtensions { public static IEnumerable IndexesOf( this string target, string value) { for(int i = target.IndexOf(value); i != -1; i = target.IndexOf(value, i + 1)) yield return i; } } class MultipleIndices { static void Main() { string str = "abc123def123ghi123j12"; str.IndexesOf("123").AssertEquals(new[] { 3, 9, 15 });

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

145

// Can call normal, but not as "OOpish": StringExtensions.IndexesOf(str, "123") .AssertEquals(new[] { 3, 9, 15 }); } } ///:~

You can implement the algorithm in a number of ways. We can, for example, create a private IndexesOf( ) if we put the extension method into a static class of MultipleIndices rather than in a separate class.

Exercise 8: Use var to assign a variable to a derived type. Prove that you cannot assign a base object or a different derived type to your variable. //: SimpleNewFeaturesSolutions\ImplicitDerivedType.cs class Base {} class Derived1 : Base {} class Derived2 : Base {} class ImplicitDerivedType { static void Main() { var value = new Derived1(); //c! value = new Derived2(); //c! value = new Base(); // value is not read-only: value = new Derived1(); } } ///:~

The compiler determines value’s type (Derived1) at compile time just as if we had declared it explicitly.

Exercise 9: Prove that the compiler still infers the correct type when you assign a var variable using a complicated initialization expression, for example: //: SimpleNewFeaturesSolutions\ComplicatedInitialization.cs using System; class ComplicatedInitialization { static void Main() { var value = ((new Random().Next().ToString() + "some string").Substring(3, 3) + AnotherMethod())

146

C# Query Expressions

Preview Release 1.0

.ToUpper(); value.GetType().AssertEquals(typeof(string)); value = "some string"; //c! value = new object(); } static string AnotherMethod() { return "this method returns a string"; } } ///:~

The compiler infers an implicitly typed local variable’s type, just as it has always determined expression return types. The GetType( ).AssertEquals( ) call doesn’t verify value’s compile-time type, but only verifies the runtime type. The attempt to assign the new object to value proves value’s compile-time type.

Exercise 10: Prove that var is not a keyword by creating a variable named var. Can the compiler still infer a variable’s type when it is declared using var? //: SimpleNewFeaturesSolutions\VarIsNotAKeyword.cs class VarIsNotAKeyword { public void LocalVariableNamedVar() { int var = 5; var type = new VarIsNotAKeyword(); } } class TypeNamedVar { class var { public string SomeProperty { get; set; } } void VarIsNowARealType() { //c! var v = new VarIsNotAKeyword(); var v2 = null; v2 = new var(); v2.SomeProperty = "Some value"; // Seriously confusing: var var = new var(); // Har har har! } static void Main(string[] args) { new VarIsNotAKeyword().LocalVariableNamedVar();

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

147

new TypeNamedVar().VarIsNowARealType(); } } ///:~

The variable var that we declare in this example in no way inhibits the compiler from using var to infer a variable’s type from an initialization expression. Both ways work because the compiler determines var’s meaning contextually. (You should never write such confusing code in practice.) We created var, a nested type in TypeNamedVar that overrides implicittype inference whenever it is in-scope. Therefore, VarIsNowARealType()’s first line doesn’t compile. The compiler uses the actual type when it cannot infer var from the initialization expression. Notice that the compiler cannot infer v2’s type from null.

Exercise 11: Change Person’s ID field to an automatic property. //: SimpleNewFeaturesSolutions\Person.cs // {CF: /target:library} class Person { public Person() {} public Person(int id) { ID = id; } public int ID { set; get; } public string FirstName { get; set; } public string LastName { get; set; } public void Print() { ID.P("ID"); FirstName.P(" FirstName"); // Spaces for indentation LastName.P(" LastName"); } } ///:~

Notice that the order doesn’t matter for the get and set accessors.

Exercise 12: Change ImplementingInterfaces.cs to prove that automatic properties also satisfy abstract property definitions. //: SimpleNewFeaturesSolutions\AbstractProperties.cs // {CF: /target:library} // Interface property syntax resembles // automatic property syntax: abstract class SomeAbstract {

148

C# Query Expressions

Preview Release 1.0

public abstract int ReadAndWrite { get; set; } } // To use automatic properties when implementing an // interface, you must also add the missing accessors: class Implementer : SomeAbstract { public override int ReadAndWrite { get; set; } } ///:~

Note that interfaces and abstract classes work differently. abstract members are also virtual, thus properties that override cannot define more or less than the property in the abstract base. In other words, if an abstract property only defines a get, the implementer cannot define a set because there is nothing to override. However, if we don’t define a set, then our concrete property cannot be an automatic property. Therefore, we can only use automatic properties to satisfy abstract properties that have both a get and a set.

Exercise 13: Create a simple Rectangle class with Width, Height, and Area properties. //: SimpleNewFeaturesSolutions\AutomaticRectangle.cs class Rectangle { public Rectangle(int width, int height) { Width = width; Height = height; } public double Width { get; set; } public double Height { set; get; } public double Area { get { return Width * Height; } } } class AutomaticRectangle { static void Main() { var rectangle = new Rectangle(5, 10); rectangle.Width.AssertEquals(5); rectangle.Height.AssertEquals(10); rectangle.Area.AssertEquals(50); } } ///:~

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

149

Notice that Width and Height simply wrap backing fields, which makes them an opportune use of an automatic property. Area’s derived value, however, requires a body for get.

Exercise 14: Make one of Implementer’s properties a public field and note the compiler error. //: SimpleNewFeaturesSolutions\ImplementingInterfaces.cs // {CF: /target:library} // {CompileTimeError: 'Implementer' does not implement // interface member 'SomeInterface.ReadAndWrite'} interface SomeInterface { int WriteOnly { set; } int ReadOnly { get; } int ReadAndWrite { get; set; } } class Implementer : SomeInterface { public int WriteOnly { get; set; } public int ReadOnly { get; set; } public int ReadAndWrite; } ///:~

The CompileTimeError flag indicates the error you get when you compile this solution. Our build system verifies that the compiler produces the message after the flag.

Exercise 15: Change Person’s properties to fields. Prove that accessible fields work in an initialization list. //: SimpleNewFeaturesSolutions\CanInitializePublicFields.cs class PersonWithPublicFields { public int ID; public string FirstName; public string LastName; } class CanInitializePublicFields { static void Main() { var person = new PersonWithPublicFields { FirstName = "Joe" }; person.ID.AssertEquals(0);

150

C# Query Expressions

Preview Release 1.0

person.FirstName.AssertEquals("Joe"); person.LastName.AssertEquals(null); } } ///:~

We removed the constructors and ToString( ) for brevity.

Exercise 16: Initialize only the FirstName property of a Person object. (See solution to previous exercise.)

Exercise 17: Make both a type with an int Field, and a local variable called Field. Why does it look odd when you initialize an instance of your type’s Field with the local variable Field, and why is this not ambiguous? //: SimpleNewFeaturesSolutions\Fields.cs class MyType { public int Field; } class Fields { static void Main() { int Field = 10; MyType type = new MyType { Field = Field }; type.Field.AssertEquals(Field); } } ///:~

Although it looks like we assigned a property value to itself, in fact an implicit “type.” prepends the left side of the assignment.

Exercise 18: Use collection initializer syntax to create a Dictionary that stores a random number for the Key, and an IEnumerable of random numbers and random length for the Value. //: SimpleNewFeaturesSolutions\InitializeADictionary.cs using System; using System.Collections.Generic; class InitializeADictionary { static Random rand = new Random(47); const int Max = 100; static IEnumerable RandomSequence() { // C# 3.0 provides better ways to do

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

151

// this that you will see later: var ret = new List(); int amount = rand.Next(5); for(int i = 0; i < amount; i++) ret.Add(rand.Next(Max)); return ret; } static void Main() { var dictionary = new Dictionary> { { rand.Next(Max), RandomSequence() }, { 55, new List { rand.Next(Max), rand.Next(Max), rand.Next(Max), rand.Next(Max) }}, { rand.Next(Max), RandomSequence() }, }; foreach(var pair in dictionary) pair.Value.P(pair.Key.ToString(), POptions.NoNewLines); } } /* Output: 28: [62, 58, 42] 55: [63, 96, 2, 79] 60: [16, 94, 92, 88] *///:~

We generate both an IEnumerable of random length and one that is hardcoded. We return a List rather than using yield to avoid deferred execution. You’ll understand what that means later, but for now you can change our solution to use yield and run the foreach in Main( ) twice. You’ll get different output both times.

Exercise 19: Write code to prove whether a Stack, a LinkedList, and a Queue meet C# 3.0’s definition of “collection.” //: SimpleNewFeaturesSolutions\NotCollections.cs // {IE: NeedlessUsings} using System; using System.Collections.Generic; class NotCollections { static void HasNoAddMethod(Type type) { foreach(var methodInfo in type.GetMethods()) if(methodInfo.Name == "Add")

152

C# Query Expressions

Preview Release 1.0

false.True(); // Bombs } static void Main() { HasNoAddMethod(typeof(Stack<>)); HasNoAddMethod(typeof(Queue<>)); HasNoAddMethod(typeof(LinkedList<>)); // Compile agrees with our checks: //c! new Stack { 5 }; //c! new Queue { 5 }; //c! new LinkedList { 5 }; } } ///:~

None of the three has an Add( ) method, therefore none is a “collection” according to 3.0’s definition. Notably though, these are commonly used System.Collections.Generic types. Calling True( ) on false causes a failed assertion, but the solution never reaches that line of code.

Exercise 20: When you have an IEnumerable type without an Add( ) method, but a valid Add( ) extension method is in scope, does your type then qualify as a C# 3.0 collection type? //: SimpleNewFeaturesSolutions\AddExtension.cs // {CompileTimeError: 'MyEnumerable' does not contain // a definition for 'Add'} using System.Collections; class MyEnumerable : IEnumerable { public IEnumerator GetEnumerator() { yield break; } } static class AddExtensionMakesNoDifference { static void Add(this MyEnumerable myEnumerable, object o) {} static void Main() { MyEnumerable myEnumerable = new MyEnumerable { 5 }; } } ///:~

The Add( ) extension does not qualify MyEnumerable as a “collection” type.

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

153

Exercise 21: Write code that proves the Add( ) method must be perfectly cased (i.e. “Add”, not “ADD” or “add”). //: SimpleNewFeaturesSolutions\PerfectCasing.cs // {CompileTimeError: 'MyEnumerable' does not contain // a definition for 'Add'} using System.Collections; class MyEnumerable : IEnumerable { public IEnumerator GetEnumerator() { yield break; } static void add(object o) {} static void ADD(object o) {} } static class AddExtensionMakesNoDifference { static void Main() { MyEnumerable myEnumerable = new MyEnumerable { 5 }; } } ///:~

This shows the same issue as the previous exercise solution.

Exercise 22: Write code that proves the compiler initializes nested collections immediately, Add( )ing each one individually instead of creating them all and then adding them as a batch at the end. //: SimpleNewFeaturesSolutions\OrderOfAddCalls.cs using System.Collections; using System.Collections.Generic; class MyCollection : IEnumerable { public IEnumerator GetEnumerator() { yield break; } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } public void Add(T item) { ("Add<" + typeof(T) + ">()").P(); } } class OrderOfAddCalls {

154

C# Query Expressions

Preview Release 1.0

static void Main() { new MyCollection<MyCollection> { new MyCollection { 1, 2 }, new MyCollection { 3, 4 } }; } } /* Output: Add<System.Int32>() Add<System.Int32>() Add<MyCollection`1[System.Int32]>() Add<System.Int32>() Add<System.Int32>() Add<MyCollection`1[System.Int32]>() *///:~

The trace statements provide the proof. The first nested MyCollection is initialized then Add( )ed to the outer collection. The process repeats for the second.

Exercise 23: Add five instances of some anonymous type to a List. Give the anonymous type three integer fields: a value, its square, and its cube. //: SimpleNewFeaturesSolutions\SquaresAndCubes.cs using System; using System.Collections.Generic; class SquaresAndCubes { static void Main() { var answer = new List(); for(int Value = 1; Value <= 5; Value++) answer.Add(new { Value, Square = (int)Math.Pow(Value, 2), Cube = (int)Math.Pow(Value, 3) }); answer.P(); } } /* Output: [ { Value = 1, Square = 1, Cube = 1 }, { Value = 2, Square = 4, Cube = 8 }, { Value = 3, Square = 9, Cube = 27 }, { Value = 4, Square = 16, Cube = 64 }, { Value = 5, Square = 25, Cube = 125 } ]

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

155

*///:~

We must use List because we cannot explicitly state the name of the anonymous type as the generic argument. We named our iteration variable Value instead of i to avoid explicitly naming the anonymous type’s first field. You’ll see how the compiler infers a generic sequence’s type when we revisit this in the Query Expressions chapter.

Exercise 24: Make two instances of an anonymous type that holds another anonymous type that holds a HeldType from AnonymousEquals.cs. Call Equals( ) on the first-anonymous type, and notice that HeldType.Equals( )’s trace statement prints successfully (indicating that the second-level Equals( ) is indeed generated). //: SimpleNewFeaturesSolutions\AnonymousEqualsDeeper.cs class HeldType { string id; public HeldType(string id) { this.id = id; } public override bool Equals(object obj) { // Trace statement proves this method is called: "Equals()".P(id); return true; } } class AnonymousEqualsDeeper { static void Main() { var outer = new { Inner = new { Held = new HeldType("first") } }; outer.Equals(outer); var outer2 = new { Inner = new { Held = new HeldType("second") } }; outer.Equals(outer2); outer2.Equals(outer); } } /* Output: first: Equals() first: Equals() second: Equals() *///:~

156

C# Query Expressions

Preview Release 1.0

This exercise shows that the rules for the compiler-generated Equals( ) are recursive, as we would expect. outer calls Equals( ) on its only member, Inner, which again calls Equals( ) on its only member, Held. Notice how the output varies depending on which reference we invoke Equals( ).

Exercise 25: (Advanced.) Use reflection to determine whether the compiler reuses the same generated generic type if you change only the order of the identical property names and property types of two anonymous types. //: SimpleNewFeaturesSolutions\DifferentGeneratedTypes.cs class DifferentGeneratedTypes { static void Main() { var anonymous1 = new { Property1 = 5, Property2 = 6 }; var anonymous2 = new { Property2 = 6, Property1 = 5 }; // The types are not the same: anonymous1.GetType() .Equals(anonymous2.GetType()).False(); // What the compiler is doing: anonymous1.GetType().Name .AssertEquals("<>f__AnonymousType0`2"); anonymous2.GetType().Name .AssertEquals("<>f__AnonymousType1`2"); } } ///:~

The compiler generates different generic types. If the compiler reused the same type, then two instances could possibly be equal even though the property order differs.

Exercise 26: Write a single-argument lambda expression whose parameter the compiler infers as a DateTime. Access the Day property from within the lambda. Make another lambda whose parameter type the compiler infers as an int. Again, try to access the Day property. Prove that the compiler catches the error. //: SimpleNewFeaturesSolutions\LambdaInference.cs using System; delegate void DateTimeDelegate(DateTime dateTime); delegate void IntDelegate(int theInt);

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

157

class LambdaInference { static void Main() { DateTimeDelegate del1 = d => d.Day.P(); //c! IntDelegate del2 = d => d.Day.P(); } } ///:~

The compiler catches the error. Lambdas maintain compile-time type safety, even when using type inference.

Exercise 27: Make a List of random TimeSpans between 0 and 24 hours. Use FindAll( ) to locate all TimeSpans less than 12 hours, Exists( ) to see if any have an Hours property value of five, TrueForAll( ) to ensure each is between 0 and 24 hours, and ConvertAll( ) to return just the Hours portion of each. Use lambda expressions for all calls. //: SimpleNewFeaturesSolutions\ListOperations.cs using System; using System.Collections.Generic; class ListOperations { static Random rand = new Random(47); static IEnumerable<TimeSpan> RandomSpans() { int amount = rand.Next(20); for(int i = 0; i < amount; i++) // Keep each within 24 hours: yield return new TimeSpan(rand.Next(24), rand.Next(60), rand.Next(60)); } static void Main() { // Could call ToList() here, but we haven't // introduced it in the book yet: var list = new List<TimeSpan>(RandomSpans()); list.Sort(); list.P("list", POptions.NoNewLines); list.FindAll(ts => ts.TotalHours < 12) .P("Less than 12 hours", POptions.NoNewLines); list.Exists(t => t.Hours == 5).False(); list.TrueForAll(t => 0 < t.TotalHours && t.TotalHours < 24).True(); list.ConvertAll(ts => ts.Hours).P("Hours", POptions.NoNewLines); } } /* Output:

158

C# Query Expressions

Preview Release 1.0

list: [00:47:36, 10:37:57, 14:37:35, 22:09:56, 22:52:22] Less than 12 hours: [00:47:36, 10:37:57] Hours: [0, 10, 14, 22, 22] *///:~

Compared to anonymous methods, lambda expressions significantly reduce the labor of this exercise. Later we’ll use Enumerable.Select( ), which is a more generic version of ConvertAll( ).

Exercise 28: Write a Generate( ) method that returns an IEnumerable. Have it take two arguments: an int with the number of items to generate, and a no-arg delegate that returns a type T. Make it execute the delegate the given number of times, yielding each result. Hint: You may wish to use the Func delegate talked about in the next section instead of making your own. //: MindView.Util\Enumerable.3.cs // {CF: /target:library} using System; using System.Collections.Generic; namespace MindView.Util { public static partial class Enumerable { public static IEnumerable Generate(int amount, Func generator) { for(int i = 0; i < amount; i++) yield return generator(); } } } ///:~

We use Generate( ) in the next exercise.

Exercise 29: Fill two Lists with the same random numbers. Sort( ) both lists in descending order, and use RemoveAll( ) to take out the odd numbers. For the first list, use anonymous methods, for the second list, use lambda expressions. Verify that they produce identical results (TrueForAll( )). //: SimpleNewFeaturesSolutions\RemovingOdds.cs using System; using MindView.Util;

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

159

using System.Collections.Generic; class RemovingOdds { static void Main() { var rand = new Random(47); // You may have made a for loop for this: var one = new List( Enumerable.Generate(100, rand.Next)); var two = new List(one); one.AssertEquals(two); one.Sort(delegate(int left, int right) { return right.CompareTo(left); }); two.Sort((left, right) => right.CompareTo(left)); one.AssertEquals(two); one.RemoveAll(delegate(int i) { return i % 2 != 0; }); two.RemoveAll(i => i % 2 != 0); one.AssertEquals(two); one.TrueForAll(delegate(int i) { return i % 2 == 0; }) .True(); two.TrueForAll(i => i % 2 == 0).True(); } } ///:~

The System.MindView.Enumerable.Generate( ) method executes its given delegate n times, yielding each result. Notice that the anonymous methods and the lambda expressions look much the same, but the anonymous methods are much more succinct. Later in the book we will introduce System.Linq.Enumerable.Where( ), which filters much like RemoveAll( ), but works on any IEnumerable.

Exercise 30: Write a basic selection sort algorithm that returns an IEnumerable and takes an IEnumerable and a Func as the comparer. Test your algorithm. //: SimpleNewFeaturesSolutions\FuncSelectionSort.cs using System; using MindView.Util; using System.Collections.Generic; static class FuncSelectionSort { static IEnumerable SelectionSort( this IEnumerable sequence,

160

C# Query Expressions

Preview Release 1.0

Func comparer) { var list = new List(sequence); // Selection: for(int i = 0; i < list.Count; i++) { int smallest = i; for(int j = smallest + 1; j < list.Count; j++) if(comparer(list[j], list[smallest]) < 0) smallest = j; T temp = list[i]; list[i] = list[smallest]; list[smallest] = temp; } return list; } static void Main() { Func> generation = () => Enumerable.Generate(5, new Random(47).Next); generation().SelectionSort((l, r) => l - r) .P(POptions.NoNewLines); generation().SelectionSort((l, r) => r - l) .P(POptions.NoNewLines); } } /* Output: [601795864, 914533663, 1266970102, 1305670887, 1332423928] [1332423928, 1305670887, 1266970102, 914533663, 601795864] *///:~

Writing the SelectionSort( ) helps acquaint you with Func. We also used Func in Main( ) to generate identical sequences of random numbers for sorting in ascending and descending order. Later you will see OrderBy( ) that has Func choose what objects to compare. It then relies on the chosen objects to implement IComparable (instead of using a two-arg Func as we did here).

Chapter 2 Exercise 1: From a sequence of strings, select the ones with no vowels, uppercasing the ones you select. //: QueryExpressionsSolutions\NoVowels.cs using System.Linq; class NoVowels {

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

161

static void Main() { var strings = new[] { "bcd", "baCd", "bCdF", "hOwdy", "l8R", "abracadabra", }; var noVowels = from s in strings where s.ToLower().IndexOfAny( new[] { 'a', 'e', 'i', 'o', 'u' }) == -1 select s.ToUpper(); noVowels.AssertEquals(new[] { "BCD", "BCDF", "L8R" }); // Translation: var noVowels2 = strings.Where(s => s.ToLower() .IndexOfAny(new[] { 'a', 'e', 'i', 'o', 'u' }) == -1) .Select(s => s.ToUpper()); noVowels.AssertEquals(noVowels2); var noVowels3 = Enumerable.Select( Enumerable.Where( strings, s => s.ToLower().IndexOfAny(new[] { 'a', 'e', 'i', 'o', 'u' }) == -1), s => s.ToUpper()); noVowels2.AssertEquals(noVowels3); } } ///:~

At first query expressions may seem limited in their ability, but since they are simply syntactic rewrites, normal code within each clause is legal.

Exercise 2: Look up Enumerable.Range( ) and use it to create a number source to write a query that calculate the squares of the odd numbers from 1 to 100. //: QueryExpressionsSolutions\SquaresOfOddNumbers.cs using System; using System.Linq; class SquaresOfOddNumbers { static void Main() { var answer1 = from i in Enumerable.Range(1, 100) where i % 2 == 1 select (int)Math.Pow(i, 2); var answer2 =

162

C# Query Expressions

Preview Release 1.0

Enumerable.Range(1, 100) .Where(i => i % 2 == 1) .Select(i => (int)Math.Pow(i, 2)); answer1.AssertEquals(answer2); var answer3 = Enumerable.Select( Enumerable.Where(Enumerable.Range(1, 100), i => i % 2 == 1), i => (int)Math.Pow(i, 2)); answer2.AssertEquals(answer3); // Verify answer1 by hand: var enumerator = answer1.GetEnumerator(); foreach(int i in Enumerable.Range(1, 100)) { if(i % 2 == 0) continue; enumerator.MoveNext().True(); Math.Pow(i, 2).AssertEquals(enumerator.Current); } enumerator.MoveNext().False(); } } ///:~

Range( ) returns a sequence of numbers. The queries in answer1 and answer2 differ mainly in the dots, parentheses, capitalized clause name (select vs. .Select( )), and conversion of the lambda bodies to actual lambdas (i =>). Since all expressions convert into lambdas and method calls, we can use any legal C#. We cast the result of Math.Pow( ) (a double) back to an int so that the query returns an IEnumerable and not an IEnumerable<double>. Later we’ll study Enumerable’s Cast( ) method, which casts all the elements to a target type T. The third approach is harder to read. C# 3.0 greatly enhances the legibility of extension methods with syntactic rewrites, even though extension methods allow you to write query-like code with normal static methods.

Exercise 3: Modify Select( ) in SelectCode.cs so the select clause returns 100 for each element instead of using the selector argument.2

2 We suggest, for practice with these exercises, that you first formulate the query and then translate it into Enumerable calls.

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

163

//: QueryExpressionsSolutions\SelectCodeModified.cs using System; using System.Collections.Generic; static class CustomEnumerable { public static IEnumerable Select(this IEnumerable collection, Func selector) { foreach(int element in collection) yield return 100; } // Other query expression helper methods go here } class SelectCode { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; var timesTen = from number in numbers select number * 10; timesTen.AssertEquals( new[] { 100, 100, 100, 100, 100 }); var timesTenTranslation1 = numbers.Select(number => number * 10); timesTen.AssertEquals(timesTenTranslation1); var timesTenTranslation2 = CustomEnumerable.Select(numbers, number => number * 10); timesTenTranslation1 .AssertEquals(timesTenTranslation2); } } ///:~

We change all type arguments to int to satisfy the compiler, since 100 is hard-coded as our return value. Notice the expression in the select clause is now useless since our Select( ) ignores it. Having confirmed that the first query yields the expected result, we then show the compiler’s three-step translation, inserting tests between each stage that verify identical results.

164

C# Query Expressions

Preview Release 1.0

Exercise 4: When you add a using for System.Linq to SelectCode.cs, does the compiler still resolve to CustomEnumerable’s Select( ) method or use Enumerable’s instead? //: QueryExpressionsSolutions\SelectWithNeedlessUsing.cs // {IE: NeedlessUsings} using System; using System.Linq; using System.Collections.Generic; static class CustomEnumerable { public static IEnumerable Select(this IEnumerable collection, Func selector) { "Select()".P(); foreach(T element in collection) yield return selector(element); } // Other query expression helper methods go here } class SelectCode { static void Main() { int[] numbers = { 1, 2, 3, 4, 5 }; var timesTen = from number in numbers select number * 10; timesTen.AssertEquals(new[] { 10, 20, 30, 40, 50 }); } } /* Output: Select() *///:~

The compiler chooses CustomEnumerable’s Select( ) — still “nearer” in scope than Enumerable’s — even when we add using.

Exercise 5: Write a Where( ) method and use it in a query. Give it a trace statement to prove the compiler calls your version instead of Enumerable’s. //: QueryExpressionsSolutions\CustomWhereMethod.cs using System; using System.Collections.Generic; static class MyExtensions {

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

165

public static IEnumerable Where( this IEnumerable source, Func predicate) { "Where()".P(); foreach(T item in source) if(predicate(item)) yield return item; } } class CustomWhereMethod { static void Main() { int[] numbers = { 3, 6, 3, 2, 5, 7, 8 }; var result1 = from i in numbers where i % 2 == 0 select i; result1.AssertEquals(new[] { 6, 2, 8 }); var result2 = numbers.Where(i => i % 2 == 0); // Degenerate select result2.AssertEquals(result1); var result3 = MyExtensions.Where(numbers, i => i % 2 == 0); result3.AssertEquals(result2); } } /* Output: Where() Where() Where() Where() Where() *///:~

Where( ) is much like Select( ) except that it uses its delegate for a test instead of a transformation. For each item that passes the predicate, Where( ) yields that item. Of course, you can write the algorithm to do whatever you like instead, though doing so would cause confusion. You would initially expect three “Where( )” trace statements in the output. We explain this behavior in the section that follows in the book.

Exercise 6: Here’s a twist: create a Select( ) method that handles the following query expression: bool returnedValue =

166

C# Query Expressions

Preview Release 1.0

from s in 5 select 5.5; //: QueryExpressions\AbnormalSelectMethod.cs using System; static class AbnormalSelectMethod { static bool Select(this int i, Func func) { return false; } static void Main() { (from s in 5 select 5.5).False(); 5.Select(s => 5.5).False(); Select(5, s => 5.5).False(); } } ///:~

We drop the returnedValue variable because False( ) proves that the query returns a bool by requiring bool for its type. This query expression uses an int as the source and returns a bool, whereas most query expressions use an IEnumerable or IQueryable as their source and/or return type. Note the syntactic revision: the compiler inserts a parameter s for the anonymous method even though we don’t use s in the lambda.

Exercise 7: Prove that the compiler drops degenerate select clauses by relying on the trace statement within Select( ) in SelectCode.cs. Make two queries, one with a degenerate select clause, and one without, and notice that the trace statement only prints for the non-degenerate select clause. //: QueryExpressionsSolutions\DegeneratesDontPrint.cs using System; using System.Linq; using System.Collections.Generic; static class CustomEnumerable { public static IEnumerable Select(this IEnumerable collection, Func selector) { "Select()".P(); foreach(T element in collection) yield return selector(element); }

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

167

} class DegeneratesDontPrint { static void Main() { int[] numbers = { 1, 2, 3 }; var notDegenerate1 = from number in numbers select number; notDegenerate1.AssertEquals(numbers); var notDegenerate2 = from number in numbers select number + 5; notDegenerate2.AssertEquals(new[] { 6, 7, 8 }); var degenerate = from number in numbers where number < 5 select number; "No trace statement here...".P(); degenerate.AssertEquals(numbers); } } /* Output: Select() Select() No trace statement here... *///:~

Sure enough, the compiler drops the degenerate select clause, thus there’s no output for the last query.

Exercise 8: Does the compiler consider a where true clause to be degenerate? That is, does it consider it a waste of processor cycles and thus eliminate the clause in the translation? Write code to prove your answer. //: QueryExpressionsSolutions\WhereTrueNotDegenerate.cs using System; using System.Linq; using System.Collections.Generic; static class WhereTrueNotDegenerate { static IEnumerable Where(this IEnumerable source, Func predicate) { "Where()".P(); foreach(T item in source) if(predicate(item))

168

C# Query Expressions

Preview Release 1.0

yield return item; } static void Main() { int[] numbers = {}; (from i in numbers where true select i + 1).AssertEquals(new int[0]); } } /* Output: Where() *///:~

The trace statement proves that where true is not considered degenerate.

Exercise 9: Make a HasWhere object with a Where( ) instance method (instead of the static extension methods we’ve used thus far). Make another class that has Where( ) and Select( ) extension methods. Put trace statements in all methods (be sure to distinguish between the two Where( ) methods). Use an instance of HasWhere for a query expression’s source. Which Where( ) does the compiler use? Does this setup cause any issues with the Select( ) extension method (make sure your select isn’t degenerate)? //: QueryExpressionsSolutions\IntermixedMethods.cs using System; using System.Collections; using System.Collections.Generic; static class Extensions { public static IEnumerable Select(this IEnumerable collection, Func selector) { "Select()".P(); foreach(T element in collection) yield return selector(element); } public static IEnumerable Where(this IEnumerable source, Func func) { "Extension Where()".P(); foreach(T i in source) if(func(i)) yield return i; }

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

169

} class HasWhere : IEnumerable { List elements = new List { 5, public IEnumerable Where(Func GetEnumerator() return elements.GetEnumerator(); } IEnumerator IEnumerable.GetEnumerator() return GetEnumerator(); } }

2, 7 }; bool> func) {

{

{

class IntermixedMethods { static void Main() { var hasWhere = new HasWhere(); var result = from i in hasWhere where i % 2 == 1 select i * 10; result.AssertEquals(new[] { 50, 70 }); } } /* Output: Select() Instance Where() *///:~

Following the rules of extension methods, the compiler uses the instance Where( ) method instead of the extension version. However, since HasWhere has no Select( ), the rules simply reapply, and the compiler chooses the Select( ) extension method. This provides the greatest flexibility.

Exercise 10: Use string’s CompareTo( ) method to list all unique pairs of programmer names: Jeff, Andrew, Craig, Susan, Derek, Drew, Katlyn. Store each pair in an anonymous type with First and Second fields. //: QueryExpressionsSolutions\ProgrammerPairs.cs using System.Linq;

170

C# Query Expressions

Preview Release 1.0

class ProgrammerPairs { static void Main() { var programmers = new[] { "Jeff", "Andrew", "Craig", "Susan", "Derek", "Drew", "Katlyn" }; var pairs = from programmer1 in programmers from programmer2 in programmers where programmer1.CompareTo(programmer2) < 0 select new { First = programmer1, Second = programmer2 }; pairs.P(); // Translation: var pairs2 = programmers .SelectMany( programmer1 => programmers, (programmer1, programmer2) => new { programmer1, programmer2 }) .Where(temp => temp.programmer1.CompareTo(temp.programmer2)< 0) .Select( temp => new { First = temp.programmer1, Second = temp.programmer2 }); pairs.AssertEquals(pairs2); // Further translation of extension methods: var pairs3 = Enumerable.Select( Enumerable.Where( Enumerable.SelectMany(programmers, programmer1 => programmers, (programmer1, programmer2) => new { programmer1, programmer2 }), temp => temp.programmer1.CompareTo( temp.programmer2) < 0), temp => new { First = temp.programmer1, Second = temp.programmer2 }); pairs2.AssertEquals(pairs3);

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

171

} } /* Output: [ { First = Jeff, Second = Susan }, { First = Jeff, Second = Katlyn }, { First = Andrew, Second = Jeff }, { First = Andrew, Second = Craig }, { First = Andrew, Second = Susan }, { First = Andrew, Second = Derek }, { First = Andrew, Second = Drew }, { First = Andrew, Second = Katlyn }, { First = Craig, Second = Jeff }, { First = Craig, Second = Susan }, { First = Craig, Second = Derek }, { First = Craig, Second = Drew }, { First = Craig, Second = Katlyn }, { First = Derek, Second = Jeff }, { First = Derek, Second = Susan }, { First = Derek, Second = Drew }, { First = Derek, Second = Katlyn }, { First = Drew, Second = Jeff }, { First = Drew, Second = Susan }, { First = Drew, Second = Katlyn }, { First = Katlyn, Second = Susan } ] *///:~

When CompareTo( ) returns a negative value, the name on the left is “less than” the name on the right. This enables us to remove duplicates and identical-name pairs. P( ) with a Boolean argument inserts newlines between each element. To show the compiler’s translation, we replace clauses with dots and method names. By moving the method names and inserting commas, we convert extension methods to normal static method calls.

Exercise 11: Use both the compiler’s approach (packing into temporary anonymous types) and the variable capture approach (nesting lambda expressions) to translate your solution for the previous exercise. //: QueryExpressionsSolutions\PairsWithCaptures.cs using System.Linq; class PairsWithCaptures {

172

C# Query Expressions

Preview Release 1.0

static void Main() { var programmers = new[] { "Jeff", "Andrew", "Craig", "Susan", "Derek", "Drew", "Katlyn" }; var pairs = from programmer1 in programmers from programmer2 in programmers where programmer1.CompareTo(programmer2) < 0 select new { First = programmer1, Second = programmer2 }; pairs.P(); // See previous solution to see compiler's approach. // With captures: var pairs2 = programmers.SelectMany( programmer1 => programmers .Where(programmer2 => programmer1.CompareTo(programmer2) < 0) .Select( programmer2 => new { First = programmer1, Second = programmer2 })); pairs.AssertEquals(pairs2); var pairs3 = Enumerable.SelectMany(programmers, programmer1 => Enumerable.Select( Enumerable.Where(programmers, programmer2 => programmer1.CompareTo(programmer2) < 0), programmer2 => new { First = programmer1, Second = programmer2 })); pairs2.AssertEquals(pairs3); } } /* Output: [ { First = Jeff, Second = Susan }, { First = Jeff, Second = Katlyn }, { First = Andrew, Second = Jeff },

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

173

{ { { { { { { { { { { { { { { { { {

First First First First First First First First First First First First First First First First First First

= = = = = = = = = = = = = = = = = =

Andrew, Second = Craig }, Andrew, Second = Susan }, Andrew, Second = Derek }, Andrew, Second = Drew }, Andrew, Second = Katlyn }, Craig, Second = Jeff }, Craig, Second = Susan }, Craig, Second = Derek }, Craig, Second = Drew }, Craig, Second = Katlyn }, Derek, Second = Jeff }, Derek, Second = Susan }, Derek, Second = Drew }, Derek, Second = Katlyn }, Drew, Second = Jeff }, Drew, Second = Susan }, Drew, Second = Katlyn }, Katlyn, Second = Susan }

] *///:~

We include the original query for convenience, and to compare the capture results with those of the compiler’s. Notice that captures don’t require packing values into temporary anonymous types. They require nested lambdas, however, so before translating the extension methods the intermediate translation nests rather than chaining. This syntax is easier to read and write, whether in a query expression or with chained extension methods.

Exercise 12: Generalize your solution to the programming pair exercise to create a generic Combine( ) method. (Hint: You will have to create your own type to contain both members of each pair.) //: QueryExpressionsSolutions\GenericCombine.cs using System; using System.Linq; using System.Collections.Generic; class Pair { public T First { get; set; } public U Second { get; set; } public override string ToString() { return "{ First = " + First + ", Second = " + Second + " }";

174

C# Query Expressions

Preview Release 1.0

} } static class GenericCombine { static IEnumerable<Pair> Combine(this IEnumerable left, IEnumerable right) where T : IComparable { return from first in left from second in right where first.CompareTo(second) < 0 select new Pair { First = first, Second = second }; } static void Main() { var programmers = new[] { "Jeff", "Andrew", "Craig", "Susan", "Derek", "Drew", "Katlyn" }; programmers.Combine(programmers) .OrderBy(p => p.First) .ThenBy(p => p.Second).P(); } } /* Output: [ { First = Andrew, Second = Craig }, { First = Andrew, Second = Derek }, { First = Andrew, Second = Drew }, { First = Andrew, Second = Jeff }, { First = Andrew, Second = Katlyn }, { First = Andrew, Second = Susan }, { First = Craig, Second = Derek }, { First = Craig, Second = Drew }, { First = Craig, Second = Jeff }, { First = Craig, Second = Katlyn }, { First = Craig, Second = Susan }, { First = Derek, Second = Drew }, { First = Derek, Second = Jeff }, { First = Derek, Second = Katlyn }, { First = Derek, Second = Susan }, { First = Drew, Second = Jeff }, { First = Drew, Second = Katlyn }, { First = Drew, Second = Susan },

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

175

{ First = Jeff, Second = Katlyn }, { First = Jeff, Second = Susan }, { First = Katlyn, Second = Susan } ] *///:~

We built Pair to expand the scope of our select result (a downside of factoring into the Combine( ) method), since anonymous types are usable only within their declared method. We use OrderBy( ) and ThenBy( ) to make the output a bit clearer. Later in the book, you will see how these two methods work.

Exercise 13: How would AlternativeSelectMany.cs’s “alternative translation” change if its query contained a where clause? //: QueryExpressionsSolutions\AlternativeWithWhere.cs using System.Linq; class AlternativeWithWhere { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; var query1 = from n1 in numbers1 from n2 in numbers2 where n1 + n2 < 10 select n1 + n2; // Alternative translation: var query2 = numbers1.SelectMany( n1 => numbers2.Where(n2 => n1 + n2 < 10) .Select(n2 => n1 + n2)); query1.AssertEquals(new[] { 4, 5, 5, 6 }); query1.AssertEquals(query2); } } ///:~

The Where( ) call falls in place before the Select( ) call. Notice that both capture n1’s value since we nest them within the first lambda expression’s body.

Exercise 14: Write code to prove how ActualCompilerTranslation.cs’s translation changes if the compiler converts all froms except the first into SelectMany( ) calls. (That is, it

176

C# Query Expressions

Preview Release 1.0

converts the select clause into a normal Select( ) call instead of combining the last from and select into a single SelectMany( ).) //: QueryExpressionsSolutions\WastedSelect.cs using System.Linq; class WastedSelect { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; int[] numbers3 = { 5, 6 }; var result = from n1 in numbers1 from n2 in numbers2 from n3 in numbers3 select n1 + n2 + n3; result.AssertEquals( new[] { 9, 10, 10, 11, 10, 11, 11, 12 }); // Bogus translation: var resultTranslation = numbers1 .SelectMany(n1 => numbers2, // n1 not used (n1, n2) => new { n1, n2 }) .SelectMany(temp1 => numbers3, // temp not used (temp1, n3) => new { temp1, n3 }) .Select( temp2 => temp2.temp1.n1 + temp2.temp1.n2 + temp2.n3); result.AssertEquals(resultTranslation); } } ///:~

This exercise produces an unnecessary Select( ) for the last select clause. The second SelectMany( ) further packs the first’s results into another temporary anonymous type for the subsequent Select( ). However, this produces the same result as ActualCompilerTranslation.cs, which performs both steps in one SelectMany( ).

Exercise 15: How does SuperFroms.cs’s translation change when you insert a “where true” between the third and fourth from? //: QueryExpressionsSolutions\SuperFromsWithWhere.cs using System.Linq;

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

177

class SuperFromsWithWhere { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 3, 4 }; int[] numbers3 = { 4, 5 }; int[] numbers4 = { 6, 7 }; var additions1 = from n1 in numbers1 from n2 in numbers2 // The last two froms: from n3 in numbers3 where true from n4 in numbers4 select n1 + n2 + n3 + n4; additions1.AssertEquals(new[] { 14, 15, 15, 16, 15, 16, 16, 17, 15, 16, 16, 17, 16, 17, 17, 18 }); // New translation: var additions2 = numbers1 .SelectMany(n1 => numbers2, (n1, n2) => new { n1, n2 }) .SelectMany(temp1 => numbers3, (temp1, n3) => new { temp1, n3 }) .Where(temp2 => true) .SelectMany(n4 => numbers4, (temp2, n4) => temp2.temp1.n1 + temp2.temp1.n2 + temp2.n3 + n4); additions1.AssertEquals(additions2); } } ///:~

The where clause doesn’t change too much. The four froms still produce three SelectMany( )s, and the packing and unpacking into and out of transparent identifiers is much the same.

Exercise 16: We performed the first step of translating MixingWhere.cs’s first query in MixingWhereConversions.cs. Perform this same first step on the rest of the queries. //: QueryExpressionsSolutions\MixingWhereTranslations.cs using System.Linq; class MixingWhereTranslations { static void Main() { int[] odds = { 1, 3, 5, 7, 9 };

178

C# Query Expressions

Preview Release 1.0

int[] evens = { 2, 4, 6, 8, 10 }; var better = from o in odds where o < 7 where o > 1 from e in evens where e < 6 select o * e; var betterTranslation = odds .Where(o => o < 7) .Where(o => o > 1) .SelectMany(o => evens, (o, e) => new { o, e }) .Where(temp => temp.e < 6) .Select(temp => temp.o * temp.e); var better2 = from o in odds from e in evens where o > 1 && o < 7 where e < 6 select o * e; var better2Translation = odds .SelectMany(o => evens, (o, e) => new { o, e }) .Where(temp => temp.o > 1 && temp.o < 7) .Where(temp => temp.e < 6) .Select(temp => temp.o * temp.e); var best = from o in odds from e in evens where o > 1 && o < 7 && e < 6 select o * e; var bestTranslation = odds.SelectMany(o => evens, (o, e) => new { o, e }) .Where(temp => temp.o > 1 && temp.o < 7 && temp.e < 6) .Select(temp => temp.o * temp.e); better.AssertEquals(betterTranslation); betterTranslation.AssertEquals(better2); better2.AssertEquals(better2Translation); better2Translation.AssertEquals(best); best.AssertEquals(bestTranslation); } } ///:~

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

179

If you follow the three rules outlined in this section, the translation is straightforward.

Exercise 17: Below you see a variation to the queries in MixingWhere.cs. One requires less iterations than the other. Which is it, and why? //: QueryExpressionsSolutions\MixingWhereAlternative.cs using System; using System.Collections.Generic; static class Extensions { static int selectCount; static int selectManyCount; static int whereCount; public static IEnumerable Select(this IEnumerable collection, Func selector) { foreach(T element in collection) { selectCount++; yield return selector(element); } } public static IEnumerable SelectMany( this IEnumerable source, Func> collectionSelector, Func resultSelector) { foreach(T tValue in source) foreach(C cValue in collectionSelector(tValue)) { selectManyCount++; yield return resultSelector(tValue, cValue); } } public static IEnumerable Where(this IEnumerable source, Func func) { foreach(T i in source) { whereCount++; if(func(i)) yield return i; } } public static void ReportCounts() { whereCount.P("Where() iteration count");

180

C# Query Expressions

Preview Release 1.0

selectManyCount.P("SelectMany() iteration count"); selectCount.P("Select() iteration count"); (whereCount + selectManyCount + selectCount).P("Total"); whereCount = selectManyCount = selectCount = 0; } } class MixingWhereAlternative { static void Main() { int[] odds = { 1, 3, 5, 7, 9 }; int[] evens = { 2, 4, 6, 8, 10 }; var approach1 = from o in odds from e in evens where o > 1 && o < 7 where e < 6 select o * e; approach1.P(POptions.NoNewLines); Extensions.ReportCounts(); var approach2 = from o in odds where o > 1 && o < 7 from e in evens where e < 6 select o * e; approach2.P(POptions.NoNewLines); Extensions.ReportCounts(); } } /* Output: [6, 12, 10, 20] Where() iteration count: 35 SelectMany() iteration count: 25 Select() iteration count: 4 Total: 64 [6, 12, 10, 20] Where() iteration count: 15 SelectMany() iteration count: 10 Select() iteration count: 4 Total: 29 *///:~

In the text we recommend putting all froms together at the top of your query, however, this is only a recommendation. We feel, at least in this case, that doing so makes for the best approach. However, we must point out that

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

181

many times you sacrifice speed for readability, though usually the cost goes unnoticed. In this solution, we again created our own Select( ), SelectMany( ), and Where( ) methods. However, in our version, we also track the number of iterations each makes. Calling ReportCounts( ) prints the current values and resets them to zero. The second approach mixes the from and where clauses, which requires less iterations. The where removes the unwanted items before the second from clause. The first query, however, combines all items with all other items, even the unwanted items, and then filters using the where.

Exercise 18: Use the Northwind class to list the names of Customers who live in Mexico. //: QueryExpressionsSolutions\CustomersInMexico.cs using System.Linq; class CustomersInMexico { static void Main() { var mexicanCustomers = from c in Northwind.Customers where c.Country == "Mexico" select c.ContactName; mexicanCustomers.P(); var mexicanCustomersTranslated1 = Northwind.Customers .Where(c => c.Country == "Mexico") .Select(c => c.ContactName); mexicanCustomers .AssertEquals(mexicanCustomersTranslated1); var mexicanCustomersTranslated2 = Enumerable.Select( Enumerable.Where(Northwind.Customers, c => c.Country == "Mexico"), c => c.ContactName); mexicanCustomersTranslated1 .AssertEquals(mexicanCustomersTranslated2); } } /* Output: [ Ana Trujillo, Antonio Moreno,

182

C# Query Expressions

Preview Release 1.0

Francisco Chang, Guillermo Fernández, Miguel Angel Paolino ] *///:~

The static Northwind properties return IEnumerables, making them ideal sources for a query.

Exercise 19: Write a query using the NorthWind class that pairs all Mexican customers with American customers. //: QueryExpressionsSolutions\MexicansAndAmericans.cs using System.Linq; class MexicansAndAmericans { static void Main() { var result = from m in Northwind.Customers where m.Country == "Mexico" from a in Northwind.Customers where a.Country == "USA" select new { Mexican = m.ContactName, American = a.ContactName }; var resultTranslated = Northwind.Customers .Where(m => m.Country == "Mexico") .SelectMany(m => Northwind.Customers, (m, a) => new { m, a }) .Where(temp => temp.a.Country == "USA") .Select(temp => new { Mexican = temp.m.ContactName, American = temp.a.ContactName }); result.AssertEquals(resultTranslated); var resultTranslated2 = Enumerable.Select( Enumerable.Where( Enumerable.SelectMany( Enumerable.Where(Northwind.Customers, m => m.Country == "Mexico"), m => Northwind.Customers, (m, a) => new { m, a }),

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

183

temp => temp.a.Country == "USA"), temp => new { Mexican = temp.m.ContactName, American = temp.a.ContactName }); resultTranslated.AssertEquals(resultTranslated2); } } ///:~

Two froms is like a Cartesian product. We discussed this earlier in the book.

Exercise 20: Select the value and the square root of the numbers 1 to 100 with square roots greater than 5 and less than 6. Use a let clause. //: QueryExpressionsSolutions\SquareRootsBetween5And8.cs using System; using System.Linq; class SquareRootsBetween5And8 { static void Main() { // Without a let: var answer = from val in Enumerable.Range(1, 100) where 5 < Math.Sqrt(val) && Math.Sqrt(val) < 6 select new { Value = val, SquareRoot = Math.Sqrt(val) }; answer.P(); var answer2 = from val in Enumerable.Range(1, 100) let sqrt = Math.Sqrt(val) where 5 < sqrt && sqrt < 6 select new { Value = val, SquareRoot = sqrt }; answer.AssertEquals(answer2); // Alternative approach: var answer3 = from val in Enumerable.Range(26, 10) select new { Value = val, SquareRoot = Math.Sqrt(val) };

184

C# Query Expressions

Preview Release 1.0

answer2.AssertEquals(answer3); } } /* Output: [ { Value = 26, SquareRoot = 5.09901951359278 }, { Value = 27, SquareRoot = 5.19615242270663 }, { Value = 28, SquareRoot = 5.29150262212918 }, { Value = 29, SquareRoot = 5.3851648071345 }, { Value = 30, SquareRoot = 5.47722557505166 }, { Value = 31, SquareRoot = 5.56776436283002 }, { Value = 32, SquareRoot = 5.65685424949238 }, { Value = 33, SquareRoot = 5.74456264653803 }, { Value = 34, SquareRoot = 5.8309518948453 }, { Value = 35, SquareRoot = 5.91607978309962 } ] *///:~

The temporary sqrt variable in the second query makes it easier to read. It’s also less error-prone, because the expression Math.Sqrt(value) is comprised of the two expressions Math.Sqrt( ) and value. You can eliminate the where clause by setting your Range( ) parameters correctly (as in our second query), which makes the let unnecessary. We show the translations later in the chapter.

Exercise 21: Sort all Orders shipped to Mexico and Germany by CustomerID and descending order of ShippedDate. //: QueryExpressionsSolutions\OrdersToMexicoAndGermany.cs using System.Linq; class OrdersToMexicoAndGermany { static void Main() { var solution = from o in Northwind.Orders where o.ShipCountry == "Germany" || o.ShipCountry == "Mexico" orderby o.CustomerID, o.ShippedDate descending select new { o.CustomerID, o.ShippedDate }; solution.Take(10).P(); var solutionTranslation1 = Northwind.Orders .Where(o => o.ShipCountry == "Germany" || o.ShipCountry == "Mexico")

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

185

.OrderBy(o => o.CustomerID) .ThenByDescending(o => o.ShippedDate) .Select(o => new { o.CustomerID, o.ShippedDate }); solution.AssertEquals(solutionTranslation1); var solutionTranslation2 = Enumerable.Select( Enumerable.ThenByDescending( Enumerable.OrderBy( Enumerable.Where(Northwind.Orders, o => o.ShipCountry == "Germany" || o.ShipCountry == "Mexico"), o => o.CustomerID), o => o.ShippedDate), o => new { o.CustomerID, o.ShippedDate }); solutionTranslation1.AssertEquals(solutionTranslation2); } } /* Output: [ { CustomerID }, { CustomerID }, { CustomerID }, { CustomerID }, { CustomerID }, { CustomerID }, { CustomerID }, { CustomerID }, { CustomerID }, { CustomerID } ] *///:~

= ALFKI, ShippedDate = 4/13/1998 12:00:00 AM = ALFKI, ShippedDate = 3/24/1998 12:00:00 AM = ALFKI, ShippedDate = 1/21/1998 12:00:00 AM = ALFKI, ShippedDate = 10/21/1997 12:00:00 AM = ALFKI, ShippedDate = 10/13/1997 12:00:00 AM = ALFKI, ShippedDate = 9/2/1997 12:00:00 AM = ANATR, ShippedDate = 3/11/1998 12:00:00 AM = ANATR, ShippedDate = 12/12/1997 12:00:00 AM = ANATR, ShippedDate = 8/14/1997 12:00:00 AM = ANATR, ShippedDate = 9/24/1996 12:00:00 AM

We get more meaningful output using an anonymous type to choose the OrderID and OrderDate, and Take( ) to limit the output.

186

C# Query Expressions

Preview Release 1.0

Exercise 22: What’s the UnitPrice of the most expensive Product ever sold? //: QueryExpressionsSolutions\MostExpensiveProduct.cs using System.Linq; class MostExpensiveProduct { static void Main() { var prices = from od in Northwind.OrderDetails orderby od.UnitPrice descending select od.UnitPrice; // One way to get the first item: var enumerator = prices.GetEnumerator(); enumerator.MoveNext().True(); decimal mostExpensive1 = enumerator.Current; mostExpensive1.AssertEquals(263.5m); // Better way to get the first item: decimal mostExpensive2 = prices.First(); mostExpensive1.AssertEquals(mostExpensive2); // No need for temporary "prices" variable: decimal mostExpensive3 = (from od in Northwind.OrderDetails orderby od.UnitPrice descending select od.UnitPrice).First(); mostExpensive2.AssertEquals(mostExpensive3); // Translation: decimal mostExpensive4 = Northwind.OrderDetails .OrderByDescending(od => od.UnitPrice) .Select(od => od.UnitPrice) .First(); mostExpensive3.AssertEquals(mostExpensive4); decimal mostExpensive5 = Enumerable.First( Enumerable.Select( Enumerable.OrderByDescending( Northwind.OrderDetails, od => od.UnitPrice), od => od.UnitPrice)); mostExpensive4.AssertEquals(mostExpensive5); // Query not necessary. Just use Max(): decimal mostExpensive6 = Northwind.OrderDetails.Max(od => od.UnitPrice);

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

187

mostExpensive5.AssertEquals(mostExpensive6); } } ///:~

The last query alone would suffice, but you’ll find that practice with different approaches helps you understand queries. The beginning of the solution shows the initial approach you should take. We order the largest UnitPrices first, and then use enumerators to extract the first item. You use First( ) (as shown later) to extract the first element. We introduce Max( ) here, which extracts the item directly, and is more efficient than ordering the data and then taking the first item.

Exercise 23: Prove that an OrderBy( ) followed by a ThenBy( ) does not always produce the same results as an OrderBy( ) followed by another OrderBy( ). //: QueryExpressionsSolutions\MessingUpTheOrder.cs using System.Linq; class MessingUpTheOrder { static void Main() { var wantedCountries = new[] { "Venezuela", "Italy" }; var source = Northwind.Customers.Where(c => wantedCountries.Contains(c.Country)); var twoOrderBys = source .Where(c => wantedCountries.Contains(c.Country)) .OrderBy(c => c.Country) .OrderBy(c => c.ContactName) .Select(c => new { c.Country, c.ContactName }); var orderByThenBy = source .Where(c => wantedCountries.Contains(c.Country)) .OrderBy(c => c.Country) .ThenBy(c => c.ContactName) .Select(c => new { c.Country, c.ContactName }); twoOrderBys.Equals(orderByThenBy).False(); twoOrderBys.P("twoOrderBys"); orderByThenBy.P("orderByThenBy"); } } /* Output: twoOrderBys: [ { Country = Venezuela, ContactName = Carlos González },

188

C# Query Expressions

Preview Release 1.0

{ { { { { {

Country Country Country Country Country Country

= = = = = =

Venezuela, ContactName = Carlos Hernández }, Venezuela, ContactName = Felipe Izquierdo }, Italy, ContactName = Giovanni Rovelli }, Venezuela, ContactName = Manuel Pereira }, Italy, ContactName = Maurizio Moroni }, Italy, ContactName = Paolo Accorti }

] orderByThenBy: [ { Country = Italy, ContactName = Giovanni Rovelli }, { Country = Italy, ContactName = Maurizio Moroni }, { Country = Italy, ContactName = Paolo Accorti }, { Country = Venezuela, ContactName = Carlos González }, { Country = Venezuela, ContactName = Carlos Hernández }, { Country = Venezuela, ContactName = Felipe Izquierdo }, { Country = Venezuela, ContactName = Manuel Pereira } ] *///:~

Printing the results of these two queries shows that the order of the Country values is not preserved after the OrderBy( ) of the ContactNames.

Exercise 24: Sort all the Customer Phone numbers ignoring parentheses. Keep the parentheses, however, in your output. //: QueryExpressionsSolutions\SortingWithNoParens.cs using System.Linq; class SortingWithNoParens { static void Main() { var ignoringParens = from c in Northwind.Customers orderby c.Phone.Replace("(", "") select c.Phone; ignoringParens.Take(3).P(POptions.NoNewLines); var ignoringParensTranslated = Northwind.Customers .OrderBy(c => c.Phone.Replace("(", "").Replace(")", "")) .Select(c => c.Phone); ignoringParens.AssertEquals(ignoringParensTranslated); var ignoringParensTranslated2 = Enumerable.Select( Enumerable.OrderBy(Northwind.Customers, c => c.Phone.Replace("(", "") .Replace(")", "")),

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

189

c => c.Phone); } } /* Output: [011-4988260, (02) 201 24 67, 0221-0644327] *///:~

We remove the parenthesis in the orderby, but we don’t want to remove it in the select. The output still contains the parenthesis.

Exercise 25: Use the Enumerable.Count( ) to find the number of elements in each group of OrderDetails, grouped by their ProductID. //: QueryExpressionsSolutions\GroupingDetails.cs using System.Linq; class GroupingDetails { static void Main() { var answer1 = from od in Northwind.OrderDetails orderby od.ProductID group od by od.ProductID; foreach(var group in answer1.Take(5)) group.Count().P(group.Key.ToString()); // Translation: var answer2 = Northwind.OrderDetails .OrderBy(od => od.ProductID) .GroupBy(od => od.ProductID); answer1.AssertEquals(answer2); var answer3 = Enumerable.GroupBy( Enumerable.OrderBy(Northwind.OrderDetails, od => od.ProductID), od => od.ProductID); answer2.AssertEquals(answer3); // Alternative method you'll see // later in the book: var differentApproach = from od in Northwind.OrderDetails orderby od.ProductID group od by od.ProductID into g let count = g.Count() orderby count descending select new { g.Key, Count = count }; differentApproach.Take(5).P();

190

C# Query Expressions

Preview Release 1.0

} } /* Output: 1: 38 2: 44 3: 12 4: 20 5: 10 [ { Key = 59, { Key = 24, { Key = 31, { Key = 60, { Key = 56, ] *///:~

Count Count Count Count Count

= = = = =

54 51 51 51 50

}, }, }, }, }

Note that with the translation there may be no Select( ) at the end, because you can end a query with select or group…by (a special form of select). We take advantage of P( )’s optional descriptive string parameter to display the Key in our output. The first part of the output shows the ProductID with the number of times that Product has been ordered. A more sensible alternative query might be, “What are our most popular products?” We group as we did before, but then use into to order the groups by their (descending) number of elements. You’ll see into in depth later in the chapter.

Exercise 26: List each country and its number of Customers (much like the previous exercise). //: QueryExpressionsSolutions\CustomerCountryCounts.cs using System.Linq; class CustomerCountryCounts { static void Main() { var solution1 = from c in Northwind.Customers group c by c.Country; foreach(var group in solution1) group.Count().P(group.Key); var solution2 = Northwind.Customers.GroupBy(c => c.Country); solution1.AssertEquals(solution2);

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

191

var solution3 = Enumerable.GroupBy(Northwind.Customers, c => c.Country); solution2.AssertEquals(solution3); } } /* Output: Germany: 11 Mexico: 5 UK: 7 Sweden: 2 France: 11 Spain: 5 Canada: 3 Argentina: 3 Switzerland: 2 Brazil: 9 Austria: 2 Italy: 3 Portugal: 2 USA: 13 Venezuela: 4 Ireland: 1 Belgium: 2 Norway: 1 Denmark: 2 Finland: 2 Poland: 1 *///:~

The translations take up most of the space in this rather simple query (as usual).

Exercise 27: Group all Customers by whether their Phone numbers include parentheses. Although not necessary to solve this exercise, try making a custom IEqualityComparer to accomplish the task. //: QueryExpressionsSolutions\GroupingByParens.cs using System; using System.Linq; using System.Collections.Generic; static class StringExtension { public static bool HasParens(this string s) { return s.Contains("(") && s.Contains(")"); }

192

C# Query Expressions

Preview Release 1.0

} class ParensComparer : IEqualityComparer<string> { public bool Equals(string left, string right) { if(left.HasParens() && right.HasParens()) return true; if(!left.HasParens() && !right.HasParens()) return true; return false; } public int GetHashCode(string str) { return str.HasParens() ? 1 : 0; } } class GroupingByParens { static void Main() { Func phoneHasParens = c => c.Phone.HasParens(); Func phoneDoesNotHaveParens = c => !phoneHasParens(c); // Simple approach: var groups = from c in Northwind.Customers group c by c.Phone.HasParens(); groups.Count().AssertEquals(2); groups.First().Count(phoneHasParens).AssertEquals(0); groups.Last().Count(phoneDoesNotHaveParens) .AssertEquals(0); // Alternative approach with ParensComparer: var grouped = Northwind.Customers.GroupBy(o => o.Phone, o => o, new ParensComparer()); grouped.Count().AssertEquals(2); var group1 = grouped.First(); var group2 = grouped.Last(); group1.Key.AssertEquals("030-0074321"); group2.Key.AssertEquals("(5) 555-4729"); group1.Count(phoneHasParens).AssertEquals(0); group2.Count(phoneDoesNotHaveParens). AssertEquals(0); } } ///:~

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

193

The string extension method HasParens( ) returns true when a string contains both left and right parentheses. ParensComparer is a custom IEqualityComparer for this solution. ParensComparer.Equals( ) returns true when both the left and the right have or lack parentheses. GetHashCode( ) returns the same value for strings with or without parentheses. The tricky part of this exercise is that GroupBy( ) must select the Phone as the grouping criteria and then forward all the elements to ParensComparer for the rest of the logic. But once the structure is complete, the actual GroupBy( ) call is rather simple. We verify our results, relying on methods shown later in the book. First( ) returns the first element in the sequence. Last( ) retrieves the last. Since there are only two groups, these methods come in handy. The version of Count( ) we use here counts the number of items for which the predicate returns true. Note how these declarative methods make code much easier to decipher (and write) compared to C# 2.0 techniques. The first approach is much more succinct, and the group Keys is also more meaningful (it is a bool indicating whether the group has parentheses or not). The second approach just picks a random element in the group as the key, since all elements are considered equal by whether or not they have parentheses.

Exercise 28: Use a join and a group…by to group all OrderIDs by their Customer ContactNames. (Hint: In the translation, you must pack values in the Join( ) for extraction by GroupBy( )). //: QueryExpressionsSolutions\CustomerOrders.cs using System.Linq; class Join { static void Main() { var customerOrders1 = from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID group o.OrderID by c.ContactName; // Translation: var customerOrders2 =

194

C# Query Expressions

Preview Release 1.0

Northwind.Customers .Join(Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c, o }) .GroupBy(temp => temp.c.ContactName, temp => temp.o.OrderID); customerOrders1.AssertEquals(customerOrders2); var customerOrders3 = Enumerable.GroupBy( Enumerable.Join(Northwind.Customers, Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c, o }), temp => temp.c.ContactName, temp => temp.o.OrderID); customerOrders2.AssertEquals(customerOrders3); } } ///:~

join introduces the iteration variable o, so we can use it in the group clause. The translation requires packing values into temporary types because of the chaining nature of the clauses. Grouping by ContactName is not necessarily a good idea because, within the Northwind database, ContactName is not guaranteed to be unique. However, it happens that in the sample database that ContactName is unique, so it makes a good exercise.

Exercise 29: join Customers with itself to find any Customers that have the same ContactName. //: QueryExpressionsSolutions\IdenticalContactNames.cs using System.Linq; class IdenticalContactNames { static void Main() { var identicalContactNames1 = from c1 in Northwind.Customers join c2 in Northwind.Customers on c1.ContactName equals c2.ContactName where c1.CustomerID != c2.CustomerID select new { ID1 = c1.CustomerID, ID2 = c2.CustomerID };

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

195

identicalContactNames1.Count().AssertEquals(0); // Translation: var identicalContactNames2 = Northwind.Customers .Join(Northwind.Customers, c1 => c1.ContactName, c2 => c2.ContactName, (c1, c2) => new { c1, c2 }) .Where(temp => temp.c1.CustomerID != temp.c2.CustomerID) .Select(temp => new { ID1 = temp.c1.CustomerID, ID2 = temp.c2.CustomerID }); identicalContactNames2.Count().AssertEquals(0); var identicalContactNames3 = Enumerable.Select( Enumerable.Where( Enumerable.Join(Northwind.Customers, Northwind.Customers, c1 => c1.ContactName, c2 => c2.ContactName, (c1, c2) => new { c1, c2 }), temp => temp.c1.CustomerID != temp.c2.CustomerID), temp => new { ID1 = temp.c1.CustomerID, ID2 = temp.c2.CustomerID }); identicalContactNames3.Count().AssertEquals(0); // Alternative approach: // join is from, where is on, equals is == var identicalContactNames4 = from c1 in Northwind.Customers from c2 in Northwind.Customers where c1.ContactName == c2.ContactName where c1.CustomerID != c2.CustomerID select new { ID1 = c1.CustomerID, ID2 = c2.CustomerID }; identicalContactNames4.Count().AssertEquals(0); // Translation a little different: var identicalContactNames5 =

196

C# Query Expressions

Preview Release 1.0

Northwind.Customers .SelectMany(c1 => Northwind.Customers, (c1, c2) => new { c1, c2 }) .Where(temp => temp.c1.ContactName == temp.c2.ContactName) .Where(temp => temp.c1.CustomerID != temp.c2.CustomerID) .Select(temp => new { ID1 = temp.c1.CustomerID, ID2 = temp.c2.CustomerID }); identicalContactNames5.Count().AssertEquals(0); } } ///:~

No Customers have identical ContactNames. But you shouldn’t rely on this, as only unique CustomerIDs are guaranteed. Code that depended on unique ContactNames would have problems with a duplicate. If you know SQL, you understand that a cross-join with WHERE can do the same as a normal JOIN, and you can JOIN on conditions other than equality (less than, greater than, not equal, etc.). In C#, however, you can join only where one item equals the other, which is almost all join cases. C# also requires any other filtering criteria to be placed in where clauses (for example, we join equal contact names but filter out those with identical CustomerIDs). The last query replaces the first query’s join, on, and equals with from, where, and ==, which naturally changes its translation.

Exercise 30: Pair all Customers together, listing the Customer with the earlier name in the alphabet first. (Hint: A join won’t work for this one. Why not?) //: QueryExpressionsSolutions\CustomerPairs.cs using System.Linq; class CustomerPairs { static void Main() { // Limit output with Take(): var source = Northwind.Customers.Take(3); var pairs = from a in source

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

197

from b in source where a.ContactName.CompareTo(b.ContactName) < 0 select new { First = a.ContactName, Second = b.ContactName }; pairs.P(); var pairsTranslated = source .SelectMany(a => source, (a, b) => new { a, b }) .Where(temp => temp.a. ContactName.CompareTo(temp.b.ContactName) < 0) .Select(temp => new { First = temp.a.ContactName, Second = temp.b.ContactName }); pairs.AssertEquals(pairsTranslated); var pairsTranslated2 = Enumerable.Select( Enumerable.Where( Enumerable.SelectMany(source, a => source, (a, b) => new { a, b }), temp => temp.a.ContactName.CompareTo( temp.b.ContactName) < 0), temp => new { First = temp.a.ContactName, Second = temp.b.ContactName }); pairsTranslated.AssertEquals(pairsTranslated2); } } /* Output: [ { First = Ana Trujillo, Second = Maria Anders }, { First = Ana Trujillo, Second = Antonio Moreno }, { First = Antonio Moreno, Second = Maria Anders } ] *///:~

We must use a Cartesian product (two froms instead of a single from with a join). Query expressions allow only equi-joins by requiring the equals keyword within the join.

198

C# Query Expressions

Preview Release 1.0

In SQL, you can JOIN on other conditions besides equality, but such joins are rare. To do the same in a C# query expression, you must use two froms with a where, which in many cases may not be as fast as a join.

Exercise 31: List each CustomerID along with the ProductIDs they ordered, as well as the number of times they have ordered that Product. (Hint: group the OrderDetails by an anonymous type that holds each CustomerID with the ProductID, then count how many are in each group). Notice that the original query takes us straight to the easy-to-read answer (which is also relatively easy to write). The alternative is an imperative approach, used in the verification code at the end of the solution. Since anonymous types override Equals( ) and GetHashCode( ), we can use them as a grouping criteria, as we did here. Thus we group every combination of CustomerID and ProductId together, and Count( ) the number in each group.

Exercise 32: Join Customers with Orders on Customer.Country and Order.ShipCountry. //: QueryExpressionsSolutions\FreakyJoin.cs using System.Linq; class FreakyJoin { static void Main() { var goodPracticeButUseless = from c in Northwind.Customers join o in Northwind.Orders on c.Country equals o.ShipCountry select new { c.Country, o.ShipCountry }; goodPracticeButUseless.All(temp => temp.Country.Equals(temp.ShipCountry)).True(); var translation1 = Northwind.Customers.Join(Northwind.Orders, c => c.Country, o => o.ShipCountry, (c, o) => new { c.Country, o.ShipCountry }); goodPracticeButUseless.AssertEquals(translation1); var translation2 = Enumerable.Join(Northwind.Customers, Northwind.Orders, c => c.Country, o => o.ShipCountry, (c, o) => new { c.Country, o.ShipCountry });

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

199

translation1.AssertEquals(translation2); } } ///:~

As long as the types on both sides of the join match, it doesn’t matter where the values come from. This is not necessarily a useful join, but it demonstrates the concept. All( ) is an extension method you’ll see later. It returns true if the lambda returns true for every element in the sequence.

Exercise 33: You’ve seen iteration variables on the proper sides of the equals. Now try this variation in JoinChains.cs: join Orders to OrderDetails to Customers to Products, in that order. Does this work? Why? (Hint: The translation makes the answer clear.) //: QueryExpressionsSolutions\AlternativeJoinChains.cs using System.Linq; class AlternativeJoinChains { static void Main() { var customerProducts = from o in Northwind.Orders join od in Northwind.OrderDetails on o.OrderID equals od.OrderID join c in Northwind.Customers on o.CustomerID equals c.CustomerID join p in Northwind.Products on od.ProductID equals p.ProductID select new { c.ContactName, p.ProductName }; // This works because transparent identifiers // fall into place on the left side of the equals: /*c! var customerProductsTranslation = Northwind.Orders .Join(Northwind.OrderDetails, o => o.OrderID, od => od.OrderID, (o, od) => new { o, od }) .Join(Northwind.Customers, // We use "t" in lieu of "transparent": t1 => o.CustomerID, c => c.CustomerID, (t1, c) => new { t1, c }) .Join(Northwind.Products,

200

C# Query Expressions

Preview Release 1.0

t2 => od.ProductID, p => p.ProductID, (t2, p) => new { c.ContactName, p.ProductName }); */ var customerProductsTranslation2 = Northwind.Orders .Join(Northwind.OrderDetails, o => o.OrderID, od => od.OrderID, (o, od) => new { o, od }) .Join(Northwind.Customers, nt1 => nt1.o.CustomerID, c => c.CustomerID, (nt1, c) => new { nt1, c }) .Join(Northwind.Products, nt2 => nt2.nt1.od.ProductID, p => p.ProductID, (nt2, p) => new { nt2.c.ContactName, p.ProductName }); customerProducts.AssertEquals( customerProductsTranslation2); var customerProductsTranslation3 = Enumerable.Join( Enumerable.Join( Enumerable.Join(Northwind.Orders, Northwind.OrderDetails, o => o.OrderID, od => od.OrderID, (o, od) => new { o, od }), Northwind.Customers, nt1 => nt1.o.CustomerID, c => c.CustomerID, (nt1, c) => new { nt1, c }), Northwind.Products, nt2 => nt2.nt1.od.ProductID, p => p.ProductID, (nt2, p) => new { nt2.c.ContactName, p.ProductName }); customerProductsTranslation2.AssertEquals( customerProductsTranslation3); } } ///:~

Transparent identifiers make this work. As the Join( ) calls chain, the compiler keeps packing the iteration variables from any level of from or join into temporary anonymous types. Thus any of these iteration variables are available on the left side of the equals. Your only restriction is that you must use the new iteration variable on the right side of the equals (or capture a variable outside your query, but such a need should be rare or non-existent).

Exercise 34: Write two queries to retrieve all Orders, pairing each Order with its OrderDetails into an anonymous type: a. The first query uses a join

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

201

b. The second query uses multiple froms //: QueryExpressionsSolutions\JoinVsFrom.cs using System.Linq; class JoinVsFrom { static void Main() { var withJoin = from o in Northwind.Orders join od in Northwind.OrderDetails on o.OrderID equals od.OrderID select new { o, od }; var withJoinTranslated = Northwind.Orders.Join(Northwind.OrderDetails, o => o.OrderID, od => od.OrderID, (o, od) => new { o, od }); withJoin.AssertEquals(withJoinTranslated); var withMultipleFroms = from o in Northwind.Orders from od in Northwind.OrderDetails where o.OrderID == od.OrderID select new { o, od }; withJoinTranslated.AssertEquals(withMultipleFroms); var withMultipleFromsTranslated = Northwind.Orders. SelectMany(o => Northwind.OrderDetails, (o, od) => new { o, od }). Where(temp => temp.o.OrderID == temp.od.OrderID). // The compiler doesn't need to insert this // next step, but it does anyway: Select(temp => new { temp.o, temp.od }); withMultipleFroms.AssertEquals( withMultipleFromsTranslated); } } ///:~

Again, the only syntactic difference between a join and two froms is whether you use on vs. where and equals vs. ==. Notice the compiler combines the join and select clause into a single Join( ) call. With multiple from clauses, the compiler follows the rules outlined in the [[[Multiple Froms]]] section of the book. Thus, we get an extra SelectMany( ) and Where( ), whereas the first query requires only a single Join( ).

202

C# Query Expressions

Preview Release 1.0

join is your best bet when joining on equality, which is the most common join of all. It’s so common, in fact, it has its own term: equijoin.

Exercise 35: Use nested queries to find all the Customer ContactNames in the three countries that have the most Orders. (Hint: Write a query in the where clause to find the three countries with the most orders, then combine that query with Enumerable.Contains( ) in the where.) //: QueryExpressionsSolutions\CountriesWithMostOrders.cs using System.Linq; class CountriesWithMostOrders { static void Main() { var topThreeCountries = from c in Northwind.Customers where (from g in from g in from o in Northwind.Orders group o by o.ShipCountry orderby g.Count() descending select g select g.Key).Take(3).Contains(c.Country) orderby c.ContactName select c.ContactName; var topThreeCountriesTranslation1 = Northwind.Customers .Where(c => Northwind.Orders.GroupBy(o => o.ShipCountry) .OrderByDescending(g => g.Count()) .Select(g => g.Key).Take(3).Contains(c.Country)) .OrderBy(c => c.ContactName) .Select(c => c.ContactName); topThreeCountries.AssertEquals( topThreeCountriesTranslation1); var topThreeCountriesTranslation2 = Enumerable.Select( Enumerable.OrderBy( Enumerable.Where(Northwind.Customers, c => Enumerable.Contains( Enumerable.Take( Enumerable.Select( Enumerable.OrderByDescending(

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

203

Enumerable.GroupBy(Northwind.Orders, o => o.ShipCountry), g => g.Count()), g => g.Key), 3), c.Country)), c => c.ContactName), c => c.ContactName); topThreeCountriesTranslation1.AssertEquals( topThreeCountriesTranslation2); } } ///:~

Nested queries in some cases help clarify a query. For example, here we retrieve the top three countries, then get all customers in those countries. Looking at the translation, there is really only one nested query, instead of two as the original query makes it appear to be. That nested query is embedded in the Where( ) call’s lambda expression. This is a huge benefit to query expressions: they link together seamlessly.

Exercise 36: Rewrite the first query in GroupingDifferentTypes.cs to combine each group’s Key and Count( ) into a single anonymous object (instead of having to Count( ) within the foreach). //: QueryExpressionsSolutions\GroupingDifferentTypes.cs using System.Linq; class GroupingDifferentTypes { static void Main() { // Group actual elements: var grouped = from c in Northwind.Customers group c by c.Country into g select new { Country = g.Key, Count = g.Count() }; grouped.P(); var groupedTranslation1 = from g in from c in Northwind.Customers group c by c.Country select new { Country = g.Key, Count = g.Count() }; grouped.AssertEquals(groupedTranslation1); var groupedTranslation2 = from g in

204

C# Query Expressions

Preview Release 1.0

Northwind.Customers.GroupBy(c => c.Country) select new { Country = g.Key, Count = g.Count() }; groupedTranslation1.AssertEquals(groupedTranslation2); var groupedTranslation3 = Northwind.Customers .GroupBy(c => c.Country) .Select(g => new { Country = g.Key, Count = g.Count() }); groupedTranslation2.AssertEquals(groupedTranslation3); var groupedTranslation4 = Enumerable.Select( Enumerable.GroupBy(Northwind.Customers, c => c.Country), g => new { Country = g.Key, Count = g.Count() }); groupedTranslation3.AssertEquals(groupedTranslation4); } } /* Output: [ { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = { Country = ] *///:~

Exercise Solutions

Germany, Count = 11 }, Mexico, Count = 5 }, UK, Count = 7 }, Sweden, Count = 2 }, France, Count = 11 }, Spain, Count = 5 }, Canada, Count = 3 }, Argentina, Count = 3 }, Switzerland, Count = 2 }, Brazil, Count = 9 }, Austria, Count = 2 }, Italy, Count = 3 }, Portugal, Count = 2 }, USA, Count = 13 }, Venezuela, Count = 4 }, Ireland, Count = 1 }, Belgium, Count = 2 }, Norway, Count = 1 }, Denmark, Count = 2 }, Finland, Count = 2 }, Poland, Count = 1 }

King/Eckel ©2008 MindView, Inc.

205

Writing out all the translations, though tedious, helps you understand what the compiler is doing. Notice the first translation has no method calls in it yet. It simply rewrites the top query into a nested bottom query. The first query is much more succinct than GroupingDifferentTypes.cs’s original approach with the extra foreach loop. P( ) simply iterates over the results and displays them on the console.

Exercise 37: Use into to find all the Customers in the three countries with the most orders. This is a slight revision of [[[exercise x]]] //: QueryExpressionsSolutions\MostOrdersWithInto.cs using System.Linq; class MostOrdersWithInto { static void Main() { var topThreeCountries = from c in Northwind.Customers where (from o in Northwind.Orders group o by o.ShipCountry into g orderby g.Count() descending select g.Key).Take(3).Contains(c.Country) orderby c.ContactName select c.ContactName; // Translation is same as we saw in // previous exercise's solution. } } ///:~

Our solution to the original exercise required a lot of nested queries. However, into removed two layers of from.

Exercise 38: List all CustomerIDs along with the ProductIDs they ordered combined with the number of times they ordered that product. Sort the results by descending number of times that the Customer ordered each Product. Use a group into for this repeat of Exercise {#?field code}. //: QueryExpressionsSolutions\CustomerProductsWithInto.cs using System.Linq; class CustomerProducts { static void Main() { var customerProducts =

206

C# Query Expressions

Preview Release 1.0

from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID join od in Northwind.OrderDetails on o.OrderID equals od.OrderID // Can't use a join into (discussed later) // because grouping by different criteria: group od by new { c.ContactName, od.ProductID } into g let Count = g.Count() orderby Count descending select new { g.Key.ContactName, g.Key.ProductID, Count }; customerProducts.Take(20).P(); // You don't know how to translate let clauses, // but we do the translation for completeness. // Change "into" to a nested query: var customerProductsTranslation1 = from g in from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID join od in Northwind.OrderDetails on o.OrderID equals od.OrderID // Can't use a join into (discussed later) // because grouping by different criteria: group od by new { c.ContactName, od.ProductID } let Count = g.Count() orderby Count descending select new { g.Key.ContactName, g.Key.ProductID, Count }; customerProducts.AssertEquals( customerProductsTranslation1); // Convert let clause to transparent identifer, // then non-transparent identifier (doing both // steps in one here): var customerProductsTranslation2 = from nt in // "nonTransparent"

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

207

from g in from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID join od in Northwind.OrderDetails on o.OrderID equals od.OrderID // Can't use a join into (discussed later) // because grouping by different criteria: group od by new { c.ContactName, od.ProductID } select new { g, Count = g.Count() } orderby nt.Count descending select new { nt.g.Key.ContactName, nt.g.Key.ProductID, nt.Count }; customerProductsTranslation1.AssertEquals( customerProductsTranslation2); // Now onto extension-method syntax: var customerProductsTranslation3 = Northwind.Customers. Join(Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c, o }). Join(Northwind.OrderDetails, temp => temp.o.OrderID, od => od.OrderID, (temp, od) => new { temp, od }). GroupBy(temp2 => new { temp2.temp.c.ContactName, temp2.od.ProductID }, temp2 => temp2.od). Select(g => new { g, Count = g.Count() }). OrderByDescending(nt => nt.Count). Select(nt => new { nt.g.Key.ContactName, nt.g.Key.ProductID, nt.Count }); customerProductsTranslation2.AssertEquals( customerProductsTranslation3); // Static method calls: var customerProductsTranslation4 = Enumerable.Select(

208

C# Query Expressions

Preview Release 1.0

Enumerable.OrderByDescending( Enumerable.Select( Enumerable.GroupBy( Enumerable.Join( Enumerable.Join(Northwind.Customers, Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c, o }), Northwind.OrderDetails, temp => temp.o.OrderID, od => od.OrderID, (temp, od) => new { temp, od }), temp2 => new { temp2.temp.c.ContactName, temp2.od.ProductID }, temp2 => temp2.od), g => new { g, Count = g.Count() }), nt => nt.Count), nt => new { nt.g.Key.ContactName, nt.g.Key.ProductID, nt.Count }); customerProductsTranslation3.AssertEquals( customerProductsTranslation4); } } /* Output: [ { ContactName }, { ContactName }, { ContactName }, { ContactName }, { ContactName }, { ContactName 4 }, { ContactName { ContactName { ContactName { ContactName { ContactName

Exercise Solutions

= Jose Pavarotti, ProductID = 2, Count = 5 = Roland Mendel, ProductID = 24, Count = 4 = Roland Mendel, ProductID = 64, Count = 4 = Roland Mendel, ProductID = 31, Count = 4 = Roland Mendel, ProductID = 17, Count = 4 = Patricia McKenna, ProductID = 71, Count = = = = = =

Horst Horst Paula Paula Paula

Kloss, ProductID = 42, Count = 4 }, Kloss, ProductID = 60, Count = 4 }, Wilson, ProductID = 56, Count = 4 }, Wilson, ProductID = 17, Count = 4 }, Wilson, ProductID = 62, Count = 4 },

King/Eckel ©2008 MindView, Inc.

209

{ ContactName = Jose Pavarotti, ProductID = 56, Count = 4 }, { ContactName = Jose Pavarotti, ProductID = 41, Count = 4 }, { ContactName = Jose Pavarotti, ProductID = 68, Count = 4 }, { ContactName = Jose Pavarotti, ProductID = 13, Count = 4 }, { ContactName { ContactName { ContactName = 3 }, { ContactName = 3 }, { ContactName ] *///:~

= Palle Ibsen, ProductID = 77, Count = 4 }, = Thomas Hardy, ProductID = 31, Count = 3 }, = Christina Berglund, ProductID = 75, Count = Christina Berglund, ProductID = 54, Count = Hanna Moos, ProductID = 21, Count = 3 }

The multiple join clauses cause the compiler to pack values into temporary anonymous types. We’ll show you how to translate let clauses in the next section.

Exercise 39: Show all the intermediate steps of translating the last query in RootFinder.cs. //: QueryExpressionsSolutions\RootFinderTranslation.cs using System; using System.Linq; using System.Collections.Generic; class Coefficients { public double A { get; set; } public double B { get; set; } public double C { get; set; } } class RootFinder { static void Main() { // Coefficients for the quadratic formula var coefficients = new List { new Coefficients { A = 1, B = 3, C = -4}, new Coefficients { A = 2, B = -4, C = -3}, new Coefficients { A = 1, B = -2, C = -4} }; var roots =

210

C# Query Expressions

Preview Release 1.0

from c in coefficients let negativeB = -c.B let bSquared = Math.Pow(c.B, 2) let fourAC = 4 * c.A * c.C let sqrtBsquaredMinusFourAC = Math.Sqrt(bSquared - fourAC) let twoA = 2 * c.A select new { FirstRoot = (negativeB + sqrtBsquaredMinusFourAC) / twoA, SecondRoot = (negativeB - sqrtBsquaredMinusFourAC) / twoA }; /*c! var rootsTranslation1 = from nt1 in // "nonTransparent1" from c in coefficients select new { c, negativeB = -c.B } let bSquared = Math.Pow(c.B, 2) let fourAC = 4 * c.A * c.C let sqrtBsquaredMinusFourAC = Math.Sqrt(bSquared - fourAC) let twoA = 2 * c.A select new { FirstRoot = (negativeB + sqrtBsquaredMinusFourAC) / twoA, SecondRoot = (negativeB - sqrtBsquaredMinusFourAC) / twoA }; */ /*c! var rootsTranslation2 = from nt2 in from nt1 in from c in coefficients select new { c, negativeB = -c.B } select new { nt1, bSquared = Math.Pow(c.B, 2) } let fourAC = 4 * c.A * c.C let sqrtBsquaredMinusFourAC = Math.Sqrt(bSquared - fourAC) let twoA = 2 * c.A select new { FirstRoot = (negativeB + sqrtBsquaredMinusFourAC) / twoA,

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

211

SecondRoot = (negativeB - sqrtBsquaredMinusFourAC) / twoA }; */ /*c! var rootsTranslation3 = from nt3 in from nt2 in from nt1 in from c in coefficients select new { c, negativeB = -c.B } select new { nt1, bSquared = Math.Pow(c.B, 2) } select new { nt2, fourAC = 4 * c.A * c.C } let sqrtBsquaredMinusFourAC = Math.Sqrt(bSquared - fourAC) let twoA = 2 * c.A select new { FirstRoot = (negativeB + sqrtBsquaredMinusFourAC) / twoA, SecondRoot = (negativeB - sqrtBsquaredMinusFourAC) / twoA }; */ /*c! var rootsTranslation4 = from nt4 in from nt3 in from nt2 in from nt1 in from c in coefficients select new { c, negativeB = -c.B } select new { nt1, bSquared = Math.Pow(c.B, 2) } select new { nt2, fourAC = 4 * c.A * c.C } select new { nt3, sqrtBsquaredMinusFourAC = Math.Sqrt(bSquared - fourAC) } let twoA = 2 * c.A select new { FirstRoot = (negativeB + sqrtBsquaredMinusFourAC) / twoA, SecondRoot = (negativeB - sqrtBsquaredMinusFourAC) / twoA };

212

C# Query Expressions

Preview Release 1.0

*/ /*c! var rootsTranslation5 = from nt5 in from nt4 in from nt3 in from nt2 in from nt1 in from c in coefficients select new { c, negativeB = -c.B } select new { nt1, bSquared = Math.Pow(c.B, 2) } select new { nt2, fourAC = 4 * c.A * c.C } select new { nt3, sqrtBsquaredMinusFourAC = Math.Sqrt(bSquared - fourAC) } select new { nt4, twoA = 2 * c.A } select new { FirstRoot = (negativeB + sqrtBsquaredMinusFourAC) / twoA, SecondRoot = (negativeB - sqrtBsquaredMinusFourAC) / twoA }; */ // Scope all the transparent identifier variables: var rootsTranslation6 = from nt5 in from nt4 in from nt3 in from nt2 in from nt1 in from c in coefficients select new { c, negativeB = -c.B } select new { nt1, bSquared = Math.Pow(nt1.c.B, 2) } select new { nt2, fourAC = 4 * nt2.nt1.c.A * nt2.nt1.c.C } select new { nt3, sqrtBsquaredMinusFourAC =

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

213

Math.Sqrt(nt3.nt2.bSquared - nt3.fourAC) } select new { nt4, twoA = 2 * nt4.nt3.nt2.nt1.c.A } select new { FirstRoot = (nt5.nt4.nt3.nt2.nt1.negativeB + nt5.nt4.sqrtBsquaredMinusFourAC) / nt5.twoA, SecondRoot = (nt5.nt4.nt3.nt2.nt1.negativeB nt5.nt4.sqrtBsquaredMinusFourAC) / nt5.twoA }; roots.AssertEquals(rootsTranslation6); // Convert to extension method syntax: var rootsTranslation7 = coefficients. Select(c => new { c, negativeB = -c.B }). Select(nt1 => new { nt1, bSquared = Math.Pow(nt1.c.B, 2) }). Select(nt2 => new { nt2, fourAC = 4 * nt2.nt1.c.A * nt2.nt1.c.C }). Select(nt3 => new { nt3, sqrtBsquaredMinusFourAC = Math.Sqrt(nt3.nt2.bSquared - nt3.fourAC) }). Select(nt4 => new { nt4, twoA = 2 * nt4.nt3.nt2.nt1.c.A }). Select(nt5 => new { FirstRoot = (nt5.nt4.nt3.nt2.nt1.negativeB + nt5.nt4.sqrtBsquaredMinusFourAC) / nt5.twoA, SecondRoot = (nt5.nt4.nt3.nt2.nt1.negativeB nt5.nt4.sqrtBsquaredMinusFourAC) / nt5.twoA }); rootsTranslation6.AssertEquals(rootsTranslation7); // Convert to static method calls: var rootsTranslation8 =

214

C# Query Expressions

Preview Release 1.0

Enumerable.Select( Enumerable.Select( Enumerable.Select( Enumerable.Select( Enumerable.Select( Enumerable.Select(coefficients, c => new { c, negativeB = -c.B }), nt1 => new { nt1, bSquared = Math.Pow(nt1.c.B, 2) }), nt2 => new { nt2, fourAC = 4 * nt2.nt1.c.A * nt2.nt1.c.C }), nt3 => new { nt3, sqrtBsquaredMinusFourAC = Math.Sqrt(nt3.nt2.bSquared - nt3.fourAC) }), nt4 => new { nt4, twoA = 2 * nt4.nt3.nt2.nt1.c.A }), nt5 => new { FirstRoot = (nt5.nt4.nt3.nt2.nt1.negativeB + nt5.nt4.sqrtBsquaredMinusFourAC) / nt5.twoA, SecondRoot = (nt5.nt4.nt3.nt2.nt1.negativeB nt5.nt4.sqrtBsquaredMinusFourAC) / nt5.twoA }); rootsTranslation7.AssertEquals(rootsTranslation8); } } ///:~

These laborious translations pay off when your query doesn’t compile. Practice with translations helps you to figure out how the compiler produces error messages.

Exercise 40: The Customer’s ContactName field combines a spaceseparated first name and last name (and sometimes a middle name or middle token, such as “de Castro”). Using let clauses, write a query that retrieves

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

215

first and last (or last part of) names, and packs the two strings into an anonymous type with FirstName and LastName properties. //: QueryExpressionsSolutions\CustomerNames.cs using System.Linq; class CustomerNames { static void Main() { var names = from c in Northwind.Customers let parts = c.ContactName.Split() let FirstName = parts[0] let LastName = parts.Length == 2 ? parts[1] : parts[2] orderby LastName, FirstName select new { LastName, FirstName }; // We'll forgo rewriting the translations // having transparent identifiers. // Rewrite first let clause: var namesTranslation1 = from nt1 in // nt1 means "nonTransparent1" from c in Northwind.Customers select new { c, parts = c.ContactName.Split() } let FirstName = nt1.parts[0] let LastName = nt1.parts.Length == 2 ? nt1.parts[1] : nt1.parts[2] orderby LastName, FirstName select new { LastName, FirstName }; names.AssertEquals(namesTranslation1); // Rewrite second let clause: var namesTranslation2 = from nt2 in from nt1 in from c in Northwind.Customers select new { c, parts = c.ContactName.Split() } select new { nt1, FirstName = nt1.parts[0] } let LastName = nt2.nt1.parts.Length == 2 ? nt2.nt1.parts[1] : nt2.nt1.parts[2] orderby LastName, nt2.FirstName select new { LastName, nt2.FirstName }; namesTranslation1.AssertEquals(namesTranslation2); // Rewrite third let clause: var namesTranslation3 = from nt3 in

216

C# Query Expressions

Preview Release 1.0

from nt2 in from nt1 in from c in Northwind.Customers select new { c, parts = c.ContactName.Split() } select new { nt1, FirstName = nt1.parts[0] } select new { nt2, LastName = nt2.nt1.parts.Length == 2 ? nt2.nt1.parts[1] : nt2.nt1.parts[2] } orderby nt3.LastName, nt3.nt2.FirstName select new { nt3.LastName, nt3.nt2.FirstName }; namesTranslation2.AssertEquals(namesTranslation3); // Rewrite to extension-method calls: var namesTranslation4 = Northwind.Customers. Select( c => new { c, parts = c.ContactName.Split() }). Select(nt1 => new { nt1, FirstName = nt1.parts[0] }). Select(nt2 => new { nt2, LastName = nt2.nt1.parts.Length == 2 ? nt2.nt1.parts[1] : nt2.nt1.parts[2] }). OrderBy(nt3 => nt3.LastName). ThenBy(nt3 => nt3.nt2.FirstName). Select(nt3 => new { nt3.LastName, nt3.nt2.FirstName }); namesTranslation3.AssertEquals(namesTranslation4); // Rewrite to static method calls: var namesTranslation5 = Enumerable.Select( Enumerable.ThenBy( Enumerable.OrderBy( Enumerable.Select( Enumerable.Select( Enumerable.Select(Northwind.Customers, c => new { c, parts = c.ContactName.Split() }), nt1 => new { nt1, FirstName = nt1.parts[0] }), nt2 => new { nt2, LastName = nt2.nt1.parts.Length == 2 ? nt2.nt1.parts[1] : nt2.nt1.parts[2] }), nt3 => nt3.LastName),

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

217

nt3 => nt3.nt2.FirstName), nt3 => new { nt3.LastName, nt3.nt2.FirstName }); namesTranslation4.AssertEquals(namesTranslation5); } } ///:~

The let clauses make this query more readable and easier to maintain than if we had to rewrite FirstName and LastName, the expressions that make up parts.

Exercise 41: Try to change both queries in IntoVsLet.cs to select each value and the sum of its square into an anonymous type, instead of just the square. See what kind of compiler messages this produces. //: QueryExpressionsSolutions\IntoVsLetTranslation.cs using System.Linq; class IntoVsLetTranslation { static void Main() { var numbers = new[] { 1, 3, 2 }; var withLet = from n in numbers let square = n * n // "let" version select new { n, SumOfSquare = square + square}; /*c! // Doesn't compile because n not // in scope in last select clause: var withInto = from n in numbers select n * n into square // "into" version select new { n, SumOfSquare = square + square }; */ var withLetTranslation1 = from nt in from n in numbers select new { n, square = n * n } select new { nt.n, SumOfSquare = nt.square + nt.square }; /*c! // Notice where n is available: var withIntoTranslation1 = from square in // n is only in scope within the nested query: from n in numbers select n * n

218

C# Query Expressions

Preview Release 1.0

// So can't access n here!: select new { n, SumOfSquare = square + square }; */ withLet.AssertEquals(withLetTranslation1); var withLetTranslation2 = numbers .Select(n => new { n, square = n * n }) .Select(nt => new { nt.n, SumOfSquare = nt.square + nt.square}); withLetTranslation1.AssertEquals(withLetTranslation2); /*c! var withIntoTranslation2 = numbers .Select(n => n * n) .Select(square => // No n in scope: new { n, SumOfSquare = square + square }); */ } } ///:~

This solution combined with the translations stresses the scoping differences between a let and an into. withInto’s query seems like it should work, but once you see the last line of the last translation, the problem becomes apparent. It attempts to Select( ) an n that doesn’t exist as a lambda parameter, and there is no outer level of scope that it may capture it from.

Exercise 42: Find all Products never ordered using join into and Count( ). //: QueryExpressionsSolutions\NeverOrderedProducts.cs using System.Linq; class NeverOrderedProducts { static void Main() { var neverOrderedProducts = from p in Northwind.Products join od in Northwind.OrderDetails on p.ProductID equals od.ProductID into grouped where grouped.Count() == 0 select p; neverOrderedProducts.Count().AssertEquals(0); /*c! // This step has transparent identifiers:

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

219

var neverOrderedProductsTranslation1 = Northwind.Products .GroupJoin(Northwind.OrderDetails, p => p.ProductID, od => od.ProductID, (p, grouped) => new { p, grouped }) .Where(transparent => grouped.Count() == 0) .Select(transparent => p); */ var neverOrderedProductsTranslation2 = Northwind.Products .GroupJoin(Northwind.OrderDetails, p => p.ProductID, od => od.ProductID, (p, grouped) => new { p, grouped }) // nt means "non-transparent": .Where(nt => nt.grouped.Count() == 0) .Select(nt => nt.p); neverOrderedProducts.AssertEquals( neverOrderedProductsTranslation2); var neverOrderedProductsTranslation3 = Enumerable.Select( Enumerable.Where( Enumerable.GroupJoin(Northwind.Products, Northwind.OrderDetails, p => p.ProductID, od => od.ProductID, (p, grouped) => new { p, grouped }), nt => nt.grouped.Count() == 0), nt => nt.p); neverOrderedProductsTranslation2.AssertEquals( neverOrderedProductsTranslation3); // Better approach using Enumerable.Except(): var orderedProductIds = Northwind.OrderDetails.Select(od => od.ProductID); var allProductIds = Northwind.Products.Select(p => p.ProductID); allProductIds.Except(orderedProductIds).Count() .AssertEquals(0); } } ///:~

The join into that produced IEnumerable objects is a great help, as we now simply Count( ) the number in each sequence. (No Northwind products were never ordered. )

220

C# Query Expressions

Preview Release 1.0

Joining OrderDetails to Products instead of Products to OrderDetails is incorrect because there are no OrderDetails that do not have an associated Product (all counts are non-zero).

Exercise 43: Group Orders with Customers rather than Customers with Orders in GroupingWhileJoining.cs. //: QueryExpressions\JoiningBackwards.cs using System.Linq; class JoiningBackwards { static void Main() { var customerOrderCounts = from o in Northwind.Orders join c in Northwind.Customers on o.CustomerID equals c.CustomerID into grouped orderby o.CustomerID select new { o.CustomerID, NumCustomers = grouped.Count() }; customerOrderCounts.Take(10).P(); } } /* Output: [ { CustomerID = ALFKI, NumCustomers = 1 }, { CustomerID = ALFKI, NumCustomers = 1 }, { CustomerID = ALFKI, NumCustomers = 1 }, { CustomerID = ALFKI, NumCustomers = 1 }, { CustomerID = ALFKI, NumCustomers = 1 }, { CustomerID = ALFKI, NumCustomers = 1 }, { CustomerID = ANATR, NumCustomers = 1 }, { CustomerID = ANATR, NumCustomers = 1 }, { CustomerID = ANATR, NumCustomers = 1 }, { CustomerID = ANATR, NumCustomers = 1 } ] *///:~

As stated in the text, this query produces groups of just one element, and that element is now Customer objects instead of Orders, which isn’t useful.

Exercise 44: Use a join into to find Customers who have made at least $15,000 in Orders. //: QueryExpressionsSolutions\ProfitableCustomers.cs

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

221

using System.Linq; class ProfitableCustomers { static void Main() { var solution = (from c in Northwind.Customers join o in Northwind.Orders on c.CustomerID equals o.CustomerID join od in Northwind.OrderDetails on o.OrderID equals od.OrderID into grouped where grouped.Sum( od => od.Quantity * od.UnitPrice) > 15000 select c.ContactName).Distinct(); /*c! // Transparent identifier step: var solutionTranslation1 = Northwind.Customers .Join(Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c, o }) .GroupJoin(Northwind.OrderDetails, nt1 => o.OrderID, od => od.OrderID, (nt1, grouped) => new { nt1, grouped }) .Where(nt2 => grouped.Sum(od => od.Quantity * od.UnitPrice) > 15000) .Select(nt2 => .c.ContactName).Distinct(); */ var solutionTranslation2 = Northwind.Customers .Join(Northwind.Orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new { c, o }) .GroupJoin(Northwind.OrderDetails, t => t.o.OrderID, od => od.OrderID, (nt1, grouped) => new { nt1, grouped }) .Where(nt2 => nt2.grouped.Sum(od => od.Quantity * od.UnitPrice) > 15000) .Select(nt2 => nt2.nt1.c.ContactName).Distinct(); solution.AssertEquals(solutionTranslation2); var solutionTranslation3 = Enumerable.Distinct( Enumerable.Select( Enumerable.Where( Enumerable.GroupJoin( Enumerable.Join(Northwind.Customers, Northwind.Orders, c => c.CustomerID,

222

C# Query Expressions

Preview Release 1.0

o => o.CustomerID, (c, o) => new { c, o }), Northwind.OrderDetails, t => t.o.OrderID, od => od.OrderID, (nt1, grouped) => new { nt1, grouped }), nt2 => Enumerable.Sum(nt2.grouped, od => od.Quantity * od.UnitPrice) > 15000), nt2 => nt2.nt1.c.ContactName)); solutionTranslation2 .AssertEquals(solutionTranslation3); } } ///:~

Any kind of where on a group condition is like a SQL HAVING clause. Notice the join into is only on the last join.

Exercise 45: Translate the second query in OuterJoins.cs. Notice where the DefaultIfEmpty( ) call comes within the translation. How does this allow for outer joins? //: QueryExpressionsSolutions\AnOuterJoinTranslation.cs using System.Linq; class AnOuterJoinTranslation { static void Main() { int[] numbers1 = { 1, 2 }; int[] numbers2 = { 2, 3 }; var numbers = from n1 in numbers1 join n2 in numbers2 on n1 equals n2 into n2s from n2 in n2s.DefaultIfEmpty(-1) select new { n1, n2 }; /*c! // Transparent identifier step: var numbersTranslation1 = numbers1 .GroupJoin(numbers2, n1 => n1, n2 => n2, (n1, n2s) => new { n1, n2s }) // We use "t" in lieu of "transparent": .SelectMany(t => n2s.DefaultIfEmpty(-1), (nt, n2) => new { nt.n1, n2 }); */

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

223

var numbersTranslation2 = numbers1 .GroupJoin(numbers2, n1 => n1, n2 => n2, (n1, n2s) => new { n1, n2s }) // We use "nt" in lieu of "nonTransparent": .SelectMany(nt => nt.n2s.DefaultIfEmpty(-1), (nt, n2) => new { nt.n1, n2 }); numbers.AssertEquals(numbersTranslation2); } } ///:~

The DefaultIfEmpty( ) call lands inside the SelectMany( )’s first lambda expression. Recall that a from followed directly by a select (that is not the query’s initial from) converts to a single SelectMany( ). This intermediate from introduces another source and iteration variable, and the source declaration is placed in this first lambda. See the [[[multiple froms]]] section for a refresher if you need it.

Exercise 46: Use an outer join.to again find any Products never ordered (as in [[[formatting Exercise 42: ]]]). //: QueryExpressionsSolutions\NeverOrderedOuterJoin.cs using System.Linq; class NeverOrderedOuterJoin { static void Main() { var neverOrderedProducts = from p in Northwind.Products join od in Northwind.OrderDetails on p.ProductID equals od.ProductID into ods from od in ods.DefaultIfEmpty() where od == null select p; neverOrderedProducts.Count().AssertEquals(0); var neverOrderedProductsTranslation1 = Northwind.Products.GroupJoin(Northwind.OrderDetails, p => p.ProductID, od => od.ProductID, (p, ods) => new { p, ods }). SelectMany(temp => temp.ods.DefaultIfEmpty(), (temp, od) => new { temp, od }). Where(temp2 => temp2.od == null). Select(temp2 => temp2.temp.p);

224

C# Query Expressions

Preview Release 1.0

neverOrderedProducts.AssertEquals( neverOrderedProductsTranslation1); var neverOrderedProductsTranslation2 = Enumerable.Select( Enumerable.Where( Enumerable.SelectMany( Enumerable.GroupJoin(Northwind.Products, Northwind.OrderDetails, p => p.ProductID, od => od.ProductID, (p, ods) => new { p, ods }), temp => temp.ods.DefaultIfEmpty(), (temp, od) => new { temp, od }), temp2 => temp2.od == null), temp2 => temp2.temp.p); neverOrderedProductsTranslation1.AssertEquals( neverOrderedProductsTranslation2); } } ///:~

An outer join is not the ideal solution, but this is a good exercise of the feature. This is much like [[[Exercise 42: ]]], but that exercise solved it a bit more elegantly.

Exercise 47: Rewrite the last approach in CustomersWithoutOrders.cs so that it does not use the temporary variables customerIDs and orderCustomerIDs. //: QueryExpressionsSolutions\DirectCalls.cs using System.Linq; class CustomersWithoutOrders { static void Main() { var noOrdersIDs = (from c in Northwind.Customers select c.CustomerID).Except( from o in Northwind.Orders select o.CustomerID); noOrdersIDs.Count().AssertEquals(2); // We prefer the direct call approach better, // (which is also the first step in the translation): var noOrderIDsTranslation1 = Northwind.Customers.Select(c => c.CustomerID) .Except(Northwind.Orders.Select(o => o.CustomerID)); noOrdersIDs.AssertEquals(noOrderIDsTranslation1); var noOrderIDsTranslation2 =

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

225

Enumerable.Except( Enumerable.Select(Northwind.Customers, c => c.CustomerID), Enumerable.Select(Northwind.Orders, o => o.CustomerID)); noOrderIDsTranslation1.AssertEquals( noOrderIDsTranslation2); } } ///:~

Calling Select( ) directly seems less noisy than writing a full query expression with a from and a select.

Exercise 48: Retrieve the first three Country names from an alphabetical list of Customers’ countries. Find all the ContactNames of Customers in those three countries. //: QueryExpressionsSolutions\CustomersInCountries.cs using System.Linq; class CustomersInCountries { static void Main() { var countries = (from customer in Northwind.Customers orderby customer.Country select customer.Country).Distinct().Take(3); // Alternative approach: var countries2 = Northwind.Customers .Select(customer => customer.Country) .Distinct() .OrderBy(country => country) .Take(3); countries.AssertEquals(countries2); var answer1 = from c in Northwind.Customers where countries.Contains(c.Country) select c.ContactName; answer1.P(); var answer2 = Northwind.Customers .Where(c => countries.Contains(c.Country)) .Select(c => c.ContactName); answer1.AssertEquals(answer2); var answer3 =

226

C# Query Expressions

Preview Release 1.0

Enumerable.Select( Enumerable.Where(Northwind.Customers, c => countries.Contains(c.Country)), c => c.ContactName); answer2.AssertEquals(answer3); } } /* Output: [ Patricio Simpson, Roland Mendel, Catherine Dewey, Yvonne Moncada, Georg Pipps, Sergio Gutiérrez, Pascale Cartrain ] *///:~

The method call order for countries2 reads more sensibly than for countries, where we must first order by c.Country, then extract each Country from the Customer, then eliminate duplicates, and finally Take( ) the first three. In the second query, once we get all the countries and eliminate duplicates, we just alphabetize what’s left and Take( ) the first three. Notice the Contains( ) Enumerable extension method in the where clause.

Exercise 49: Write a non-generic version of System.Linq.Enumerable.Concat( ). //: MindView.Util\Enumerable.4.cs // {CF: /target:library} using System; using System.Linq; using System.Collections; namespace MindView.Util { public static partial class Enumerable { public static IEnumerable Concat(this IEnumerable left, IEnumerable right) { return System.Linq.Enumerable.Concat( left.Cast(), right.Cast()); }

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

227

} } ///:~

Notice that we adapt System.Linq.Enumerable’s Concat( ) method to our needs, rather than “re-inventing the wheel” by writing two foreach loops. We included our solution in MindView.Util.Enumerable for later use.

Exercise 50: Combine two queries with Concat( ) to retrieve the names and nationalities of Customers from the USA and Mexico. Note in this sample output that nationality differs from Country name: //: QueryExpressionsSolutions\MexicansAndAmericans2.cs using System.Linq; using MindView.Util; using System.Collections; class MexicansAndAmericans { static IEnumerable GetNationality( string countryName, string nationalityName) { return from c in Northwind.Customers where c.Country == countryName select new { Name = c.ContactName, Nationality = nationalityName }; } static void Main() { GetNationality("Mexico", "Mexican").Concat( GetNationality("USA", "American")).P(); } } /* Output: [ { Name = Ana Trujillo, Nationality = Mexican }, { Name = Antonio Moreno, Nationality = Mexican }, { Name = Francisco Chang, Nationality = Mexican }, { Name = Guillermo Fernández, Nationality = Mexican }, { Name = Miguel Angel Paolino, Nationality = Mexican }, { Name = Howard Snyder, Nationality = American }, { Name = Yoshi Latimer, Nationality = American }, { Name = John Steel, Nationality = American }, { Name = Jaime Yorres, Nationality = American },

228

C# Query Expressions

Preview Release 1.0

{ { { { { { { { {

Name Name Name Name Name Name Name Name Name

= = = = = = = = =

Fran Wilson, Nationality = American }, Rene Phillips, Nationality = American }, Paula Wilson, Nationality = American }, Jose Pavarotti, Nationality = American }, Art Braunschweiger, Nationality = American }, Liz Nixon, Nationality = American }, Liu Wong, Nationality = American }, Helvetius Nagy, Nationality = American }, Karl Jablonski, Nationality = American }

] *///:~

We refactored the query expression into GetNationality( ) rather than repeat it. However, we can only return GetNationality( )’s anonymous type by upcasting it into objects. Hence, GetNationality( ) returns a nongeneric IEnumerable. No built-in Concat( ) takes and returns a non-generic IEnumerable, so the MindView.Util assembly includes our own simple adapter for the builtin generic Concat( ) (it also solves Exercise [[[#{?field code here?}]]]).

Exercise 51: Find the following: a. The most parts in any Customer’s ContactName (Hint: use Split( )). Check the results three times using Any( ), All( ), and Count( ). b. The latest and earliest OrderDate. c. The first and last Customers whose names contain a lowercase ‘q’ (in ContactName order). d. The 15th Customer (in ContactName order). e. All Customer ContactNames that begin with ‘R’ (don’t use where or Where( )). f.

The last three Customer ContactNames (in ContactName order).

//: QueryExpressionsSolutions\SimpleQueries.cs using System; using System.Linq; class SimpleQueries {

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

229

static void Main() { // No customer has more than // three parts in their ContactName: Northwind.Customers.Max(c => c.ContactName.Split() .Length).AssertEquals(3); // Can assert with an All() or Any(): Northwind.Customers .All(c => c.ContactName.Split().Length <= 3).True(); Northwind.Customers .Any(c => c.ContactName.Split().Length > 3).False(); Northwind.Customers.Count( c => c.ContactName.Split().Length == 3) .AssertEquals(3); Northwind.Orders.Max(o => o.OrderDate) .AssertEquals(DateTime.Parse("5/6/1998")); Northwind.Orders.Min(o => o.OrderDate) .AssertEquals(DateTime.Parse("7/4/1996")); var orderedCustomers = Northwind.Customers.OrderBy(c => c.ContactName); Func containsQ = c => c.ContactName.Contains("q"); orderedCustomers.First(containsQ).ContactName .AssertEquals("Dominique Perrier"); orderedCustomers.Last(containsQ).ContactName .AssertEquals("Frédérique Citeaux"); orderedCustomers.ElementAt(15).ContactName .AssertEquals("Christina Berglund"); orderedCustomers .Select(c => c.ContactName) .SkipWhile(n => !n.StartsWith("R")) .TakeWhile(n => n.StartsWith("R")) .AssertEquals(new[] { "Renate Messner", "Rene Phillips", "Rita Müller", "Roland Mendel" }); orderedCustomers.Skip(Northwind.Customers.Count() - 3) .Select(c => c.ContactName).AssertEquals(new[] { "Yoshi Tannamuri", "Yvonne Moncada", "Zbyszek Piestrzeniewicz" }); } } ///:~

We’ve factored out as much as possible by creating orderedCustomers and containsQ rather than repeating their expressions within the queries that follow.

230

C# Query Expressions

Preview Release 1.0

We never use a query clause (though it’s possible in some cases). When you’re already inserting other Enumerable extension methods, it’s sometimes simpler just to add the Where( ), Select( ), etc Notice that to find Customers that start with ‘R,’ we first select the ContactName, then test for ‘R,’ rather than the other way around(which works, but requires more dotting).

Exercise 52: Use Aggregate( ) to output Customer ContactName first names, separated by the vertical-bar character (|), and surround the entire string with square brackets, as seen in this sample output (which only shows three names): //: QueryExpressionsSolutions\Aggregating.cs using System.Linq; class Aggregating { static void Main() { // Limit the output: var customerNames = Northwind.Customers.Select( c => c.ContactName.Split()[0]).Take(3); customerNames.Aggregate(string.Empty, (r, n) => r + n + " | ", result => "[" + result.Trim(' ', '|') + "]").P(); ("[" + customerNames.Aggregate((r, n) => r + " | " + n) + "]").P(); } } /* Output: [Maria | Ana | Antonio] [Maria | Ana | Antonio] *///:~

Note that we must seed the initial value with string.Empty in order to be able to write the last lambda, which places brackets around the final result. Of course, we can add the brackets after Aggregate( ) is finished, as we do in the second approach.

Exercise 53: Put all the OrderDetails in an ILookup, and look them up by OrderID, counting the number in at least three groups. //: QueryExpressionsSolutions\OrderDetailLookup.cs using System.Linq; class OrderDetailLookup {

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

231

static void Main() { var detailsLookup = Northwind.OrderDetails.ToLookup(od => od.OrderID); foreach(int orderID in Northwind.Orders.Take(3) .Select(o => o.OrderID)) detailsLookup[orderID].Count() .P(orderID + " detail count"); } } /* Output: 10248 detail count: 3 10249 detail count: 2 10250 detail count: 3 *///:~

Although we haven’t used ILookup much in the book, it’s a very useful way to show a master-detail relationship.

Exercise 54: Retrieve three different sequences of Customer ContactNames in three different alphabetical ranges: a-g, h-q, and r-z. //: QueryExpressionsSolutions\CustomerSections.cs using System; using System.Linq; using MindView.Util; class CustomerSections { static void Main() { Func lessThanEqual = (c, s) => char.ToUpper(s[0]) <= c; var curried = lessThanEqual.Curry(); var contactNames = Northwind.Customers .Select(c => c.ContactName.Split()[0]) .Take(20) .OrderBy(cn => cn); Func<string, bool> lessThanG = curried('G'); var lessThanQ = curried('Q'); var aThroughG = contactNames.TakeWhile(lessThanG); var hThroughQ = contactNames.SkipWhile(lessThanG) .TakeWhile(lessThanQ); var rThroughZ = contactNames.SkipWhile(lessThanQ); aThroughG.P("A-G", POptions.NoNewLines); hThroughQ.P("H-Q", POptions.NoNewLines); rThroughZ.P("R-Z", POptions.NoNewLines); }

232

C# Query Expressions

Preview Release 1.0

} /* Output: A-G: [Ana, Ann, Antonio, Christina, Elizabeth, Elizabeth, Francisco, Frédérique] H-Q: [Hanna, Janine, Laurence, Maria, Martín, Patricio, Pedro] R-Z: [Roland, Sven, Thomas, Victoria, Yang] *///:~

Again we use Take( ) to limit the output. We Select( ) the contact names before doing anything else. After that, we’re simply working with a sequence of strings, instead of having to dot into the Customer.ContactName in each subsequent call. Don’t be surprised if your solution is very different from ours. We couldn’t resist using currying to solve this exercise, but you won’t see currying until later in the book (in the Functional Programming chapter). The main reason for doing so was to avoid repeating our lambda expression. Instead, we state the lambda expression once, and then bind the first argument to the desired value (Q or G). We also used some composing tricks (shown soon in the book). Once we create contactNames, we use that as our source for the subsequent queries, rather than restating the initial query for every section of the alphabet.

Exercise Solutions

King/Eckel ©2008 MindView, Inc.

233

Related Documents

C# Query Expressions
November 2019 5
Query-operators C# 3
November 2019 15
Query
October 2019 37
Query
November 2019 32
Query
October 2019 34