Logo




Subscribe:
RSS 2.0 | Atom 1.0
Categories:

Sign In


[Giagnocavo]Michael::Write()

# Thursday, February 28, 2008
Writing events with System.Diagnostics.Eventing

... or, how the hell to use Vista and 2008's new ETW stuff with managed code. And, introducing ecmanaged: A decent way to do all this stuff.

Quick ETW Overview

Actually, the real ETW overview is here: http://msdn.microsoft.com/msdnmag/issues/07/04/ETW/ <-- This is some of the best overview and documentation on it (the other good stuff is the ecmangen documentation in the Windows SDK bin folder). The MSDN stuff is terribly confusing for the most part. Or maybe I'm too spolied by how easy it is to find stuff in the BCL. My overview is on what you gotta do to make things work in .NET.

ETW is a real pain to use with .NET. Even so, ETW starts off looking really promising. You define everything in a nice XML manifest file, and everything is based off that. But wait, everything? Shouldn't the manifest be the end-all? Yea, that'd make logical sense. No, you run some tools from the Windows SDK. First you run MC, which generates a .h header file. Managed devs are growning now -- why the hell should something as general as event tracing be language specific? The .h file contains the processed event descriptors, ready for C consumption. 

It worsens: MC also generated a resource script. You have to compile that with RC and it'll create a Win32 .res resource. Then you compile that into a binary (the C# compiler has the /win32res option). Then you go back and edit your XML manifest and make sure it points to the final binary. Wait, what? Yes. The resources that MC generates for RC contain all the messages that are in your XML manifest. Someone thought it was a really cute idea to go and make the Event Viewer not only read all the data from your manifest, but also have to go look it up from some binary resources.

Actually, this probably made sense to someone on the Windows team since I'm guessing they already have tools to go and localise Win32 resources or something. Unfortunately, it sucks and makes no sense for anyone NOT in their particular position. Now, I hope I'm wrong (I really, really want to be wrong), but I think there's no way to force the message strings to just stay in the XML file and be read from there. 

Finally, things get easy again. Just run "wevtutil install-manifest Some.man" (wevtutil is in system32). In fact, this utility is so user friendly, it even lets you type "im" instead of "install-manifest". At this point, assuming the other steps went well, your provider shows up in Event Viewer.

ECManGen

But wait, how do I actually make that manifest? This part is almost the easiest. In the Windows SDK, there's a lovely little tool called ECManGen. Just fire it up, and go to town adding Providers, Channels, Templates, and Events.

Providers are the main things that show up in your Event Viewer, such as MyApp-FooProduct-LameComp. Channels separate Admin/Operational/Debug and others. Templates are an argument list for Events. If you have, say, a bunch of events that take the same kinds of parameters, you can share templates among them (I find it helpful to create a "SingleStringTemplate".) It's very straightforward.

*Note: I can't actually get Admin channels to work. If I create an event and stick it in an Admin channel and set its level to Informational, MC complains (as does ECManGen) that the level has to be Critical, Error, or Informational. Uh, OK. Instead, just use Operational.

Except... ECManGen is a free utility. (Free? Perhaps not, seeing as the annual MS bill for a 4-person dev team is around $20,000 (counting just MSDN) -- but it's well worth it.) Part of the docs say: "NOTE: For the Manifest Generator Tool to function correctly, the file winmeta.xml (which contains pre-defined metadata values) must either be in the Include directory of the SDK or in the same directory as the Manifest Generator tool." OK, easy enough. Except... it doesn't work that easily. The only way I got it to work was to copy the xml file over to the same directory, *and* start ECManGen from that directory.

Oh yea, ECManGen won't open your manifest file if you pass it as an argument, so forget about cute VS integration. Just Google ecmangen and go rate up the bugs on Connect :).

Going Managed

OK, so you're not living in the last century and use decent tools -- how does this map to C#? First off, you create an EventProvider with the right Guid (the one from your manifest). Then you create an EventDescriptor for each event, matching up all the little parameters (the MSDN docs for EventDescriptor have more details). Finally, you can call WriteEvent, passing the EventDescriptor *by ref* for some reason (no, I can't figure out why).

Oh yea, and you have to hookup that Win32 resource to your C# project, so if you needed another resource (like another app manifest?), you'll have to go deal with merging them and all that hassle. And, don't forget to make sure the parameters you pass into the object[] array of WriteEvent line up with what your manifest has. And also, the .NET API won't even handle the Boolean->BOOL (4 byte) silliness for you. 

In summary, it's a lot of boring, error-prone work, and you'll have to repeat it every time you edit your manifest. Yuck. Maybe it's just easier to use the old event log stuff and forget about all this fancy ETW stuff.

However, the ETW features are pretty cool. As is ecmangen (well, for the most part). I couldn't tear myself away from the promises the new ETW stuff offered, so I sat down and wrote a hack tool to do everything for me. The goal is to have an XML file, then convert it right into a DLL I can use.

ecmanaged does exactly this. Sorta. It doesn't deal with complicated scenarios or performance counters, but handles straightforward event logging with templates in an easy and type-safe fashion.

Using ecmanaged

Usage: ecmanaged ecgen <manifest> /out:<target assembly> /namespace:<target namespace> /class:<target class>
Usage: ecmanaged msglocate <manifest> <messageFileName>
Usage: ecmanaged install <manifest> <targetManifest> <messageFileName>
Usage: ecmanaged uninstall <targetManifest>

Call ecgen with a manifest, and it'll go build an assembly with the wrappers for calling your provider events. It'll stick the Win32 resource on the assembly too, so it's ready to be used.

Call msglocate with the manifest and the message binary file, and it'll go fix the manifest to point to the right place.

Call install, and it'll copy the manifest to another location, fix the msg location, and call wevtutil on it. It copies because I'm assuming your original manifest is under source control, and hence shouldn't be fixed up.

Uninstall just calls wevutil uninstall-manifest.

Yey! Now you can just make a batch file to call this on your manifest and run it every time you change your manifest. What I do is stick this in a folder in my project called "Eventing". I check the resulting binary into source control (just like a 3rd party DLL). The downside is making sure you always update the DLL if you modify the manifest. I'm sure somehow msbuild can come to the rescue (and autoupdate when the manifest changes, so Intellisense keeps working), but this works for me.

I welcome all comments/questions/insults, but please, no death threats. Download: ecmanaged.exe (45 KB)

P.S. No, I don't hate Windows or the devs on it, although I hate unmanaged code (it IS 2008, innit?). Perhaps I'm just annoyed I have to make a Saving Throw versus Hang every time I click Send in Outlook 2007 (yes, patched up, even with Vista SP1... which made it worse). But I still think this whole ETW toolchain is terribly unpolished.

Code
Thursday, February 28, 2008 9:12:25 PM UTC  #    Comments [1]  |  Trackback

# Monday, December 10, 2007
Access is denied when starting Windows Event Log on Vista

I ran into a strange problem today. The Windows Event Log service would not start, stating an error code 5 "Access is denied". This doesn't make any sense as running the services MSC requires elevated permissions and I'm an administrator. That error message alone didn't give me too much of a clue, especially because I don't know much about the event log service. Also, as the event log service was down, troubleshooting it required a bit more work than usual since I couldn't just turn to the event log.

Fortunately, Vista's ETW (Event Tracing for Windows) provided an easy solution. At a command prompt, I ran "logman query providers". This shows a list of all the installed tracing providers on the system. The interesting one in this case is "Microsoft-Windows-Eventlog". Using this information I could start a trace and generate a report. The report indicated that access was denied creating the System channel. It mentioned a path it was trying to use: %SystemRoot%\system32\winevt\logs\System.evtx

I checked the permissions and they looked ok (SYSTEM had full control). So I renamed the System log and gave permissions to everyone on the folder. Then I started the Event log service and it worked fine. It created a new System.evtx. When I checked the permissions I saw that Event log service runs as local system. Apparently that must be the security required. After resetting the permissions, everything seems to be working fine.

I'm posting this since when I searched for a solution, I found several people asking and no answers. This could be the solution for other people. I'd like to hear if anyone knows what might have caused this mess in the first place.

Misc. Technology
Monday, December 10, 2007 3:18:17 PM UTC  #    Comments [3]  |  Trackback

# Thursday, September 06, 2007
Complicated functions in LINQ to SQL

Rob Conery talks about geocoding with LINQ here. In his post, he provides some code using the Haversine formula to compute the distance between two points on earth. His function is declared as:

Func<double, double, double, double, double> CalcDistance = (lat1, lon1, lat2, lon2) => …

Further, this delegate is wrapped in a normal C# method. Now, read my previous post about calling functions in LINQ to SQL. Can you see where things are going to go wrong? That's right, a normal delegate or method can't be converted into SQL by LINQ, as the engine has nothing to work with. Only when our code is available as data can the LINQ to SQL engine do it's magic. Since there's some insinuation that LINQ to SQL just can't handle things like Haversine, I'll demonstrate how to do it.

If you want to use your own "complicated" functions with LINQ to SQL, you'll need to manually construct the predicate expression. It's not pretty, but, it does let you convert somewhat detailed functions, such as Haversine, inside of LINQ-to-SQL queries. This is not always the right approach: in some cases it'll be better to use a UDF or stored procedure. (If someone knows a native, better way, please let me know! TomasP's Expandable stuff looks cool, but I ran into some bugs on Beta 2. This is really something the compiler should help out with!).

To start off, we need to declare our function as an Expression Tree (I cannot vouch for the accuracy of this code; I'm merely demonstrating LINQ to SQL technique. For the geocoding details, read Rob's post.):

const double R = 6367;
const double RAD = Math.PI / 180;
static Expression<Func<double, double, double, double, double>> dist =
(lat1, lon1, lat2, lon2) =>
    R * 2 *
    (
        Math.Asin(Math.Min(1,Math.Sqrt(
            (
            Math.Pow(Math.Sin(((lat1 * RAD - lat2 * RAD)) / 2.0), 2.0) +
            Math.Cos(lat1 * RAD) * Math.Cos(lat2 * RAD) *
            Math.Pow(Math.Sin(((lon1 * RAD - lon2 * RAD)) / 2.0), 2.0) 
           
        ))) 
    );

OK, that was the easy part. We just had to wrap Expression<> around the type declaration and remove the method calls. But how do we pass this to our query? The Where method has this signature:

public static IQueryable<TSource> Where<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate);

Hence, we need to provide a predicate of Expression<Func<TSource, bool>> for it do its magic. To work with expressions, we need to start using the System.Linq.Expressions namespace. For more clarity, I aliased E to System.Linq.Expressions.Expression. Building expressions isn't particularly hard, it's just annoying and time consuming. C#'s lack of "symbolics", or a way to use the compiler to create expressions to reference properties, etc. means we have to use strings to do so. I never did claim it'd be pretty.

We start off with some data from elsewhere in our application. In my example, I'm just going to declare some locals:

double coolLat = 43.641852;
double coolLong = -79.387298;
double maxDist = 100;

The first Expression we need is a ParameterExpression to refer to the source item from the table (i.e., the parameter the Where method is going to give to us). In my code, my table is called Accounts, hence my object type is Account. To create the parameter expression:

var acctParam = E.Parameter(typeof(Account), "a");

Next, we need to be able to reference the fields in the table. With LINQ to SQL, these are properties on our object. We create them like this:

var acctLon = E.Property(acctParam, "Longitude");
var acctLat = E.Property(acctParam, "Latitude");

The secret sauce is creating the invoke to our dist expression. This is where all the work comes into play and includes all our complicated code in the LINQ to SQL query. Fortunately, after building up our arguments separately, it's not that hard:

var distCalc = E.Invoke(dist, E.Constant(coolLat), E.Constant(coolLong), acctLat, acctLon);

We have to use ConstantExpressions for our local variables. Using constant allows us to capture the value of those variables. Now we're ready to finish off, by adding a less than comparison and turning it all into a <TSource, bool> LambdaExpression:

var maxComp = E.LessThan(distCalc, E.Constant(maxDist));
var pred = E.Lambda<Func<Account, bool>>(maxComp, acctParam);

Our query can now look like this:

MembersDataContext dc = new MembersDataContext();
var q = dc.Accounts.Where(pred);
Console.WriteLine(q.ToString());

When we run it, we see that LINQ to SQL is quite capable of handling our little bit of math:

exec sp_executesql N'SELECT [t0].[AccountId], [t0].[Latitude], [t0].[Longitude]
FROM [dbo].[Accounts] AS [t0]
WHERE (@p0 * ASIN(
(CASE
WHEN @p1 < SQRT(POWER(SIN(((@p2 * @p3) - ([t0].[Latitude] * @p4)) / @p5), @p6) + (COS(@p7 * @p8) * COS([t0].[Latitude] * @p9) *
   
POWER(SIN(((@p10 * @p11) - ([t0].[Longitude] * @p12)) / @p13), @p14))) THEN @p1
ELSE SQRT(POWER(SIN(((@p2 * @p3) - ([t0].[Latitude] * @p4)) / @p5), @p6) + (COS(@p7 * @p8) * COS([t0].[Latitude] * @p9) *
    POWER(SIN(((@p10 * @p11) - ([t0].[Longitude] * @p12)) / @p13), @p14)))

END))) < @p15'
,N'@p0 float,@p1 float,@p2 float,@p3 float,@p4 float,@p5 float,@p6 float,@p7 float,@p8 float,@p9 float,
    @p10 float,@p11 float,@p12 float,@p13 float,@p14 float,@p15 float'
,@p0=12742,@p1=1,@p2=95.412000000000006,@p3=0.017453292519943295,
@p4=0.017453292519943295,@p5=2,@p6=2,@p7=95.412000000000006,@p8=0.017453292519943295,@p9=0.017453292519943295,@p10=102.63200000000001,
@p11=0.017453292519943295,@p12=0.017453292519943295,@p13=2,@p14=2,@p15=100

I still think there should be some kind of syntax so we could write it like we want to: (Where(a=>dist(coolLat, coolLong, a.Latitude, a.Longitude) > 10)). If anyone knows a built-in way to do it, please let me know. I'm sure it's just something simple I'm overlooking...

Code
Thursday, September 06, 2007 1:56:54 PM UTC  #    Comments [4]  |  Trackback

# Wednesday, September 05, 2007
Calling custom methods in LINQ-to-SQL

This was sparked by the issues raised by Rob Conery here. Basically, if you have some semi-complicated function that you need to apply to a LINQ-to-SQL query, how can you do it? This is somewhat covered by TomasP.NET: Building LINQ Queries at Runtime in C# and Joseph Albahari: Dynamically building LINQ expression predicates. I recommend those two articles; they're very good. I'm going to write a bit about it too. Some of it might be redundant, some of the ideas I took from those articles. The code is all mine.

So, let's say we want to get all accounts where the square root of the account ID is even. This will serve as our placeholder for a totally contrived example. Just calling our method in our LINQ query won't work because the LINQ-to-SQL code isn't going to know what to do with it. A method is just an opaque block of code with a name. Here's an example:

static void Main(string[] args) {
    MembersDataContext dc = new MembersDataContext();
    var q = dc.Accounts.Where(a => IsRightAccount(a));
    Console.WriteLine(q.ToString());
}
static bool IsRightAccount(Account a) {
    return Math.Sqrt(a.AccountId) % 2 == 0;
}

This code crashes with: Unhandled Exception: System.NotSupportedException: Method 'Boolean IsRightAccount(ConsoleApplication1.Account)' has no supported translation to SQL. Which should be expected, as LINQ to SQL cannot know what goes on inside that method and thus can't translate it.

Let's change things to make IsRightAccount be a Func delegate (from a lambda expression):

static Func<Account, bool> IsRightAccount = a => Math.Sqrt(a.AccountId) % 2 == 0;

Now we get: Unhandled Exception: System.NotImplementedException: The method or operation is not implemented. At System.Data.Linq.SqlClient.QueryConverter.VisitInvocation(InvocationExpression invoke). That's a somewhat strange exception, as I'd expect it to be a bit more helpful. At any rate, I'd expect it to crash, because that is just a delegate to a method. It's still an opaque block of code.

Enter the magic Tree of Expressions: Expression<T>. As I mentioned in my last post, Expression Trees provide "introspection" (reflection against code). In the examples above, the lambda that calls IsRightAccount for the Where clause actually turned into an InvocationExpression that represents a call to the delegate provided. Hence me saying that it is "opaque". What we need is to make sure that our code (our IsRightAccount calculation) is visible as data. When it's visible as data, then LINQ-to-SQL can go and say "Oh, you want to take the square root of the account ID, mod it by 2, and see if that's zero… now THAT I can do in SQL".

Declaring an Expression Tree is really simple. First, make sure you import System.Linq.Expressions if you don't want to fully qualify the name. Then, declare your tree just like any lambda Func, except this time make the type Expression<MyFunc>:

static Expression<Func<Account, bool>> IsRightAccount = a => Math.Sqrt(a.AccountId) % 2 == 0;

We will also change our Where clause to accommodate the fact that we are not calling a method:

var q = dc.Accounts.Where(IsRightAccount);

And presto! Our program now shows a SQL conversion:

SELECT [t0].[AccountId], [t0].[Email]
FROM [dbo].[Accounts] AS [t0]
WHERE (SQRT(CONVERT(Float,[t0].[AccountId])) % @p0) = @p1

Things get a bit more complicated when we try to stack expressions together. As far as I know, we must create the Expression manually (using the Expression static methods); the compiler won't help out. If anyone knows a built-in way around this, please let me know. Otherwise, see the links at the top of this article for more information and other workarounds.

This also applies if you have complex logic that doesn't directly map to a item -> bool predicate expression. In those cases, you can still encapsulate most of your code by using the compiler to generate the bulk of the Expression, and then just wrap it with a bit of hand-created expression. In my LINQ to the CRUD article, the code attached uses this approach to generate queries for the select/delete commands. Again, I will note that TomasP has written expansions so you can just write myCoolExpression.Expand(arg) rather than building everything by hand.

If you know of any other links or work done in this area, I'm very interested in seeing other approaches. Thanks!

Code
Wednesday, September 05, 2007 11:58:49 PM UTC  #    Comments [2]  |  Trackback

C# 3.0 and LINQ Misunderstandings

Apparently, there is some considerable confusion over all the new C# language features. People who I would hope are reasonably intelligent are completely misunderstanding some C# fundamentals. Agreed, a lot of the new concepts introduced to C# 3.0 are might seem relatively foreign to C# users. Microsoft's marketing related to LINQ doesn't help much either. I'm going to try to clarify the top few things I've seen. I'll reference the C# spec (http://msdn2.microsoft.com/en-us/library/ms364047(vs.80).aspx) so you can know I'm being accurate.

Myth: LINQ is just a data access technology

While data access is obviously a large part of LINQ (even the name stands for Language Integrated Query), you can do a lot more than just access data. Re-using a previous example:

    args
        .Select(s => new Thread(() => SomeLongProcess(s))) 
        .Process(t => t.Start())
        .ToList()
        .ForEach(t => t.Join());

There's no sign of data access there! The practical functional C# articles on my site go into more detail on this. Suffice to say, while a lot of new features were added to "make LINQ possible", much, much, more is possible than just creating queries. (Alternatively, you might chose to limit the meaning of LINQ, as I'd like to do. In this case, I wouldn't consider using lambdas, extensions, etc. as LINQ. Others may disagree. Marketing?)

Implicitly typed local variables (var keyword)

C# is still statically and strongly typed. But, there's a new feature that lets you declare local variables without specifying the type, if the type can be inferred from the initializing expression. From the spec:

var i = 5;
var s = "Hello";
var d = 1.0;
var numbers = new int[] {1, 2, 3};
var orders = new Dictionary<int,Order>();

The implicitly typed local variable declarations above are precisely equivalent to the following explicitly typed declarations:

int i = 5;
string s = "Hello";
double d = 1.0;
int[] numbers = new int[] {1, 2, 3};
Dictionary<int,Order> orders = new Dictionary<int,Order>();

Oddly enough, the C# spec doesn't even mention LINQ or anonymous types when it talks about "var" locals. Why is there confusion about this simple feature? Let's examine anonymous types:

C# anonymous types allow you to declare a type just by specifying its fields. From the spec: "C# 3.0 permits the new operator to be used with an anonymous object initializer to create an object of an anonymous type." For instance, the following code is a valid expression:

new { Name = "Michael" }

This produces an object of a new, anonymous, type, containing a single string property called Name. Hence, this code works:

Console.WriteLine(new { Name = "Michael" }.Name);

However, how can you assign such an object to a variable? Yes, that is the only "need" for the var keyword. There's no way to name the type, since it's anonymous. Regardless if you agree with anonymous types (versus a Tuple class), this is the place where you *need* to use var: assigning an anonymous type to a local.

As you may have noticed, I still haven't mentioned LINQ. Anonymous types are not LINQ specific. They are, however, particularly helpful for certain LINQ queries:

var topCustomers = MyDatabase.Customers.Where(c => c.GoldStar == true).Select(c => new {c.CustomerId, c.Name});

Because of this, people start to associate anonymous types only with LINQ queries and hence, var with LINQ only. The truth is that these features can be used anywhere.

What's the takeaway here? The var keyword simply allows the compiler to infer the type of the variable so you don't have to specify it. Nothing more, nothing less. Some people still want to explicitly annotate every single variable – hey, that's their choice. But don't be locked into this just because there was no option before. Me? I'll take more concise code any day!

As a side note (if this wasn't clear), the var keyword is NOT dynamic typing, just implicit typing (type inference).

Dynamic C#

Seems there's a lot of confusion about C# being dynamic. C# is not dynamically typed, as some seem to imply. I think perhaps some of the confusion comes from all the nice type inference that C# provides. Using the var keyword as shown above might make some people feel "oh, I'm saying var just like Javascript!". Adding to the confusion is the fact that C# is now a "semi"-functional language. For instance, from "why's (poignant) guide to ruby", we see this Ruby code:

5.times { print "Odelay!" }

C# allows us to write in a similar style:

5.Times(() => Console.WriteLine("Odelay!"));

The next Ruby example in that guide goes like this:

exit unless "restaurant".include? "aura"

In C#, we can write:

exit.Unless(() => "restaurant".Contains("aura"));

The Wikipedia article on Functional Programming lists a few features that dynamic languages usually have:

Eval

Sorta… you can create Expression<T> and execute them. On the other hand, you can't do anything like eval(myString) (which is just asking for runtime failure).

Higher-order functions

Definitely in there.

Runtime alteration of object or type system

No, not really. (I.e., maybe you can hack around with certain APIs to try to do some magic, but it's not a language feature.)

Functional Programming

Yes. But still, functions aren't really first class citizens…yet. Once we can start using method groups as Actions and Funcs, implicitly, then it'll get even better. This is an interesting presentation from Andrew Kennedy, Microsoft Research: C# is a functional programming language.

Closures

Yep, since anonymous methods were introduced in C# 2.0.

Continuations

Extremely limited, in the form of yield return.

Introspection

Not just reflection, but actually inspecting the actual code. C# 3.0 has this in the form of Expression<T> (see below).

Macros

No, not crappy C-style macros. Here, I'm thinking more like macros that'd let you create things like C# query comprehensions, *in source code*. (Which is what I was actually hoping when I saw the new query comprehension syntax… no such luck).

In summary, C# lets you gain a lot of benefits usually associated with dynamic programming, but without the nasty parts of dynamic typing.

A great paper on this subject is Static Typing Where Possible, Dynamic Typing When Needed: The End of the Cold War Between Programming Languages, by Erik Meijer and Peter Drayton of Microsoft.

Myth: Extension methods add methods to a class

This is a tricky one, since extension methods appear to be exactly that. This myth is also somewhat perpetuated by the C# spec: "In effect, extension methods make it possible to extend existing types and constructed types with additional methods." But, right before that, the real explanation is given: "Extension methods are static methods that can be invoked using instance method syntax."

Essentially, Extension methods allow us to use infix notation with certain methods. This explains the line from the spec "in effect". A more helpful way to think of this is by thinking of the "." operator as an overloaded operator that also allows passing the first operand given to it as the first argument to specially marked methods.

An alternative* would be to define an operator like the F# pipeline operator (|>). In C#, this would let us write stuff like:

customers |> Seq.Where(c => c.Name == "Michael")

That doesn't look like an improvement. BUT, we no longer need to mark methods in a special way. We can just use them:

myArray |> Array.BinarySearch("s")

Why do we need infix notation anyways? Well, the normal prefix notation can be difficult to read:

Select(Where(customers, c => c.Cool == true), c => c.Name)
Array.BinarySearch(items, "S")

Extension methods just make those functions easier to pipeline. That's all folks. Think of them like this, and save yourself a headache about what "extending a type" means.

*My guess as to why extension methods are done they way they are is because it could confuse people if you have something like "item |> Stuff.SomeMethod("X")" where SomeMethod returns a function. Or, where you have "item |> Stuff.SomeMethod("X").SomethingElse("y"). I'm still annoyed that I can't use infix semantics where *I* want, but oh well.

Lambda expressions and Expression<T>

Spec: "Lambda expressions provide a more concise, functional syntax for writing anonymous methods.". Spec: "Expression trees permit lambda expressions to be represented as data structures instead of executable code. A lambda expression that is convertible to a delegate type D is also convertible to an expression tree of type System.Query.Expression<D>."

So, "lambda expressions" can either be just code (i.e., directly executable IL) OR they can get turned into a data structure, Expression<T>.

Adding to the confusion, a lambda expression can contain just an expression (i=> i + 1), or it can be a block of statements ( i => {Write(i); i++; Write(i); return i+1;} ). However, a lambda expression with a statement block body cannot become an Expression<T>. (As far as I know, VB's lambdas only allow for expression bodies, not blocks.)

An example:

Expression<Func<int, int>> inc = i => i + 1;

Is equivalent to:

ParameterExpression param_i = Expression.Parameter(typeof(int), "i");
var inc2 = Expression.Lambda<Func<int, int>>(
    Expression.Add(
        param_i, 
    Expression.Constant(1, typeof(int))),
param_i);

At runtime, you can then go inspect the actual code and decide what to do with it. This is exactly the premise for LINQ to SQL. When you create a LINQ-to-SQL query, it is turned into an expression like shown above. Then the LINQ-to-SQL APIs inspect and convert that expression tree into SQL statements.

Here, I can understand the confusion. The word "expression" is used in three distinct manners. Rightfully, Expression Trees should be referred to as Expression or Expression<T>, which could help clear up some of the confusion. Additionally, it doesn't help that lambdas have these different conversion rules (although working around it could be ugly, possibly).

Are there any other features that you've seen misused or you've had questions about? Let me know! I love comments, insults, and suggestions.

Want to correct me on something? Go right ahead! But, if you're going to say something like "C# doesn't have type inference", please make sure to either be an expert on the matter or be able to quote an authoritative source or show a proof. Thanks!

Code
Wednesday, September 05, 2007 3:01:18 AM UTC  #    Comments [10]  |  Trackback

# Wednesday, August 29, 2007
Practical Functional C# - Part IV – Think in [Result]Sets

Don't skip parts I to III to get you up to speed. I have $300 up for grabs if these articles are incorrect in stating that it will greatly improve most C# apps:

    Practical Functional C# - Part I 
    Practical Functional C# - Part II 
    Practical Functional C# - Part III - Loops are Evil

Tell the compiler what you want, rather than how to do it. That's a key concept that can take you very far. However, as imperative programmers, this can sometimes be a hard concept. We are so used to instructing the processor, step-by-step, how do to things, and then only after we're all done, making sure the end effect is to our liking. Functional programming helps us change this.

Consider the following task: Write a function that checks a username against a few disallowed characters "!@#$%^&*". The normal C# implementation looks like this (null checks removed):

static bool IsBadUserName(string userName)
{
    var badChars = "!@#$%^&*";
    foreach (char c in badChars) {
        if (userName.Contains(c)) return true
    }
    return false;
}

It's not horrible; there's only one branch, but it does require a bit of thought to sort out. Now let's take an approach where we deal with string as a set of characters to manipulate at once (requires C# 3.0 compiler):

static bool IsBadUserName2(string userName){ 
    var badChars = "!@#$%^&*";
    return userName
        .Intersect(badChars) 
        .Any();
}

We've reduced the function from a step-by-step loop into something that has only path. The bigger benefit is that there's no need to evaluate end conditions and so on for correctness: The code says exactly what it does. "Are there any bad characters in the user name?" We don't have to worry how it does what we asked, we just need to think in terms of what we want our result to be.

A common function in functional languages is called "map". Map takes a list of something and turns it into a list of something else. For instance, if we had a list of integers ("ourInts"), we could turn them into squares by saying "map ourInts by multiplying each value with itself". In C# (LINQ), they called map "Select". Here's a quick example:

var ourInts = new[] { 2, 5, 13 };
var squares = ourInts.Select(i => i * i);

Squares will contain the list { 4, 25, 169 }. What use is this? Well, it is an extremely common pattern to take some set of data, filter it, modify it a bit, and return a new set of data. Here's an example: You have a variable from containing semicolon-delimited email addresses. You want to turn these into an array of .NET's MailAddress objects to use with some other code. The loop isn't very pretty:

var tempAddresses = new List<MailAddress>();
foreach (string s in semicolonEmails.Split(';')) {
    var ts = s.Trim();
    if (ts == "") continue
    tempAddresses.Add(new MailAddress(ts)); 
}
var myAddresses = tempAddresses.ToArray();

But consider the functional approach:

var myAddresses = semicolonEmails
    .Split(';'
    .Select(s => s.Trim())
    .Where(s => s != ""
    .Select(s => new MailAddress(s))
    .ToArray();

In one statement, we transform the data three times, as well as add filter to remove empty items.

One place where LINQ falls flat on its face is when it comes to processing data. For some reason, there are no methods defined to do "ForEach" or "Process". (Even more interesting: List<T> does define these methods.) Process is a great pattern: on each item, it performs some action, then returns the original item. The code to define it is very simple and looks like this:

static IEnumerable<T> Process<T>(this IEnumerable<T> source, Action<T> f)
{
    foreach (var item in source) {
        f(item); 
        yield return item; 
    }
}

How is this of use? Well, let's chain together some functional and imperative processing. For instance, write a program that does some long process on all files passed in as arguments – on separate threads. If we take the purely imperative approach, our code looks like this:

static void Main(string[] args)
{
    var threads = new List<Thread>();
    foreach (var s in args) {
        Thread t = new Thread(startLongProcess); 
        t.Start(s); 
        threads.Add(t); 
    }
    foreach (var t in threads) {
        t.Join();
    }
}
static void startLongProcess(object data)
{
    SomeLongProcess((string)data);
}

Yes, we actually must declare a separate function just to invoke SomeLongProcess. Let's combine and use the functional approach now:

static void Main(string[] args)
{
    args
        .Select(s => new Thread(() => SomeLongProcess(s))) 
        .Process(t => t.Start())
        .ToList()
        .ForEach(t => t.Join());
}

Which way is going to be easier to edit and change around? I don't know about you, but for me, going from ~12 to ~5 lines, removing extra variables, useless functions and flow control structures: that's a hands-down win in my book.

As a side note, threading, in fact, is a space that is extremely ripe for functional styles. I'm willing to bet that ".NET 4" will include threading extensions that rely heavily on functional concepts. For instance, it's easy to create a method that allows us to replace the previous program with this:

args.Parallel(SomeLongProcess);

But I'll talk about that another day.

In the next article, I'm going to cover C#'s new inner functions capability and how that can be used to help build up more complicated function chains. I'd also like some feedback on which kinds of areas of C# programming you've run into that seem to require more code than necessary.

Code
Wednesday, August 29, 2007 10:45:59 AM UTC  #    Comments [4]  |  Trackback

# Sunday, August 26, 2007
A LINQ to the CRUD

A question many people have run into is: How does CRUD fit into LINQ-to-SQL? While LINQ-to-SQL (I'll abbreviate as DLINQ) provides a very fast and easy way for us to start querying our data, it doesn't handle updates as beautifully. It is particularly noticeable when you are doing multi-tier and hence cannot share a DataContext.

Let's consider an example database called "MembersDatabase" which has a table called "Accounts". Accounts has an int primary key, and a varchar Email field. We use the DLINQ designer and create the dbml that generates a class called MembersDataContext. How does our app-tier code look? I'll give you a hint, it starts with "ug" and rhymes with "nasty":

// Select
int someId = 123; // Passed from another tier
Account someAccount; // Can't use implicit typing -- no anonymous types
using (var dc = new MembersDataContext()) {
    someAccount = dc.Accounts.SingleOrDefault(a => a.AccountId == someId);
}

// Insert
var myAccount = new Account();
myAccount.Email = "me@contoso.com";
using (var dc = new MembersDataContext()) {
    dc.Accounts.Add(myAccount); 
    dc.SubmitChanges();
}

// Update
int myId = 1; // Id and email passed from another tier
string newEmail = "cool@cool.com";
var changedAccount = new Account();
changedAccount.AccountId = myId;
changedAccount.Email = newEmail;
using (var dc = new MembersDataContext()) {
    dc.Accounts.Attach(changedAccount, true); 
    dc.SubmitChanges();
}

// Delete
int idToKill = 2; // Passed from another tier
using (var dc = new MembersDataContext()) {
    using (var txScope = new System.Transactions.TransactionScope()) {
        var acc = dc.Accounts.SingleOrDefault(a => a.AccountId == idToKill); 
        if (acc == null) throw new ChangeConflictException("Row not found."); 
        dc.Accounts.Remove(acc); 
        dc.SubmitChanges();
        txScope.Complete();
    }
}

It's not horrible; it's certainly better than anything before it. But, we can do better. I created some helper classes to do so (see attached file).

First, DatabaseBase<TContext>. This holds our tables, and provides DataContext helper functions Use and Query. Second, TableBase<TItem, TKey>. This actually provides our CRUD methods. I don't think anyone is overly interested in the implementation details (comment if I'm wrong), so here's how you declare your CRUD types:

class MembersDatabase : DatabaseBase<MembersDataContext>
{
    public static readonly TableBase<Account, int> Accounts 
        = CreateTable(dc => dc.Accounts, a => a.AccountId);
}

That's it.

You create a new class to serve as your "database" class. All that's required here is to inherit from DatabaseBase.

Next, for each table, simply create a new field via CreateTable. The first parameter is a lambda function that selects the right table off the DataContext. The second parameter is a lambda expression that selects the primary key. Not much too it.

So, how does our previous chunk of code look with this small helper library?

// Select
int someId = 123; // Passed from another tier
var someAccount = MembersDatabase.Accounts.SelectByKey(someId);

// Insert
var myAccount = new Account();
myAccount.Email = "me@contoso.com";
MembersDatabase.Accounts.Insert(myAccount);

// Update
int myId = 1; // Id and email passed from another tier
string newEmail = "me@me.com";
var changedAccount = new Account();
changedAccount.AccountId = myId;
changedAccount.Email = newEmail;
MembersDatabase.Accounts.Update(changedAccount);

// Delete
int idToKill = 2; // Passed from another tier
MembersDatabase.Accounts.Delete(idToKill);

That's about 40% less code. It’s far more straightforward, being a single block.

To start using this code, just drop DatabaseBase.cs into your project. It adds DatabaseBase to System.Data.Linq. Then subclass as shown above, and you're on your way to LINQ updating bliss. What do you think?

DatabaseBase.cs (5.65 KB)

P.S. At any rate, I should get points for the Zelda pun, right?

Update: I forgot to mention, you'll want to turn UpdateCheck to Never on your columns in the LINQ-to-SQL designer.
Update: The code (including the Tuple class) for the .NET 3.5 RTM release is here: http://www.atrevido.net/blog/2008/06/26/LINQ+To+The+CRUD+RTM.aspx

Code
Sunday, August 26, 2007 8:30:32 PM UTC  #    Comments [10]  |  Trackback

# Wednesday, August 22, 2007
Reason #52 against Visual Basic (Nothing in Visual Basic)

What people do in their own time in the privacy of their homes is none of my business. However, when they mess with reading documentation, then it crosses the line and becomes annoying. How many times do VB developers need to be told that a null is "Nothing"? Consider this snippet from MSDN:

-------
The CreateUser method will return a null reference (Nothing in Visual Basic) if password is an empty string or a null reference (Nothing in Visual Basic), username is an empty string or a null reference (Nothing in Visual Basic) or contains a comma (,), passwordQuestion is not a null reference (Nothing in Visual Basic) and contains an empty string, or passwordAnswer is not a null reference (Nothing in Visual Basic) and contains an empty string.
-------

Five times in one paragraph! I know null type systems are annoying and lead to errors, but that seems a bit excessive. Seriously though, it'd make more sense to make VB developers learn a few words once, rather than having to mess up documentation just in case they get confused.

Code | Humour
Wednesday, August 22, 2007 2:49:34 PM UTC  #    Comments [5]  |  Trackback

# Thursday, August 16, 2007
Practical Functional C# - Part III – Loops are Evil

Be sure to read these first two articles:

    Practical Functional C# - Part I

    Practical Functional C# - Part II

OK let's start with a quick challenge. Write an accurate description of the following program, in English:

static void Main(string[] args)
{
    string output = "";
    if (args.Length > 0) output = args[0]; 
    if (args.Length > 1) {
        for (int i = 1; i < args.Length; i++) {
            output += ", " + args[i]; 
        }
    }
    Console.WriteLine(output);
}

Well? The specification might be something like "a program that writes its arguments separated by a comma and space". But how quickly could you determine that from the code? How much additional time does it take to determine there aren't any bugs? Every time a maintenance developer comes across this code, they have to analyze this code, determine the boundary conditions correctly, verify the indexing, and so on. Every time someone reads this code, she must pay a high tax. The saddest part is that this is an extremely common pattern.

Edit: As Chad Hower pointed out in the comments that you can remove the two if statements by performing a check inside the loop. That shortens it considerably and reduces some of the "tax" that has to be paid when you read it.

"Take this set of values and aggregate them into a single value". How many pieces of code do exactly this, but obscure it behind a for loop? Functional languages realize this and provide functions called "Fold" or "Reduce". Such functions take an accumulator function, apply it to every element in the list, then return the accumulator value when finished. C# 3.0, courtesy of LINQ, provides an equivalent function, called "Aggregate":

static void Main(string[] args)
{
    string output = args.DefaultIfEmpty("").Aggregate(
        (accum, item) => accum += ", " + item); 
    Console.WriteLine(output);
}

Here we are saying that we are going to aggregate args into a single string value. The lambda on the second line takes two arguments. The first is the accumulator and Aggregate passes it to each element ("threads" it through). The second parameter is the current item we are working with. The return value of our lambda is simply the concatenation of the current accumulated value, comma and space, and the current item. This return value becomes the accumulator for the next item. On the first execution, since we did not give it an explicit seed value, it just uses the first item. The final return value becomes the value that Aggregate returns to output. (Edited: Added DefaultIfEmpty -- this overload of Aggregate doesn't work on empty sequences.) Using the lambda provides nicer syntax than this equivalent code:

static void Main(string[] args)

    string output = args.DefaultIfEmpty("").Aggregate(joinCommaSpace); 
    Console.WriteLine(output);
}
static string joinCommaSpace(string a, string b)

    return a + ", " + b;
}

So why is this a good thing? Well, it goes back to the questions about the first set of code: how much effort is required to determine intent and correctness of a particular piece of code? In the imperative way, we need eight lines of code, with three distinct paths. Using a functional approach, we have three lines of code and only one code path. The only serious objection that I've heard is that this code is "unfamiliar". Well, sure, anything new might be unfamiliar, but that does not make it bad.

You wouldn't write your SQL code to make someone only familiar with C "comfortable" or "familiar", would you? You wouldn't write in C# the same way you'd write in C or BASIC. So why stick with outdated programming practices just because some "new" developer might get confused? Functional code is more concise, less error prone, and much more readable. Learning this style might take a couple of days, but it's an invaluable skill. At any rate, LINQ is built upon these concepts, so it will do people good to learn anyways.

Now, let's examine how to create our own functions to hide loops. This time, we're going to look at data access. A very common pattern in data access is creating a SqlDataReader, going through it, adding elements to a list. Like other patterns, overhead obscures the intent:

public static List<Person> GetAllPeople()
{
    // Setup command
    var comm = new SqlCommand("GetAllPeople");
    comm.CommandType = CommandType.StoredProcedure;

    // Setup connection
    using (var conn = new SqlConnection(Settings.ConnectionString)) {
        comm.Connection = conn; 
        conn.Open();

        // Loop and add people to our list
        using (var reader = comm.ExecuteReader()) {
            var people = new List<Person>();
            while (reader.Read()) {
                var p = new Person();
                p.Name = reader.GetString(0); 
                p.Age = reader.GetInt32(1); 
                people.Add(p); 
            }
            // Done
            return people; 
        }
    }
}

Yes, something as simple as reading a list of two-field type can take 14 lines of code. Let's do something about that. The only unique part is where we create a Person from the SqlDataReader. Outside of that, it's a very straightforward, but large, pattern. Refactoring the pattern, we get this:

public static List<T> ListFromReader<T>(string connectionString,
SqlCommand command, Func<SqlDataReader, T> code)
{
    // Setup connection
    using (var conn = new SqlConnection(connectionString)) {
        command.Connection = conn; 
        conn.Open();

        // Loop into list
        using (var reader = command.ExecuteReader()) {
            var list = new List<T>();
            while (reader.Read()) {
                // Here we call the supplied code to add the right item
                T item = code(reader); 
                list.Add(item); 
            }
            return list; 
        }
    }
}

Now our code to GetAllPeople is very simple:

public static List<Person> GetAllPeople2()
{
    var comm = new SqlCommand("GetAllPeople");
    comm.CommandType = CommandType.StoredProcedure;

    return ListFromReader(Settings.ConnectionString, comm, 
        reader => new Person {
            Name = reader.GetString(0), 
            Age = reader.GetInt32(1) 
        });
}

Just be looking at this code, we know all the data-API stuff is handled correctly. This approach is vastly superior to other common SQL "helpers", for example, an ExecuteReader method that gives us a SqlDataReader. First, we have no locals that need disposing or other cleanup: this function scopes the variables to our lambda. Second, we can focus on our actual logic (creating a Person) rather than dealing with loop conditions.

Edit: This is not specific to C# 3.0! You can definately achieve a lot of the same benefits in 2.0, except you need to replace the simple lambda syntax ( => ) with the much more verbose delegate (ArgType arg) { } (anonymous method) syntax. C# 3.0 just makes it much easier to write.

What other common loops do you encounter? I have a few more we'll cover in the next article. As always, comments, insults and suggestions are welcome.

Code
Thursday, August 16, 2007 4:36:01 PM UTC  #    Comments [21]  |  Trackback

# Monday, August 13, 2007
Practical Functional C# - Part II

Previous article: Practical Functional C# - Part I

Last time I demonstrated how to replicate the using keyword as a function that takes a function. In this article, I'm going to show some real-world cases where you'll see a major improvement by taking a more functional approach. I'm going to use WCF as an example space.

WCF allows us to define services as normal C# interfaces. We then use a factory to create a proxy for an interface, specifying the URI, binding types (HTTP, binary, message queue), and other options. However, our interest is how this actually looks from the client side; how we actually make calls.

Since a WCF call is actually invoking code on another machine, any number of bad things can happen to our client channel, resulting in exceptions. When this happens, we have to Abort the channel. But if things complete successfully, we just need to Close the channel. Of course, that's not easy enough: if the Close fails, we then need to Abort the channel anyways. Confused? Well, check out this code snippet that uses an interface called ICalculator to add two numbers:

void WcfExample()
{
    int a = 1; 
    int b = 2; 
    int sum; 
    var chanFactory = GetCachedFactory<ICalculator>();
    ICalculator calc = chanFactory.CreateChannel();
    bool error = true
    try {
        sum = calc.Add(a, b); 
        ((IClientChannel)calc).Close();
        error = false
    }
    finally {
        if (error) {
            ((IClientChannel)calc).Abort();
        }
    }
    Console.WriteLine(sum);
}

Ouch! Out of 17 lines of code, only five relate to our problem. The rest (70%) is pure, ugly overhead. This is a much more complicated pattern than the using pattern, and from experience, I can tell you it is error-prone. Fortunately, this pattern decomposes nicely. The only two unique things are the name of the interface and the statement that acts on the interface. Here is one way you might go about writing this generically:

TReturn UseService<TChannel, TReturn>(Func<TChannel, TReturn> code)
{
    var chanFactory = GetCachedFactory<TChannel>();
    TChannel channel = chanFactory.CreateChannel();
    bool error = true
    try {
        TReturn result = code(channel); 
        ((IClientChannel)channel).Close();
        error = false
        return result; 
    }
    finally {
        if (error) {
            ((IClientChannel)channel).Abort();
        }
    }
}

This is exactly like the previous code, but we've substituted TChannel and a function parameter named code instead of our actual types. The type of function we want is Func<TChannel, TReturn>. You can think of this as saying: "transforms a TChannel into a TReturn". Now, look at the beauty this allows on the client side:

void WcfExample2()
{
    int a = 1; 
    int b = 2; 
    int sum = UseService((ICalculator calc) => 
        calc.Add(a, b)); 
    Console.WriteLine(sum);
}

Presto! Our overhead went from 12 lines to zero. If that didn't sell you on the power of functions, I suggest finding another career. Now, there is a lambda there, but all it is saying is "Here is a function that takes one parameter of type ICalculator and returns the value of Add(a, b)." The C# compiler automatically infers the return type (int).

Some of you mentioned still using C# 2.0. Two things: First, you can use the 3.0 compiler and run on 2.0 because these examples don't reference any new assemblies. Second, you can achieve the same thing using anonymous methods, but they just look a bit uglier:

void WcfExample3()
{
    // C# 2.0 example with anonymous methods
    int a = 1; 
    int b = 2; 
    int sum = UseService<ICalculator, int>(delegate(ICalculator calc) {
        return calc.Add(a, b); 
    });
    Console.WriteLine(sum);
}

By this time, I hope you are thinking about places in your code where you can extract a larger pattern and use functions instead. The key concept to understand here is that your source code should reflect your solution. You should not have extra code sitting around just for the sake of appeasing the platform or meeting some "design pattern". Let the platform and libraries take care of the details, and let your code focus on actually solving problems. As always, I welcome your comments, suggestions, and insults, so please let me know what you think!

Next up, we'll take a look at loops, see why they are harmful, and then go about fixing them.

Code
Monday, August 13, 2007 11:07:26 PM UTC  #    Comments [8]  |  Trackback

# Sunday, August 12, 2007
Practical Functional C# - Part I

Edit: This first article might be a bit dense at times. However, I promise, if you stick through and read at least until Parts II and III (Loops are Evil) you will see major benefits that will totally transform your code (even C# 2.0 code!). In fact, to the first 3 people that can show me that these practices will NOT result in better code in many enterprise apps, I'll give $100 each. Post a blog comment or email me (mgg AT telefinity dot com).

Redundancy in source code is a common degeneration. But for the C# programmer, her weapons to eliminate it have been unwieldy at best. She has been able to eliminate simple, blocky, patterns, but truly writing reusable code at a fine level has not been easy.

This series will demonstrate how you can take advantage of functional programming (FP) in your work, today. The first task is to dispense with the notion that functional programming is difficult and strange. Most FP articles start with a recursive definition of the Fibonacci sequence and then talk about currying functions. Here, we're going to start off with something that every C# programmer has used many times. I can guarantee that after you absorb this series of articles, you'll be writing more concise, less buggy, more powerful software. I've seen many C# modules get cut down drastically in size. I've seen new C# code written that does multithreading and it was written correctly on the first go. I want you to see these things too.

Refactoring out common blocks of code is familiar. Probably every developer has written a helper method to initialize a database command or prepare a commonly used object. Unfortunately, refactoring only tends to happen to contiguous blocks. We do not see refactoring of complex patterns that do not fit into neat blocks. Consider the following visualizations of a program. The red blocks are the unique parts of the program, and the grey ones represent common statements.


                                             becomes

It is easy to refactor the first sequence: move the common parts into another function and call it. However, the second pattern poses quite a problem. The individual "common" blocks are too simple to move into another method. The call to the refactor method would be the same as the method itself! How can we refactor it correctly?

Well, C# itself contains keywords that refactor some of these patterns. Consider the using keyword. The following two methods are nearly equivalent:

void usingPatternExample()
{
    string result;
    StreamReader sr = new StreamReader(filename);
    try {
        result = sr.ReadToEnd();
    }
    finally {
        if (sr != null) sr.Dispose();
    }
}

void
usingKeywordExample()
{
    string result; 
    using (StreamReader sr = new StreamReader(filename)) {
        result = sr.ReadToEnd();
    }
}

I think it is obvious why everyone uses the using keyword over writing out this long init-try-something-finally-dispose pattern repeatedly. When we read code that uses the using keyword, the intent is instantly clear. We don't need to go validate the pattern to make sure it does what we think it does, we can just see "using" and know that it is correct. As an extra benefit, we can declare our disposable variable inside the using scope, so that we don't accidentally use it after it is disposed.

This is all good and dandy, until we realize that we are stuck with the handful of patterns defined as C# keywords! If I wanted to declare the variable result as an output of the using statement, I'm out of luck. If I want to initialize multiple objects in one using block, then that is just too bad. Part of the reason for this is that it was extremely ugly to do this in C# before version 3.0. Consider this C# 1.0 example to recreate the using keyword as a user-defined function:

delegate void UsingAction(IDisposable obj);
void Using(IDisposable obj, UsingAction action)
{
    try { action(obj); }
    finally { if (obj != null) obj.Dispose(); }
}

So far, so good – nothing ugly yet. The usage of this user-defined Using function on the other hand:

string result;
void UsingExample1()
{
    Using(new StreamReader(filename), new UsingAction(readIntoResult));
}
void readIntoResult(IDisposable obj)
{
    StreamReader sr = (StreamReader)obj; 
    result = sr.ReadToEnd();
}

Atrocious! The pattern of the code is shattered and complicated to follow. The main part of the code (getting the result) is forced to sit in a separate method many lines away. We're required to use fields to pass data in or out. And just to top it off, lack of generics forces a cast – we had to write StreamReader three times. The utter syntax hides our intention. In short, it is completely unusable.

C# 2.0 makes some progress. First, we can change UsingAction to Action<T> and allow the caller to specify the type. All that is needed is a constraint on the type to IDisposable. The new declaration of Using looks like this:

void Using<T>(T obj, Action<T> action) where T : IDisposable

This says that we need something of type "T", and the only restriction is that T must be a type that implements IDisposable. Then, we want an Action that works on that same type T, whatever it is. By using T instead of a specific type, we are essentially saying "we don't care about what type your object is; our function works with all sorts of types". This is the source of the name "generics". Our code is more generic; it's not tied to a specific type. Generics are an extremely important tool for refactoring patterns because patterns occur across different types.

To clean up the calling code, C# 2.0 has a feature called anonymous methods. They are exactly as they sound: methods without names. They can be declared inline, right along with the rest of the code. Additionally, they "capture" local variables, making it unnecessary to declare fields to pass data in and out.

void UsingExample2()
{
    string result; 
    Using(new StreamReader(filename), delegate(StreamReader sr) {
        result = sr.ReadToEnd();
    });
}

What a major improvement! C# 2 can infer the type of T for Using, avoiding having to write Using<StreamReader>. Apart from having to type StreamReader twice, this code is starting to approach the level of clarity that the using keyword provides.

C# 3.0 introduces lambdas. Now, that's usually a scary word from functional programming, but it's actually very simple. For C#, it's just an easier way of writing anonymous methods. Instead of having to write "delegate(ParamType paramName) { }", we can just do "paramName =>". That's all there is to it! Nothing scary at all. For example, if I want to write a lambda to add two numbers, I’d write it like this:

Func<int, int, int> add = (int a, int b) => a + b;

The part Func<int, int, int> says we have a function that takes two ints and returns an int. Next, we declare our two integer parameters: (int a, int b). Then, we define the lambda body by using the => operator. Our body consists only of “a + b”. With lambdas, if we only have one statement, we don’t need to explicitly use braces or the return keyword. Alternatively, we could have written it as:  

Func<int, int, int> add = (int a, int b) => { return a + b; };

Behind the scenes, the compiler goes and creates a method and creates the Func delegate on top of it. So, armed with this new syntax and power, let’s take another look at our Using method:

void UsingExample3()
{
  string result;
  Using(new StreamReader(file), sr =>
    result = sr.ReadToEnd()
  );
}

void usingKeywordExample()
{
  string result;
  using(var sr = new StreamReader(file)) {
    result = sr.ReadToEnd();
  }
}

This shows there is nothing that special about the built-in keywords. It is also the first step in demonstrating that some parts of functional programming are not very foreign to the C# programmer. The next articles in this series will provide some real-world application of these ideas and show how you can reduce the amount of code you have to write (and reduce the number of bugs at the same time!).

Code
Sunday, August 12, 2007 11:21:32 PM UTC  #    Comments [5]  |  Trackback

# Thursday, August 02, 2007
C# Frustration

C# with lambda syntax and extension methods (in lieu of compositional operators) gets us so far, but the syntax and compiler could use a bit of polish. I'll show some exact examples later, but meanwhile this picture sums up my feelings at the moment:

Notype1.png

Edited to add: Inspired by http://xkcd.com/297/

Edited to add: Just to be clear, this isn't a compiler or IDE issue: it's a spec issue (AFAIK). The C# spec simply doesn't allow certain things (like type inference from a method group).

Code | Humour
Thursday, August 02, 2007 7:39:10 PM UTC  #    Comments [7]  |  Trackback

# Wednesday, April 04, 2007
SQL Replication on a cluster: Error authenticating proxy

We were rolling out a new database that is transactionally replicated to a few other nodes. In test and staging, everything worked fine, but in production, on a cluster, the distribution job failed. The snapshot runs as the SQL Agent account, but the distro runs as a separate account to distro just that database to the subscribers. The error is:

Unable to start execution of step 2 (reason: Error authenticating proxy DOMAIN\SomeUserName, system error: Logon failure: unknown user name or bad password.). The step failed.

 

We spent about an hour trying to figure out what was going wrong -- all the ACLs were right, the user was in the PAL. Everything was identical in permissions to the other environments.

After a bit of time on the line with PSS, we noticed that if we ran everything as the SQL Agent Account (the cluster service), then it worked. But, this required adding the cluster's account to the subscriber DB, and that was not acceptable.

Finally, our PSS rep suggested we check that the SQL Agent account was trusted for delegation. Bingo. On staging and test, the SQL Agent account is Network Service (or Local System). But in a cluster, it runs as a separate account, and that account is not trusted for delegation. Hence, the impersonation call failed. Simply going into ADUC and trusting it for Kerberos delegation, then restarting the SQL Agent, allowed us to use the proxy accounts without problem.

It seems like this message comes up a lot in context of replication and clusters. Hope this helps someone else!

Misc. Technology
Wednesday, April 04, 2007 1:24:11 AM UTC  #    Comments [1]  |  Trackback

# Tuesday, March 13, 2007
Outsourced Evolutionary Programming

My friend just dealt with an oursourced project. Yes, outsourced as in sending it to a place that charges a lot less money than, say, developers who actually know what they're doing.

One of their deliverables was a program that compressed an XML string into a gzip file. Should be a minor thing, right? The C#/.NET 2 code is less than 10 lines. Well, their first delivery produced files that contained the text "System.Byte[]". This was not accepted and they vowed to look into it in more as they were sure the code was correct.

Their next files were a bit larger and supposedly were really correct this time. But still, no zip program could read the data. Well, a quick look at the beginning of the file shows the bytes EF BB BF -- the UTF-8 BOM. The rest of the file was ASCII digits. Yes, they wrote the bytestream out as a UTF-8 interpretation.

If we define evolution as "the non-random survival of randomly mutating replicators", we can define their approach as "the non-random acceptance of randomly mutating programs."

Code | Humour
Tuesday, March 13, 2007 5:51:08 PM UTC  #    Comments [2]  |  Trackback

# Tuesday, August 08, 2006
Wow, Nikon's VR is as good as they say

Having just got a dSLR (Nikon D50, rather low-end) with the cheap kit-lens (some Quantaray lens), I had been looking for a better lens. Most of the photos I take are inside, mainly snapshots. I've never seen flash photos that look that great, so I prefer to use the existing light. Of course, indoors, this makes for some really nasty blur in most cases. There's just not enough light inside my house to be able to get a fast enough shutter speed.

I read some reviews about the Nikon VR (Vibration Reduction) lenses, and that they supposedly compensate for 3 full stops. I tried one VR lens out in a store, and it seemed pretty interesting, so I ordered the Nikkor 18-200 VR. Unfortunately, Nikon apparently doesn't know anything about logistics or manufacturing, and this lens is backordered 2-6 months (depending on who you ask). Since I don't want to be taking crappy shots for that long, I grabbed the Nikkor 24-120 VR as a temporary lens. Well, I'm not sure it'll help my crappy shots, but at least they won't be blurry.

I got home, it's dusk, and decided to try it out. For fun, I set the exposure to 1 second, and well, let the results speak for themselves:


VR Off:


VR On:


Ignore the actual value of the image (I'm not a photographer), but just look at the blur. Granted, I was lying against a sofa, so I had a bit of something to lean on, but the difference with VR on == wow! So, given a bit of support, I can take pictures, handheld, with exposures > 1/15, no problem. Worth every dollar.

Yes, I know this is nothing new, but I'm so impressed I wanted to write it down (also helps my rationalization on buying the lens :)).

Photography
Tuesday, August 08, 2006 2:01:21 AM UTC  #    Comments [1]  |  Trackback