Semi-Literate Programming with C#

By on 12/22/2009

Recently, I've been reading the book Coders at Work, where author Peter Seibel interviews lots of well known developers.  One of the questions that he often asks is whether they have tried Literate Programming, an idea introduced by Donald Knuth in the 70s.  Although most of them say no, some of them have tried it briefly.  They usually come to the conclusion that it's an interesting idea, but a lot of the tooling doesn't really make it a realistic solution.

The author's insistence at asking them about this got me thinking about some of the problems that I've encountered during my career.  Could literate programming be applied in this modern day and age to help solve some of the problems that we all face?

A Use Case

I started thinking about the types of code where these questions tend to come up with. It's usually around code where business analysts make up some business rules.  Let's use an example.  Say you work at a company that needs to creating billing statements for clients.  When you process the statements, there are a series of fees which you must attach to the bill based on certain conditions.

The rules may have been explained to the developer in a face to face meeting like this:
  • Apply a 7% tax on the principal when the client is in Florida
  • A $50 flat fee will be applied when a New York based client maintains a principal of more than $1000
So I, as a developer will scuttle away and create the program per specifications. The application goes to production and everyone is happy. Two years pass by, most of the original staff that was working when the system was originally deployed has turned over, and the new staff has a question about why some fees are being charged for a given client.

In an ideal world, the business will refer back to documentation that they created when wanting to know about the behavior of some system that you programmed. However, in the real world, a more likely scenario is that they will end up asking you over the phone about some obscure section of the code and you end up having to crack open the source to figure out what the code is doing in that piece.

The Solution?

I want a solution that lets me write code, and without manual intervention, allow other developers and the end user to understand what the business logic is doing.  Of course, most people will point to the XML documentation feature of C# along with auto-documentation products like SandCastle, and suggest that this is enough.  However, maintaining XML comments violates the "without manual intervention" part of my own requirements. It also creates output that is not really consumable by end users.

There is also another issue that most people probably don't really think about. A lot of the code in today's projects is not really ... useful, to document. Serialization code, parsing code, data access code ... most of that is pretty standard.  Developers will easily understand it assuming they already know how to use the APIs like ADO.NET, and WCF.  And users won't care about it.  So that really just leaves the fundamental logic that is the raison d'être for your application in the first place. This is what I am interested in making easily available for a human to read.

For the solution, I wrote a simple Rule class:
public class Rule<T>
    private Expression<Action<T>> expression;
    private Expression<Func<T, bool>> evalExpression;
    private Action<T> compiled;
    private Func<T, bool> evalCompiled;

public void Execute(T context) { if (this.evalCompiled(context)) { this.compiled(context); } }

public Expression<Func<T, bool>> Evaluation { get { return this.evalExpression; } set { this.evalExpression = value; this.evalCompiled = value.Compile(); } }

public Expression<Action<T>> Action { get { return this.expression; } set { this.expression = value; this.compiled = value.Compile(); } }

public override string ToString() { return string.Format("if {0} then {1}", this.Evaluation.Body, this.Action.Body); } }
This class takes, as a generic parameter a context which represents one item that needs to be processed.  You will set two lambda expressions: the Evaluation, and the Action.  The evaluation will return true if the action is to be applied.  An example can be seen below:
List<Rule<BizContext>> rules = new List<Rule<BizContext>>();

rules.Add(new Rule<BizContext>() { Evaluation = c => c.State == Florida, Action = c => c.Fees.Add(c.Principal * .07M) });

rules.Add(new Rule<BizContext>() { Evaluation = c => c.State == NewYork && c.Principal > 1000.00M, Action = c => c.Fees.Add(50.00M) });
"BizContext" in the above code can contain anything that pertains to the item that needs to be processed.  In our case, the analyst's rules say that we need to operate based on the principal and client's state, and add fees.  So those are the properties that the context contains. Because the rules were added to a list, you can iterate through the list and call the rule class' "Execute" method.
foreach (var rule in rules)
So far, there's nothing groundbreaking about the Rule class. I'm sure many of you have written something similar time and time again. But here's where the literate programming comes into play. Because the "Evaluation" and "Action" methods are actually Expressions ... we have access to the textual representations of the code, in addition to having the ability to execute it.

The overridden .ToString method on the class will output an easy to understand string of the business rules using actual code that will execute when it's run. So for the two rules defined above, you can get a printout like this:
if (c.State = Florida) then c.Fees.Add((c.Principal * 0.32))
if ((c.State = NewYork) && (c.Principal > 1000.00)) then c.Fees.Add(50.00)
The end user gets a realistic printout of the actual business logic in the system on-demand; And the developer doesn't have to do anything to update this when the business logic changes.

So there you have it; I wouldn't exactly call it full fledged literate programming with C# in the way that it was described originally.  But I think that it embodies the qualities of literate programming, where documentation and code are one and the same. It's a compromise, and it would be interesting to see if this approach can be implemented in a real-world scenario.

Any takers? :-)

See more in the archives