Archive for November, 2009

Qizmt: MapReduce Framework in C#

I was recently surprised to find that MySpace had open sourced a distributed “MapReduce Framework” called Qizmt (http://qizmt.myspace.com/).  From the site’s description:

MySpace Qizmt [kiz-mit] is a mapreduce framework for both developing and executing distributed computing applications on large clusters of Windows servers.

This has been a topic that I’ve been interested in for a while so I’m glad to see that someone has been making progress in this space for the .NET world.  Dryad remains an interesting prospect which is apparently even seen production use in Microsoft’s ad service, however it’s clear that it hasn’t been “productionized” yet.

One interesting aspect of Qizmt is that it seems they paid a lot of attention to making it easy to deploy.  In my opinion, the ability for a developer to easily get started with a one machine install is a key enabler for the uptake of any new technology.  Hopefully with more competition for technologies such as this, we will see some cool options become available to us as developers for high performance computing.

Comments

Executing PowerShell Scripts via C#

When Dave asked me for some help with a little side project of his that he was researching, I jumped at the chance.  The requirement was to execute a powershell script programmatically and pass in some parameters that were gathered from a simple form.

I had been wanting to learn more about powershell since it came out (the original codename was called monad) and this was the perfect opportunity.  The end result is a simple little static class that you can use to execute a powershell script and pass in some parameters in a strongly typed fashion.

Here’s a sample usage:

string result = PowerShell.Execute(
    @"c:\users\joel\dev\script.ps1",
    () => new
    {
        server = formserver,
        fname = formname
    });

Although there are quite a few little nuances involved in the execution from a command line, once you figure them out it’s quite easy.  The class basically just does a Process.Execute on the PowerShell.exe command and passes in some command line arguments that executes the ps1 file.

I opted to do this instead of the cleaner API that is available via hosting the powershell runtime because that has an additional requirement on System.Management.Automation.dll which must be installed with the windows sdk.  I didn’t want to introduce this dependency for the project, so the command line method was preferrable.

Below is the class … you’ll obviously have to include a few extra using statements at the top of your file, but you can find those easily.  Enjoy!

public static class PowerShell
{
    public static string Execute(string scriptPath, Expression<Func<object>> parameters)
    {
        string shellPath = "powershell.exe";
        StringBuilder sb = new StringBuilder();
        sb.AppendFormat("\"& '{0}'\" ", scriptPath);

        NewExpression n = parameters.Body as NewExpression;

        for (int i = 0; i < n.Members.Count; i++)
        {
            var member = n.Members[i];
            var value = n.Arguments[i];
            string paramValue;
            if (value is MemberExpression)
            {
                paramValue = Expression.Lambda(value).Compile().DynamicInvoke().ToString();
            }
            else
            {
                paramValue = value.ToString().Replace("\"", string.Empty);
            }
            sb.AppendFormat(" -{0} {1}", member.Name.Replace("get_", ""), paramValue);
        }

        string result = ExecuteCommand(shellPath, sb.ToString());
        return result;
    }

    private static string ExecuteCommand(string shellPath, string arguments)
    {
        arguments = "-noprofile " + arguments;
        var process = new Process();
        var info = process.StartInfo;

        process.StartInfo.UseShellExecute = false;
        process.StartInfo.FileName = shellPath;
        process.StartInfo.Arguments = arguments;
        process.StartInfo.RedirectStandardError = true;
        process.StartInfo.RedirectStandardOutput = true;

        process.Start();

        var output = process.StandardOutput;
        var error = process.StandardError;

        string result = output.ReadToEnd();
        process.WaitForExit();
        return result;
    }
}

Comments (2)

Static Access to Request-Specific Data

I wrote a post over on the nGenSoft Blog talking about how to gain Static Access to Request-Specific Data:

As we have all come to learn in the last decade plus of web development, web applications are inherently stateless.  Unlike their native client cousins, every request must be treated as if it was done in isolation from any other user action.  This can tend to complicate application level concerns.  More often than not people just end up polluting their application by mixing code that is related to servicing the http request, with their business logic.

We wanted a way to maintain application related plumbing such as database connections neatly maintained, without having to always worry about the stateless nature of http requests.  We noticed that ASP.NET has a really nice pattern that works really well in the HttpContext.Current property.  This is a static property that contains information about only the current request … at first I couldn’t figure out how this works because ASP.NET is by nature a multi-threaded environment.  How was it segregating the information, which is accessed statically, to each individual requests?


Update: repost of original text can now be found below

As we have all come to learn in the last decade plus of web development, web applications are inherently stateless. Unlike their native client cousins, every request must be treated as if it was done in isolation from any other user action. This can tend to complicate application level concerns. More often than not people just end up polluting their application by mixing code that is related to servicing the http request, with their business logic.

We wanted a way to maintain application related plumbing such as database connections neatly maintained, without having to always worry about the stateless nature of http requests. We noticed that ASP.NET has a really nice pattern that works really well in the HttpContext.Current property. This is a static property that contains information about only the current request … at first I couldn’t figure out how this works because ASP.NET is by nature a multi-threaded environment. How was it segregating the information, which is accessed statically, to each individual requests?

After doing some research online, I finally figured out a great way to maintain request level state across different components (ie. http module –> http handler –> mvc action filter –> etc.). I did a good bit of searching, but found it was succinctly put in a blog post by hanselman:

http://www.hanselman.com/blog/ATaleOfTwoTechniquesTheThreadStaticAttributeAndSystemWebHttpContextCurrentItems.aspx

I started off by looking at (ie. Reflectoring) how the enigmatic HttpContext.Current works. Turns out there’s a lot of magic going on under the hood there with the web hosting framework and further .net remoting. In the end, looks like there are two simple ways to solve this problem:

  • [ThreadStatic] attribute lets you have an instance of your static *per* thread.
  • HttpContext.Current.Items, only usable in the context of asp.net obviously, but correctly manages your scope for the lifetime of the request.

As hanselman puts it:

Today’s lesson learned:the [ThreadStatic] attribute is only useful when YOU control the ThreadPool (and the lifecycle of the threads).

So it seems that in order to solve the problem we need to adapt our strategy. If our app is running in a local client (ie. stateful), we can either use the threadstatic attribute, or nothing at all if we don’t plan on doing complex multithreading. However, if we are executing our application’s code in an asp.net app, we need to use HttpContext.Current.Items. Armed with this knowledge, we could have a small initialization step that lets you set up the strategy for how to manage session information. So in the app_start method of the global asax, we can do something like:

AppContext.SetEnvironment(new AppEnvironment());

Thus, in ASP.NET you have an implementation that can know how to provide the proper scoping for that hosting environment. AppContext is defined as:

public interface IAppEnvironment
{
    public AppContext Current { get; set; }
}
public class AppContext
{
    // instance data
    public IDatabase Database { get; set; }

    // static lifecycl
    private static IAppEnvironment environment;

    public static void SetEnvironment(IAppEnvironment env) { environment = env; }

    public static AppContext Current
    {
        get { return environment.Current; }
        set { environment.Current = value; }
    }
}

The instance data can be whatever you want … in the case of a data-driven app, it can maintain a request level database connection and whatever other information we need to refer to (which you can easily do by just saying “AppContext.Current.Database”). The static “Current” property that everyone would use simply defers to the environment implementation. Below are two implementations of the IAppEnvironment that you can use from ASP.NET and a custom one that you can use in a console app, or unit test.

public class WebEnvironment : IAppEnvironment
{
    public AppContext Current
    {
        get { return HttpContext.Current.Items["appcontext"] as AppContext; }
        set { HttpContext.Current.Items["appcontext"] = value; }
    }
}

public class CustomEnvironment : IAppEnvironment
{
    [ThreadStatic]
    private static AppContext context;

    public AppContext Current
    {
        get { return context; }
        set { context = value; }
    }
}

The CustomEnvironment implementation above just uses the simple thread static attribute since it’s assuming that you will be managing the hosting environment (threading and all) … where in the WebEnvironment, you can defer to the httpcontext stuff since that is handled for you.

Techniques such as these let you focus on your application, while limiting the amount of time that you have to spend worrying complexities of adapting your application to run in a web application.

Comments