5 things you probably didn’t know about .NET WebSockets


As most of you probably already know WebSocket provides full-duplex communication over a single TCP connection. .NET 4.5 added support for WebSockets as part of the BCL. In this article I am going to talk about few of the subtleties that you need to think about.

  1. Connection upgrade is somewhat expensive
    WebSocket connections are initiated as traditional HTTP connections. The client then usually requests an “upgrade” to a WebSocket session, this upgradae process is relatively expensive. If you are interested in performance you may want to pool a set of connection that are already upgraded and use connections from the pool.
  2. Simultaneous sends or receives
    While you can simultaneously do send and receive, you can only do one of each simultaneously. In other words at any given point in time you can only have a single pending send or a single pending receive. Various workaround exists for this limitation. For instance, SignalR uses a queue. The other option is to provide synchronizations using a ManualResetEvent – there are pros and cons to both so you need to think about what makes sense for your specific application.
  3. Teardown
    There are two ways to close a WebSocket connection. The graceful way is CloseAsync which when initiated sends a message to the connected party, and waits for acknowledgement. The keyword here is sends. Remember in our previous point we discussed that you can only have a single send or receive at any given point in time? So if you are sending data, and at the same time try to CloseAsync this leads to an exception because CloseAsync will also try to send a message. The other option is to use CloseOutputAsync this is more of a “fire-and-forget” approach.
  4. COM Exceptions?
    Based on some of our testing there are certain COM level exceptions that can happen during high load conditions. Once again if you look at the SignalR implementation you can treat these types of exceptions as non-fatal.

    0x800703e3 – The I/O operation has been aborted because of either a thread exit or application request
    0x800704cd – The remote host closed the connection
    0x80070026 – Reached the end-of-file

  5. Unobserved Exceptions
    Any unobserved exceptions (background thread exceptions that weren’t caught) can cause your WebSocket to get into an aborted state. This is because the .NET 4.5 implementation of WebSocket adds a TaskScheulder.UnobservedExceptions handler and aborts the connection for any exceptions that propagate up to it. So you have couple of choices here, first make sure that you don’t have unobserved exceptions (which means you have an issue that you are not even aware of). If you call any method that initiates a Task – make sure you store it and add a continuation to observe it’s exceptions. The other option is to add a TaskScheduler.UnobservedExceptions yourself to see what potential exceptions you are missing.

Building a service execution pipeline

Most software built today have a notion of a client and a service. This is even more true with mobile/web applications because you want your client apps to do as little as possible, and your service to do most of the heavy lifting. This allows you to improve your service without requiring constant client updates. Also since you have a single service that serves potentially various native clients (iOS, Android, or Windows Phone), being able to update it independently of your users gives you a clear competitive edge.

So today I want to focus on building a service based on this notion of a “pipeline”. Most .NET client/service frameworks already have this concept. For instance, Windows Communication Foundation (WCF) has an execution pipeline, and so does ASP.NET. This allows you to extend the behavior of these frameworks at various points during the execution. But that’s not really what I want to talk about today. I want to talk about how you can build your services such that you can add/remove functionality using a pipeline-style execution. As we are talking about the implementation of this pipeline I also want to take this opportunity to discuss good software design practices, and it’s uses.



Building distinct components that perform a very specific functionality and nothing more
If you don’t build services this way, our natural inclination as developers, will be to pile on top of the code that already exists. Think about authentication for example, your service may do user authentication against a local database but you want to add support for Facebook, or Twitter. Assuming that you have a component that’s responsible for doing the authentication with a local database today, when you want to support these additional parties, the immediate thought is to update this authentication logic to support it. The problem with this is you simply cannot tease apart facebook, from twitter from your local authentication. They are all in one bucket. Either you have them all or you don’t. And the same can be said about many other types of functionality within your system.

Being able to enable/disable functionality of your service without code change
One of key decisions you need to make when building a service is to recognize the inevitable truth that dependencies of your services will break down. Period. If you program knowing this fact, then you are more likely to build resiliant systems that can withstand these dramatic erruptions. One of the key ways to do this is to design your services so that they can deteriorate in terms of performance or functionality instead of completely breaking down. The only way to do this properly is if you can identify and isolate specific functionalities in your system and have the ability to enable/disable them. Like we said earlier if all your functionalities are burried in a bucket of water, you’ve already muddied it – there’s nothing you can do when a bad drop of oil is poured in. But if each arbitrary subset of water droplets were contained in a packet, and you could identify which was the bad breed, you could easily remove it from the container and let the rest of them continue to function.


So the way to solve this problem is to build on somewhat of a modified chain-of-responsibility principle. Where each item in the chain is handling a single responsibility and then forwarding the request over to the next party in the chain and the execution continues. Theoratically every non-shared chunk of code can become an item in the chain but then it becomes difficult to stich them together. So the right balance is to isolate a feature in an item, and then string together the chain to build the overall functionality.

But we are sort of getting ahead of ourselves. In order to build a fully functional high performance, execution pipeline, we’ll need to build many foundational pieces. So instead of doing that, today we will start with a simple, synchronous, one-way, non-hierarchial execution pipeline. Let’s get started with a simple console application:

static void Main(string[] args)
    // configure the MEF container
    using (var container = ConfigureMef())
        // create a pipeline flow - logging/fake response/terminate
        IPipeline pipeline = new LoggingPipeline(new FakeResponsePipeline());

        // inject the property

        // resolve the root type - a simple http server
        var server = container.GetExportedValue<Server>();

        // start the server
        var service = server.Start(@"");

        // keep accepting connections
    } // dispose

First notice that I am using the Managed Extensibility Framework as my DI container. Any other DI container would work in this case, but MEF allowed me to stick with the .NET framework and also didn’t need any additional configuration to work – which was nice for this simple example.

So the first thing to notice is that we have an interface for our Pipeline which we are preparing externally from the actual service. In this case we are saying that the pipeline will consist of Logging and a FakeResponse – and injecting that to our server. Our server then will execute the pipeline following the chain. It’s easy to see how we can externalize this configuration to a config file allowing us to modify the behavior of the service without necessarily making a code change. We will talk more about this just a little later, for now let’s look at how the server is configured:

public class Server
    private readonly IHttpListener listener;
    private readonly IPipeline pipeline;

    public Server(IHttpListener listener, IPipeline pipeline)
        this.listener = listener;
        this.pipeline = pipeline;

    public async Task Start(string address)

        // keep listening
        while (true)
            // wait for a listener
            var context = await this.listener.GetContextAsync().ConfigureAwait(false);

            // initiate the pipeline and forget

This is probably one of the most simplest HTTP servers. It accepts a requests and passes it forward to the first item in the pipeline. That’s it. Recall, the pipeline was built external to the server and injected to it.

public interface IPipeline
    void Continue(HttpListenerContext listenerContext);

The execution pipeline interface is extremely simple. It accepts a Continuation to the next execution pipeline. One of things you want to do when you build interfaces is to think of the minimal set that satifies what you are trying to do. The leaner your interfaces, the less likely they will change, and therefore have less impact on the overall system. The recommendaton for most interfaces is to have no more than 3-4 methods. The .NET Framework interfaces not surprisingly has 3.75 members, with a methods-to-properties ratio of 3.5:1. If your interfaces start to have more than 10 methods, you’re probably building more than one responsiblity in an interface and there’s probably an opportunity to separate them. This task is often referred to as decomposition.

Now that we have the base interface we need to build an abstract concept that allows to move to the next item in the pipeline. Because interface does not define that, the interface just says you should be able to continue. For that we create an abstract BaseContinuationPipeline.

public abstract class BaseContinuationPipeline : IPipeline
    private IPipeline forward;

    public BaseContinuationPipeline()
        // default is the terminating pipeline
        this.forward = new TerminatingPipeline(); 
    public BaseContinuationPipeline(IPipeline forward)
        this.forward = forward;

    public virtual void Continue(HttpListenerContext listenerContext)
        Task.Run(() =>
            // execute the "abstract" action

            // continue to the next action

    protected abstract void Execute(HttpListenerContext listenerContext);

There are few things to point out here: the first thing is the default forward continuation to the pipeline is this special item in the pipeline called a terminating pipeline. It’s sole purpose is to end the request. This pattern of using a special object to handle empty set is referred to as the Null Object Pattern. There were several ways to implement this special case. For instance, I could have simply left the forward object to null, and if it was null – prevent the forwarding and end the request. However, the problem with this is you are implementing a special case while your normal control flow of your logic can support it. The second thing is there might be other parts of the pipeline where I may decide to end the request, in which case I only need to forward it to the terminating pipeline (no repeating logic). One good sign of a well designed software is it’s overall lack of conditional statements. If you think about it conditional statements (if/else, switch) are sort of a forced behavior to your normal control flow. Now I am not saying remove all conditional statements, that would be absurd there is no way to check if 5 > 3 without actually doing a if statement but for business objects you should think twice if you are constantly checking if the object is null. Also notice that the default is the terminating pipeline. This means even if the caller does not pass any feature sets to the pipeline the default behavior will be stop the execution. You always want to do a safe design, meaning that your APIs are full proof such that no matter how you interact with it the baseline behavior is at least functional.

So let’s keep moving forward. This abstract class does not know about any features so it simply describes how to move forword to the next continuation and keeps the execution abstract. The current item in the pipeline is executed by doing this.Execute, and then the request is forwarded to the next responsible party. That’s it.

public class TerminatingPipeline : IPipeline
    public void Continue(HttpListenerContext listenerContext)
        // end the response stream

        // no more forwarding

So like we talked about before the terminating pipeline implements IPipeline interface and it’s continuation is to simply to end the request and there’s no where to forward to since this is always going to be the last item in the list. And to close, let’s look at the two items in our pipeline, the logging and the fakeresponse:

public class LoggingPipeline : BaseContinuationPipeline
    public LoggingPipeline(IPipeline forward) : base(forward) { }

    protected override void Execute(HttpListenerContext listenerContext)

public class FakeResponsePipeline : BaseContinuationPipeline
    protected override void Execute(HttpListenerContext listenerContext)
        string response = "hello world";
        listenerContext.Response.OutputStream.WriteAsync(ASCIIEncoding.ASCII.GetBytes(response), 0, response.Length);

They both implement the BaseContinuationPipeline because they both support continuations. In fact, all our pipeline elements will implement continuations except the special TerminatingPipeline. The loggig pipeline simply prints the incoming URL, and the FakeResponse writes Hello World to the output.

With that we have a foundation to build something on top of. You can download the source code for this entire implementation. I will continue this discussion with more functionality with support for hierarchies, non-sequential and stateful pipelines. All of which will be required to build a fully functional service.


Sebastian Junger (pronounced Younger) was on Bill Maher last night. Junger is an award winning Afghanistan war correspondent, and director, and his latest documentary Which way is the front line from here? has been proclaimed a success at the Sundance Film Festival this year.

During the interview Bill asked why war felt like an addiction to some soldiers. In particular, what psychology drove this behavior? To that Sebastian responded:

“The consequences in war are huge. The consequences even of small things. You don’t tie your shoe, you trip in a firefight, someone gets killed. And it gives you this strange almost Zen like focus on the details of life — and everything starts to feel very meaningful, and friendships feel meaningful, everything has this kind of intensity. And soilders miss that sense of meaning and the bond that arises in that situation”.

That’s a very insightful answer, and it’s probably why it stuck with me. No matter what I was doing I kept thinking about what he said, and how it applied to everything we do. His response wasn’t that soldiers do it for our country, or that they do it because they want to help humanity. While I am sure there’s a component of that, the reality is from an individual’s temporal perspective these grand reasons are too hard to see and therefore can’t be the reason for motivation. I think to a certain degree the same is true for most professionals – it’s arguable to compare a soldier to another profession but the general idea still applies. For instance, as software engineers, we build things that can change someone’s life, and we like to think that is the reason why we do what we do. But is it really? When we are trying to solve a hard technical problem at midnight, are we really doing it because it has an impact on someone’s life? Or is it something else? Self-awareness is key to happiness. If we know what it is that attracts us, we can certainly accentuate it or maybe even find other activities that lead to it.

Without self-awareness the truth tends to hide behind social norms and hypes around you. The truth could hide even from yourself if you’re not careful. If you are lucky enough to love what you do, spend a day, take a walk and ask yourself: what it is that you really enjoy. Keep peeling away until there’s no more, and hopefully what you’re left with is your kernel of happiness.

Introduction to Machine Learning

In most computer science programs, machine learning is usually a graduate level course. It’s a specialization within the field of artificial intelligence, which is often thought of as a theoretical study than practical applications. But yet, machine learning today is used heavily to solve problems. Our team for instance, uses it to build acoustic models for speech recognition. It’s no longer a theory, it’s applied science. But if you wanted to start in this field, which I suspect is going to play a major role in software in the future, where do you start? I came across this free textbox from professor Max Welling for UCI Computer Science. His textbook “A First Encounter with Machine Learning” is available for free. While it’s not an entirely bedside reading, it is however written for engineers who are interested in learning about the various machine learning algorithms that are available today.

.NET and Node.JS – Performance Comparison (Updated)

Update (3/31/2013 – 11:41 PM PST): This article has been updated! As most readers have commented the node.js async package is not asynchronous, which is what the original article was based on. I made an assumption I should not have. I have since rerun the tests taking this into account, as well as some of the changes recommended by Guillaume Lecomte. I have decided to update this existing post so that there’s no confusion in the future with the data. Thank you everyone for all the comments, posts and keeping me sane.

Update (3/29/2013 – 3:43 PM PST): There’s been a lot of valid comments around the use of the async NPM package for node.js which are valid. I will take them into account and re-run these tests.

If you talk to any silicon valley startup today chances are you will hear about node.js. One of the key reasons most argue is that node.js is fast, scalable because of forced non-blocking IO, and it’s efficient use of a single threaded model. I personally love JavaScript, so being able to use JavaScript on the server side seemed like a key gain. But I was never really sold into the notion that node.js is supremely fast because there aren’t any context switches and thread synchronizations. We all know these practices should be avoided at all costs in any multi-threaded program, but to give it all away seemed like an extreme. But if that meant consistently higher performance, then sure, that would make sense. So I wanted to test this theory. I wanted to find out exactly how fast node.js was compared to .NET – as empirically as possible.

So I wanted to come up with a problem that involved IO (ideally not involving a database), and some computation. And I wanted to do this under load, so that I could see how each system behaves under pressure. I came up with the following problem: I have approximately 200 files, each containing somewhere between 10 to 30 thousand random decimals. Each request to the server would contain a number such as: /1 or /120, the service would then open the corresponding file, read the contents, and sort them in memory and output the median value. That’s it. Our goal is to reach a maximum of 200 simultaneous requests, so the idea is that each request would have a corresponding file without ever overlapping.

I also wanted to align the two platforms (.NET and Node.js). For instance, I didn’t want to host the .NET service on IIS because it seemed unfair to pay the cost of all the things IIS comes with (caching, routing, performance counters), only to never use them. I also avoided the entire ASP.NET pipeline, including MVC for the same reasons, they all come with features, which we don’t care about in this case.

Okay, so both .NET and Node.JS will create a basic HTTP listener. What about client? The plan here is to create a simple .NET console app that drives load to the service. While the client is written in .NET, the key point here is that we test both .NET and Node.JS services using the same client. So at a minimum how the client is written is a negligible problem. Before we delve into the details, let’s look the graph that shows us the results:

.NET and Node.JS - Performance Comparison

Performance of sorting numbers between .NET and Node.JS

On an average Node.js wins hands down. Even though there are few spikes that could be attributed to various disk related anomalies, as some of the readers have eluded to. I also want to clarify that if you look at the graph carefully you start to see that the two lines start to intersect towards the end of the test run, while that might start to give you the impression that overtime the performance for .NET and node.js converge the reality is .NET starts to suffer even more over time. Let’s look at each aspect of this test more carefully.

We’ll start with the client, the client uses a HttpClient to drive requests to the service. The response times are maintained on the client side so that there aren’t any drastic implementation difference on the service that could impact our numbers. Notice that I avoided doing any Console.Write (which blocks) until the very end.

public void Start()
    Task[] tasks = new Task[this.tasks];

    for (int i = 0; i < this.tasks; ++i)
        tasks[i] = this.Perform(i);


public async Task Perform(int state)
    string url = String.Format("{0}{1}", this.baseUrl, state.ToString().PadLeft(3, '0'));
    var client = new HttpClient();
    Stopwatch timer = new Stopwatch();

    string result = await client.GetStringAsync(url);

    this.result.Enqueue(String.Format("{0,4}\t{1,5}\t{2}", url, timer.ElapsedMilliseconds, result));

With that client, we can start looking at the service. First we’ll start with the node.js implementation. One of the beauties of node.js is it’s succinct syntax. With less than 40 lines of code we are able to fork processes based on the number of CPU cores and share the CPU-bound tasks amongst them.

var http = require('http');
var fs = require('fs');
var cluster = require('cluster');
var numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
    // Fork workers.
    for (var i = 0; i < numCPUs; i++) {

    cluster.on('exit', function(worker, code, signal) {
        console.log('worker ' + worker.process.pid + ' died');
else {
	http.createServer(function(request, response) {
		var file = parseInt(request.url.substring(1));
		file = file % 200;
		file = String("000" + file).slice(-3);

		// read the file
		fs.readFile('../data/input'+file+'.txt', 'ascii', function(err, data) {
			if(err) {
				response.writeHead(400, {'Content-Type':'text/plain'});
			else {
				var results = data.toString().split("\r\n");

				response.writeHead(200, {'Content-Type': 'text/plain'});
				response.end('input'+file+'.txt\t' + results[(parseInt(results.length/2))]);
	}).listen(8080, '');
console.log('Server running at')

And lastly, let’s look at the .NET service implementation. Needless to say we are using .NET 4.5, with all the glories of async/await. As I mentioned earlier, I wanted to compare purely .NET without IIS or ASP.NET, so I started off with a simple HTTP listener:

public async Task Start()
    while (true)
        var context = await this.listener.GetContextAsync();

With that I am able to start processing each request, as requests come in I read the file stream asynchronously so I am not blocking my Threadpool thread, and perform the in-memory sort which is a simple Task that wraps the Array.Sort. With .NET I could have severely improved performance in this area by using parallel sorting algorithms which come right of the parallel extensions, but I choose not to because that really isn’t the point of this test.

private async void ProcessRequest(HttpListenerContext context)
        var filename = this.GetFileFromUrl(context.Request.Url.PathAndQuery.Substring(1));
        string rawData = string.Empty;

        using (StreamReader reader = new StreamReader(Path.Combine(dataDirectory, filename)))
            rawData = await reader.ReadToEndAsync();
        var sorted = await this.SortAsync(context, rawData);
        var response = encoding.GetBytes(String.Format("{0}\t{1}", filename, sorted[sorted.Length / 2]));

        await context.Response.OutputStream.WriteAsync(response, 0, response.Length);
        context.Response.StatusCode = (int)HttpStatusCode.OK;
    catch(Exception e) 
        context.Response.StatusCode = (int)HttpStatusCode.BadRequest;

private async Task<string[]> SortAsync(HttpListenerContext context, string rawData)
    return await Task.Factory.StartNew(() =>
        var array = rawData.Split(new string[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries);

        return array;

You can download the entire source code, this zip file includes both client, and service sources for both .NET and node.js. It also includes a tool to generate the random number files, so that you can run the tests on your local machine. You will also find the raw numbers in the zip file.

I hope this was useful to you all as you’re deciding to choose the next framework to build your services on. For most startups the key pivot point is performance, scalability over anything else and node.js clearly shines as we’ve shown today.

Also, please remember some of the comments below are based on the original article which was using the async NPM package. This article has since been updated with the corrected information.