The Path from Fibers to Async/Await

The Problem: Run User Code Asynchronously

Our Node-based server product, Synchro Server, allows our customers to build mobile apps that run in the cloud. Our users are developers who write Synchro apps that run under the Synchro Server (Node.js) environment. A significant amount of the code that users run on our platform is asynchronous in nature, for things like talking to databases or other network resources, interacting with REST APIs, etc.

We wanted to find a way to call our user's module entry points that allowed them to run asynchronous code as needed, but without adding unnecessary complexity for the majority of cases where the user's module entry point was not doing anything asynchronous. We didn't want to pass in a Node.js-style completion callback to every entry point, or make every entry point return a Promise, for example, as those would have been pretty ugly and added a potential point of implementation failure for all of the non-asynchronous entry point implementations. And we wanted to avoid callback hell.

Our First Solution: Fibers

Fibers to the rescue! Fibers is a module created and maintained by Marcel Laverdet, a guy who clearly has better things to do. Nevertheless, Marcel actively maintains fibers (the open issues are generally under control, and he has been very responsive in the couple of cases where we have had issues).

In addition to being actively maintained by a rock star developer, fibers is also used by a significant number of projects. Most notable to us was that it is employed by Meteor, a widely-used commercial platform.

We used fibers via a Node package called wait.for, which provides a very thin wrapper around fibers, and is particularly handy for calling Node-style async functions on a fiber. With wait.for, you launch a fiber like this:

var wait = require('wait.for');

var server = http.createServer(  
  function(req, res){
    console.log('req!');
    wait.launchFiber(handler,req,res); // handle in a fiber, keep node spinning 
  }).listen(8000);

And then you can do async actions on a fiber like this:

var dns = require("dns");  
var wait = require('wait.for');

function handler(req, res)){  
  var addresses = wait.for(dns.resolve4,"google.com"); // async
  for (var i = 0; i < addresses.length; i++) {
    var a = addresses[i];
    console.log("reverse for " + a + ": " + JSON.stringify(wait.for(dns.reverse,a)));
  }
}

Our Problems with Fibers

We used fibers for over a year and it generally worked well. We developed a number of concerns over our time with fibers, as outlined below.

  • Single maintainer - While Marcel has been awesome, he is just one person. If he got bored, or got hit by a meat truck, it's not clear that there is anyone else who would (or could) take over and move the project forward.
  • Complex solution - Fibers is some serious black-magic voodoo code. It's a lot of C code that requires a deep understanding of Node internals, and is multi-platform with some pretty complex platform-specific threading/coroutine code.
  • Binary component - Fibers builds as a binary component. For general development in your own controlled environment, this is usually not an issue (you have to have build tools installed/configured, and npm does the builds automatically as needed in response to npm commands). But this can cause issues, including when deploying your app to a cloud service, or when users try to install your solution without a full dev environment (including compiler), particularly on Windows. And we have run into these issues and had to deal with them.
  • Not the way Node.js is going - For better or worse, Node.js has been moving (very slowly) toward a supported, integrated approach to asynchronous processing, and that is not based on fibers.

It's not clear that any of these issues by themselves would have made us go looking for a better solution, but the combination was enough to make us revisit our choice to use fibers and to form the criteria we used when evaluating other potential solutions.

The Future: Async/Await

Node does have a proposed solution for asynchronous processing using async and await keywords. That solution also relies on promises (blech, IMO). The problem is that this isn't coming anytime soon (it's proposed for ES7). When that solution does materialize, it will almost certainly become the accepted and practiced standard for doing async processing.

While there are a number of transpilers out there that will provide an implementation of async/await today, we didn't feel that a transpiler solution was appropriate.

The Path to Async/Await

In researching contemporary asynchronous processing techniques and solutions for Node, we found a couple of excellent resources. First is a great article by Tim Caswell that compares fibers to generators. Another really great piece by Thomas Hunter II makes a strong case for the solution we ended up choosing: generators/yields+CO. This solution is by far the closest thing to the eventual async/await support, and will provide a very clean and easy migration path to async/await when it finally gets here.

The CO library is, in their words: The ultimate generator based flow-control goodness for nodejs (supports thunks, promises, etc). CO is primarily maintained by a couple of people, but the big difference between CO and fibers in terms of maintainability is that CO is implemented in a couple of hundred lines of fairly straightforward JavaScript. Also, CO is used by approximately 10x as many projects as fibers (according to a crude survey of dependent projects on NPM).

CO functions similarly to fibers, in that you use CO to execute a top level function, then inside of that function or anything called from it, you can do asynchronous operations (in this case using yield). Here is how you launch a function to support asynchronous processing with CO:

var co = require('co');

var server = http.createServer(  
  function(req, res){
    console.log('req!');
    co(function*(){ yield handler(req,res) }) // handle with CO, keep node spinning  
  }).listen(8000);

And here is how you can do async actions under CO:

var dns = require("dns");

function * dnsReverse(address) {  
  yield function(cb){dns.reverse(a, cb)};
}

function * handler(req, res)){  
  var addresses = yield function(cb){ dns.resolve4("google.com", cb) } // async
  for (var i = 0; i < addresses.length; i++) {
    var a = addresses[i];
    console.log("reverse for " + a + ": " + JSON.stringify(yield dnsReverse(a)));
  }
}

Note that in the call to dns.resolve, we yield to a thunk that we wrap around the Node-style async function call, and in the dnsReverse case we yield to a generator function (which itself yields to a thunk). This was done to illustrate different techniques. You can also yield to Promises and other "yieldable" object types.

For details on the Synchro async solution using generators/yields+co, please see the relevant document in our docs.

The Transition: Lessons Learned

We ported the Synchro Server codebase over the course of two weeks. We have approximately 75 asynchronous functions in our core server code. We have over 200 automated unit tests on the server, the majority of which are asynchronous, plus we have client tests suites for Windows/WinPhone, iOS, and Android that exercise the async functionality of the server.

The Whole Call Stack is Involved

One of our favorite things about fibers was that you could launch a fiber to run a top-level request processor, then way down in the bowels of processing a request you could do something asynchronous, without anyone in the call stack ever needing to know about it. When using the new technique, as will be the case with async\await eventually, every function that calls a function which may be async has to yield to that function. The asynchronous nature of every function is made absolutely clear in this process (we found a number of issues in our code where it wasn't apparent that code being called several functions below might do something asynchronous, which might allow another request to be processed by Node before that lower-level function returned control). In the end, we really appreciated the explicit nature of the generators/yield+co solution in exposing the asynchronous structure of the application at every level of the call stack, and we decided that what we used to see as an advantage to fibers in this respect was actually a liability.

Make Sure to Yield

One of the biggest issues we ran into was that since calling functions didn't know that async operations might be happening, when we converted those lower level functions to be generator functions, we often did not then call those lower level functions using yield. If you call a generator function like a regular function (without using yield), the result you get back will not be the expected return value, but will rather be a generator object representing the function, and none of the code in the generator will be executed. These cases can be incredibly hard to track down. Our solution was to add the suffix "Awaitable" to every generator function, to make it clear that you should use yield when calling it. We were then able to grep our code and make sure that every call to an "Awaitable" function was preceded by a yield.

Summary

We feel that it was time for us to move from fibers to generators/yield+co, both to address some of our concerns about fibers and to put us on a better path to the eventual solution of async/await. Having made the transition (not without some bumps and bruises), we are happy with the end result.