18 Oct 2017

Invariants hidden in callbacks

In this post I will discuss one of my favorite pet-peeves: callbacks. This post is programming language independent, though I guess it will shine through that I mainly work in C, C++ and Node.js. Callbacks are super nice for accomplishing a wide array of tasks: concern separation, asynchronous execution, future extension, etc. however there are several problems hidden in how callbacks can be implemented.

callbacks.jpg

The first problem is to not allow for a context to be passed through to the callback. With modern languages where we have closures this is a non-problem because the function will automatically carry the extra state with it. However in C you sometimes see a callback (which is a function pointer) used like this:

// Header

// This is how we declare a function pointer with no args, returning void, called MyCallback.

typedef (void) (*MyCallback)();

void registerCallback(MyCallback *cb);



// Source

static MyCallback *g_registeredCallback = NULL;

void registerCallback(MyCallback *cb) {

 g_registeredCallback = cb;

}



void invokeCallback() {

 if (g_registeredCallback != NULL) {

   g_registeredCallback();

 }

}

We register our own callbacks like this:

void callbackFunction() {

 printf("callack was invoked\n");

}

 registerCallback(callbackFunction);

 invokeCallback();

However the API doesn’t allow me to pass any extra state to the callback, so I can't attach the callback to any "object". The solution is to always pass in an extra context parameter:

// Header

typedef (void) (*MyCallback)(void *context);

void registerCallback(MyCallback *cb, void *context);



// Source

static MyCallback *g_registeredCallback = NULL;

static MyCallback *g_registeredCallbackContext = NULL;

void registerCallback(MyCallback *cb, void *context) {

 g_registeredCallback = cb;

 g_registeredCallbackContext = context;

}



void invokeCallback() {

 if (g_registeredCallback != NULL) {

   g_registeredCallback(g_registeredCallbackContext);

 }

}

Now we have a state and we can connect a specific callback invocation to an object. This is automatically solved in JavaScript since a function reference contains it's closure (for non-javascript programmers here is an example of what this means):

function generateCallback() {

 var closureVariable = 32;

 return function() {

   console.log('the closure variable is', closureVariable);

   closureVariable++;

 }

}



let callback1 = generateCallback();

callback1(); // prints 32

callback1(); // prints 33



let callback2 = generateCallback();

callback2(); // prints 32

callback1(); // prints 34

The next problem is that when we invoke a callback we need to consider that the callback can do anything. Take following example where we store callbacks in an array and then later we will process them and clear the callback queue (this time implemented in JavaScript so the context problem from above is automatically solved).

let callbacks = [];

function registerCallback(cb) {

 callbacks.push(cb);

}



function processCallbacks() {

 callbacks.forEach(cb => cb());

 callbacks = [];

}

The implementation looks innocent, however consider following usage:

registerCallback(() => {

 registerCallback(() => {

   console.log('When is this called?')

 });

});

and boom an infinite loop. This problem has many variations, for instance if we allow a callback to be unregistered, can we unregister ourselves from within the callback? The common trait for these problems is that we have some invariant that gets violated, i.e. when we wrote the functions we expected the callbacks array to not be modified while we are processing callbacks.

To ensure we don't violate the invariant we can rewrite like this:

function processCallbacks() {

 let internalCallbacks = callbacks;

 callbacks = [];



 internalCallbacks.forEach(cb => cb());

}

Now we first make a copy of the globally accessible object so registering a new callback while we process callbacks will not be executed.

My final point around callbacks is to always have same state when invoking the callbacks. With state I mean callstack, mutexes held, etc. Consider following example:

function doProcess(callback) {

 if (Math.rand() < 0.5) {

   callback(Math.rand());

   return;

 }



 globalVariable++;

 callback(Math.rand());

}

when the callback is invoked we will not know if the global variable has been updated or not. Common versions of this problem is to have different execution flows where the callback is invoked with different mutexes held along the different paths, or calling a callback both synchronously and asynchronously. A nice solution is to refactor the code so the callback is only ever invoked from one place. Also notice that if you ever refactor your code so the callback is invoked in a different state (for instance holding different mutexes or executing on a different thread) can lead to hard diagnosed and mysterious bugs.

To summarize, the bugs come down to breaking invariants, and when the invariants are implicit it can be hard to spot the problems. The invariants can be broken either by the callback doing things the callback invoker didn't expect, or reversely the signaller invoking the callback at times the callback didn't expect to be invoked.

In Node.js we can often get around the problems by executing the callbacks from a process.nextTick callback:

function processCallbacks() {

 callbacks.forEach(cb => process.nextTick(() => cb()));

 callbacks = [];

}

and then we need to accept that our callbacks always will fire asynchronously. In C and C++ there is no general solution for executing deferred callbacks, so in a future post we will look at what our options are there.

News & feeds

Configuring CMake for success

One common question I get on our Advanced C++ course is how the project structure should look, e.g. should I split the source code into multiple directories? How can I introduce unit tests? Should I compile into multiple libraries?

Each project is unique in what it needs, so in this blog post I’ll go over my general preferred setup and then look at how we adopted it for C++ on Android where we also interface with Kotlin/Java.


Read more >

Working Remotely… Very Remotely

Does your job let you travel? Do you want it to? Our job does! And we're taking advantage of it!  Read all about our plans for remote work... very remote work.  #waaayoutofoffice

Read more >

Focus on Business Value with Scrum

How do we, as a company, keep our focus on Business Value, when we our teams are dominated by technical people and developers, who are not always even expected to see the bigger picture. How can Scrum be at help?

Read more >

Why TypeScript?

A few contrasting examples: in JavaScript it's fine to forget a  property in an object literal, to call a function with the wrong number  of arguments, or to write code that will never be reached in a program. TypeScript flags all of these things as errors, often directly in an  IDE.

Read more >

Take a course in Prague

We are proud to announce that, starting from May, Edument will also offer courses in Prague. Now you can enjoy and be energized by spending time in one of Europe's most beautiful cities, at the same time as attending one of our popular courses! 

Read more >