Topological sort is one of those things that is way simpler than its name might suggest. It is a fundamental problem that arises in many areas of computer science. The thing with a lot of these fundamental concepts is that while they are incredibly important, we do not necessarily see them in code a whole lot. Obviously this highly depends on the kind of code we are considering, but in most environments there are multiple layers of abstraction underneath us. Clearly this enables us to be more productive, because we are provided with solutions to common problems and can focus on things unique to whatever we are working on. But even though we might not have to implement a solution ourselves we still have to identify the problems when they arise. Recently I encountered such a problem instance of topological sort in a popular open source project called Babel and thought it was a super nice illustration worth looking into.

Topological sort: The abstract point of view

Topological sort is a graph problem, which means that it asks a question about a given set of vertices and edges. More precisely, this problem is about directed graphs. But that is all just terminology for a bunch of points connected with arrows. Here is an example from Wikipedia1:

Directed acyclic graph

One of these constructs is the input to our problem. And the question we would like to answer is:

Is there an ordering of all nodes that respects the ordering induced by arrows between nodes?

Or in other words, can you list all nodes such that nodes connected by an arrow come after each other? In the example above such an ordering would have to list 8 before 9 for example. And indeed, one can proof that such an ordering exists if and only if the graph has no cycles. One possible solution for the graph above would be:

3, 7, 8, 5, 11, 10, 2, 9

This problem can be solved with an efficient algorithm that runs in linear time, but I do not want to go into detail about how that works here. The basic idea is simple though: Find a node that has no arrows pointing to it, add it to the ordering and remove it from the graph. Then repeat this until the graph is empty. Simple enough, but to make it run in linear time we have to find a node without incoming arrows directly, without searching the graph. Because if we search it in each step, the worst case scenario is that each step itself takes linear time which results in quadratic time in total. But fixing that is not super hard either, so lets move on and see how solving this problem could be useful in Babel.

Babel plugins: One instance of the topological sort problem

Okay, so Babel is a JavaScript compiler that is widely used to transform ES6 to ES5 and things like that. It can do many awesome things and all the information is available at Babel’s architecture is plugin based which means that all code transformations are defined as individual plugins. The idea is that these can be stacked together via a configuration file. One plugin might transform all let and const to var declarations while another one might transpile ES6 classes into their corresponding constructor functions. There are a lot of these plugins and it is really simple to write custom ones. As the ecosystem grew it became clear that some plugins directly rely on others running before or after them. For example: If you write a plugin that makes modifications to ES6 classes in your code, then it should definitely run before these classes are transformed into functions. In current versions of Babel, the order in which plugins run is determined by the order you specify them in the configuration file. This really is a problem, because now the user of your plugin has to know what the correct order is and will be confronted with weird behavior and errors otherwise. At the time of this writing it is still being discussed how to solve this problem in detail2, but the general idea is clear. Plugins can have dependencies on other plugins and we have to run them in an order that respects these dependencies. Sounds familiar, right?

That is exactly an instance of the abstract problem described above. The nodes are plugins and if plugin A depends on plugin B there is an arrow from B to A. Then a topological sort of all plugins respects the dependencies and can safely ignore the order specified by the user. I think that is a really beautiful illustration of the topological sort problem and shows the power of graphs. They can apply to much more than just things that already look like graphs.